5 Replies Latest reply on Sep 23, 2010 1:13 PM by galder.zamarreno

    Replicated cache TimeoutException

    feci

      Hi,

       

      I have cache:

       

      <namedCache name="requestCache">
              <deadlockDetection enabled="true" spinDuration="100" />
              <locking useLockStriping="false" lockAcquisitionTimeout="5000" />
              <clustering mode="replication">
                  <stateRetrieval fetchInMemoryState="true" />
                  <sync replTimeout="8000" />
              </clustering>
          </namedCache>

       

      and I have 3 nodes which receives message from JMS (from topic listener) and they are putting it to this cache at once. I'm calling putIfAbsend, so I'm awaiting that first will put it there, while others not. But it usualy fails at :

       

      org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [5 seconds] on key [7269e6c3-76c6-4e39-b7b2-b457de05b0c2] for requestor [Thread[OOB-3,fecihoDesktop-26189,5,Thread Pools]]! Lock held by [(another thread)]

       

      where can be the problem?

        • 1. Re: Replicated cache TimeoutException
          galder.zamarreno

          Are all 3 nodes receiving the same messages? If they are and all 3 nodes try to modify the same key, you'll get lock issues like this cos all nodes would be trying to modify the same key at the same time. Not sure what the key represents in your case, but if the key is associated with the JMS message, then it'd better if only one of the nodes consumed the message and stored in Infinispan.

          1 of 1 people found this helpful
          • 2. Re: Replicated cache TimeoutException
            feci

            yes, I made it that way, I'm asking if node is a coordinator and only coordinator is putting messages...

             

            How is the remove method working? Can there be a problem, if the three nodes are removing the same key? Let say I want to take the message from the cache (all three nodes in same time), with remove, but I want that only one node will suceed, other two will get null...

            • 3. Re: Replicated cache TimeoutException
              galder.zamarreno

                 public void testEmptySecondLevelCacheEntry() throws Exception {
                    getSessions().getCache().evictEntityRegion(Item.class.getName());
                    Statistics stats = getSessions().getStatistics();
                    stats.clear();
                    SecondLevelCacheStatistics statistics = stats.getSecondLevelCacheStatistics(Item.class.getName() + ".items");
                    Map cacheEntries = statistics.getEntries();
                    assertEquals(0, cacheEntries.size());
                 }
              Yeah, remove works the same way. Any modification works in the same way. You can limit your modifications to be local if you want using one of the overriding Flags.

              1 of 1 people found this helpful
              • 4. Re: Replicated cache TimeoutException
                feci

                The problem is, that I'm getting that kind of exception even when I'm starting new node. I'm putting to cache only from one node, but when I start another node, it throws again

                 

                org.infinispan.util.concurrent.TimeoutException: Replication timeout for ...

                 

                How can I put something to synchronously replicated cache even during new nodes are starting? I gues they are retrieving the states and so on, so it locks the cache...

                • 5. Re: Replicated cache TimeoutException
                  galder.zamarreno

                  Hmmm, where is that TimeoutException being reported? On the node doing the put? Or the node starting up?

                   

                  I doubt it's the node joining since it's not in a state that accepts invocations if it's doing state transfer. The node serving the state shouldn't be locking either since it does a non-blocking state transfer which basically reads the cache and this does not block. So, I wonder where this repl timeout comes from exactly. If you could be provide a unit test replication this it would be of a lot of help.