8 Replies Latest reply on Apr 4, 2011 7:29 AM by dan.berindei

    deadlock in put to infinispan?

    omerzohar

      so it goes like this : i have a lot of MDBs running in a JBOSS AS 6 + Infinispan configuration heavly using the cache, puting and geting object. at one poing while running my server i notcied it doesn't do much.

      I noticed the Average write time is 4.5 seconds( ! )

      Ii did a thread dump and basically all threads are stuck on the PUT method of infinispan

      this is the stacktrace example of one of the thread (they're all identical more or less):

       

       

      Thread: Thread-16 (group:HornetQ-client-global-threads-31731945) : priority:5, demon:true, threadId:174, threadState:WAITING
      - waiting on <0x69b10> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      sun.misc.Unsafe.park(Native Method)
      java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
      java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
      java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
      java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
      java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:807)
      org.infinispan.util.concurrent.locks.StripedLock.acquireLock(StripedLock.java:94)
      org.infinispan.loaders.LockSupportCacheStore.lockForWriting(LockSupportCacheStore.java:65)
      org.infinispan.loaders.LockSupportCacheStore.store(LockSupportCacheStore.java:149)
      org.infinispan.interceptors.DistCacheStoreInterceptor.visitPutKeyValueCommand(DistCacheStoreInterceptor.java:81)
      org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
      org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
      org.infinispan.interceptors.CacheLoaderInterceptor.visitPutKeyValueCommand(CacheLoaderInterceptor.java:81)
      org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
      org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
      org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)
      org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:58)
      org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
      org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
      org.infinispan.interceptors.MarshalledValueInterceptor.visitPutKeyValueCommand(MarshalledValueInterceptor.java:125)
      org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
      org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
      org.infinispan.interceptors.TxInterceptor.enlistWriteAndInvokeNext(TxInterceptor.java:182)
      org.infinispan.interceptors.TxInterceptor.visitPutKeyValueCommand(TxInterceptor.java:130)
      org.infinispan.interceptors.DistTxInterceptor.visitPutKeyValueCommand(DistTxInterceptor.java:76)
      org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
      org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
      org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:113)
      org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
      org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
      org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)
      org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:58)
      org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:58)
      org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
      org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
      org.infinispan.interceptors.BatchingInterceptor.handleDefault(BatchingInterceptor.java:76)
      org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:58)
      org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
      org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)
      org.infinispan.CacheDelegate.put(CacheDelegate.java:444)
      org.infinispan.CacheSupport.put(CacheSupport.java:28)
      

       

       

      And this is my Cache config:

       

       

         <!-- blackbox application cache bean cache definitions -->
        <infinispan-config name="blackBox" jndi-name="java:CacheManager/blackbox">
          <alias>blackBox-cache</alias>
          <infinispan xmlns="urn:infinispan:config:4.2">
            <global>
              <transport clusterName="${jboss.partition.name:DefaultPartition}-blackBox" distributedSyncTimeout="17500">
                <properties>
                  <property name="stack" value="${jboss.default.jgroups.stack:udp}"/>
                </properties>
              </transport>
              <globalJmxStatistics enabled="true"/>
              <shutdown hookBehavior="DONT_REGISTER"/>
            </global>
      
            <default>
              <locking isolationLevel="REPEATABLE_READ" lockAcquisitionTimeout="15000" useLockStriping="false" concurrencyLevel="1000"/>
              <jmxStatistics enabled="true"/>
              <lazyDeserialization enabled="true"/>
              <invocationBatching enabled="true"/>   
              <eviction wakeUpInterval="5000" strategy="LRU"/>
            </default>
      
      
       <!-- DATA PERSISTANT CACHE -->
            <namedCache name="persistant">
      
              <clustering mode="distribution">
                <stateRetrieval timeout="60000" fetchInMemoryState="false"/>
                <sync/>
                <hash numOwners="2"/>
                <l1 enabled="true"/>
              </clustering>
      
              <loaders passivation="false" shared="true" preload="true">
                <loader class="org.infinispan.loaders.file.FileCacheStore" fetchPersistentState="false" purgeOnStartup="false">
                  <properties>
                    <property name="location" value="${jboss.server.data.dir}${/}blackbox.cache"/>
                  </properties>
                </loader>
              </loaders>
      
              <!--eviction wakeUpInterval="600000" maxEntries="1024" strategy="LIRS"/>
              <expiration lifespan="86400000" maxIdle="-1"/-->
              <eviction wakeUpInterval="60000" maxEntries="16384" strategy="LIRS"/>
              <expiration lifespan="14400000" maxIdle="-1"/>
            </namedCache>
      

       

       

      Anyone has an advice how to resolve this? am i using infinispan wrong or something?

        • 1. deadlock in put to infinispan?
          dan.berindei

          Your problem is that FileCacheStore is using striped locks and because of that transactions that should normally be isolated use the same locks in a different order, causing deadlocks.

           

          You could specify a lockConcurrencyLevel on the loader element higher than the default 2048, which will reduce your chances of deadlocks. Unfortunately you won't be able to disable lock striping completely in FileCacheStore or any other cache store that uses locking internally, so you will still have (smaller) chances of deadlocks.

           

          Your other option is to use a cache store that doesn't implement its own locking, e.g. BdbjeCacheStore or JdbmCacheStore. Those cache stores should be faster than the file-based one anyway, even without the deadlocks.

          • 2. deadlock in put to infinispan?
            genman

            I wouldn't charactertize this as a deadlock situation, unless the locking order was somehow reversed and the program stopped working completely.

             

            You would also get better performance using an async cache loader configuration, if you're doing a lot of data storage.

            • 3. deadlock in put to infinispan?
              omerzohar

              @Dan Berindei thanks, i didn't noticed that filecachestore is not for production. i moved to BdbjeCacheStore backed by berekly db and indeed it pefrom much better.

              one issue though. i can't seem to load the BdbjeCacheStore from configuration from the infinispan-config.xml in JBOSS. it seem to silently fail and refuse to load berekleyDB. however if i load a different conf xml with the same configuration it seem to work perfectly.

               

               

              @Elias Ross you mean async put and get or async work mode of the cache loader?

              • 4. deadlock in put to infinispan?
                genman

                I'm commenting about the cache loader.

                • 5. deadlock in put to infinispan?
                  omerzohar

                  what are the advantages of such an async cache loader? im not sure i understand how it works...

                  • 6. deadlock in put to infinispan?
                    dan.berindei

                    @Elias Ross: You're right, it's not a deadlock. My theory was you could get deadlocks because the FileCacheStore maps keys to locks more or less randomly and so can change the order in which locks are acquired. However after looking at the code closer I realized the locks are acquired without a timeout, so if it had been a deadlock those MDBs would have been stuck forever. I guess what happens is the cache store only acquires the lock for one key at a time, so there is no way to acquire the locks in the wrong order.

                     

                    @omer zohar: What do you mean it silently fails to load the BdbjeCacheStore? Does it work as if you didn't configure any cache store or does it work as if you still had the FileCacheStore configured? If it's the latter you probably have another infinispan-config.xml in your classpath.

                     

                    An async cache loader will improve throughput by performing the stores on a secondary thread, see the javadoc at http://docs.jboss.org/infinispan/4.2/apidocs/org/infinispan/loaders/decorators/AsyncStore.html

                     

                    One more thing, I noticed you set shared="true" in your loaders config - that's won't work properly with FileCacheStore or BdbjeCacheStore, because 1) each server has its own store on a local disk and 2) even if you use a shared directory, these cache stores are not engineered to work with multiple processes writing to them at the same time.

                    • 7. deadlock in put to infinispan?
                      omerzohar

                      @Dan Berindei: yes, by failing silently i mean it worked as if it had no cache store  at all. overflowed items were just forgotten...

                       

                      so far performance has been good with sync, i will try the async mode. thanks.

                       

                      regarding shared=true. i was under the impression this setting is a must for having the cache work in a cluster. am i wrong here?

                      • 8. deadlock in put to infinispan?
                        dan.berindei

                        omer zohar wrote:

                         

                        @Dan Berindei: yes, by failing silently i mean it worked as if it had no cache store  at all. overflowed items were just forgotten...

                         

                        That's really odd, could you try to reproduce this in a standalone app?

                         

                        so far performance has been good with sync, i will try the async mode. thanks.

                        Keep in mind that this will move the stores out of the MDB transaction, so you could end up thinking your data is stored safely on disk even though the case store has failed. As always, it's a tradeoff...

                         

                        regarding shared=true. i was under the impression this setting is a must for having the cache work in a cluster. am i wrong here?

                        It's not required at all, it's just an optimization when your backing store really is shared. See http://docs.jboss.org/infinispan/4.2/apidocs/index.html?org/infinispan/config/