1 2 Previous Next 22 Replies Latest reply on Jul 22, 2009 9:16 AM by xuetao niu

    BerkeleyDB Deadlock Exception from JBoss Cache Loader: Bdbje

    xuetao niu Newbie

      Hi,

      I am using Jboss Pojo cache on top of Jboss cache (3.0.0GA) managing rather big object trees of collections. And I am randomly getting com.sleepycat.je.DeadlockException(s) when using the BdbjeCacheLoader as cache loader.

      There are 10 threads running under a single machine (JSE environment), each does both reading and writing. Neither reading nor writing (as I believe) should take up to 20 seconds as a transactional batch on my hardware, since the objects are not huge (each object is a list of other object that include an array of maps of basic types).

      If I set the lockAcquisitionTimeout to smaller value, it happens more frequently. But even I set the value to a very big one, like 20000, it still happens with less frequency, and it blocks for 20 seconds before this exception is thrown.

      Due to these posts, I guess it might have something to do with mvcc. (I am using je-3.3.82), but really do not hope so, since mvcc is one of the points I like jboss cache.

      http://forums.oracle.com/forums/thread.jspa?threadID=866063&tstart=60
      http://forums.oracle.com/forums/thread.jspa?threadID=616358&tstart=30

      Below is the jboss cache configuration and the exception stack trace. Thanks in advance for your help!

      <?xml version="1.0" encoding="UTF-8"?>
      <jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns="urn:jboss:jbosscache-core:config:3.0">
      
       <locking isolationLevel="READ_COMMITTED"
       lockParentForChildInsertRemove="false" lockAcquisitionTimeout="5000"
       nodeLockingScheme="mvcc" writeSkewCheck="false" concurrencyLevel="500" />
      
       <transaction
       transactionManagerLookupClass="org.jboss.cache.transaction.GenericTransactionManagerLookup"
       syncRollbackPhase="false" syncCommitPhase="false" />
      
       <jmxStatistics enabled="false" />
      
       <startup regionsInactiveOnStartup="true" />
      
       <shutdown hookBehavior="DEFAULT" />
      
       <listeners asyncPoolSize="1" asyncQueueSize="100000" />
      
       <invocationBatching enabled="false" />
      
       <eviction wakeUpInterval="1000">
      
       <!-- Cache wide default -->
       <region name="/_default_" algorithmClass="org.jboss.cache.eviction.LRUAlgorithm">
      
       <property name="maxNodes">10000</property>
       <property name="timeToLiveSeconds">300</property>
       </region>
       </eviction>
      
       <loaders passivation="false" shared="true">
       <preload>
       <node fqn="/" />
       </preload>
      
       <loader class="org.jboss.cache.loader.bdbje.BdbjeCacheLoader" async="false"
       fetchPersistentState="false" ignoreModifications="false"
       purgeOnStartup="false">
      
       <properties>
      
       cache.jdbc.table.name=Cached_Events
       cache.jdbc.table.create=false
       cache.jdbc.table.drop=false
       cache.jdbc.table.primarykey=jbosscache_pk
       cache.jdbc.fqn.column=fqn
       cache.jdbc.fqn.type=varchar2(255)
       cache.jdbc.node.column=node
       cache.jdbc.node.type=blob
       cache.jdbc.parent.column=parent
      
       cache.jdbc.sql-concat=concat(1,2)
      
       cache.jdbc.connection.factory=org.jboss.cache.loader.C3p0ConnectionFactory
       c3p0.maxPoolSize=20
       c3p0.checkoutTimeout=5000
      
       location=tmp/jbosscache
       </properties>
      
       </loader>
      
      
       </loaders>
      
      </jbosscache>
      


      com.sleepycat.je.DeadlockException: (JE 3.3.82) Lock expired. Locker 6761395 -1_Detection2_ThreadLocker: waited for lock on database=JBossCache-Cluster LockAddr:23471145 node=1785 type=READ grant=WAIT_NEW timeoutMillis=20000 startTime=1245426236094 endTime=1245426256097
       [junit] Owners: [<LockInfo locker="12706060 346_Detection1_Txn" type="WRITE"/>]
       [junit] Waiters: [<LockInfo locker="24478649 347_Detection4_Txn" type="WRITE"/>]
       [junit]
       [junit] at com.sleepycat.je.txn.LockManager.makeTimeoutMsgInternal(LockManager.java:469)
       [junit] at com.sleepycat.je.txn.SyncedLockManager.makeTimeoutMsg(SyncedLockManager.java:77)
       [junit] at com.sleepycat.je.txn.LockManager.lock(LockManager.java:278)
       [junit] at com.sleepycat.je.txn.BasicLocker.lockInternal(BasicLocker.java:167)
       [junit] at com.sleepycat.je.txn.Locker.lock(Locker.java:360)
       [junit] at com.sleepycat.je.dbi.CursorImpl.lockLNDeletedAllowed(CursorImpl.java:2516)
       [junit] at com.sleepycat.je.dbi.CursorImpl.lockLN(CursorImpl.java:2438)
       [junit] at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2368)
       [junit] at com.sleepycat.je.dbi.CursorImpl.getCurrentAlreadyLatched(CursorImpl.java:1427)
       [junit] at com.sleepycat.je.dbi.CursorImpl.getNextWithKeyChangeStatus(CursorImpl.java:1573)
       [junit] at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1499)
       [junit] at com.sleepycat.je.dbi.CursorImpl.getNextNoDup(CursorImpl.java:1688)
       [junit] at com.sleepycat.je.Cursor.searchInternal(Cursor.java:1936)
       [junit] at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:1748)
       [junit] at com.sleepycat.je.Cursor.search(Cursor.java:1615)
       [junit] at com.sleepycat.je.Cursor.getSearchKeyRange(Cursor.java:1117)
       [junit] at org.jboss.cache.loader.bdbje.BdbjeCacheLoader.getChildrenNames(BdbjeCacheLoader.java:391)
       [junit] at org.jboss.cache.loader.CacheLoaderManager.preload(CacheLoaderManager.java:371)
      


        • 1. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
          Galder Zamarreño Master

          How big are the Pojos that you're storing? Several kbs/megs...etc?

          Did you try getting some thread dumps to see what the thread holding the lock to the database is doing?

          Can you try commenting preloading and trying again?

          • 2. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
            xuetao niu Newbie

            Hi Galder,

            Thanks for the hints to probe the problem!

            The size of a Pojo is between 5k and 100k by estimate.

            I did not track the lock-holder thread's behavior, cuz I do not know how to do it, could you give some tips on this?

            I disabled the preloading and I haven't seen the berkeleyDb deadlock exception, but often see jboss cache lock exception as below if I set the lockAcquisitionTimeout to smaller values.

            org.jboss.cache.lock.TimeoutException: Unable to acquire lock on Fqn [/RTDE/pattern-matching-threshold/663330655] after [100] milliseconds for requestor [GlobalTransaction:<null>:2]! Lock held by [GlobalTransaction:<null>:1]
             at org.jboss.cache.mvcc.MVCCNodeHelper.acquireLock(MVCCNodeHelper.java:170)
             at org.jboss.cache.mvcc.MVCCNodeHelper.wrapNodeForWriting(MVCCNodeHelper.java:203)
             at org.jboss.cache.interceptors.MVCCLockingInterceptor.handlePutKeyValueCommand(MVCCLockingInterceptor.java:98)
             at org.jboss.cache.interceptors.base.PrePostProcessingCommandInterceptor.visitPutKeyValueCommand(PrePostProcessingCommandInterceptor.java:88)
             at org.jboss.cache.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:100)
             at org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
             at org.jboss.cache.interceptors.CacheLoaderInterceptor.visitPutKeyValueCommand(CacheLoaderInterceptor.java:130)
             at org.jboss.cache.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:100)
             at org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
             at org.jboss.cache.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:131)
             at org.jboss.cache.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:65)
             at org.jboss.cache.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:100)
             at org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
             at org.jboss.cache.interceptors.TxInterceptor.attachGtxAndPassUpChain(TxInterceptor.java:284)
             at org.jboss.cache.interceptors.TxInterceptor.handleDefault(TxInterceptor.java:271)
             at org.jboss.cache.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:65)
             at org.jboss.cache.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:100)
             at org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
             at org.jboss.cache.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:178)
             at org.jboss.cache.interceptors.InvocationContextInterceptor.visitPutKeyValueCommand(InvocationContextInterceptor.java:82)
             at org.jboss.cache.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:100)
             at org.jboss.cache.interceptors.InterceptorChain.invoke(InterceptorChain.java:265)
             at org.jboss.cache.invocation.CacheInvocationDelegate.put(CacheInvocationDelegate.java:560)
             at org.jboss.cache.pojo.impl.InternalHelper.lockPojo(InternalHelper.java:342)
             at org.jboss.cache.pojo.impl.PojoCacheDelegate.putObject(PojoCacheDelegate.java:101)
             at org.jboss.cache.pojo.impl.PojoCacheImpl.attach(PojoCacheImpl.java:102)
            


            What I am also wondering are these:
            - Does the lockAcquisitionTimeout configuration of jboss cache also change Berkeley DB's such configuration?
            - If I set the lockAcquisitionTimeout to a the largest available figure, is there a true deadlock possibility in JBoss cache (when using either of the READ_COMMITED or REPEATABLE_READ isolation levels)? By "true deadlock" I mean even if someone can let the lock requester wait for any amount of longer time, the requester will NEVER get the lock.

            Thanks a lot!

            Xuetao

            • 3. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
              xuetao niu Newbie

              Btw, our system is not light in cache-writing. Unlike the classical cache use cases, randomly, it can also be very write-intensive.

              • 4. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                xuetao niu Newbie

                Maybe something useful to give:

                Now I use 20 threads doing the same job (setting ockAcquisitionTimeout="4294967", all the rest configuration is the same as I pasted before), I experience a phenonemen which seems to be a "true deadlock" since no threads are running (jobs in the Threadpool Executor's queue are undone) and the cpu usage is zero. Then after a while, like 1 minute, one of the threads threw this:

                 org.jboss.cache.pojo.PojoCacheException: attach failed /RTDE/time-distance-analysis-tdm/4207
                 at org.jboss.cache.pojo.impl.PojoCacheImpl.attach(PojoCacheImpl.java:111)
                 at org.jboss.cache.pojo.impl.PojoCacheImpl.attach(PojoCacheImpl.java:93)
                 at org.jboss.cache.pojo.impl.PojoCacheImpl.attach(PojoCacheImpl.java:88)
                 at com.fiserv.detection.eventcache.jboss.JBossPojoCacheStorage.update(JBossPojoCacheStorage.java:50)
                 at com.fiserv.detection.eventcache.statemachine.CacheAccessor.save(CacheAccessor.java:117)
                 at com.fiserv.detection.eventcache.statemachine.CacheAccessor.matchOne(CacheAccessor.java:84)
                 at com.fiserv.detection.tda.TimeDistanceAnalysis.doDetection(TimeDistanceAnalysis.java:71)
                 ... 53 more
                Caused by: java.lang.IllegalStateException: Cache not in STARTED state!
                 at org.jboss.cache.invocation.CacheInvocationDelegate.cacheStatusCheck(CacheInvocationDelegate.java:660)
                 at org.jboss.cache.invocation.CacheInvocationDelegate.get(CacheInvocationDelegate.java:446)
                 at org.jboss.cache.pojo.impl.InternalHelper.get(InternalHelper.java:100)
                 at org.jboss.cache.pojo.impl.InternalHelper.getPojoReference(InternalHelper.java:47)
                 at org.jboss.cache.pojo.impl.InternalHelper.getPojo(InternalHelper.java:141)
                 at org.jboss.cache.pojo.impl.PojoCacheDelegate.putObject(PojoCacheDelegate.java:104)
                 at org.jboss.cache.pojo.impl.PojoCacheImpl.attach(PojoCacheImpl.java:102)
                 ... 59 more
                


                Have you seen anything like this?

                Regards,

                Xuetao

                • 5. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                  xuetao niu Newbie

                  P.S., I am sharing one org.jboss.cache.pojo.PojoCache instance amongst multiple threads.

                  • 6. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                    xuetao niu Newbie

                    And, this is my class which should have the cache object fully initialized within the constructor:

                     private final PojoCache cache;
                    
                     JBossPojoCacheStorage(Configuration config){
                    
                    
                     cache = PojoCacheFactory.createCache(config,true);
                    
                    
                     }
                    


                    • 8. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                      Galder Zamarreño Master

                      First of all, to generate thread dumps and see where threads holding write locks might be stuck, see generating thread dumps wiki: http://www.jboss.org/community/wiki/StackTrace

                      I must say that 100ms for lock timeout is pretty small. Something around 10-20 seconds is more usual. Rather than increasing the lock timeout to insane numbers, use thread dumps to figure out what's holding up the progress or if there're any deadlocks...etc There're even tools like TDA or Samurai that can help analyze them.

                      WRT "Caused by: java.lang.IllegalStateException: Cache not in STARTED state!" message, that looks strange, did you check whether stop/destroy was called at some point?

                      Finally, when you say that your cache usage is very write heavy, do you mean that you call attach() loads of times? Or that once you've attached a Pojo to the cache, you modify it via its set methods very often?

                      • 9. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                        xuetao niu Newbie

                        Thanks for your reply Galder!

                        I couldn't reproduce the deadlocking problem on my windows machine anymore (with a rather large number of lockWaiting time config), I am trying on Linux where it does happen randomly.

                        About the write-intensive stuff, yes, we will have a lot more calls to attach() than modifying the already cached Pojos. For this, I have some questions:

                        1. If I call attach() with a key and value that already exist in the cache, what will happen?

                        2. How do I use transaction in java code to explicitly begin and end a transaction?
                        - DummyTransactionManager.getUserTransaction()?
                        - or like describe in this page? http://www.jboss.org/file-access/default/members/jbosscache/freezone/docs/1.4.1.SP4/tutorial/en/html_single/index.html which looks a pretty old doc?

                        • 10. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                          Galder Zamarreño Master

                           

                          "niuxuetao_fiserv" wrote:
                          About the write-intensive stuff, yes, we will have a lot more calls to attach() than modifying the already cached Pojos.


                          Attaching a pojo is an expensive operation where the pojo is dissected and mapped to the cache, hence such operation should be used as less as possible. Detailed information on the process can be found in http://www.jboss.org/file-access/default/members/jbosscache/freezone/docs/3.0.0.GA/pojo/userguide_en/html_single/index.html#d0e763

                          If you're calling attach more often than modifying the Pojo directly, I'm not sure of the suitability of PojoCache in your case. It might even be more of a burden than using plain JBoss Cache. The point of a PojoCache is that you attach once and then you interact with the bean as normal, calling it's getters,setters...etc and when you're done with it, you detach it.

                          "niuxuetao_fiserv" wrote:
                          1. If I call attach() with a key and value that already exist in the cache, what will happen?


                          Hmmm, do you mean what happens when you attach a pojo under a fqn+key that already contains a pojo? In that case, the existing pojo is detached and the new one is reattached. Due to the reasons above, this is an expensive operation.

                          "niuxuetao_fiserv" wrote:
                          2. How do I use transaction in java code to explicitly begin and end a transaction?
                          - DummyTransactionManager.getUserTransaction()?
                          - or like describe in this page? http://www.jboss.org/file-access/default/members/jbosscache/freezone/docs/1.4.1.SP4/tutorial/en/html_single/index.html which looks a pretty old doc?


                          Well, depends the environment in which you're in. If you're within an application server, you get a UserTransaction from JNDI or from an EJB context.

                          If you're outside of an application server, you should be using something like JBoss Transactions that allows setting up a transaction manager and getting a user transaction. You can find examples in their docu.

                          Finally, DummyTransactionManager does what it says on the tin. It's dummy, so it's not really a transaction manager, it's simply used as a mock for testing purpouses and hence you shouldn't be using it in conjunction with your production code.

                          • 11. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                            xuetao niu Newbie

                            Thanks a lot Galder for your help!

                            What I need from Pojo cache is that it returns the same reference of object with the same fqn under the same node, which is what jboss core cache probably cannot do at the moment.

                            Do you know any way to use jboss core cache with the reference comparability as what pojo cache provides?

                            Maybe my use case can further clarify what we need:

                            What we are caching is a pojo (say A) composed of a list of other pojos(say B) that are further composed of a map of other pojos(say C). Each A is about 5k-100k size. The operations to the A(s) have the following properties:

                            - High put/removal of A
                            - Low put/removal of B and C
                            - High reading of A,B,C

                            • 12. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                              Galder Zamarreño Master

                              How many A objects do you expect to have in the system? Are all A objects stored under the same FQN?

                              Also, of those A objects, take for example instance A1. Are you calling multiple attach/detach for the same A1 instance? Or for each A instance, i.e. A1, there's only one attach and one detach?

                              • 13. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                                xuetao niu Newbie

                                We use about 10-20 nodes (i.e. 10-20 different FQNs) and each node keeps up to 1000,000 keys in the map, so each node/FQN also keeps up to 1000,000 instances of A.

                                For A1, I am calling attach() and detach() only once respectively.

                                • 14. Re: BerkeleyDB Deadlock Exception from JBoss Cache Loader: B
                                  Galder Zamarreño Master

                                  Right, I suspect that your lock timeout issues might be related to your current fqn structure. If you bear in mind that locks are acquired on a per FQN basis, storing a lot keys within an FQN leads to lock contention. Instead, I'd suggest you add more FQNs and reduce the number of keys stored in each FQN. The side effect of this is that memory consumption is higher but on the other hand, concurrency levels increase.

                                  1 2 Previous Next