3 Replies Latest reply on Jun 11, 2007 6:57 PM by nanreh

    BdbjeCacheLoader exception with JTA distributed transactions

    nanreh

      I have a simple cache configuration that specifies a single BdbjeCacheLoader. I'm using JBoss Transactions and specifying transaction boundaries in my code using JTA. Here's one of the two transactional methods in my application:

       public byte[] loadData(byte[] key) throws ShardStoreException {
      
       Validate.notNull(key,"key is null");
      
       UserTransaction tx = null;
       try {
      
       String keyStr = HexUtil.toHexString(key);
      
       String fqn = new StringBuilder("/shard_").append(shardId).append("/").append(keyStr).toString();
       Fqn nodeFqn = Fqn.fromString(fqn);
      
       tx = com.arjuna.ats.jta.UserTransaction.userTransaction();
       logger.info("-------- About to begin TX");
       tx.begin();
       logger.info("-------- Transaction details: " + tx);
       logger.info("-------- Transaction started");
      
       Object value = cache.get(nodeFqn, KEY_PAYLOAD);
      
       logger.info("-------- About to commit tx.");
       tx.commit();
       logger.info("-------- TX committed.");
      
       if(null==value) {
       return null;
       }
      
       if( !(value instanceof byte[]) ) {
       throw new IllegalArgumentException("Invalid payload found in cache.");
       }
       return (byte[])value;
      
       } catch(Exception e) {
       try {
       logger.error("Error encountered. Rolling back transaction.",e);
       tx.rollback();
       } catch(Exception re) {
       logger.error("Error in rollback.",re);
       }
       throw new ShardStoreException(e);
       }
       }
      


      Pretty basic usage of JBoss cache with lots of logger calls. With my current cache configuration I can call this method once and it appears to succeed normally. When I call it again, however, I get this exception:

      jvm 1 | INFO 10:11.03 [AnonymousIoService-4] JBossCacheShardStore - -------- About to begin TX
      jvm 1 | INFO 10:11.03 [AnonymousIoService-4] JBossCacheShardStore - -------- Transaction details: com.arjuna.ats.internal.jta.transaction.jts.UserTransactionImple@1b2d7df
      jvm 1 | INFO 10:11.03 [AnonymousIoService-4] JBossCacheShardStore - -------- Transaction started
      jvm 1 | [23-05-2007 10:11:03:651] jacorb.poa:INFO: oid:
      jvm 1 | 05 15 0C 47 48 3F 29 14 36 40 ...GH?).6@
      jvm 1 | object is activated
      jvm 1 | ERROR 10:11.04 [AnonymousIoService-4] JBossCacheShardStore - Error encountered. Rolling back transaction.
      jvm 1 | java.lang.RuntimeException: com.sleepycat.je.DeadlockException: (JE 3.2.23) Lock expired. Locker -1_AnonymousIoService-4_ThreadLocker: waited for lock on database=SHARD_CLUSTER node=28 type=READ grant=WAIT_NEW timeoutMillis=500 startTime=1179940263653 endTime=1179940264155
      jvm 1 | Owners: [<LockInfo locker="6_RequestProcessor-5_Txn" type="WRITE"/>]
      jvm 1 | Waiters: []
      jvm 1 |
      jvm 1 | at org.jboss.cache.CacheImpl.invokeMethod(CacheImpl.java:3764)
      jvm 1 | at org.jboss.cache.CacheImpl.get(CacheImpl.java:1323)
      jvm 1 | at org.jboss.cache.CacheImpl.get(CacheImpl.java:1298)
      jvm 1 | at com.scea.scne.acs.shard.cache.JBossCacheShardStore.loadData(JBossCacheShardStore.java:110)
      jvm 1 | at com.scea.scne.acs.shard.server.ShardFetchHandler.handleLoadRequest(ShardFetchHandler.java:72)
      jvm 1 | at com.scea.scne.acs.shard.server.ShardFetchHandler.messageReceived(ShardFetchHandler.java:55)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain$TailFilter.messageReceived(AbstractIoFilterChain.java:703)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(AbstractIoFilterChain.java:362)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilterChain.java:54)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceived(AbstractIoFilterChain.java:800)
      jvm 1 | at org.apache.mina.filter.LoggingFilter.messageReceived(LoggingFilter.java:97)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(AbstractIoFilterChain.java:362)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilterChain.java:54)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceived(AbstractIoFilterChain.java:800)
      jvm 1 | at org.apache.mina.filter.codec.support.SimpleProtocolDecoderOutput.flush(SimpleProtocolDecoderOutput.java:62)
      jvm 1 | at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:200)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(AbstractIoFilterChain.java:362)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilterChain.java:54)
      jvm 1 | at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceived(AbstractIoFilterChain.java:800)
      jvm 1 | at org.apache.mina.filter.executor.ExecutorFilter.processEvent(ExecutorFilter.java:247)
      jvm 1 | at org.apache.mina.filter.executor.ExecutorFilter$ProcessEventsRunnable.run(ExecutorFilter.java:307)
      jvm 1 | at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
      jvm 1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
      jvm 1 | at java.lang.Thread.run(Thread.java:595)
      jvm 1 | Caused by: com.sleepycat.je.DeadlockException: (JE 3.2.23) Lock expired. Locker -1_AnonymousIoService-4_ThreadLocker: waited for lock on database=SHARD_CLUSTER node=28 type=READ grant=WAIT_NEW timeoutMillis=500 startTime=1179940263653 endTime=1179940264155
      jvm 1 | Owners: [<LockInfo locker="6_RequestProcessor-5_Txn" type="WRITE"/>]
      jvm 1 | Waiters: []
      jvm 1 |
      jvm 1 | at com.sleepycat.je.txn.LockManager.lock(LockManager.java:266)
      jvm 1 | at com.sleepycat.je.txn.BasicLocker.lockInternal(BasicLocker.java:90)
      jvm 1 | at com.sleepycat.je.txn.Locker.lock(Locker.java:257)
      jvm 1 | at com.sleepycat.je.dbi.CursorImpl.lockLNDeletedAllowed(CursorImpl.java:2349)
      jvm 1 | at com.sleepycat.je.dbi.CursorImpl.lockLN(CursorImpl.java:2271)
      jvm 1 | at com.sleepycat.je.dbi.CursorImpl.searchAndPosition(CursorImpl.java:1963)
      jvm 1 | at com.sleepycat.je.Cursor.searchInternal(Cursor.java:1178)
      jvm 1 | at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:1148)
      jvm 1 | at com.sleepycat.je.Cursor.search(Cursor.java:1014)
      jvm 1 | at com.sleepycat.je.Database.get(Database.java:547)
      jvm 1 | at org.jboss.cache.loader.bdbje.BdbjeCacheLoader.get(BdbjeCacheLoader.java:380)
      jvm 1 | at org.jboss.cache.interceptors.CacheLoaderInterceptor.loadData(CacheLoaderInterceptor.java:515)
      jvm 1 | at org.jboss.cache.interceptors.CacheLoaderInterceptor.loadNode(CacheLoaderInterceptor.java:395)
      jvm 1 | at org.jboss.cache.interceptors.CacheLoaderInterceptor.loadIfNeeded(CacheLoaderInterceptor.java:181)
      jvm 1 | at org.jboss.cache.interceptors.CacheLoaderInterceptor.invoke(CacheLoaderInterceptor.java:152)
      jvm 1 | at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:75)
      jvm 1 | at org.jboss.cache.interceptors.UnlockInterceptor.invoke(UnlockInterceptor.java:33)
      jvm 1 | at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:75)
      jvm 1 | at org.jboss.cache.interceptors.PessimisticLockInterceptor.invoke(PessimisticLockInterceptor.java:202)
      jvm 1 | at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:75)
      jvm 1 | at org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:34)
      jvm 1 | at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:75)
      jvm 1 | at org.jboss.cache.interceptors.NotificationInterceptor.invoke(NotificationInterceptor.java:22)
      jvm 1 | at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:75)
      jvm 1 | at org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:294)
      jvm 1 | at org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:128)
      jvm 1 | at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:75)
      jvm 1 | at org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:126)
      jvm 1 | at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:75)
      jvm 1 | at org.jboss.cache.interceptors.InvocationContextInterceptor.invoke(InvocationContextInterceptor.java:66)
      jvm 1 | at org.jboss.cache.CacheImpl.invokeMethod(CacheImpl.java:3752)
      jvm 1 | ... 23 more
      


      I'm not sure how to interpret this stack trace but it looks like the BerkleyDB is holding onto a lock from the first transaction and the second transaction fails to acquire it and fails.

      When I tell the transaction manager to do local transactions the error goes away. I'm wondering if I need to send the Berkley DB a few properties to set up its transactional behavior but I thought I'd post in case anyone out there recognizes the issue.

      Here's my cache configuration file:

      <?xml version="1.0" encoding="UTF-8"?>
      
      <server>
       <classpath codebase="./lib" archives="jboss-cache.jar, jgroups.jar"/>
      
       <mbean code="org.jboss.cache.CacheImpl" name="jboss.cache:service=Cache">
       <depends>jboss:service=Naming</depends>
       <depends>jboss:service=TransactionManager</depends>
      
       <!-- Use our custom TxManagerLookup class -->
       <attribute name="TransactionManagerLookupClass">org.defy.tx.CustomTxManagerLookup</attribute>
      
       <!--
       Node locking level : SERIALIZABLE
       REPEATABLE_READ (default)
       READ_COMMITTED
       READ_UNCOMMITTED
       NONE
       -->
       <attribute name="IsolationLevel">REPEATABLE_READ</attribute>
      
       <!-- Lock parent before doing node additions/removes -->
       <attribute name="LockParentForChildInsertRemove">true</attribute>
      
       <!--
       Valid modes are LOCAL
       REPL_ASYNC
       REPL_SYNC
       INVALIDATION_ASYNC
       INVALIDATION_SYNC
       -->
      
       <attribute name="CacheMode">REPL_SYNC</attribute>
       <!--<attribute name="CacheMode">LOCAL</attribute>-->
      
       <!-- Name of cluster. Needs to be the same for all JBoss Cache nodes in a
       cluster in order to find each other. -->
       <attribute name="ClusterName">SHARD_CLUSTER</attribute>
      
       <!--Uncomment next three statements to enable JGroups multiplexer.
       This configuration is dependent on the JGroups multiplexer being
       registered in an MBean server such as JBossAS. -->
       <!--
       <depends>jgroups.mux:name=Multiplexer</depends>
       <attribute name="MultiplexerService">jgroups.mux:name=Multiplexer</attribute>
       <attribute name="MultiplexerStack">fc-fast-minimalthreads</attribute>
       -->
      
       <!-- JGroups protocol stack properties.
       ClusterConfig isn't used if the multiplexer is enabled and successfully initialized.
       -->
       <attribute name="ClusterConfig">
       <config>
       <!-- UDP: if you have a multihomed machine,
       set the bind_addr attribute to the appropriate NIC IP address -->
       <!-- UDP: On Windows machines, because of the media sense feature
       being broken with multicast (even after disabling media sense)
       set the loopback attribute to true -->
       <UDP mcast_addr="228.1.2.3" mcast_port="48866"
       ip_ttl="64" ip_mcast="true"
       mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
       ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
       loopback="true"/>
       <PING timeout="2000" num_initial_members="3"
       up_thread="false" down_thread="false"/>
       <MERGE2 min_interval="10000" max_interval="20000"/>
       <FD shun="true" up_thread="true" down_thread="true"/>
       <VERIFY_SUSPECT timeout="1500"
       up_thread="false" down_thread="false"/>
       <pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800"
       max_xmit_size="8192" up_thread="false" down_thread="false"/>
       <UNICAST timeout="600,1200,2400" down_thread="false"/>
       <pbcast.STABLE desired_avg_gossip="20000"
       up_thread="false" down_thread="false"/>
       <FRAG frag_size="8192"
       down_thread="false" up_thread="false"/>
       <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
       shun="true" print_local_addr="true"/>
       <pbcast.STATE_TRANSFER up_thread="false" down_thread="false"/>
       </config>
       </attribute>
      
      
       <!--
       The max amount of time (in milliseconds) we wait until the
       initial state (ie. the contents of the cache) are retrieved from
       existing members in a clustered environment
       -->
       <attribute name="InitialStateRetrievalTimeout">20000</attribute>
       <attribute name="FetchInMemoryState">true</attribute>
      
       <!--
       Number of milliseconds to wait until all responses for a
       synchronous call have been received.
       -->
       <attribute name="SyncReplTimeout">20000</attribute>
      
       <!-- Max number of milliseconds to wait for a lock acquisition -->
       <attribute name="LockAcquisitionTimeout">15000</attribute>
      
       <attribute name="EvictionPolicyConfig">
       <config>
       <attribute name="wakeUpIntervalSeconds">5</attribute>
       <!-- This defaults to 200000 if not specified -->
       <attribute name="eventQueueSize">200000</attribute>
       <attribute name="policyClass">org.jboss.cache.eviction.FIFOPolicy</attribute>
      
       <!-- Cache wide default -->
       <region name="/_default_">
       <!-- We're allowing an unlimited number of nodes for now. -->
       <attribute name="maxNodes">0</attribute>
       </region>
      
       </config>
       </attribute>
      
       <attribute name="CacheLoaderConfiguration">
       <config>
       <!-- comma delimited FQNs to preload -->
       <passivation>false</passivation>
       <preload>/</preload>
       <shared>false</shared>
      
       <cacheloader>
       <class>org.jboss.cache.loader.bdbje.BdbjeCacheLoader</class>
      
       <properties>
       location=/tmp/filestore
       transactional=false
       </properties>
       <async>false</async>
       <fetchPersistentState>false</fetchPersistentState>
       <ignoreModifications>false</ignoreModifications>
       <purgeOnStartup>true</purgeOnStartup>
       </cacheloader>
       </config>
       </attribute>
      
       </mbean>
      </server>
      


      thanks in advance,
      frank

        • 1. Re: BdbjeCacheLoader exception with JTA distributed transact
          nanreh

          One additional possible hint: when I shut the process down a shutdown hook runs to do an orderly shutdown of jacorb which is being used by JBoss Transactions for distributed transactions. BdbjeCacheLoader throws an exception when it's asked to shutdown apparently due to a lingering open transaction:

          jvm 1 | com.sleepycat.je.DatabaseException: (JE 3.2.23) There is 1 existing transaction opened against the Environment.
          jvm 1 | Aborting open transactions ...
          jvm 1 |
          jvm 1 | at com.sleepycat.je.Environment.close(Environment.java:331)
          jvm 1 | at org.jboss.cache.loader.bdbje.BdbjeCacheLoader.stop(BdbjeCacheLoader.java:259)
          jvm 1 | at org.jboss.cache.loader.CacheLoaderManager.stopCacheLoader(CacheLoaderManager.java:333)
          jvm 1 | at org.jboss.cache.CacheImpl.stop(CacheImpl.java:870)
          jvm 1 | at com.scea.scne.acs.shard.cache.JBossCacheShardStore$3.run(JBossCacheShardStore.java:72)
          


          thanks...
          f

          • 2. Re: BdbjeCacheLoader exception with JTA distributed transact
            manik

            I just wrote a unit test loosely based on your use case above and it works okay. Can you confirm that this is specific to the transaction manager you use? Could you try with org.jboss.cache.DummyTransactionManager and see how it behaves with that?

            I'm NOT advocating the use of the DummyTransactionManager; just using it to try and pinpoint the fault. :-)

            • 3. Re: BdbjeCacheLoader exception with JTA distributed transact
              nanreh

              Thanks Manik for looking into this issue--I had another forum post which cleared it up for me:

              http://www.jboss.com/index.html?module=bb&op=viewtopic&t=109512

              I was running a distributed transaction manager with JBoss Cache and none of my transactions were being committed. I'm now running local transactions + a REPL_SYNC policy and letting JBoss do its internal prepare/commit across the cluster. For now I'm handling sync failures in my application as you mentioned.

              thanks again,
              Frank[/url]