1 Reply Latest reply on Oct 26, 2005 6:21 PM by allen.wyatt

    Using cache as multi-server locking mechanism

    allen.wyatt

      I want to use a cache as a multi-server locking mechanism. I have several servers (for redundancy in case one dies) that will receive the same message on a pub/sub JMS topic. When each server receives the message it wants to do some processing on it, but only if some other server has not already started processing the message. So, the code would work something like this:

      public boolean canIProcessMessage( String messageIdentity )
      {
      boolean doIOwnThisMessage = false;
      cacheMBean.lock_access_to_treecache_region( "/message-statuses" );
      String x = cacheMBean.get( "/message-statuses", messageIdentity );
      if ( null == x )
      {
       // No one else has ownership of message "identity-of-message" yet
       cacheMBean.put( "/message-statuses", "identity-of-message", "I own this message" );
       doIOwnThisMessage = true;
      }
      cacheMBean.unlock_access_to_treecache_region( "/message-statuses" );
      return doIOwnThisMessage;
      }
      


      Does this sound reasonable? Is JBossCache a good tool to use here? I'm unsure how to do the "lock_access_to_treecache_region" method (didn't find it in the javadoc). I tried an implementation where I use a cache MBean that has REPL_SYNC CacheMode and SERIALIZABLE IsolationLevel and used a transaction around the get and put calls (one transaction around both calls). When testing this I often get LockTimeout and Replication exceptions though.

        • 1. Re: Using cache as multi-server locking mechanism
          allen.wyatt

          Here are some more details of what's happening. I'm running JBoss 4.0.3 with JBossCache 1.2.4. I have four servers configured named node0, node1, node2, and node3. They all run the following code:

           TreeCacheMBean cacheMBean = findCache();
          
           TransactionManager txn = cacheMBean.getTransactionManager();
           txn.begin();
          
           String fullPath = "/pnr-control/ABC123";
           String co = ( String ) cacheMBean.get( fullPath, "owner" );
           boolean iAmOwner = false;
           if ( null == co )
           {
           cacheMBean.put( fullPath, "owner", "owned" );
           iAmOwner = true;
           }
          
           txn.commit();
          


          The first time this code is invoked on all four servers after they are started, it works. One of the servers gets through the code and does the put while the other three realize someone else has done the put. The second time this code is invoked all the servers throw exceptions. The exceptions are:

          Exception from node2:
          org.jboss.tm.JBossRollbackException: Unable to commit, tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=S46824604252446/15, BranchQual=, localId=15] status=STATUS_NO_TRANSACTION; - nested throwable: (org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3961, retval=null, received=false, suspected=false))
           at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:354)
           at org.jboss.tm.TxManager.commit(TxManager.java:224)
           at server.CacheServiceImpl.processRequest(CacheServiceImpl.java:79)
           ...
          Caused by: org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3961, retval=null, received=false, suspected=false)
           at org.jboss.cache.interceptors.ReplicationInterceptor$SynchronizationHandler.beforeCompletion(ReplicationInterceptor.java:406)
           at org.jboss.cache.interceptors.OrderedSynchronizationHandler.beforeCompletion(OrderedSynchronizationHandler.java:72)
           at org.jboss.tm.TransactionImpl.doBeforeCompletion(TransactionImpl.java:1473)
           at org.jboss.tm.TransactionImpl.beforePrepare(TransactionImpl.java:1092)
           at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:306)
           ... 48 more
          Caused by: org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3961, retval=null, received=false, suspected=false
           at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3505)
           at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3526)
           at org.jboss.cache.interceptors.ReplicationInterceptor.runPreparePhase(ReplicationInterceptor.java:485)
           at org.jboss.cache.interceptors.ReplicationInterceptor$SynchronizationHandler.beforeCompletion(ReplicationInterceptor.java:389)
           ... 52 more
          
          Exception from node0:
          org.jboss.tm.JBossRollbackException: Unable to commit, tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=S46824604252446/16, BranchQual=, localId=16] status=STATUS_NO_TRANSACTION; - nested throwable: (org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3964, retval=null, received=false, suspected=false))
           at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:354)
           at org.jboss.tm.TxManager.commit(TxManager.java:224)
           at server.CacheServiceImpl.processRequest(CacheServiceImpl.java:79)
           ...
          Caused by: org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3964, retval=null, received=false, suspected=false)
           at org.jboss.cache.interceptors.ReplicationInterceptor$SynchronizationHandler.beforeCompletion(ReplicationInterceptor.java:406)
           at org.jboss.cache.interceptors.OrderedSynchronizationHandler.beforeCompletion(OrderedSynchronizationHandler.java:72)
           at org.jboss.tm.TransactionImpl.doBeforeCompletion(TransactionImpl.java:1473)
           at org.jboss.tm.TransactionImpl.beforePrepare(TransactionImpl.java:1092)
           at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:306)
           ... 48 more
          Caused by: org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3964, retval=null, received=false, suspected=false
           at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3505)
           at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3526)
           at org.jboss.cache.interceptors.ReplicationInterceptor.runPreparePhase(ReplicationInterceptor.java:485)
           at org.jboss.cache.interceptors.ReplicationInterceptor$SynchronizationHandler.beforeCompletion(ReplicationInterceptor.java:389)
           ... 52 more
          
          Exception from node1:
          org.jboss.tm.JBossRollbackException: Unable to commit, tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=S46824604252446/16, BranchQual=, localId=16] status=STATUS_NO_TRANSACTION; - nested throwable: (org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3961, retval=null, received=false, suspected=false))
           at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:354)
           at org.jboss.tm.TxManager.commit(TxManager.java:224)
           at server.CacheServiceImpl.processRequest(CacheServiceImpl.java:79)
           ...
          Caused by: org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3961, retval=null, received=false, suspected=false)
           at org.jboss.cache.interceptors.ReplicationInterceptor$SynchronizationHandler.beforeCompletion(ReplicationInterceptor.java:406)
           at org.jboss.cache.interceptors.OrderedSynchronizationHandler.beforeCompletion(OrderedSynchronizationHandler.java:72)
           at org.jboss.tm.TransactionImpl.doBeforeCompletion(TransactionImpl.java:1473)
           at org.jboss.tm.TransactionImpl.beforePrepare(TransactionImpl.java:1092)
           at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:306)
           ... 48 more
          Caused by: org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3961, retval=null, received=false, suspected=false
           at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3505)
           at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3526)
           at org.jboss.cache.interceptors.ReplicationInterceptor.runPreparePhase(ReplicationInterceptor.java:485)
           at org.jboss.cache.interceptors.ReplicationInterceptor$SynchronizationHandler.beforeCompletion(ReplicationInterceptor.java:389)
           ... 52 more
          
          Exception from node3:
          org.jboss.tm.JBossRollbackException: Unable to commit, tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=S46824604252446/16, BranchQual=, localId=16] status=STATUS_NO_TRANSACTION; - nested throwable: (org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3961, retval=null, received=false, suspected=false))
           at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:354)
           at org.jboss.tm.TxManager.commit(TxManager.java:224)
           at server.CacheServiceImpl.processRequest(CacheServiceImpl.java:79)
           ...
          Caused by: org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3961, retval=null, received=false, suspected=false)
           at org.jboss.cache.interceptors.ReplicationInterceptor$SynchronizationHandler.beforeCompletion(ReplicationInterceptor.java:406)
           at org.jboss.cache.interceptors.OrderedSynchronizationHandler.beforeCompletion(OrderedSynchronizationHandler.java:72)
           at org.jboss.tm.TransactionImpl.doBeforeCompletion(TransactionImpl.java:1473)
           at org.jboss.tm.TransactionImpl.beforePrepare(TransactionImpl.java:1092)
           at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:306)
           ... 48 more
          Caused by: org.jboss.cache.ReplicationException: rsp=sender=10.16.34.37:3961, retval=null, received=false, suspected=false
           at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3505)
           at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3526)
           at org.jboss.cache.interceptors.ReplicationInterceptor.runPreparePhase(ReplicationInterceptor.java:485)
           at org.jboss.cache.interceptors.ReplicationInterceptor$SynchronizationHandler.beforeCompletion(ReplicationInterceptor.java:389)
           ... 52 more
          


          "server.CacheServiceImpl.processRequest" is the method listed above. Line 79 is the "txn.commit" line in the code.

          Here's the configuration of the cache:

          <?xml version="1.0" encoding="UTF-8" ?>
          <server>
           <classpath codebase="./lib" archives="jboss-cache.jar, jgroups.jar"/>
          
           <!-- ==================================================================== -->
           <!-- TreeCache that synchronously replicates changes -->
           <!-- ==================================================================== -->
           <mbean code="org.jboss.cache.TreeCache" name="jboss.cache:service=AMWCache">
          
           <depends>jboss:service=Naming</depends>
           <depends>jboss:service=TransactionManager</depends>
          
           <!-- Configure the TransactionManager -->
           <attribute name="TransactionManagerLookupClass">org.jboss.cache.JBossTransactionManagerLookup</attribute>
          
           <!-- Node locking level : SERIALIZABLE, REPEATABLE_READ (default), READ_COMMITTED, READ_UNCOMMITTED,
           NONE -->
           <attribute name="IsolationLevel">SERIALIZABLE</attribute>
          
           <!-- Valid modes are LOCAL, REPL_ASYNC, REPL_SYNC -->
           <attribute name="CacheMode">REPL_SYNC</attribute>
          
           <!-- Name of cluster. Needs to be the same for all clusters, in order to find each other -->
           <attribute name="ClusterName">AMW-Cache</attribute>
          
           <attribute name="ClusterConfig">
           <config>
           <!-- UDP: if you have a multihomed machine, set the bind_addr attribute to the
           appropriate NIC IP address -->
           <!-- UDP: On Windows machines, because of the media sense feature being broken with
           multicast (even after disabling media sense) set the loopback attribute to
           true -->
           <UDP mcast_addr="228.1.2.3" mcast_port="45566" ip_ttl="64" ip_mcast="true"
           mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
           ucast_send_buf_size="150000" ucast_recv_buf_size="80000" loopback="true"/>
           <PING timeout="2000" num_initial_members="3" up_thread="false" down_thread="false"/>
           <MERGE2 min_interval="10000" max_interval="20000"/>
           <FD shun="true" up_thread="true" down_thread="true"/>
           <VERIFY_SUSPECT timeout="1500" up_thread="false" down_thread="false"/>
           <pbcast.NAKACK gc_lag="50" max_xmit_size="8192" retransmit_timeout="600,1200,2400,4800"
           up_thread="false" down_thread="false"/>
           <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10"
           down_thread="false"/>
           <pbcast.STABLE desired_avg_gossip="20000" up_thread="false" down_thread="false"/>
           <FRAG frag_size="8192" down_thread="false" up_thread="false"/>
           <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="true"
           print_local_addr="true"/>
           <pbcast.STATE_TRANSFER up_thread="false" down_thread="false"/>
           </config>
           </attribute>
          
           <!-- The max amount of time (in milliseconds) we wait until the initial state (ie. the contents of the
           cache) are retrieved from existing members in a clustered environment -->
           <attribute name="InitialStateRetrievalTimeout">5000</attribute>
          
           <!-- Number of milliseconds to wait until all responses for a synchronous call have been received. -->
           <attribute name="SyncReplTimeout">10000</attribute>
          
           <!-- Max number of milliseconds to wait for a lock acquisition -->
           <attribute name="LockAcquisitionTimeout">15000</attribute>
          
           <!-- Name of the eviction policy class. -->
           <attribute name="EvictionPolicyClass">org.jboss.cache.eviction.LRUPolicy</attribute>
          
           <!-- Specific eviction policy configurations. This is LRU -->
           <attribute name="EvictionPolicyConfig">
           <config>
           <!-- <attribute name="wakeUpIntervalSeconds">5</attribute> -->
           <attribute name="wakeUpIntervalSeconds">3600</attribute>
           <!-- Cache wide default -->
           <region name="/_default_">
           <attribute name="maxNodes">5000</attribute>
           <attribute name="timeToLiveSeconds">86400</attribute>
           <!-- Maximum time an object is kept in cache regardless of idle time -->
           <attribute name="maxAgeSeconds">86400</attribute>
           </region>
           </config>
           </attribute>
          
           </mbean>
          
          </server>