1 Reply Latest reply on Sep 7, 2011 4:27 AM by Mircea Markus

    locked keys after replication timeout problem

    Jan Slezak Newbie

      Hi - please could you help me with the following problem:


      I have distributed caches configured using JBossTM. During stress we found following blocker: some nodes got repl. timeout on node CACHE23


      2011-08-26 16:28:56,901 ERROR   local -> dist worker| tionContextInterceptor: 111| ISPN000136: Execution error

      org.infinispan.util.concurrent.TimeoutException: Replication timeout for CACHE13-46306


      therefore PreparedCommand failed as well:


      2011-08-26 16:28:56,979 ERROR   local -> dist worker| TransactionCoordinator:1640| Error while processing PrepareCommand

      org.infinispan.util.concurrent.TimeoutException: Replication timeout for CACHE13-46306


      but after that following error occured:


      2011-08-26 16:28:57,057 WARNlocal -> dist worker|   TransactionXaAdapter: 118| ISPN000141: Could not rollback prepared 1PC transaction. This transaction will be rolled back by the recovery process, if enabled. Transaction: LocalXaTransaction{xid=< formatId=131076, gtrid_length=29, bqual_length=28, tx_uid=0:ffff0a000ae5:e6f8:4e57ad79:7, node_name=1, branch_uid=0:ffff0a000ae5:e6f8:4e57ad79:8, eis_name=unknown eis name >} LocalTransaction{remoteLockedNodes=null, isMarkedForRollback=false, transaction=TransactionImple < ac, BasicAction: 0:ffff0a000ae5:e6f8:4e57ad79:7 status: ActionStatus.COMMITTING >} org.infinispan.transaction.xa.LocalXaTransaction@44574163


      2011-08-26 16:28:57,166 ERROR   local -> dist worker| IfspnCacheSynchronizer: 139| Error during commit: Could not commit transaction.

      javax.transaction.RollbackException: Could not commit transaction.


      the transaction dissapeared:


      2011-08-26 16:28:57,182 WARNlocal -> dist worker| IfspnCacheSynchronizer: 129| Error during rollback: Error during commit: Could not commit transaction.java.lang.IllegalStateException: BaseTransaction.rollback - no transaction!


      The problem is that after that  there are locked nodes by itself (node CACHE23) and distributed cache is completely locked for it:


      org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [1 seconds] on key [ID8541566201966883263] for requestor [GlobalTransaction:<CACHE23-28362>:3:remote]! Lock held by [GlobalTransaction:<CACHE23-28362>:2:remote]


      Do you guys have any idea if this is bug or improperly configured TM or something? Same error I've got using BatchingTM, DummyTM applying various configurations


      Is there some way how to 'unlock' acquired lock explicitly or how to bring things back to live?


      Thanks alot!



      Configuration (full log attached):


          <transport clusterName="cachecluster">
                  <property name="configurationFile" value="cw-jgroups-udp.xml"/>
          <globalJmxStatistics enabled="true" jmxDomain="distCache"/>






      <locking isolationLevel="REPEATABLE_READ"


      <clustering mode="distribution">
              <sync replTimeout="30000"/>
              <l1 enabled="false"/>
              <hash numOwners="100"/>


      <jmxStatistics enabled="true"/>
          <!--<invocationBatching enabled="true"/>-->