1 2 3 4 Previous Next 45 Replies Latest reply on Jan 18, 2011 10:25 AM by yelin666 Go to original post
      • 15. Re: failure situation using cluster-wide lock
        manik

        Hi Lin - this is done, see details of the fix in http://lists.jboss.org/pipermail/infinispan-dev/2010-December/006943.html

         

        You will see this fix in 4.2.0.CR3.

        • 16. Re: failure situation using cluster-wide lock
          yelin666

          I restested with 4.2.0.CR3 using the simple test above & JBossTX. Unfortunated, it still locked up every time after two instances running for a while.

          • 17. Re: failure situation using cluster-wide lock
            manik

            Hi Lin

             

            I haven't been able to reproduce this with more recent snapshots.  I just published a new 4.2.0-SNAPSHOT which contains some important fixes in the transaction adapters, which may be related.  Could you pls try with this snapshot, and if it works, this will form the basis of 4.2.0.CR4.

             

            Thanks

            Manik

            • 18. Re: failure situation using cluster-wide lock
              yelin666

              Hi Manik,

               

              I downloaded infinispan-core-4.2.0-20101214.115023-9.jar (assumed that's the snapshot build you mentioned). I am still seeing the lock up running the simple test I attached earlier in this thread, when the counter reached about 20 (doesn't take long). Please suggest if that's the right version to try.

               

              Thanks,

              Lin

              • 19. Re: failure situation using cluster-wide lock
                manik

                I'm not sure if that is the right snapshot, but 4.2.0.CR4 is out.  Could you pls try that?

                • 20. Re: failure situation using cluster-wide lock
                  yelin666

                  Manik,

                   

                  I just tried 4.2.0.CR4, and it locked up as previous versions. It seems one thread was holding the lock and never released it. I am wondering how you verify it is fixed on your side?

                   

                  I understand a cluster-wide lock is not easy to implement. However, with a lock up in a simple test case above, I won't be able to recommend using the lock in a production environment to our team.

                   

                  Thanks,

                  Lin

                  • 21. Re: failure situation using cluster-wide lock
                    manik

                    Hi Lin - we did have a few unit tests for this.  I'll have a look again, but could I ask you to try 4.2.0.FINAL as well?

                    • 22. Re: failure situation using cluster-wide lock
                      yelin666

                      I tried 4.2.0.FINAL, and it's having the same issue with my test attached above. If I kill one instance, it may release the lock and the other instance could get the lock and continue. So it seems one thread in the process being killed was holding the lock.

                       

                      Did you have a unit test for this particular issue? Did it pass? I am just wondering...

                      • 23. Re: failure situation using cluster-wide lock
                        manik

                        This is the test that I use.

                         

                        https://github.com/infinispan/infinispan/blob/master/core/src/test/java/org/infinispan/tx/RemoteLockCleanupStressTest.java

                         

                        I'll give it another run and test for regressions.

                        • 24. Re: failure situation using cluster-wide lock
                          manik

                          I've reopened the JIRA and will investigate further.  I recommend adding a watch on that JIRA.  :-)

                           

                          https://issues.jboss.org/browse/ISPN-777

                          • 25. Re: failure situation using cluster-wide lock
                            yelin666

                            Thanks for the update. Did you reproduce it with the stress test? I hope it'll be fixed in 4.2.1.

                            • 26. failure situation using cluster-wide lock
                              manik

                              Hi - I haven't been able to reproduce this with the stress test as yet.  The test failure you still see - is it still exactly the same?  Or is it some other error that you now see?  Could you share a recent (4.2.0.FINAL) log file?

                              • 27. Re: failure situation using cluster-wide lock
                                yelin666

                                The problem I saw is exactly the same as before. I looked at the code of the stress test you sent, and noticed a major difference between your test and my test. Basically, in your test, only two test threads were started in the main thread, and the CacheManager of one thread got killed and the lock owner was checked to verify the correctness. In my test, I created a timer to repeatedly start the CounterTask at a fixed rate, so more threads got involved in the test. I attached my test code again. It's a standalone Java application, and I run two instances to let them race for the lock. After a while, both of them got locked out, and no one could acquire the lock anymore. The following is the output from one instance.

                                 

                                Create cache manager ...

                                previous settings {}

                                log4j:WARN No appenders could be found for logger (org.infinispan.factories.GlobalComponentRegistry).

                                log4j:WARN Please initialize the log4j system properly.

                                log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

                                get cache

                                memebers [T00696119-60327, T00696119-38912]

                                cache size 1

                                aquiring lock on cache test-grid key locked-counter...

                                org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [locked-counter] for requestor [GlobalTransaction:<T00696119-38912>:1:remote]! Lock held by [GlobalTransaction:<T00696119-60327>:4:local]

                                        at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:228)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:155)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:107)

                                        at org.infinispan.interceptors.LockingInterceptor.lockKeysForLockCommand(LockingInterceptor.java:217)

                                        at org.infinispan.interceptors.LockingInterceptor.visitLockControlCommand(LockingInterceptor.java:195)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)

                                        at org.infinispan.commands.AbstractVisitor.visitLockControlCommand(AbstractVisitor.java:152)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:169)

                                        at org.infinispan.interceptors.TxInterceptor.visitLockControlCommand(TxInterceptor.java:125)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.visitLockControlCommand(InvocationContextInterceptor.java:63)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                                        at org.infinispan.commands.control.LockControlCommand.perform(LockControlCommand.java:165)

                                        at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:94)

                                        at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommand(CommandAwareRpcDispatcher.java:177)

                                        at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:149)

                                        at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:578)

                                        at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:489)

                                        at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:365)

                                        at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)

                                        at org.jgroups.JChannel.up(JChannel.java:1465)

                                        at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:954)

                                        at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:478)

                                        at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:265)

                                        at org.jgroups.protocols.FRAG2.up(FRAG2.java:190)

                                        at org.jgroups.protocols.FlowControl.up(FlowControl.java:419)

                                        at org.jgroups.protocols.FlowControl.up(FlowControl.java:419)

                                        at org.jgroups.protocols.pbcast.GMS.up(GMS.java:888)

                                        at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234)

                                        at org.jgroups.protocols.UNICAST.up(UNICAST.java:310)

                                        at org.jgroups.protocols.pbcast.NAKACK.handleMessage(NAKACK.java:806)

                                        at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:671)

                                        at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:132)

                                        at org.jgroups.protocols.FD.up(FD.java:266)

                                        at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:269)

                                        at org.jgroups.protocols.MERGE2.up(MERGE2.java:210)

                                        at org.jgroups.protocols.Discovery.up(Discovery.java:292)

                                        at org.jgroups.protocols.PING.up(PING.java:67)

                                rollback...

                                        at org.jgroups.protocols.MPING.up(MPING.java:175)

                                done rollback

                                        at org.jgroups.protocols.TP.passMessageUp(TP.java:1093)

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                        at org.jgroups.protocols.TP.access$100(TP.java:56)

                                        at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1633)

                                        at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1615)

                                        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

                                        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

                                        at java.lang.Thread.run(Thread.java:619)

                                current value : 6

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 7

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 9

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 11

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 12

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 14

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 16

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 18

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 19

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 21

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                current value : 23

                                commit...

                                done commit

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                2011-01-09 20:54:24,958 [Thread-4] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.TransactionReaper_18] - TransactionReaper::check timeout for TX 0:ffff032b3859:662:4d2a6694:32 in state  RUN

                                2011-01-09 20:54:24,958 [Thread-5] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.BasicAction_58] - Abort of action id 0:ffff032b3859:662:4d2a6694:32 invoked while multiple threads active within it.

                                2011-01-09 20:54:24,958 [Thread-5] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check - atomic action 0:ffff032b3859:662:4d2a6694:32 aborting with 1 threads active!

                                2011-01-09 20:54:24,974 [Thread-5] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.TransactionReaper_7] - TransactionReaper::doCancellations worker Thread[Thread-5,5,main] successfully canceled TX 0:ffff032b3859:662:4d2a6694:32

                                java.lang.IllegalStateException: Transaction TransactionImple < ac, BasicAction: 0:ffff032b3859:662:4d2a6694:32 status: ActionStatus.ABORTED > is not in a valid state to be invoking cache operations on.

                                        at org.infinispan.interceptors.TxInterceptor.enlist(TxInterceptor.java:193)

                                        at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:165)

                                        at org.infinispan.interceptors.TxInterceptor.visitGetKeyValueCommand(TxInterceptor.java:160)

                                        at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:59)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)

                                rollback...

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:58)

                                2011-01-09 20:54:26,381 [Timer-0] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.BasicAction_40] - Abort called on already aborted atomic action 0:ffff032b3859:662:4d2a6694:32

                                done rollback

                                tx status at the end : no transaction

                                        at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:88)

                                        at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:59)

                                        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                                        at org.infinispan.CacheDelegate.get(CacheDelegate.java:207)

                                        at CacheScheduledCounter$CounterTask.run(CacheScheduledCounter.java:84)

                                        at java.util.TimerThread.mainLoop(Timer.java:512)

                                        at java.util.TimerThread.run(Timer.java:462)

                                aquiring lock on cache test-grid key locked-counter...

                                org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [locked-counter] for requestor [GlobalTransaction:<T00696119-38912>:14:local]! Lock held by [GlobalTransaction:<T00696119-38912>:13:local]

                                        at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:228)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:155)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:107)

                                        at org.infinispan.interceptors.LockingInterceptor.lockKeysForLockCommand(LockingInterceptor.java:217)

                                        at org.infinispan.interceptors.LockingInterceptor.visitLockControlCommand(LockingInterceptor.java:195)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)

                                        at org.infinispan.commands.AbstractVisitor.visitLockControlCommand(AbstractVisitor.java:152)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:169)

                                        at org.infinispan.interceptors.TxInterceptor.visitLockControlCommand(TxInterceptor.java:125)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)

                                rollback...

                                done rollback

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                        at org.infinispan.interceptors.InvocationContextInterceptor.visitLockControlCommand(InvocationContextInterceptor.java:63)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:311)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:303)

                                        at CacheScheduledCounter$CounterTask.run(CacheScheduledCounter.java:82)

                                        at java.util.TimerThread.mainLoop(Timer.java:512)

                                        at java.util.TimerThread.run(Timer.java:462)

                                org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [locked-counter] for requestor [GlobalTransaction:<T00696119-38912>:15:local]! Lock held by [GlobalTransaction:<T00696119-38912>:13:local]

                                        at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:228)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:155)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:107)

                                        at org.infinispan.interceptors.LockingInterceptor.lockKeysForLockCommand(LockingInterceptor.java:217)

                                        at org.infinispan.interceptors.LockingInterceptor.visitLockControlCommand(LockingInterceptor.java:195)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)

                                        at org.infinispan.commands.AbstractVisitor.visitLockControlCommand(AbstractVisitor.java:152)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:169)

                                        at org.infinispan.interceptors.TxInterceptor.visitLockControlCommand(TxInterceptor.java:125)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)

                                rollback...

                                done rollback

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                        at org.infinispan.interceptors.InvocationContextInterceptor.visitLockControlCommand(InvocationContextInterceptor.java:63)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:311)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:303)

                                        at CacheScheduledCounter$CounterTask.run(CacheScheduledCounter.java:82)

                                        at java.util.TimerThread.mainLoop(Timer.java:512)

                                        at java.util.TimerThread.run(Timer.java:462)

                                org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [locked-counter] for requestor [GlobalTransaction:<T00696119-38912>:16:local]! Lock held by [GlobalTransaction:<T00696119-38912>:13:local]

                                        at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:228)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:155)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:107)

                                        at org.infinispan.interceptors.LockingInterceptor.lockKeysForLockCommand(LockingInterceptor.java:217)

                                        at org.infinispan.interceptors.LockingInterceptor.visitLockControlCommand(LockingInterceptor.java:195)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)

                                        at org.infinispan.commands.AbstractVisitor.visitLockControlCommand(AbstractVisitor.java:152)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:169)

                                        at org.infinispan.interceptors.TxInterceptor.visitLockControlCommand(TxInterceptor.java:125)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)

                                rollback...

                                done rollback

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                        at org.infinispan.interceptors.InvocationContextInterceptor.visitLockControlCommand(InvocationContextInterceptor.java:63)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:311)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:303)

                                        at CacheScheduledCounter$CounterTask.run(CacheScheduledCounter.java:82)

                                        at java.util.TimerThread.mainLoop(Timer.java:512)

                                        at java.util.TimerThread.run(Timer.java:462)

                                org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [locked-counter] for requestor [GlobalTransaction:<T00696119-38912>:17:local]! Lock held by [GlobalTransaction:<T00696119-38912>:13:local]

                                        at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:228)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:155)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:107)

                                        at org.infinispan.interceptors.LockingInterceptor.lockKeysForLockCommand(LockingInterceptor.java:217)

                                        at org.infinispan.interceptors.LockingInterceptor.visitLockControlCommand(LockingInterceptor.java:195)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)

                                        at org.infinispan.commands.AbstractVisitor.visitLockControlCommand(AbstractVisitor.java:152)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:169)

                                        at org.infinispan.interceptors.TxInterceptor.visitLockControlCommand(TxInterceptor.java:125)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)

                                rollback...

                                done rollback

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                        at org.infinispan.interceptors.InvocationContextInterceptor.visitLockControlCommand(InvocationContextInterceptor.java:63)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:311)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:303)

                                        at CacheScheduledCounter$CounterTask.run(CacheScheduledCounter.java:82)

                                        at java.util.TimerThread.mainLoop(Timer.java:512)

                                        at java.util.TimerThread.run(Timer.java:462)

                                org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [locked-counter] for requestor [GlobalTransaction:<T00696119-38912>:18:local]! Lock held by [GlobalTransaction:<T00696119-38912>:13:local]

                                        at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:228)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:155)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:107)

                                        at org.infinispan.interceptors.LockingInterceptor.lockKeysForLockCommand(LockingInterceptor.java:217)

                                        at org.infinispan.interceptors.LockingInterceptor.visitLockControlCommand(LockingInterceptor.java:195)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)

                                        at org.infinispan.commands.AbstractVisitor.visitLockControlCommand(AbstractVisitor.java:152)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:169)

                                        at org.infinispan.interceptors.TxInterceptor.visitLockControlCommand(TxInterceptor.java:125)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)

                                rollback...

                                done rollback

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                        at org.infinispan.interceptors.InvocationContextInterceptor.visitLockControlCommand(InvocationContextInterceptor.java:63)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:311)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:303)

                                        at CacheScheduledCounter$CounterTask.run(CacheScheduledCounter.java:82)

                                        at java.util.TimerThread.mainLoop(Timer.java:512)

                                        at java.util.TimerThread.run(Timer.java:462)

                                2011-01-09 20:55:29,871 [Thread-4] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.TransactionReaper_18] - TransactionReaper::check timeout for TX 0:ffff032b3859:662:4d2a6694:4a in state  RUN

                                2011-01-09 20:55:29,871 [Thread-5] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.BasicAction_58] - Abort of action id 0:ffff032b3859:662:4d2a6694:4a invoked while multiple threads active within it.

                                2011-01-09 20:55:29,871 [Thread-5] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check - atomic action 0:ffff032b3859:662:4d2a6694:4a aborting with 1 threads active!

                                2011-01-09 20:55:29,871 [Thread-5] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.TransactionReaper_7] - TransactionReaper::doCancellations worker Thread[Thread-5,5,main] successfully canceled TX 0:ffff032b3859:662:4d2a6694:4a

                                org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [locked-counter] for requestor [GlobalTransaction:<T00696119-38912>:19:local]! Lock held by [GlobalTransaction:<T00696119-38912>:13:local]

                                        at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:228)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:155)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:107)

                                        at org.infinispan.interceptors.LockingInterceptor.lockKeysForLockCommand(LockingInterceptor.java:217)

                                        at org.infinispan.interceptors.LockingInterceptor.visitLockControlCommand(LockingInterceptor.java:195)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)

                                        at org.infinispan.commands.AbstractVisitor.visitLockControlCommand(AbstractVisitor.java:152)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:169)

                                        at org.infinispan.interceptors.TxInterceptor.visitLockControlCommand(TxInterceptor.java:125)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)

                                rollback...

                                2011-01-09 20:55:30,027 [Timer-0] WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N - [com.arjuna.ats.arjuna.coordinator.BasicAction_40] - Abort called on already aborted atomic action 0:ffff032b3859:662:4d2a6694:4a

                                        at org.infinispan.interceptors.InvocationContextInterceptor.visitLockControlCommand(InvocationContextInterceptor.java:63)

                                done rollback

                                tx status at the end : no transaction

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(aquiring lock on cache test-grid key locked-counter...

                                LockControlCommand.java:143)

                                        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:311)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:303)

                                        at CacheScheduledCounter$CounterTask.run(CacheScheduledCounter.java:82)

                                        at java.util.TimerThread.mainLoop(Timer.java:512)

                                        at java.util.TimerThread.run(Timer.java:462)

                                org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [locked-counter] for requestor [GlobalTransaction:<T00696119-38912>:20:local]! Lock held by [GlobalTransaction:<T00696119-38912>:13:local]

                                        at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:228)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:155)

                                        at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:107)

                                        at org.infinispan.interceptors.LockingInterceptor.lockKeysForLockCommand(LockingInterceptor.java:217)

                                        at org.infinispan.interceptors.LockingInterceptor.visitLockControlCommand(LockingInterceptor.java:195)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)

                                        at org.infinispan.commands.AbstractVisitor.visitLockControlCommand(AbstractVisitor.java:152)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:169)

                                        at org.infinispan.interceptors.TxInterceptor.visitLockControlCommand(TxInterceptor.java:125)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                                        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:87)

                                rollback...

                                done rollback

                                tx status at the end : no transaction

                                aquiring lock on cache test-grid key locked-counter...

                                        at org.infinispan.interceptors.InvocationContextInterceptor.visitLockControlCommand(InvocationContextInterceptor.java:63)

                                        at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:143)

                                        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:311)

                                        at org.infinispan.CacheDelegate.lock(CacheDelegate.java:303)

                                        at CacheScheduledCounter$CounterTask.run(CacheScheduledCounter.java:82)

                                        at java.util.TimerThread.mainLoop(Timer.java:512)

                                        at java.util.TimerThread.run(Timer.java:462)

                                • 28. Re: failure situation using cluster-wide lock
                                  manik

                                  I just tried your test.  Here is what I see:

                                   

                                  1. start one instance.  It increments counter to 3 or 4
                                  2. start another instance.  They take turns incrementing the counter.  Occasional timeout exceptions (but this is expected).  In general, the counter is still increasing on both sides.
                                  3. Kill one instance (I specifically kill it just after it prints that it has acquired the lock and is waiting 2 secs)
                                  4. The other instance still continues as per normal.
                                  5. Start the 2nd instance again.  Goto 3.

                                   

                                  I couldn't make it fail?  I did set the networking to use 127.0.0.1 as a bind address and both instances were running on my laptop - although I don't think that would make much of a difference.  Any other suggestions on making this fail?  Did you have more than 2 instances of your program running?

                                   

                                  Cheers

                                  Manik

                                  • 29. failure situation using cluster-wide lock
                                    yelin666

                                    Hi Manik,

                                     

                                    Thanks for the update. I am playing with it right now. Basically, what I got so far is that it doesn't fail with the DummyTransactionManager; However, I got the problem with the JBossStandalonJTAManager. Not sure if it's a TX Manager specific issue. So I am going to do a testing with the TX Manager we used in our container. I am in a whole week meeting at the moment, and will have limited time to test it, but I'll try my best to get back to you as soon as possible.

                                     

                                    Cheers,

                                    Lin