2 Replies Latest reply on Dec 18, 2014 11:21 PM by suhasan

    Getting StateTransferInProgressException on a high load

    suhasan

      While running a high traffic on a 3 node cluster where the entries are getting added to the cache, after some time the Infinispan Cache throws StateTransferInProgressException  on 2 nodes. This leads to a failure of subsequent operations.

       

      The cluster behaviour becomes normal on restart after restarting node.

       

      The stack trace is

       

      org.infinispan.statetransfer.StateTransferInProgressException: Timed out waiting for the state transfer lock, state transfer in progress for view 65

        at org.infinispan.interceptors.StateTransferLockInterceptor.signalStateTransferInProgress(StateTransferLockInterceptor.java:201)

        at org.infinispan.interceptors.StateTransferLockInterceptor.handleWriteCommand(StateTransferLockInterceptor.java:177)

        at org.infinispan.interceptors.StateTransferLockInterceptor.visitRemoveCommand(StateTransferLockInterceptor.java:157)

        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:72)

        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)

        at org.infinispan.interceptors.CacheMgmtInterceptor.visitRemoveCommand(CacheMgmtInterceptor.java:139)

        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:72)

        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)

        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:132)

        at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:91)

        at org.infinispan.commands.AbstractVisitor.visitRemoveCommand(AbstractVisitor.java:67)

        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:72)

        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:345)

        at org.infinispan.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1006)

        • 1. Re: Getting StateTransferInProgressException on a high load
          pruivo

          Hi Suhasan,

           

          It looks like you are using ISPN 5. Can't you update to the most recent version (currently 7)?

           

          Anyway, it looks like some nodes are leaving the cluster. This can happens if you have long GC pauses and the node stops responding to the heartbeats. You can try to tune the GC and/or increment the timeout in FD* protocols in your JGroups configuration.

           

          Cheers,

          Pedro

          • 2. Re: Getting StateTransferInProgressException on a high load
            suhasan

            Hi

            Thanks for replying.

             

            If the GC pause is too long, after the GC the system should be back to normal.

            But the issue occurs for hours together when we try to add to cache.

             

            The issue is not resolved unless one server is restarted.

            I just wanted to confirm if this a bug in ISPN 5.

             

            Regards

            Suhasan