10 Replies Latest reply on May 25, 2016 11:30 AM by phillavoie

    Cluster leaver still owns key?

    mdobrinin

      Hi, I'm having an issue where it appears the owner of a key can leave and the key is not rehashed to the other nodes and parts of the system can still think that the leaver is still in the cluster in a replicated cache. This causes the node trying to perform an operation (remove) to ask the owner to do it for him, but of course that node already left! To give some more detail, the remove() operations happens in response to cluster topology change events, so it is clear that the other node already left, but some other parts of Infinispan must not be aware of it yet. The issue only occurs when going from 2 to 1 nodes, e.g., the pre-last node left the cluster. Here are some logs demonstrating the issue:

       

      The leaver is removed from the cluster:

          2015-01-30 12:07:09,566 TRACE [41      ] UNICAST2 [] CALVIN-8383: removing non members [CALVIN-44280] []

          2015-01-30 12:07:09,566 TRACE [41      ] UNICAST2 [] CALVIN-8383 --> STABLE(CALVIN-44280: 243-243, conn_id=3) []

          2015-01-30 12:07:09,566 TRACE [41      ] UDP [] CALVIN-8383: sending msg to CALVIN-44280, src=CALVIN-8383, headers are UNICAST2: STABLE, seqno=243, conn_id=3, UDP: [channel_name=MOCA_TEST_CLUSTER] []

          2015-01-30 12:07:09,566 DEBUG [41      ] NAKACK [] CALVIN-8383: removed CALVIN-44280 from xmit_table (not member anymore) []

          2015-01-30 12:07:09,566 TRACE [41      ] FD_SOCK [] adjusted suspected_mbrs: [] []

          2015-01-30 12:07:09,566 DEBUG [41      ] FD_SOCK [] VIEW_CHANGE received: [CALVIN-8383] []

          2015-01-30 12:07:09,570 TRACE [41      ] STABLE [] CALVIN-8383: reset digest to CALVIN-8383: [-1] []

          2015-01-30 12:07:09,570 TRACE [41      ] UFC [] new membership: [CALVIN-8383] []

          2015-01-30 12:07:09,570 TRACE [41      ] MFC [] new membership: [CALVIN-8383] []

          2015-01-30 12:07:09,570 TRACE [41      ] FRAG2 [] CALVIN-8383: removed CALVIN-44280 from fragmentation table []

       

      The remove() command is attempted:

       

          2015-01-30 12:07:09,584 TRACE [170     ] InvocationContextInterceptor [] Invoked with command RemoveCommand{key=InfinispanNode [address=CALVIN-44280], value=null, flags=null, valueMatcher=MATCH_ALWAYS} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@15450db] []

          2015-01-30 12:07:09,584 TRACE [170     ] StateTransferInterceptor [] handleNonTxWriteCommand for command RemoveCommand{key=InfinispanNode [address=CALVIN-44280], value=null, flags=null, valueMatcher=MATCH_ALWAYS} []

          2015-01-30 12:07:09,584 TRACE [170     ] NonTransactionalLockingInterceptor [] Are (CALVIN-8383) we the lock owners for key 'InfinispanNode [address=CALVIN-44280]'? false []

          2015-01-30 12:07:09,584 TRACE [170     ] EntryWrappingInterceptor [] Wrapping entry 'InfinispanNode [address=CALVIN-44280]'? false []

          2015-01-30 12:07:09,584 TRACE [170     ] CallInterceptor [] Executing command: RemoveCommand{key=InfinispanNode [address=CALVIN-44280], value=null, flags=null, valueMatcher=MATCH_ALWAYS}. []

          2015-01-30 12:07:09,584 TRACE [170     ] BaseDistributionInterceptor [] I'm not the primary owner, so sending the command to the primary owner(CALVIN-44280) in order to be forwarded []

          2015-01-30 12:07:09,584 TRACE [170     ] RpcManagerImpl [] CALVIN-8383 invoking RemoveCommand{key=InfinispanNode [address=CALVIN-44280], value=null, flags=null, valueMatcher=MATCH_ALWAYS} to recipient list [CALVIN-44280] with options RpcOptions{timeout=20000, unit=MILLISECONDS, fifoOrder=true, totalOrder=false, responseFilter=null, responseMode=SYNCHRONOUS, skipReplicationQueue=false} []

          2015-01-30 12:07:09,585 TRACE [177     ] JGroupsTransport [] dests=null, command=CacheTopologyControlCommand{cache=moca-current-roles, type=CH_UPDATE, sender=CALVIN-8383, joinInfo=null, topologyId=9, currentCH=ReplicatedConsistentHash{members=[CALVIN-8383], numSegments=60, primaryOwners=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}, pendingCH=null, throwable=null, viewId=7}, mode=ASYNCHRONOUS_WITH_SYNC_MARSHALLING, timeout=20000 []

          2015-01-30 12:07:09,585 TRACE [72      ] GlobalComponentRegistry [] Injecting dependencies for method [public void org.infinispan.topology.CacheTopologyControlCommand.init(org.infinispan.topology.LocalTopologyManager,org.infinispan.topology.ClusterTopologyManager,org.infinispan.topology.RebalancePolicy)] on an instance of [org.infinispan.topology.CacheTopologyControlCommand]. []

          2015-01-30 12:07:09,585 TRACE [177     ] DefaultRebalancePolicy [] Cache moca-current-roles status changed: joiners=[], topology=CacheTopology{id=9, currentCH=ReplicatedConsistentHash{members=[CALVIN-8383], numSegments=60, primaryOwners=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}, pendingCH=null} []

          2015-01-30 12:07:09,585 TRACE [170     ] JGroupsTransport [] dests=[CALVIN-44280], command=SingleRpcCommand{cacheName='moca-node-urls', command=RemoveCommand{key=InfinispanNode [address=CALVIN-44280], value=null, flags=null, valueMatcher=MATCH_ALWAYS}}, mode=SYNCHRONOUS, timeout=20000 []

       

      And eventually it returns as an error:

          org.infinispan.remoting.transport.jgroups.SuspectException: One or more nodes have left the cluster while replicating command SingleRpcCommand{cacheName='moca-node-urls', command=RemoveCommand{key=InfinispanNode [address=CALVIN-44280], value=null, flags=null, valueMatcher=MATCH_ALWAYS}}

              at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:484) ~[infinispan-core-jda.jar:6.0.2-SNAPSHOT]

              at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:281) [infinispan-core-jda.jar:6.0.2-SNAPSHOT]

              at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.handleNonTxWriteCommand(BaseDistributionInterceptor.java:233) [infinispan-core-jda.jar:6.0.2-SNAPSHOT]

              at org.infinispan.interceptors.distribution.NonTxDistributionInterceptor.visitRemoveCommand(NonTxDistributionInterceptor.java:110) [infinispan-core-jda.jar:6.0.2-SNAPSHOT]

        • 1. Re: Cluster leaver still owns key?
          mdobrinin

          The version is 6.0.0, but I believe this happens in multiple versions.

          • 2. Re: Cluster leaver still owns key?
            nadirx

            Can you try to reproduce this with Infinispan 7.1.0.Final please ?

            • 3. Re: Cluster leaver still owns key?
              mdobrinin

              I will be sure to try it. Unfortunately, upgrading to Infinispan 7.x won't be possible in my case though. Are there any configuration settings which may help this issue?

              • 4. Re: Cluster leaver still owns key?
                william.burns

                Are you sure the SuspectException was thrown to the client code?  In this version the suspect exception should be caught in Infinispan code and the new owner is retried (in this case the suspect exception is logged to the trace however).

                 

                Also I am guessing you find out about the view change as an installed listener?  Are you running this remove in the listener callback?  There shouldn't be any blocking operations done in the listener view change events, is the listener set to async?

                 

                And to clarify when the JGroups view changes the owners are not yet redetermined.  This doesn't happen until after Infinispan has gone through it's view change events to figure out the new owner(s).

                • 5. Re: Cluster leaver still owns key?
                  mdobrinin

                  Thanks for the reply Will. Indeed this code runs in response to the @ViewChanged event or the @Merged event. The actions in response to these events could be slow, but the listener is marked as async. The suspected exception is logged by the ReplicationManager, and it looks like he then rethrows the exception but I'm not sure where it lands after that:

                  2015-01-30 12:07:09,585 TRACE [170     ] RpcManagerImpl [] replication exception:  []

                      org.infinispan.remoting.transport.jgroups.SuspectException: One or more nodes have left the cluster while replicating command SingleRpcCommand{cacheName='moca-node-urls', command=RemoveCommand{key=InfinispanNode [address=CALVIN-44280], value=null, flags=null, valueMatcher=MATCH_ALWAYS}}

                  I am wondering if there is any benefit to having the Merged event in addition to the ViewChanged event and if that could cause any issues. I will see if I can catch the suspected exception from the client.

                  • 6. Re: Cluster leaver still owns key?
                    william.burns

                    Taking a stab in the dark, but my guess is that you are seeing an issue that was fixed here [1].  This should be resolved in 7.0+ if you are able to upgrade.

                     

                    [1] ISPN-2577 Silence SuspectExceptions · d7257f7 · infinispan/infinispan · GitHub

                    • 7. Re: Cluster leaver still owns key?
                      phillavoie

                      Hi,

                       

                      We are experiencing a very similar issue at work. A cluster node goes down, then we attempt to remove a key from a cache and this removal is attempted remotely:

                       

                      The cluster has only 2 nodes and this is a replicated cache.

                      We are using infinispan 7.2.3

                      The node is "leaving" using "kill -9" on the process. This is a robustness test.

                       

                      Node leaving:

                       

                      11:47:38,195 DEBUG [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,elx74379whc-27681) New view accepted: [elx74379whc-27681|4] (1) [elx74379whc-27681]

                      11:47:38,195 DEBUG [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,elx74379whc-27681) Joined: [], Left: [elx74379whc-12535]

                      11:47:38,195 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,elx74379whc-27681) ISPN000094: Received new cluster view for channel JavaSIP: [elx74379whc-27681|4] (1) [elx74379whc-27681]

                       

                      Key removal:

                       

                      11:47:38,228 TRACE [org.infinispan.interceptors.InvocationContextInterceptor] (transport-thread--p8-t1) Invoked with command RemoveCommand{key=MemberIdentity [internalId=elx74379whc-12535], value=null, flags=[IGNORE_RETURN_VALUES], valueMatcher=MATCH_ALWAYS} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@6fbc6d38]

                      11:47:38,228 TRACE [org.infinispan.statetransfer.StateTransferInterceptor] (transport-thread--p8-t1) handleNonTxWriteCommand for command RemoveCommand{key=MemberIdentity [internalId=elx74379whc-12535], value=null, flags=[IGNORE_RETURN_VALUES], valueMatcher=MATCH_ALWAYS}, topology id -1

                      11:47:38,228 TRACE [org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor] (transport-thread--p8-t1) Are (elx74379whc-27681) we the lock owners for key 'MemberIdentity [internalId=elx74379whc-12535]'? false

                      11:47:38,228 TRACE [org.infinispan.interceptors.EntryWrappingInterceptor] (transport-thread--p8-t1) Wrapping entry 'MemberIdentity [internalId=elx74379whc-12535]'? false

                      11:47:38,228 TRACE [org.infinispan.interceptors.CallInterceptor] (transport-thread--p8-t1) Executing command: RemoveCommand{key=MemberIdentity [internalId=elx74379whc-12535], value=null, flags=[IGNORE_RETURN_VALUES], valueMatcher=MATCH_ALWAYS}.

                      11:47:38,228 TRACE [org.infinispan.interceptors.distribution.BaseDistributionInterceptor] (transport-thread--p8-t1) Command topology id is 6, current topology id is 6, successful? true

                      11:47:38,228 TRACE [org.infinispan.interceptors.distribution.BaseDistributionInterceptor] (transport-thread--p8-t1) I'm not the primary owner, so sending the command to the primary owner(elx74379whc-12535) in order to be forwarded

                      11:47:38,228 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (transport-thread--p8-t1) elx74379whc-27681 invoking RemoveCommand{key=MemberIdentity [internalId=elx74379whc-12535], value=null, flags=[IGNORE_RETURN_VALUES], valueMatcher=MATCH_ALWAYS} to recipient list [elx74379whc-12535] with options RpcOptions{timeout=15000, unit=MILLISECONDS, deliverOrder=PER_SENDER, responseFilter=null, responseMode=ASYNCHRONOUS_WITH_SYNC_MARSHALLING, skipReplicationQueue=false}

                      11:47:38,228 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (transport-thread--p8-t1) dests=[elx74379whc-12535], command=SingleRpcCommand{cacheName='TrafficNodeStore', command=RemoveCommand{key=MemberIdentity [internalId=elx74379whc-12535], value=null, flags=[IGNORE_RETURN_VALUES], valueMatcher=MATCH_ALWAYS}}, mode=ASYNCHRONOUS_WITH_SYNC_MARSHALLING, timeout=15000

                      11:47:38,228 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (transport-thread--p8-t1) Replication task sending SingleRpcCommand{cacheName='TrafficNodeStore', command=RemoveCommand{key=MemberIdentity [internalId=elx74379whc-12535], value=null, flags=[IGNORE_RETURN_VALUES], valueMatcher=MATCH_ALWAYS}} to single recipient elx74379whc-12535 with response mode GET_NONE

                      11:47:38,229 TRACE [org.infinispan.commons.marshall.AdaptiveBufferSizePredictor] (transport-thread--p8-t1) Next predicted buffer size for object type 'org.infinispan.commands.remote.SingleRpcCommand' will be 1408

                      11:47:38,229 TRACE [org.infinispan.marshall.core.VersionAwareMarshaller] (transport-thread--p8-t1) Wrote version 510

                      11:47:38,229 TRACE [org.infinispan.commons.marshall.AdaptiveBufferSizePredictor] (transport-thread--p8-t1) Next predicted buffer size for object type 'org.infinispan.commands.remote.SingleRpcCommand' will be 512

                      11:47:38,229 TRACE [org.infinispan.marshall.core.VersionAwareMarshaller] (transport-thread--p8-t1) Wrote version 510

                      11:47:38,229 TRACE [org.infinispan.commons.marshall.jboss.AbstractJBossMarshaller] (transport-thread--p8-t1) Stop marshaller

                      11:47:38,229 TRACE [org.infinispan.commons.marshall.jboss.AbstractJBossMarshaller] (transport-thread--p8-t1) Stop marshaller

                      11:47:38,229 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (transport-thread--p8-t1) Response: null

                      11:47:38,229 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (transport-thread--p8-t1) Response(s) to SingleRpcCommand{cacheName='TrafficNodeStore', command=RemoveCommand{key=MemberIdentity [internalId=elx74379whc-12535], value=null, flags=[IGNORE_RETURN_VALUES], valueMatcher=MATCH_ALWAYS}} is {}

                      11:47:38,232 TRACE [org.infinispan.interceptors.EntryWrappingInterceptor] (transport-thread--p8-t1) The return value is null

                       

                       

                      Thanks,

                      Phil

                      • 8. Re: Cluster leaver still owns key?
                        rvansa

                        It would be better to start a new thread and link possibly related one.

                         

                        What you observe in the first log excerpt is JGroups cluster membership change notification reaching Infinispan, but it's not the actual change of routing tables within Infinispan - you should observe CacheTopologyControlCommand with type CH_UPDATE and pendingCH=null. So the consistency hash function switch is delayed somewhere (can't tell where without jstack dump). As I see that the removal happens in transport thread - do you use removeAsync? It could be possible that if the transport thread pool is depleted and CH wants to use it, it gets stuck somewhere.

                        • 9. Re: Cluster leaver still owns key?
                          dan.berindei

                          There's less than 50ms between the cluster view and the RPC being sent to the leaver, I would say this is within expected bounds. If the cache topology isn't updated for a much longer time (say 500ms), then you may have a problem and you should open a new thread, as Radim suggested.

                          • 10. Re: Cluster leaver still owns key?
                            phillavoie

                            Hi,

                             

                            Thanks for your response. I've opened a new thread here: https://developer.jboss.org/thread/269988?sr=inbox

                             

                            Question though, if the topology update happens fast, wouldn't it make more sense to wait for it to terminate before firing a cluster view change? Is there a configuration to work like that? Can we register somewhere to be notified once the topology is updated?

                            Thanks,

                            Phil