1 Reply Latest reply on Dec 20, 2007 10:02 PM by manik

    Replication exception when node lost

    fredrikj

      Hi.
      We are currently running a setup with jboss cache 2.1.0 CR2 and JGroups 2.6.1.

      When disconnecting one node from a cluster of two nodes, the first node will often catch exception below.

      Replication exception : org.jboss.cache.ReplicationException: rsp=sender=192.168.1.112:32904, retval=null, received=false, suspected=true
      org.jboss.cache.ReplicationException: rsp=sender=192.168.1.112:32904, retval=null, received=false, suspected=true
      View changed: [192.168.1.135:50649|4] [192.168.1.135:50649]
       at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:2060)
       at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:1952)
       at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:1945)
       at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:2091)
       at org.jboss.cache.RPCManagerImpl.callRemoteMethods(RPCManagerImpl.java:70)
       at org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:106)
       at org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:69)
       at org.jboss.cache.interceptors.ReplicationInterceptor.handleReplicatedMethod(ReplicationInterceptor.java:118)
       at org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:90)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:77)
       at org.jboss.cache.interceptors.NotificationInterceptor.invoke(NotificationInterceptor.java:32)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:77)
       at org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:298)
       at org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:130)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:77)
       at org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:107)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:77)
       at org.jboss.cache.interceptors.InvocationContextInterceptor.invoke(InvocationContextInterceptor.java:62)
       at org.jboss.cache.CacheImpl.invokeMethod(CacheImpl.java:4004)
       at org.jboss.cache.CacheImpl.put(CacheImpl.java:1483)
       at org.jboss.cache.CacheImpl.put(CacheImpl.java:1468)
       at org.jboss.cache.UnversionedNode.addChild(UnversionedNode.java:353)
       at CacheTest$Producer.run(CacheTest.java:129)
       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
       at java.util.concurrent.FutureTask.run(FutureTask.java:123)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
       at java.lang.Thread.run(Thread.java:595)
      Caused by: org.jboss.cache.SuspectException: Suspected member: 192.168.1.112:32904
       at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:2054)
       ... 28 more
      


      It looks like the cache is throwing an exception since the recipient is no longer with us (well, suspected at least).

      1. Is this new behavior in 2.1? I can't remember having to catch these exceptions before when a member left the cluster.

      2. What happens if the groups is 10 nodes and one member leaves? Will the replication be discarded, replicated to all but the suspected member or is it inconclusive? I tried to find some documentation regarding this but could not find any.