Replication exception when node lost
fredrikj Dec 20, 2007 11:41 AMHi.
We are currently running a setup with jboss cache 2.1.0 CR2 and JGroups 2.6.1.
When disconnecting one node from a cluster of two nodes, the first node will often catch exception below.
Replication exception : org.jboss.cache.ReplicationException: rsp=sender=192.168.1.112:32904, retval=null, received=false, suspected=true org.jboss.cache.ReplicationException: rsp=sender=192.168.1.112:32904, retval=null, received=false, suspected=true View changed: [192.168.1.135:50649|4] [192.168.1.135:50649] at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:2060) at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:1952) at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:1945) at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:2091) at org.jboss.cache.RPCManagerImpl.callRemoteMethods(RPCManagerImpl.java:70) at org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:106) at org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:69) at org.jboss.cache.interceptors.ReplicationInterceptor.handleReplicatedMethod(ReplicationInterceptor.java:118) at org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:90) at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:77) at org.jboss.cache.interceptors.NotificationInterceptor.invoke(NotificationInterceptor.java:32) at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:77) at org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:298) at org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:130) at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:77) at org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:107) at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:77) at org.jboss.cache.interceptors.InvocationContextInterceptor.invoke(InvocationContextInterceptor.java:62) at org.jboss.cache.CacheImpl.invokeMethod(CacheImpl.java:4004) at org.jboss.cache.CacheImpl.put(CacheImpl.java:1483) at org.jboss.cache.CacheImpl.put(CacheImpl.java:1468) at org.jboss.cache.UnversionedNode.addChild(UnversionedNode.java:353) at CacheTest$Producer.run(CacheTest.java:129) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269) at java.util.concurrent.FutureTask.run(FutureTask.java:123) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) at java.lang.Thread.run(Thread.java:595) Caused by: org.jboss.cache.SuspectException: Suspected member: 192.168.1.112:32904 at org.jboss.cache.CacheImpl.callRemoteMethods(CacheImpl.java:2054) ... 28 more
It looks like the cache is throwing an exception since the recipient is no longer with us (well, suspected at least).
1. Is this new behavior in 2.1? I can't remember having to catch these exceptions before when a member left the cluster.
2. What happens if the groups is 10 nodes and one member leaves? Will the replication be discarded, replicated to all but the suspected member or is it inconclusive? I tried to find some documentation regarding this but could not find any.