0 Replies Latest reply on Mar 8, 2018 10:05 AM by vikrant02

    Infinispan not responding withTimeoutException

    vikrant02

      Hi,

       

      I am running two Infinispan cluster each running 3 nodes and both joined in a cross site cluster on openshift platform. Recently seen following recurring error after which whole cluster in one data center stopped responding however other data center keeps running. Only way to recover this issue is when whole cluster is rebooted.

       

      WARN  [org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler] (jgroups-13,_cache-am-0) ISPN000071: Caught exception when handling command SingleXSiteRpcCommand{command=PutKeyValueCommand{key=WrappedByteArray{bytes=[B0x033E243734626461..[39], hashCode=1236866894}, value=WrappedByteArray{bytes=[B0x03040B000000446F..[1016], hashCode=0}, flags=[], commandInvocationId=CommandInvocation:33409d6a-b774-aa1b-742e-2fe082901bf8:3568211, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=88200000, version=NumericVersion{version=562962838410850}}, successful=true, topologyId=-1}}: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 5946345 from cache-am-2
      at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:259)
      at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1661)
      at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1309)
      at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:621)
      at org.infinispan.cache.impl.AbstractDelegatingAdvancedCache.put(AbstractDelegatingAdvancedCache.java:318)
      at org.infinispan.cache.impl.EncoderCache.put(EncoderCache.java:438)
      at org.infinispan.xsite.BaseBackupReceiver$BackupCacheUpdater.visitPutKeyValueCommand(BaseBackupReceiver.java:110)
      at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:67)
      at org.infinispan.xsite.BaseBackupReceiver.handleRemoteCommand(BaseBackupReceiver.java:76)
      at org.infinispan.xsite.SingleXSiteRpcCommand.performInLocalSite(SingleXSiteRpcCommand.java:37)
      at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler.runXSiteReplicableCommand(GlobalInboundInvocationHandler.java:131)
      at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler.handleFromRemoteSite(GlobalInboundInvocationHandler.java:97)
      at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processRequest(JGroupsTransport.java:1302)
      at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1235)
      at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$200(JGroupsTransport.java:121)
      at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.receive(JGroupsTransport.java:1366)
      at org.jgroups.JChannel.up(JChannel.java:819)
      at org.jgroups.fork.ForkProtocolStack.up(ForkProtocolStack.java:134)
      at org.jgroups.stack.Protocol.up(Protocol.java:340)
      at org.jgroups.protocols.FORK.up(FORK.java:134)
      at org.jgroups.protocols.relay.RELAY2.deliver(RELAY2.java:645)
      at org.jgroups.protocols.relay.RELAY2.route(RELAY2.java:542)
      at org.jgroups.protocols.relay.RELAY2.handleMessage(RELAY2.java:517)
      at org.jgroups.protocols.relay.RELAY2.handleRelayMessage(RELAY2.java:498)
      at org.jgroups.protocols.relay.Relayer$Bridge.receive(Relayer.java:200)
      at org.jgroups.ReceiverAdapter.receive(ReceiverAdapter.java:24)
      at org.jgroups.JChannel.up(JChannel.java:846)
      at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:896)
      at org.jgroups.protocols.RSVP.up(RSVP.java:233)
      at org.jgroups.protocols.FRAG3.up(FRAG3.java:190)
      at org.jgroups.protocols.FlowControl.up(FlowControl.java:416)
      at org.jgroups.stack.Protocol.up(Protocol.java:372)
      at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:293)
      at org.jgroups.protocols.UNICAST3.deliverBatch(UNICAST3.java:1024)
      at org.jgroups.protocols.UNICAST3.removeAndDeliver(UNICAST3.java:833)
      at org.jgroups.protocols.UNICAST3.handleBatchReceived(UNICAST3.java:799)
      at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:470)
      at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:697)
      at org.jgroups.protocols.Encrypt.up(Encrypt.java:190)
      at org.jgroups.stack.Protocol.up(Protocol.java:372)
      at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:212)
      at org.jgroups.stack.Protocol.up(Protocol.java:372)
      at org.jgroups.stack.Protocol.up(Protocol.java:372)
      at org.jgroups.stack.Protocol.up(Protocol.java:372)
      at org.jgroups.protocols.TP.passBatchUp(TP.java:1255)
      at org.jgroups.util.MaxOneThreadPerSender$BatchHandlerLoop.passBatchUp(MaxOneThreadPerSender.java:284)
      at org.jgroups.util.SubmitToThreadPool$BatchHandler.run(SubmitToThreadPool.java:136)
      at org.jgroups.util.MaxOneThreadPerSender$BatchHandlerLoop.run(MaxOneThreadPerSender.java:273)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      
      
      [org.infinispan.interceptors.impl.InvocationContextInterceptor] (timeout-thread--p3-t1) ISPN000136: Error executing command PutKeyValueCommand, writing keys [WrappedByteArray{bytes=[B0x033E243734626461..[39], hashCode=1236866894}]: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 5946345 from cache-am-2
      at org.infinispan.remoting.transport.impl.SingleTargetRequest.onTimeout(SingleTargetRequest.java:64)
      at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:86)
      at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:21)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      

       

      Have anyone any idea what could be cause of this error and how to avoid this. This doesn't happen very often and couldn't reproduce it on my own but whenever happens, whole cluster reboot only resolves the issue.

      Any suggestion or insight is appreciated.

       

      Thanks,

      Vikrant