1 Reply Latest reply on Feb 6, 2019 9:36 AM by galder.zamarreno

    org.infinispan.util.concurrent.TimeoutException and system hangs

    padate

      Version : Infinispan 7.2.0 and Jgroups 3.6.1replicated cache

      Replicated cluster with 4 nodes

       

       

       

      We have product which is deployed at multiple customers its working fine on all but one customer . We have seen that at times we get org.infinispan.util.concurrent.TimeoutException . The cluster has 4 nodes and after few days 2 days or 4 days suddenly we see timeoutException . Even if we leave the system idle for few days and then try to access the cache we get the below on all the nodes.

       

      The jgroups cluster does not show error at this time and the view shows its stable and all the participants.The only error we have seen few hours before in jgroups logs is the node was suspected , the particitants confirm back saying they are not dead.

       

       

      Node 4 Infinispan log next time after we try to write to the cache

       

      2019-01-17 10:46:58,999 [DEBUG] from org.infinispan.remoting.transport.jgroups.JGroupsTransport in Incoming-2,shared=tcp - New view accepted: [BCUVWFMWEB03-62554|7] (3) [BCUVWFMWEB03-62554, BCUVWFMWEB04-53724, BCUVWFMWEB02-48506]

      2019-01-17 10:46:58,999 [DEBUG] from org.infinispan.remoting.transport.jgroups.JGroupsTransport in Incoming-2,shared=tcp - Joined: [BCUVWFMWEB02-48506], Left: []

      2019-01-17 10:46:58,999 [INFO] from org.infinispan.remoting.transport.jgroups.JGroupsTransport in Incoming-2,shared=tcp - ISPN000094: Received new cluster view for channel rad-cluster: [BCUVWFMWEB03-62554|7] (3) [BCUVWFMWEB03-62554, BCUVWFMWEB04-53724, BCUVWFMWEB02-48506]

      2019-01-17 10:49:53,056 [ERROR] from org.infinispan.interceptors.InvocationContextInterceptor in play-akka.actor.default-dispatcher-1175 - ISPN000136: Execution error

      • org.infinispan.util.concurrent.TimeoutException: Node BCUVWFMWEB03-62554 timed out

      at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:195) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:562) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:283) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.handleNonTxWriteCommand(BaseDistributionInterceptor.java:273) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.distribution.NonTxDistributionInterceptor.visitPutKeyValueCommand(NonTxDistributionInterceptor.java:93) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.EntryWrappingInterceptor.invokeNextAndApplyChanges(EntryWrappingInterceptor.java:386) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.EntryWrappingInterceptor.setSkipRemoteGetsAndInvokeNextForDataCommand(EntryWrappingInterceptor.java:474) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.EntryWrappingInterceptor.visitPutKeyValueCommand(EntryWrappingInterceptor.java:187) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.locking.AbstractLockingInterceptor.visitPutKeyValueCommand(AbstractLockingInterceptor.java:46) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:111) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:35) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:187) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:108) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.CacheMgmtInterceptor.updateStoreStatistics(CacheMgmtInterceptor.java:159) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:145) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:35) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333) [org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1578) [org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.cache.impl.CacheImpl.putInternal(CacheImpl.java:1056) [org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1048) [org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1041) [org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:251) [org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at com.ge.hcit.id.visualization.Authentication.authenticate(Authentication.java:223) [default.raddesktop-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]

      at Routes$$anonfun$routes$1$$anonfun$applyOrElse$6$$anonfun$apply$6.apply(routes_routing.scala:249) [default.raddesktop-1.0.0-SNAPSHOT.jar:na]

      at Routes$$anonfun$routes$1$$anonfun$applyOrElse$6$$anonfun$apply$6.apply(routes_routing.scala:249) [default.raddesktop-1.0.0-SNAPSHOT.jar:na]

      at play.core.Router$HandlerInvoker$$anon$8$$anon$3.invocation(Router.scala:198) [com.typesafe.play.play_2.10-2.2.3.jar:2.2.3]

      at play.core.Router$Routes$$anon$1.invocation(Router.scala:377) [com.typesafe.play.play_2.10-2.2.3.jar:2.2.3]

      at play.core.j.JavaAction$$anon$1.call(JavaAction.scala:56) [com.typesafe.play.play_2.10-2.2.3.jar:2.2.3]

      at play.core.j.JavaAction$$anon$3.apply(JavaAction.scala:91) [com.typesafe.play.play_2.10-2.2.3.jar:2.2.3]

      at play.core.j.JavaAction$$anon$3.apply(JavaAction.scala:90) [com.typesafe.play.play_2.10-2.2.3.jar:2.2.3]

      at play.core.j.FPromiseHelper$$anonfun$flatMap$1.apply(FPromiseHelper.scala:82) [com.typesafe.play.play_2.10-2.2.3.jar:2.2.3]

      at play.core.j.FPromiseHelper$$anonfun$flatMap$1.apply(FPromiseHelper.scala:82) [com.typesafe.play.play_2.10-2.2.3.jar:2.2.3]

      at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251) [org.scala-lang.scala-library-2.10.3.jar:na]

      at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:249) [org.scala-lang.scala-library-2.10.3.jar:na]

      at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) [org.scala-lang.scala-library-2.10.3.jar:na]

      at play.core.j.HttpExecutionContext$$anon$2.run(HttpExecutionContext.scala:37) [com.typesafe.play.play_2.10-2.2.3.jar:2.2.3]

      at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42) [com.typesafe.akka.akka-actor_2.10-2.2.0.jar:2.2.0]

      at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) [com.typesafe.akka.akka-actor_2.10-2.2.0.jar:2.2.0]

      at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [org.scala-lang.scala-library-2.10.3.jar:na]

      at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [org.scala-lang.scala-library-2.10.3.jar:na]

      at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [org.scala-lang.scala-library-2.10.3.jar:na]

      at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [org.scala-lang.scala-library-2.10.3.jar:na]

      Caused by: org.jgroups.TimeoutException: timeout waiting for response from BCUVWFMWEB03-62554, request: org.jgroups.blocks.UnicastRequest@2f85e511, req_id=53, mode=GET_ALL, target=BCUVWFMWEB03-62554

      at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:429) ~[org.jgroups.jgroups-3.6.1.Final.jar:3.6.1.Final]

      at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:374) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:188) ~[org.infinispan.infinispan-core-7.1.0.Final.jar:7.1.0.Final]

      ... 56 common frames omitted

      2019-01-17 15:02:46,465 [DEBUG] from o

       

       

       

      Node 4 jgroup log

       

      Line 131328: 2019-01-16 18:12:10,830 [DEBUG] from org.jgroups.protocols.TCP in INT-2,shared=tcp - null: received [dst: BCUVWFMWEB04-53724, src: BCUVWFMWEB03-62554 (2 headers), size=0 bytes, flags=INTERNAL], headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: ARE_YOU_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131328: 2019-01-16 18:12:10,830 [DEBUG] from org.jgroups.protocols.TCP in INT-2,shared=tcp - null: received [dst: BCUVWFMWEB04-53724, src: BCUVWFMWEB03-62554 (2 headers), size=0 bytes, flags=INTERNAL], headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: ARE_YOU_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131329: 2019-01-16 18:12:10,830 [DEBUG] from org.jgroups.protocols.TCP in INT-2,shared=tcp - null: sending msg to BCUVWFMWEB03-62554, src=BCUVWFMWEB04-53724, headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: I_AM_NOT_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131329: 2019-01-16 18:12:10,830 [DEBUG] from org.jgroups.protocols.TCP in INT-2,shared=tcp - null: sending msg to BCUVWFMWEB03-62554, src=BCUVWFMWEB04-53724, headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: I_AM_NOT_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131402: 2019-01-16 18:12:25,518 [DEBUG] from org.jgroups.protocols.TCP in INT-2,shared=tcp - null: received [dst: BCUVWFMWEB04-53724, src: BCUVWFMWEB02-5915 (2 headers), size=0 bytes, flags=INTERNAL], headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: ARE_YOU_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131402: 2019-01-16 18:12:25,518 [DEBUG] from org.jgroups.protocols.TCP in INT-2,shared=tcp - null: received [dst: BCUVWFMWEB04-53724, src: BCUVWFMWEB02-5915 (2 headers), size=0 bytes, flags=INTERNAL], headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: ARE_YOU_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131403: 2019-01-16 18:12:25,518 [DEBUG] from org.jgroups.protocols.TCP in INT-2,shared=tcp - null: sending msg to BCUVWFMWEB02-5915, src=BCUVWFMWEB04-53724, headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: I_AM_NOT_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131403: 2019-01-16 18:12:25,518 [DEBUG] from org.jgroups.protocols.TCP in INT-2,shared=tcp - null: sending msg to BCUVWFMWEB02-5915, src=BCUVWFMWEB04-53724, headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: I_AM_NOT_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131426: 2019-01-16 18:12:28,955 [DEBUG] from org.jgroups.protocols.VERIFY_SUSPECT in Timer-3,shared=tcp - verifying that BCUVWFMWEB03-62554 is dead

      Line 131426: 2019-01-16 18:12:28,955 [DEBUG] from org.jgroups.protocols.VERIFY_SUSPECT in Timer-3,shared=tcp - verifying that BCUVWFMWEB03-62554 is dead

      Line 131427: 2019-01-16 18:12:28,955 [DEBUG] from org.jgroups.protocols.TCP in Timer-3,shared=tcp - null: sending msg to BCUVWFMWEB03-62554, src=BCUVWFMWEB04-53724, headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: ARE_YOU_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131427: 2019-01-16 18:12:28,955 [DEBUG] from org.jgroups.protocols.TCP in Timer-3,shared=tcp - null: sending msg to BCUVWFMWEB03-62554, src=BCUVWFMWEB04-53724, headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: ARE_YOU_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131431: 2019-01-16 18:12:29,018 [DEBUG] from org.jgroups.protocols.TCP in INT-1,shared=tcp - null: received [dst: BCUVWFMWEB04-53724, src: BCUVWFMWEB03-62554 (2 headers), size=0 bytes, flags=INTERNAL], headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: I_AM_NOT_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131431: 2019-01-16 18:12:29,018 [DEBUG] from org.jgroups.protocols.TCP in INT-1,shared=tcp - null: received [dst: BCUVWFMWEB04-53724, src: BCUVWFMWEB03-62554 (2 headers), size=0 bytes, flags=INTERNAL], headers are VERIFY_SUSPECT: [VERIFY_SUSPECT: I_AM_NOT_DEAD], TCP: [cluster_name=rad-cluster]

      Line 131432: 2019-01-16 18:12:29,018 [DEBUG] from org.jgroups.protocols.VERIFY_SUSPECT in INT-1,shared=tcp - member BCUVWFMWEB03-62554 was unsuspected