0 Replies Latest reply on Aug 2, 2016 2:59 PM by mohilkhare

    Wiledfly9 cluster (Infinispan) : Singleton dying and NAKACK2 WARNINGs

    mohilkhare

      Hi,

       

      We are running wildfly 9 in cluster environment in production. We have latency of 30ms and loss of 0.1% on links between wildfly nodes that are running in cluster. We are using following jgroups and infinispan config:

       

      <stack name="tcp">

                          <transport socket-binding="jgroups-tcp" type="TCP"/>

                          <protocol type="TCPPING">

                              <property name="initial_hosts">

                              ip1[7600],ip2[7600],Ip3[7600]</property>

                              <property name=port_range">

                                  0

                              </property>

                          </protocol>

                          <protocol type="MERGE3"/>

                          <protocol socket-binding="jgroups-tcp-fd" type="FD_SOCK"/>

                          <protocol type="VERIFY_SUSPECT"/>

                          <protocol type="pbcast.NAKACK2"/>

                          <protocol type="UNICAST3"/>

                          <protocol type="pbcast.STABLE"/>

                          <protocol type="pbcast.GMS">

                              <property name="join_timeout">

                                  5000

                              </property>

                          </protocol>

                          <protocol type="MFC"/>

                          <protocol type="FRAG2"/>

                          <protocol type="RSVP"/>

                      </stack>

      .

      .

      .

      <subsystem xmlns="urn:jboss:domain:infinispan:3.0">

                  <cache-container aliases="singleton cluster" default-cache="default" module="org.wildfly.clustering.server" name="server">

                      <transport lock-timeout="120000"/>

                      <replicated-cache mode="ASYNC" name="default">

                          <state-transfer enabled="true" timeout="300000"/>

                          <transaction locking="OPTIMISTIC" mode="BATCH"/>

                      </replicated-cache>

                  </cache-container>

                  <cache-container default-cache="session" module="org.wildfly.clustering.web.infinispan" name="web">

                      <transport lock-timeout="120000"/>

                      <replicated-cache mode="ASYNC" name="session">

                          <state-transfer enabled="true" timeout="300000"/>

                          <locking isolation="READ_COMMITTED"/>

                          <transaction locking="OPTIMISTIC" mode="BATCH"/>

                      </replicated-cache>

                  </cache-container>

      </subsystem>

       

      We are seeing that singleton silently dies after few hours of run and logs are filled with following exceptions and warnings:

       

      Wildfly node that was running singleton ( It became unresponsive after this and reboot was needed to recover ):

       

      01-Aug-2016 12:39:00,897 ERROR [InvocationContextInterceptor] (default task-491) ISPN000136: Execution error: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds for key 8n3KnLBQPhoJ20vGsEdfSzMhq3BO8LgzYW-iFI-L and requestor GlobalTransaction:<c1564e13-c3ff-4adb-a008-bfff42c9a23d>:172851314:local. Lock is held by null, while request came from local

              at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:198) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLock(LockManagerImpl.java:171) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockKeyAndCheckOwnership(AbstractTxLockingInterceptor.java:183) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAndRegisterBackupLock(AbstractTxLockingInterceptor.java:116) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.acquireAllLocks(OptimisticLockingInterceptor.java:253) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]        at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.visitPrepareCommand(OptimisticLockingInterceptor.java:88) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]        at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.NotificationInterceptor.visitPrepareCommand(NotificationInterceptor.java:36) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.TxInterceptor.invokeNextInterceptorAndVerifyTransaction(TxInterceptor.java:138) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:125) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:111) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.statetransfer.TransactionSynchronizerInterceptor.visitPrepareCommand(TransactionSynchronizerInterceptor.java:42) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.statetransfer.StateTransferInterceptor.handleTxCommand(StateTransferInterceptor.java:200) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.statetransfer.StateTransferInterceptor.visitPrepareCommand(StateTransferInterceptor.java:88) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:111) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:336) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:157) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.xa.TransactionXaAdapter.commit(TransactionXaAdapter.java:112) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyTransaction.finishResource(DummyTransaction.java:367) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyTransaction.commitResources(DummyTransaction.java:413) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyTransaction.runCommit(DummyTransaction.java:303) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyTransaction.commit(DummyTransaction.java:104) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyBaseTransactionManager.commit(DummyBaseTransactionManager.java:73) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.wildfly.clustering.ee.infinispan.ActiveTransactionBatch.close(ActiveTransactionBatch.java:48)

              at org.wildfly.clustering.web.undertow.session.DistributableSession.requestDone(DistributableSession.java:78)

              at io.undertow.servlet.spec.ServletContextImpl.updateSessionAccessTime(ServletContextImpl.java:765) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.spec.HttpServletResponseImpl.responseDone(HttpServletResponseImpl.java:548) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:329) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:261) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:80) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:172) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.server.Connectors.executeRootHandler(Connectors.java:199) [undertow-core-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:774) [undertow-core-1.2.9.Final.jar:1.2.9.Final]

              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_72]

              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_72]

      at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_72]

       

       

       

       

       

       

       

       

       

       

       

      01-Aug-2016 12:39:00,899 ERROR [TransactionCoordinator] (default task-491) ISPN000097: Error while processing a prepare in a single-phase transaction: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds for key 8n3KnLBQPhoJ20vGsEdfSzMhq3BO8LgzYW-iFI-L and requestor GlobalTransaction:<c1564e13-c3ff-4adb-a008-bfff42c9a23d>:172851314:local. Lock is held by null, while request came from local

              at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:198) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLock(LockManagerImpl.java:171) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockKeyAndCheckOwnership(AbstractTxLockingInterceptor.java:183) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAndRegisterBackupLock(AbstractTxLockingInterceptor.java:116) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.acquireAllLocks(OptimisticLockingInterceptor.java:253) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.visitPrepareCommand(OptimisticLockingInterceptor.java:88) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.NotificationInterceptor.visitPrepareCommand(NotificationInterceptor.java:36) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.TxInterceptor.invokeNextInterceptorAndVerifyTransaction(TxInterceptor.java:138) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:125) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:111) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.statetransfer.TransactionSynchronizerInterceptor.visitPrepareCommand(TransactionSynchronizerInterceptor.java:42) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.statetransfer.StateTransferInterceptor.handleTxCommand(StateTransferInterceptor.java:200) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.statetransfer.StateTransferInterceptor.visitPrepareCommand(StateTransferInterceptor.java:88) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:111) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:336) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:157) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.xa.TransactionXaAdapter.commit(TransactionXaAdapter.java:112) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyTransaction.finishResource(DummyTransaction.java:367) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyTransaction.commitResources(DummyTransaction.java:413) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyTransaction.runCommit(DummyTransaction.java:303) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyTransaction.commit(DummyTransaction.java:104) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.infinispan.transaction.tm.DummyBaseTransactionManager.commit(DummyBaseTransactionManager.java:73) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]

              at org.wildfly.clustering.ee.infinispan.ActiveTransactionBatch.close(ActiveTransactionBatch.java:48)

              at org.wildfly.clustering.web.undertow.session.DistributableSession.requestDone(DistributableSession.java:78)

              at io.undertow.servlet.spec.ServletContextImpl.updateSessionAccessTime(ServletContextImpl.java:765) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.spec.HttpServletResponseImpl.responseDone(HttpServletResponseImpl.java:548) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:329) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:261) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:80) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:172) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.server.Connectors.executeRootHandler(Connectors.java:199) [undertow-core-1.2.9.Final.jar:1.2.9.Final]

              at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:774) [undertow-core-1.2.9.Final.jar:1.2.9.Final]

              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_72]

              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_72]

              at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_72]



      Other nodes ( reboot was need to recover ):


      01-Aug-2016 16:35:24,641 WARNING [NAKACK2] (Incoming-8022,ee,b0ca1b4b-25b5-45dd-8f0a-b08be6baa84f) JGRP000011: b0ca1b4b-25b5-45dd-8f0a-b08be6baa84f: dropped message batch from non-member c1564e13-c3ff-4adb-a008-bfff42c9a23d (view=MergeView::[62508e13-31f1-45b5-a80b-b50319158ce1|7] (5) [62508e13-31f1-45b5-a80b-b50319158ce1, 0c00809c-3c19-4d49-9b4e-ffa2b5a8face, b4300e4a-6bca-4d38-aa1a-cb0974d54a93, d57132d8-63f6-4291-93ac-f1911ba8aa06, b0ca1b4b-25b5-45dd-8f0a-b08be6baa84f], 1 subgroups: [62508e13-31f1-45b5-a80b-b50319158ce1|0] (1) [62508e13-31f1-45b5-a80b-b50319158ce1]) (received 8 identical messages from c1564e13-c3ff-4adb-a008-bfff42c9a23d in the last 380657 ms)


      Is it because of aforementioned latency and lossy link ? Can we change some config to overcome it ?


      Thanks

      Mohil


      PS: Note: 1) We are still evaluating wildfly 10 before we upgrade in production and I will really appreciate any help or workaround with wildfly 9.

                 2) In above jgroups config, I removed "FD" to avoid [WFLY-4748] Singleton service fails to start after repetitive cluster split with "Failed to reach quorum of 1" - JBoss I…