Thanks a lot Paul. We ran into this issue because some nodes were running old WF version while some new.
While we are still testing wildfly 10 in our local environment, as I mentioned before we can't update wildfly 9 to wildfly 10 in production environment until next release cycle, I tried your suggestion in wildfly 9 i.e. remove FD and it seems to be working fine as of now. I also changed MERGE protocol from MERGE2 to MERGE3 as it is better i.e.
<subsystem xmlns="urn:jboss:domain:jgroups:4.0">
<channels default="ee">
<channel name="ee" stack="tcp"/>
</channels>
<stacks default="tcp">
<stack name="tcp">
<transport type="TCP" socket-binding="jgroups-tcp"/>
<protocol type="TCPPING">
<property name="initial_hosts">
10.0.1.38[7600],10.0.1.32[7600],10.0.1.39[7600]
</property>
<property name="port_range">
0
</property>
</protocol>
<protocol type="MERGE3"/>
<protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS">
<property name="join_timeout">
5000
</property>
</protocol>
<protocol type="MFC"/>
<protocol type="FRAG2"/>
<protocol type="RSVP"/>
</stack>
</stacks>
</subsystem>
However I am running into following stacktrace in some of the nodes in cluster running wildfly 9, which prevents any login or other activity in application server. Basically node just gets locked. Hope it is not because I removed FD and changed protocol from MERGE2 to MERGE3.
/var/log/nms/vmanage-server.log.3:07-Jul-2016 16:52:34,138 ERROR [InvocationContextInterceptor] (default task-182) ISPN000136: Execution error: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds for key ti1_RSFpouZcoDwSOUHCk1XQep0p8I0Py5yc0QQv and requestor GlobalTransaction:<c1564e13-c3ff-4adb-a008-bfff42c9a23d>:2188872:local. Lock is held by GlobalTransaction:<c1564e13-c3ff-4adb-a008-bfff42c9a23d>:2188718:local, while request came from local
/var/log/nms/vmanage-server.log.3:07-Jul-2016 16:52:34,136 ERROR [InvocationContextInterceptor] (default task-131) ISPN000136: Execution error: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds for key XBQC-N9AKGDt8nc11bYhDWgAzvZoi5M6-EFb66fs and requestor GlobalTransaction:<c1564e13-c3ff-4adb-a008-bfff42c9a23d>:2188915:local. Lock is held by GlobalTransaction:<c1564e13-c3ff-4adb-a008-bfff42c9a23d>:2188722:local, while request came from local
07-Jul-2016 16:36:07,703 WARN [infinispan] (default task-385) WFLYCLWEBINF0006: Failed to schedule expiration/passivation of session 1RK0J_jLSEXbZi4e8-QgdgDNy6VEbQ3NDTjoqTxs on primary owner.: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: timeout sending message to d57132d8-63f6-4291-93ac-f1911ba8aa06
at org.jgroups.blocks.UnicastRequest.getValue(UnicastRequest.java:203)
at org.jgroups.blocks.UnicastRequest.get(UnicastRequest.java:212)
at org.wildfly.clustering.server.dispatcher.ChannelCommandDispatcher.executeOnNode(ChannelCommandDispatcher.java:151)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager$2.call(InfinispanSessionManager.java:193)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager$2.call(InfinispanSessionManager.java:188)
at org.wildfly.clustering.ee.infinispan.RetryingInvoker.invoke(RetryingInvoker.java:69)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager.executeOnPrimaryOwner(InfinispanSessionManager.java:196)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager.schedule(InfinispanSessionManager.java:181)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager$SchedulableSession.close(InfinispanSessionManager.java:453)
at org.wildfly.clustering.web.undertow.session.DistributableSession.requestDone(DistributableSession.java:77)
at io.undertow.servlet.spec.ServletContextImpl.updateSessionAccessTime(ServletContextImpl.java:765) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.spec.HttpServletResponseImpl.responseDone(HttpServletResponseImpl.java:548) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:329) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:261) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:80) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:172) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:199) [undertow-core-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:774) [undertow-core-1.2.9.Final.jar:1.2.9.Final]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_72]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_72]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_72]
Caused by: java.util.concurrent.TimeoutException: timeout sending message to d57132d8-63f6-4291-93ac-f1911ba8aa06
... 21 more
07-Jul-2016 16:36:07,703 WARN [infinispan] (default task-375) WFLYCLWEBINF0006: Failed to schedule expiration/passivation of session 0psukwiO29DjcpLecIPQldb1NRiTz0zWBMn1O3t0 on primary owner.: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: timeout sending message to d57132d8-63f6-4291-93ac-f1911ba8aa06
at org.jgroups.blocks.UnicastRequest.getValue(UnicastRequest.java:203)
at org.jgroups.blocks.UnicastRequest.get(UnicastRequest.java:212)
at org.wildfly.clustering.server.dispatcher.ChannelCommandDispatcher.executeOnNode(ChannelCommandDispatcher.java:151)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager$2.call(InfinispanSessionManager.java:193)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager$2.call(InfinispanSessionManager.java:188)
at org.wildfly.clustering.ee.infinispan.RetryingInvoker.invoke(RetryingInvoker.java:69)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager.executeOnPrimaryOwner(InfinispanSessionManager.java:196)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager.schedule(InfinispanSessionManager.java:181)
at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager$SchedulableSession.close(InfinispanSessionManager.java:453)
at org.wildfly.clustering.web.undertow.session.DistributableSession.requestDone(DistributableSession.java:77)
at io.undertow.servlet.spec.ServletContextImpl.updateSessionAccessTime(ServletContextImpl.java:765) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.spec.HttpServletResponseImpl.responseDone(HttpServletResponseImpl.java:548) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:329) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:261) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:80) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:172) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:199) [undertow-core-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:774) [undertow-core-1.2.9.Final.jar:1.2.9.Final]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_72]
/Execution error: org.infinispan.util.concurrent.TimeoutException
at org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:125) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:111) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.statetransfer.TransactionSynchronizerInterceptor.visitPrepareCommand(TransactionSynchronizerInterceptor.java:42) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.statetransfer.StateTransferInterceptor.handleTxCommand(StateTransferInterceptor.java:200) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.statetransfer.StateTransferInterceptor.visitPrepareCommand(StateTransferInterceptor.java:88) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:111) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:123) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:336) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:157) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.transaction.xa.TransactionXaAdapter.commit(TransactionXaAdapter.java:112) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.transaction.tm.DummyTransaction.finishResource(DummyTransaction.java:367) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.transaction.tm.DummyTransaction.commitResources(DummyTransaction.java:413) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.transaction.tm.DummyTransaction.runCommit(DummyTransaction.java:303) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.transaction.tm.DummyTransaction.commit(DummyTransaction.java:104) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.infinispan.transaction.tm.DummyBaseTransactionManager.commit(DummyBaseTransactionManager.java:73) [infinispan-core-7.2.3.Final.jar:7.2.3.Final]
at org.wildfly.clustering.ee.infinispan.ActiveTransactionBatch.close(ActiveTransactionBatch.java:48)
at org.wildfly.clustering.web.undertow.session.DistributableSession.requestDone(DistributableSession.java:78)
at io.undertow.servlet.spec.ServletContextImpl.updateSessionAccessTime(ServletContextImpl.java:765) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.spec.HttpServletResponseImpl.responseDone(HttpServletResponseImpl.java:548) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:329) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:261) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:80) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:172) [undertow-servlet-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:199) [undertow-core-1.2.9.Final.jar:1.2.9.Final]
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:774) [undertow-core-1.2.9.Final.jar:1.2.9.Final]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_72]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_72]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_72]
I see there are couple of bugs that are open in this area. Is there some workaround ?
I am using following infinispan config:
<cache-container aliases="singleton cluster" default-cache="default" module="org.wildfly.clustering.server" name="server">
<transport lock-timeout="120000"/>
<replicated-cache mode="ASYNC" name="default">
<state-transfer enabled="true" timeout="300000"/>
<transaction locking="OPTIMISTIC" mode="BATCH"/>
</replicated-cache>
</cache-container>
<cache-container default-cache="session" module="org.wildfly.clustering.web.infinispan" name="web">
<transport lock-timeout="120000"/>
<replicated-cache mode="ASYNC" name="session">
<state-transfer enabled="true" timeout="300000"/>
<locking isolation="READ_COMMITTED"/>
<transaction locking="OPTIMISTIC" mode="BATCH"/>
</replicated-cache>
</cache-container>
Should I add acquire-timeout in locking isolation to some value greater than 15 secs ?
Looking forward to hearing from you soon. This issue is really critical for us.
Thanks
Mohil
PS: Though in our application we are only using above two cache containers, I have not deleted following cache containers from config which comes by default in standalone-viptela.xml.
<cache-container default-cache="default" module="org.wildfly.clustering.server" name="vmanage">
<transport lock-timeout="60000"/>
<replicated-cache mode="ASYNC" name="deviceconnection">
<locking isolation="READ_COMMITTED"/>
<transaction locking="OPTIMISTIC" mode="BATCH"/>
</replicated-cache>
</cache-container>
<cache-container aliases="sfsb" default-cache="dist" module="org.wildfly.clustering.ejb.infinispan" name="ejb">
<transport lock-timeout="60000"/>
<distributed-cache l1-lifespan="0" mode="ASYNC" name="dist" owners="2">
<locking isolation="REPEATABLE_READ"/>
<transaction mode="BATCH"/>
<file-store/>
</distributed-cache>
</cache-container>
<cache-container default-cache="local-query" module="org.hibernate.infinispan" name="hibernate">
<transport lock-timeout="60000"/>
<local-cache name="local-query">
<eviction max-entries="10000" strategy="LRU"/>
<expiration max-idle="100000"/>
</local-cache>
<invalidation-cache mode="SYNC" name="entity">
<transaction mode="NON_XA"/>
<eviction max-entries="10000" strategy="LRU"/>
<expiration max-idle="100000"/>
</invalidation-cache>
<replicated-cache mode="ASYNC" name="timestamps"/>
</cache-container>