Exception acquiring ownership of X when load balancer moves session. JBoss 7.1.1
safetytrick Nov 1, 2012 1:15 PMMy application is behind a hardware load balancer instead of mod_cluster. The load balancer is configured to pin user sessions to the same jboss node as long as possible but there are some cases where a user's session is moved between nodes. When this happens our logs show "Exception acquiring ownership of X", there is also a long delay for the user, their request is blocked here:
sun.misc.Unsafe.park(Native Method) |
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196) |
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2116) |
org.jgroups.blocks.Request.responsesComplete(Request.java:196) |
org.jgroups.blocks.Request.execute(Request.java:89) |
org.jgroups.blocks.MessageDispatcher.cast(MessageDispatcher.java:308) |
org.jgroups.blocks.mux.MuxRpcDispatcher.cast(MuxRpcDispatcher.java:110) |
org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:239) |
org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:164) |
org.jboss.as.clustering.impl.CoreGroupCommunicationService.callMethodOnCluster(CoreGroupCommunicationService.java:385) |
org.jboss.as.clustering.lock.AbstractClusterLockSupport.lock(AbstractClusterLockSupport.java:152) |
org.jboss.as.clustering.lock.SharedLocalYieldingClusterLockManager.lock(SharedLocalYieldingClusterLockManager.java:436) |
org.jboss.as.clustering.web.infinispan.DistributedCacheManager.acquireSessionOwnership(DistributedCacheManager.java:372) |
org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:520) |
org.jboss.as.web.session.ClusteredSession.access(ClusteredSession.java:496) |
org.apache.catalina.connector.Request.doGetSession(Request.java:2625) |
org.apache.catalina.connector.Request.getSession(Request.java:2375) |
org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:81) |
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:155) |
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) |
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) |
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368) |
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:897) |
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:626) |
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:2039) |
java.lang.Thread.run(Thread.java:662) |
A request will be blocked here for 30 seconds before continuing.
This bug looked interesting to me: https://issues.jboss.org/browse/AS7-4260 Occurences of "Exception acquiring ownership of X (via SharedLocalYieldingClusterLockManager)"
The fix however seems to be retrying the acquire, this seems like it would make the pause worse for my users?
My user's sessions last for a long time and it's quite possible for a single node to be overloaded. Moving some of my users to another node helps manage load but a 30 second pause for that user is very painful.
What I can do to cut down these pauses?
Message was edited by: Michael Nielson, attached logs.
-
session-acquire.log.zip 1.6 KB