3 Replies Latest reply on Dec 22, 2010 11:48 AM by galder.zamarreno

    HAPartition deadlocked

    lexsoto

      Hello,

       

      I am running JBoss 5.1.0.GA.  The web application and EJBS are in separate servers in the same cluster.  A deadlock occurred where the web application blocked when looking up the EJBs in JNDI. 

       

       

      A thread dump from the deadlocked web application server shows multiple threads with the same trace:

       

      - waiting on <0x59cfe529> (a java.util.concurrent.locks.ReentrantLock$NonfairSync
      sun.misc.Unsafe.park(Native Method)
      java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
      java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
      java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
      java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
      java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
      java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
      org.jboss.seam.Component.getInstanceFromFactory(Component.java:2082)
      

       

       

      They are all waiting on synchronization object 0x59cfe529. 

      The one thread that appears to be holding this lock shows:

       

      - waiting on <0x7091fea5> (a org.jboss.ha.framework.server.ClusterPartition$ThreadGate)
      java.lang.Object.wait(Native Method)
      org.jboss.ha.framework.server.ClusterPartition$ThreadGate.await(ClusterPartition.java:2336)
      org.jboss.ha.framework.server.ClusterPartition.callMethodOnCluster(ClusterPartition.java:1089)
      org.jboss.ha.framework.server.ClusterPartition.callMethodOnCluster(ClusterPartition.java:1074)
      org.jboss.ha.jndi.HAJNDI.lookupRemotely(HAJNDI.java:248)
      org.jboss.ha.jndi.HAJNDI.lookup(HAJNDI.java:206)
      sun.reflect.GeneratedMethodAccessor624.invoke(Unknown Source)
      sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      java.lang.reflect.Method.invoke(Method.java:597)
      org.jboss.ha.framework.interfaces.HARMIClient.invoke(HARMIClient.java:318)
      $Proxy424.lookup(Unknown Source)
      org.jnp.interfaces.NamingContext.lookup(NamingContext.java:726)
      org.jnp.interfaces.NamingContext.lookup(NamingContext.java:686)
      javax.naming.InitialContext.lookup(InitialContext.java:392)
      org.jnp.interfaces.NamingContext.resolveLink(NamingContext.java:1346)
      org.jnp.interfaces.NamingContext.lookup(NamingContext.java:817)
      org.jnp.interfaces.NamingContext.lookup(NamingContext.java:833)
      org.jnp.interfaces.NamingContext.lookup(NamingContext.java:686)
      javax.naming.InitialContext.lookup(InitialContext.java:392)
      
      ........
       
      
      
      Locked synchronizers : 
      - locked <0x59cfe529> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
      

       

       

      I thought about enabling deadlock detection on the HAPartition configuration, but I found that this is discouraged here: https://jira.jboss.org/browse/JBAS-5821

       

      This JIRA entry suggest:

      Use of properly sized thread pools does the same job

       

      So should I increase the thread pool?

      How do I go about avoiding this deadlock problem?

       

       

      Any hints will be greatly appreciated.

      TIA

        • 1. Re: HAPartition deadlocked
          galder.zamarreno

          Hmmm, you're only showing part of the problem there. Clearly, the lookup cannot be resolved locally, so it tries to query the cluster for an EJB binding. But, what's the node that contains that binding doing? Did you get a thread dump from the other node? A way to workaround is to deploy all the EJBs everywhere and then the lookups will be resolved locally and hence no cluster wide calls are made.

          • 2. Re: HAPartition deadlocked
            lexsoto

            Firstly, thank you for replying. 

             

            I am no longer experiencing this problem. I think the problem was somehow related to Seam component/factory scopes.

             

            I changed the scope of the class containing factory methods from Application to Event scope.  I was also caching the EJB references in the application scope; I am not doing that anymore.

             

             

            That said, I can't explain why the dead lock.  I am concerned that the problem may come back.  The code ClusterPartition.java seem to have a timeout on the wait call.

             

            org.jboss.ha.framework.server.ClusterPartition$ThreadGate.await(ClusterPartition.java:2336)

             

            I would expect this wait to be interrupted after some time, but it didn't; it got stuck at that point.

            Can you confirm or deny if this call is supposed to timeout?

             

            Thanks

            • 3. Re: HAPartition deadlocked
              galder.zamarreno

              Alex, this is open source, so you can actually check this yourself

               

              http://anonsvn.jboss.org/repos/jbossas/tags/JBoss_5_1_0_GA/cluster/src/main/org/jboss/ha/framework/server/ClusterPartition.java

               

              As you can see from the code, the wait is timed.