3 Replies Latest reply on May 6, 2009 10:13 AM by janush

    Deadlock race-condition committing JBossMQ txn

      Hello.

      We are getting an intermittent exception when our system processes a large number of JBossMQ messages in parallel. Whenever we get this exception, one of our Message Driven Beans becomes deadlocked, and never processes any more messages.

      Our scenario is as follows (JBoss 4.2.3.GA, EJB3):

      - We have 5 separate MDBs processing messages from 5 different queues simultaneously.
      - Each MDB is a singleton on its queue (i.e. maxSession = 1).
      - Each MDB has a large backlog of messages to process, so they are all processing at once.
      - During processing of a message, each MDB calls a SLSB to do some work which updates the database. This SLSB will then in-turn call another SLSB which sends a messages to a JBossMQ topic (e.g. notifying observers that something has changed in the database).

      This will work fine for a short while, but after a few seconds (maybe 50-100 messages processed) we always get the following exception:


      2008-10-31 11:47:50,440 563340 ERROR [org.jboss.resource.adapter.jms.inflow.JmsServerSession] (WorkManager(4)-124:) org.jboss.resource.adapter.jms.inflow.JmsServerSession@af2931 failed to commit/rollback
      org.jboss.tm.JBossRollbackException: Unable to commit, tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=vieo-ws01/7090, BranchQual=, localId=7090] status=STATUS_NO_TRANSACTION; - nested throwable: (java.lang.IllegalMonitorStateExcepti
      on)
      at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:372)
      at org.jboss.tm.TxManager.commit(TxManager.java:240)
      at org.jboss.resource.adapter.jms.inflow.JmsServerSession$XATransactionDemarcationStrategy.end(JmsServerSession.java:494)
      at org.jboss.resource.adapter.jms.inflow.JmsServerSession.run(JmsServerSession.java:248)
      at org.jboss.resource.work.WorkWrapper.execute(WorkWrapper.java:204)
      at org.jboss.util.threadpool.BasicTaskWrapper.run(BasicTaskWrapper.java:275)
      at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:761)
      at java.lang.Thread.run(Thread.java:595)
      Caused by: java.lang.IllegalMonitorStateException
      at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:125)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1137)
      at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:431)
      at org.jboss.resource.adapter.jms.JmsManagedConnection.unlock(JmsManagedConnection.java:416)
      at org.jboss.resource.adapter.jms.JmsXAResource.prepare(JmsXAResource.java:89)
      at org.jboss.resource.connectionmanager.xa.JcaXAResourceWrapper.prepare(JcaXAResourceWrapper.java:93)
      at org.jboss.tm.TransactionImpl$Resource.prepare(TransactionImpl.java:2212)
      at org.jboss.tm.TransactionImpl.prepareResources(TransactionImpl.java:1660)
      at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:347)
      ... 7 more


      At this point, the affected MDB becomes deadlocked, and fails to process any more messages. If we leave the system running, then eventually more MDBs fail with the same exception.

      Notes:

      1. If we only run ONE MDB, then we never get this exception! However as soon as we introduce even a second MDB, we start seeing these failures again.

      2. The deadlock point seems to be when the code tries to return from the first SLSB call - i.e. execution never returns to the MDB onMessage() method context.

      3. JBossMQ and the SLSB EntityManager use the same datasource (Postgres 8.2 as a local-tx-datasource).

      4. This happens both if we use persistent and non-persistent messages.

      5. If we mark the second nested SLSB (the one that sends a message to the JMS Topic) with TransactionAttributeType.NOT_SUPPORTED then we don't get this exception! This is not a viable solution however, as we need any messages sent during processing to be rolled back if an exception is thrown.

      My guess is that this problem has something to do with the SLSB committing its transaction (at which point the nested JMS messages would also need to be sent?). As it is unpredictably intermittent and also works when single threaded, I think there must be some thread race-condition in the JMS locking mechanism.

      Any help here would be greatly appreciated!
      Thanks
      Scott

        • 1. Re: Deadlock race-condition committing JBossMQ txn

          OK I have found a solution!

          The deadlocking seems to be caused by this bug: https://jira.jboss.org/jira/browse/JBAS-5801

          As JBoss AS 4.2.4.GA is not available yet, I manually backported the fixes from the 4.2.4.GA branch into my JBoss 4.2.3.GA sources (building and updating the jms-ra.rar and jboss-xa-jdbc.rar archives accordingly):

          http://fisheye.jboss.org/changelog/JBossAS/?cs=76314

          This seems to have fixed the IllegalMonitorStateException problem and has stopped my MDBs from deadlocking.

          We are now looking forward to an official JBoss 4.2.4.GA release so we don't have to ship with a patched server...

          • 2. Re: Deadlock race-condition committing JBossMQ txn

            Or just use the workaround, i.e. add

            <track-connection-by-tx/>
            

            to the tx-connection-factory in jms-ds.xml

            • 3. Re: Deadlock race-condition committing JBossMQ txn
              janush

              Adrian, I applied the changes from JBAS-5801 because I had illegal monitor state exceptions in JBoss 4.2.3 too.
              But, java.lang.IllegalMonitorStateException appears again, in another method:

              java.lang.IllegalMonitorStateException
               at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:127)
               at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1175)
               at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:431)
               at org.jboss.resource.adapter.jms.JmsManagedConnection.unlock(JmsManagedConnection.java:416)
               at org.jboss.resource.adapter.jms.JmsXAResource.end(JmsXAResource.java:76)
               at org.jboss.resource.connectionmanager.xa.JcaXAResourceWrapper.end(JcaXAResourceWrapper.java:58)
               at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.endSuspendedRMs(TransactionImple.java:1529)
               at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commit(TransactionImple.java:235)
               at org.jboss.ejb.plugins.TxInterceptorCMT.endTransaction(TxInterceptorCMT.java:501)
               at org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:361)
               at org.jboss.ejb.plugins.TxInterceptorCMT.invoke(TxInterceptorCMT.java:181)
               at org.jboss.ejb.plugins.SecurityInterceptor.invoke(SecurityInterceptor.java:168)
               at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:205)
               at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:138)
               at org.jboss.ejb.SessionContainer.internalInvoke(SessionContainer.java:648)
               at org.jboss.ejb.Container.invoke(Container.java:960)
               at org.jboss.ejb.plugins.local.BaseLocalProxyFactory.invoke(BaseLocalProxyFactory.java:430)
               at org.jboss.ejb.plugins.local.StatelessSessionProxy.invoke(StatelessSessionProxy.java:103)
              

              Is the only way to avoid it is to use track-connection-by-tx?