3 Replies Latest reply on May 6, 2009 10:13 AM by janush

Deadlock race-condition committing JBossMQ txn

scotto Oct 30, 2008 11:41 PM

Hello.

We are getting an intermittent exception when our system processes a large number of JBossMQ messages in parallel. Whenever we get this exception, one of our Message Driven Beans becomes deadlocked, and never processes any more messages.

Our scenario is as follows (JBoss 4.2.3.GA, EJB3):

- We have 5 separate MDBs processing messages from 5 different queues simultaneously.
- Each MDB is a singleton on its queue (i.e. maxSession = 1).
- Each MDB has a large backlog of messages to process, so they are all processing at once.
- During processing of a message, each MDB calls a SLSB to do some work which updates the database. This SLSB will then in-turn call another SLSB which sends a messages to a JBossMQ topic (e.g. notifying observers that something has changed in the database).

This will work fine for a short while, but after a few seconds (maybe 50-100 messages processed) we always get the following exception:

2008-10-31 11:47:50,440 563340 ERROR [org.jboss.resource.adapter.jms.inflow.JmsServerSession] (WorkManager(4)-124:) org.jboss.resource.adapter.jms.inflow.JmsServerSession@af2931 failed to commit/rollback
org.jboss.tm.JBossRollbackException: Unable to commit, tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=vieo-ws01/7090, BranchQual=, localId=7090] status=STATUS_NO_TRANSACTION; - nested throwable: (java.lang.IllegalMonitorStateExcepti
on)
at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:372)
at org.jboss.tm.TxManager.commit(TxManager.java:240)
at org.jboss.resource.adapter.jms.inflow.JmsServerSession$XATransactionDemarcationStrategy.end(JmsServerSession.java:494)
at org.jboss.resource.adapter.jms.inflow.JmsServerSession.run(JmsServerSession.java:248)
at org.jboss.resource.work.WorkWrapper.execute(WorkWrapper.java:204)
at org.jboss.util.threadpool.BasicTaskWrapper.run(BasicTaskWrapper.java:275)
at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:761)
at java.lang.Thread.run(Thread.java:595)
Caused by: java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:125)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1137)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:431)
at org.jboss.resource.adapter.jms.JmsManagedConnection.unlock(JmsManagedConnection.java:416)
at org.jboss.resource.adapter.jms.JmsXAResource.prepare(JmsXAResource.java:89)
at org.jboss.resource.connectionmanager.xa.JcaXAResourceWrapper.prepare(JcaXAResourceWrapper.java:93)
at org.jboss.tm.TransactionImpl$Resource.prepare(TransactionImpl.java:2212)
at org.jboss.tm.TransactionImpl.prepareResources(TransactionImpl.java:1660)
at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:347)
... 7 more

At this point, the affected MDB becomes deadlocked, and fails to process any more messages. If we leave the system running, then eventually more MDBs fail with the same exception.

Notes:

1. If we only run ONE MDB, then we never get this exception! However as soon as we introduce even a second MDB, we start seeing these failures again.

2. The deadlock point seems to be when the code tries to return from the first SLSB call - i.e. execution never returns to the MDB onMessage() method context.

3. JBossMQ and the SLSB EntityManager use the same datasource (Postgres 8.2 as a local-tx-datasource).

4. This happens both if we use persistent and non-persistent messages.

5. If we mark the second nested SLSB (the one that sends a message to the JMS Topic) with TransactionAttributeType.NOT_SUPPORTED then we don't get this exception! This is not a viable solution however, as we need any messages sent during processing to be rolled back if an exception is thrown.

My guess is that this problem has something to do with the SLSB committing its transaction (at which point the nested JMS messages would also need to be sent?). As it is unpredictably intermittent and also works when single threaded, I think there must be some thread race-condition in the JMS locking mechanism.

Any help here would be greatly appreciated!
Thanks
Scott

1. Re: Deadlock race-condition committing JBossMQ txn

scotto Nov 4, 2008 3:18 AM (in response to scotto)

OK I have found a solution!

The deadlocking seems to be caused by this bug: https://jira.jboss.org/jira/browse/JBAS-5801

As JBoss AS 4.2.4.GA is not available yet, I manually backported the fixes from the 4.2.4.GA branch into my JBoss 4.2.3.GA sources (building and updating the jms-ra.rar and jboss-xa-jdbc.rar archives accordingly):

http://fisheye.jboss.org/changelog/JBossAS/?cs=76314

This seems to have fixed the IllegalMonitorStateException problem and has stopped my MDBs from deadlocking.

We are now looking forward to an official JBoss 4.2.4.GA release so we don't have to ship with a patched server...
Actions
2. Re: Deadlock race-condition committing JBossMQ txn

adrian.brock Nov 19, 2008 9:01 AM (in response to scotto)
Or just use the workaround, i.e. add

<track-connection-by-tx/>

to the tx-connection-factory in jms-ds.xml
Actions

3. Re: Deadlock race-condition committing JBossMQ txn

janush May 6, 2009 10:13 AM (in response to scotto)

Adrian, I applied the changes from JBAS-5801 because I had illegal monitor state exceptions in JBoss 4.2.3 too.
But, java.lang.IllegalMonitorStateException appears again, in another method:

java.lang.IllegalMonitorStateException
 at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:127)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1175)
 at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:431)
 at org.jboss.resource.adapter.jms.JmsManagedConnection.unlock(JmsManagedConnection.java:416)
 at org.jboss.resource.adapter.jms.JmsXAResource.end(JmsXAResource.java:76)
 at org.jboss.resource.connectionmanager.xa.JcaXAResourceWrapper.end(JcaXAResourceWrapper.java:58)
 at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.endSuspendedRMs(TransactionImple.java:1529)
 at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commit(TransactionImple.java:235)
 at org.jboss.ejb.plugins.TxInterceptorCMT.endTransaction(TxInterceptorCMT.java:501)
 at org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:361)
 at org.jboss.ejb.plugins.TxInterceptorCMT.invoke(TxInterceptorCMT.java:181)
 at org.jboss.ejb.plugins.SecurityInterceptor.invoke(SecurityInterceptor.java:168)
 at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:205)
 at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:138)
 at org.jboss.ejb.SessionContainer.internalInvoke(SessionContainer.java:648)
 at org.jboss.ejb.Container.invoke(Container.java:960)
 at org.jboss.ejb.plugins.local.BaseLocalProxyFactory.invoke(BaseLocalProxyFactory.java:430)
 at org.jboss.ejb.plugins.local.StatelessSessionProxy.invoke(StatelessSessionProxy.java:103)

Is the only way to avoid it is to use track-connection-by-tx?

Go to original post