MDB XA timeout clash with Atomikos?
stevesmith Jan 11, 2012 10:58 PMHi,
We're seeing a problem where a transaction timeout on an MBD will cause a message to remain 'stuck' until a restart. This is with Atomikos as the XA transaction service.
A summary of the situation:
* An MDB receives a message (via a Spring DefaultMessageListenerContainer)
* MDB retrieves an object from the DB and then calls an external service
* The MDB contacts an external service, which is holds the connection open but fails to respond
* After 5 minutes the MDB transaction times-out
This results in the following error in the logs:
2012-01-12 14:43:59,013 WARN eads-19844008) ResourceManagerImpl : transaction with xid XidImpl (31362155 bq:49.50.55.46.48.46.49.46.49.46.116.109.49.55.52 formatID:1096044365 gtxid:49.50.55.46.48.46.49.46.49.46.116.109.48.48.49.55.54.48.48.48.48.49 timed out 2012-01-12 14:43:59,037 WARN Atomikos:19 atomikos : XA resource 'HornetXAResource': rollback for XID '3132372E302E312E312E746D30303137363030303031:3132372E302E312E312E746D313734' raised -4: the supplied XID is invalid for this XA resource javax.transaction.xa.XAException at org.hornetq.core.client.impl.ClientSessionImpl.rollback(ClientSessionImpl.java:1486) at com.atomikos.datasource.xa.XAResourceTransaction.rollback(XAResourceTransaction.java:690) at com.atomikos.icatch.imp.RollbackMessage.send(RollbackMessage.java:72) at com.atomikos.icatch.imp.PropagationMessage.submit(PropagationMessage.java:111) at com.atomikos.icatch.imp.Propagator$PropagatorThread.run(Propagator.java:87) at com.atomikos.icatch.imp.Propagator.submitPropagationMessage(Propagator.java:66) at com.atomikos.icatch.imp.CoordinatorStateHandler.rollback(CoordinatorStateHandler.java:746) at com.atomikos.icatch.imp.ActiveStateHandler.onTimeout(ActiveStateHandler.java:97) at com.atomikos.icatch.imp.CoordinatorImp.alarm(CoordinatorImp.java:1105) at com.atomikos.timing.PooledAlarmTimer.notifyListeners(PooledAlarmTimer.java:112) at com.atomikos.timing.PooledAlarmTimer.run(PooledAlarmTimer.java:99) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
After this the message 'disappears'; it appears in the messageCount JMX property but does not appear in a list of messages. Only a restart forces it to roll-back.
What I think is happening is that the timeout of the HornetQ transaction is occuring before the timeout of the Atomikos transaction. This confuses Atomikos and results in a message that is neither rolled-back or acknowledged. Unfortunately it's not possible to set the HornetQ transaction separately from the Atomikos one as the default is overridden but the JTA defined timeout.
Does anyone have a suggestion on how to handle this situation?
Cheers,
Steve