2 Replies Latest reply on Jan 11, 2012 11:47 PM by stevesmith

    MDB XA timeout clash with Atomikos?

    stevesmith

      Hi,

       

      We're seeing a problem where a transaction timeout on an MBD will cause a message to remain 'stuck' until a restart. This is with Atomikos as the XA transaction service.

       

      A summary of the situation:

       

      * An MDB receives a message (via a Spring DefaultMessageListenerContainer)

      * MDB retrieves an object from the DB and then calls an external service

      * The MDB contacts an external service, which is holds the connection open but fails to respond

      * After 5 minutes the MDB transaction times-out

       

      This results in the following error in the logs:

       

      2012-01-12 14:43:59,013 WARN  eads-19844008) ResourceManagerImpl       : transaction with xid XidImpl (31362155 bq:49.50.55.46.48.46.49.46.49.46.116.109.49.55.52 formatID:1096044365 gtxid:49.50.55.46.48.46.49.46.49.46.116.109.48.48.49.55.54.48.48.48.48.49 timed out
      2012-01-12 14:43:59,037 WARN  Atomikos:19    atomikos                  : XA resource 'HornetXAResource': rollback for XID '3132372E302E312E312E746D30303137363030303031:3132372E302E312E312E746D313734' raised -4: the supplied XID is invalid for this XA resource
      javax.transaction.xa.XAException
                at org.hornetq.core.client.impl.ClientSessionImpl.rollback(ClientSessionImpl.java:1486)
                at com.atomikos.datasource.xa.XAResourceTransaction.rollback(XAResourceTransaction.java:690)
                at com.atomikos.icatch.imp.RollbackMessage.send(RollbackMessage.java:72)
                at com.atomikos.icatch.imp.PropagationMessage.submit(PropagationMessage.java:111)
                at com.atomikos.icatch.imp.Propagator$PropagatorThread.run(Propagator.java:87)
                at com.atomikos.icatch.imp.Propagator.submitPropagationMessage(Propagator.java:66)
                at com.atomikos.icatch.imp.CoordinatorStateHandler.rollback(CoordinatorStateHandler.java:746)
                at com.atomikos.icatch.imp.ActiveStateHandler.onTimeout(ActiveStateHandler.java:97)
                at com.atomikos.icatch.imp.CoordinatorImp.alarm(CoordinatorImp.java:1105)
                at com.atomikos.timing.PooledAlarmTimer.notifyListeners(PooledAlarmTimer.java:112)
                at com.atomikos.timing.PooledAlarmTimer.run(PooledAlarmTimer.java:99)
                at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                at java.lang.Thread.run(Thread.java:662)
      
      

       

      After this the message 'disappears'; it appears in the messageCount JMX property but does not appear in a list of messages. Only a restart forces it to roll-back.

       

      What I think is happening is that the timeout of the HornetQ transaction is occuring before the timeout of the Atomikos transaction. This confuses Atomikos and results in a message that is neither rolled-back or acknowledged.  Unfortunately it's not possible to set the HornetQ transaction separately from the Atomikos one as the default is overridden but the JTA defined timeout.

       

      Does anyone have a suggestion on how to handle this situation?

       

      Cheers,
      Steve