5 Replies Latest reply on Aug 2, 2013 2:43 PM by clebert.suconic

    A possible XA failover issue

    gaohoward

      With a live/backup server pair running, the following events is expected to happen:

       

      1) A client using a HA connection starts a XA transaction and
      2) sends some messages
      3) live crashes and backup becomes live
      4) the client failover transparently
      5) the client then ends the tx, got expected exception (XA_RBOTHER).
      6) if then the client tries to receive those messages sent in 2)
      7) nothing should be received.

       

      The above is shown in test:

       

      org.hornetq.tests.integration.cluster.failover.FailoverTest.testXAMessagesSentSoRollbackOnEnd()

       

      However if we change the test a bit so that '2) sends some messages' happens after '3) live crashes and backup becomes live',
      I would expect the same result, i.e.

       

      1) A client using a HA connection starts a XA transaction and
      2) live crashes and backup becoming live
      3) sends some messages
      4) the client failover transparently
      5) the client then ends the tx, got expected exception(XA_RBOTHER).
      6) if then the client tries to receive those messages sent in 2)
      7) nothing should be received.

       

      However I have got a different result. In 5) I get a different exception code XAER_NOTA, rather than XA_RBOTHER;
      and in 7), messages can be received.

       

      I'm confused over this result. My question:

       

      1) In the latter case should it throw XA_RBOTHER or XAER_NOTA?
      2) In either case shouldn't the messages be discarded and never be received, or be treated differently with different exception code?

       

      Note: the reason we get different error code is that in the first case the failover client session marked itself as 'rollback only'
      while in the second it didn't. Additionally, messages sent to the (backup) server after crash will be treated non-transactionally as the
      newly created server session don't have knowledge of the transaction (i.e. non prepared transaction won't get failover)

       

      Similar issue also happen with receiving. I've two modified tests to show the problem:

       

      https://github.com/gaohoward/hornetq/commit/6d59e984be578ed22f126d9f0fd55a215513c768

       

      Any advice is appreciated.