5 Replies Latest reply on Jan 14, 2015 5:09 AM by marklittle

    Data inconsistency for XA transaction when db resource returns XAER_RMERR code for lost connection

    ochaloup

      Hi,

       

      I would like start discussion about scenario which ends with inconsistent data for XA transaction when XA resource returns XAException.XAER_RMERR and continues with commit.

       

      This behaviour is against specification as it mandates that:

      An error occurred in committing the work performed on behalf of the transaction

      branch and the branch’s work has been rolled back. Note that returning this error

      signals a catastrophic event to a transaction manager since other resource

      managers may successfully commit their work on behalf of this branch. This error

      should be returned only when a resource manager concludes that it can never

      commit the branch and that it cannot hold the branch’s resources in a prepared

      state. Otherwise, [XA_RETRY] should be returned.

      see discussion and Tom's comment at Bug 1169671 – Recovery scenario where db connection is halted after prepare phase does not rollback resource

       

      but the problem is that a lot of databases behaves in this way. Databases as PostgreSQL, MSSQL or Sybase throws XAException.XAER_RMERR anytime when connection is lost.

      Narayana transaction manager then rollbacks the rest of the transaction. If we have following scenario the result is inconsistent data.

       

      1. prepare DB xa resource
      2. prepare second xa resource
      3. commit DB xa resource
      4. DB commits
      5. connection crashes (before confirmation is received by transaction manager)
      6. jdbc driver returns XAException as connection is down

       

      There are now 2 cases. At least for databases that EAP app server supports.

       

      The jdbc driver returns XAException.XAER_RMFAIL or XAER_RETRY. That's ok as all the subsequent xa resources are committed. Apart from a small issue of recovery manager that will repeat a try to commit non-existent XID (as DB already commits). This should be fixed by [JBTM-860] use XAResourceWrapper metadata for assume complete - JBoss Issue Tracker.

       

      The second case is the problematic one.

      The jdbc driver returns XAException.XAER_RMERR. In this case DB commits but after connection is lost method doAbort is called for the rest of xa resources. Thus the other resources are rollbacked.

      I understand that it's problem of jdbc driver and incorrect error code but it disconcerts me a bit the fact that databases like mssql, postgresql etc. could end up with inconsistent data. At least for this (corner) case.

       

      Is this just a documentation issue from TM point of view? Or could Narayana somelike prevent that situation?

       

      Thanks

      Ondra