9 Replies Latest reply on Dec 1, 2009 6:33 AM by marklittle

    JBoss can't recover transaction which is in prepared phase

    dengyong

      I have done a following transaction testing.
      In a transaction, I have two XAResource participants, assume they are x1 and x2.
      With two phase commit protocol, the normal transaction commit sequence will be:
      x1.prepare
      x2.prepare
      x1.commit
      x2.commit

      After x1.prepare an before x2.prepare, I kill JBoss.
      After JBoss is restarted, I never find JBoss do transaction recovery.

      Is this a transaction recovery bug?

        • 1. Re: JBoss can't recover transaction which is in prepared pha
          jhalliday

          > After JBoss is restarted, I never find JBoss do transaction recovery.

          Do you mean it does not run a recovery pass, or that it does not process that specific resource? Have you configured recovery for the relevant resource manager?

          • 2. Re: JBoss can't recover transaction which is in prepared pha
            dengyong

             

            "jhalliday" wrote:
            Do you mean it does not run a recovery pass, or that it does not process that specific resource? Have you configured recovery for the relevant resource manager?


            JBoss transaction didn't call x1.rollback to rollback the XA resource.

            Also I find if I kill JBoss after xa.prepare and before x2.prepare, I didn't find any relevant log in data/tx-object-store.

            • 3. Re: JBoss can't recover transaction which is in prepared pha
              dengyong

               

              "jhalliday" wrote:
              Do you mean it does not run a recovery pass, or that it does not process that specific resource? Have you configured recovery for the relevant resource manager?

              Fix some typos.

              JBoss transaction recovery manager didn't call x1.rollback to rollback the XA resource.
              Also I find if I kill JBoss after x1.prepare and before x2.prepare, I didn't find any relevant transaction log in data/tx-object-store.

              • 4. Re: JBoss can't recover transaction which is in prepared pha
                jhalliday

                a) That does not fully answer my question and b) you're not expected to - the log is not written until all resources prepare. Read up on the protocol.

                • 5. Re: JBoss can't recover transaction which is in prepared pha
                  dengyong

                   

                  "jhalliday" wrote:
                  a) That does not fully answer my question and b) you're not expected to - the log is not written until all resources prepare. Read up on the protocol.


                  a) In my testing, one XA resource oracle database, while the other is JBoss messaging. I configure recovery modules in jbossjta-properties.xml for both of them
                  b) I didn't detailed go through the protocol. :) how recovery will be handled before all resources are prepared and after some of resources are successfully prepared?


                  • 6. Re: JBoss can't recover transaction which is in prepared pha
                    mmusgrov

                    The resource that was prepared will either detect that its connection to the TM has dropped or it could time out the transaction branch. So it will rollback the transaction.

                    When the TM comes back up it won't have any knowledge of the transaction so things will be consistent,. On the other hand if had told all resources to prepare it would have written a log record and will call recover on each resource to find out their view of the transaction and things will still be consistent (unless it had called commit on some resources before crashing in which case heuristic outcomes are possible).

                    • 7. Re: JBoss can't recover transaction which is in prepared pha
                      vickyk

                       

                      "mmusgrov" wrote:
                      The resource that was prepared will either detect that its connection to the TM has dropped or it could time out the transaction branch. So it will rollback the transaction.

                      You mean the rollback on TX which is local at the RM, right?

                      "dengyong" wrote:
                      After x1.prepare an before x2.prepare, I kill JBoss.

                      There would be no transaction logs so the recovery would not be able to detect the failed transaction and hence no recovery.
                      The transaction logs would be created after the success of prepare phase, so kill the tx after the successful commit of the first resource and you should see the recovery working.


                      • 8. Re: JBoss can't recover transaction which is in prepared pha
                        adinn

                         


                        "mmusgrov" wrote:
                        The resource that was prepared will either detect that its connection to the TM has dropped or it could time out the transaction branch. So it will rollback the transaction.

                        You mean the rollback on TX which is local at the RM, right?


                        Not really, Vicky. There is only one TX. You make it sound like there are several of them.

                        If the AS crashes after preparing the first resource but before preparing the second then at reboot the AS knows nothing about the TX which was in progress when it crashed. However, the RM for the first resource does still know about the TX because it completed prepare. This is true even if the RM also crashed when the AS went down. It completed prepare so it will have made a durable record of its participation in the TX.

                        So, the AS has nothing to rollback when it restarts. The RM has to rollback the changes it has made up to prepare. Its own (private) durable log record contains all the info it needs to do that. It may decide to autonomously rollback these changes because of a timeout but it will not normally do this because this risks a heuristic outcome. For all it knows, the AS might have prepared the remaining participants and sent out several commit messages at the point where the crash occurred (it does not know it was the first and only RM to prepare).

                        This does not mean that the AS is not involved in recovery of these TXs. The normal situation is that the RM waits for the AS to contact it again after reboot. Shortly after reboot the recovery manager runs and tells the RM what to do about any unfinished transactions -- this includes workng out what to do about transactions unknown to the AS but known to the RM.

                        Essentially the AS and RM just exchange a list of TX ids. The AS says here are the TX ids which I found prepared on disk in my log and which, therefore, need to be rolled forward. The RM compares this list with the is of TXs in its log which are prepared but not completed. The RM has to roll forward changes for any TXs whose ids are in the AS list and roll back changes for TXs whose ids are not in the AS list.

                        If there is a TX in the AS's list which is not in the RM's list then this can only happen because i) the RM timed out a prepared TX and rolled it back autonomously or ii) the RM already commited its changes but the AS crashed before processing the committed reply. The RM knows which case is which because it remembers which TXs it has autonomously rolled back. In case ii) the error means the TX has a heuristic outcome (some other participants may have rolled forward).

                        The RM returns to the AS a list of all the TXs it has actually committed. If this list omits some of the TXs in the AS's list then that can only be because the RM roled them back autonomously. The outcome of these missing transactions is in doubt so the AS logs a heuristic warning.

                        So, if you kill the TX in the middle of prepare the recovery manager will still talk to the RM and the RM will do recovery. If you use a debugger you can step through the code executed by the recovery thread (look for class PeriodicRecovery) and watch it invoke the XA recovery module to perform this negotiation with the RM.

                        If you don't understand the full details of this protocol I suggest you read a good book on Transactions (Mark Little's book is very clear on this subject ;-)



                        • 9. Re: JBoss can't recover transaction which is in prepared pha
                          marklittle

                          Couldn't have said any of that better myself. Well done Andrew :-)