14 Replies Latest reply on Jul 15, 2010 7:08 AM by Mark Little

    XAResource.recover() doesn't seem to work in standalone mode

    Ben Spiller Newbie

      I thought HornetQ implemented the full JMS standard including all the XA bits, but I always get 0 Xids returned when I called recover(), even if I do so immediately after successfully preparing the transaction:

       

       

      {code}Xid xid = createXid();
      xar.start(xid, XAResource.TMNOFLAGS);
      producer.send(session.createTextMessage(getClass().getName()+" "+xid.toString()));
      xar.end(xid, XAResource.TMSUCCESS);
      xar.prepare(xid);
      Xid[] xids = xar.recover(XAResource.TMNOFLAGS);
      log("Recovered "+xids.length+": "+Arrays.toString(xids));
      org.junit.Assert.assertArrayEquals(new Xid[]{xid}, xids);{code}

       

       

      Other JMS providers like SonicMQ and ActiveMQ have no trouble doing this... What worries me is that the HornetQ xa-heuristic sample seems to deliberately avoid using the standard JTA/XA API for the recovery operation, and uses a proprietary mechanism based around JMX to do the same job instead. Does that mean it's not possible to use this part of JTA if you're not running in an appserver? This seems weird given that HornetQ clearly has all the information required to return correct results from these API calls, even in standalone mode.

       

      If so, it's a real shame as I guess it means if we decide we want to support HornetQ in our product we'll need to introduce a big if clause like 'if (provider != hornetQ) return xa.recover() else return doTheSameThingUsingProprietaryHornetqJmxBeanMechanism()' to call the HornetQ bean, rather than just using the standard JMS api.

        • 1. Re: XAResource.recover() doesn't seem to work in standalone mode
          Tim Fox Master

          HornetQ fulls implements the XAResource, including recover(). There is nothing non standard there.

           

          The JMX stuff is *in addition* to the standard JTA stuff. This allows administrators to apply heuristics via a nicer interface.

          • 2. Re: XAResource.recover() doesn't seem to work in standalone mode
            Tim Fox Master

            Take a look in the test suite for tests that show XA recovery working

            • 3. Re: XAResource.recover() doesn't seem to work in standalone mode
              Tim Fox Master

              BTW, in your code snippet:

               

              xar.recover(XAResource.TMNOFLAGS)

              It must be
              TMSTARTRSCAN to start a scan and TMENDRSCAN to end the scan.

              See JTA specification page 50 for more details.
              • 4. Re: XAResource.recover() doesn't seem to work in standalone mode
                Ben Spiller Newbie

                Hi Tim, thanks for your help. that was exactly the problem - the XAResource javadoc isn't desparately clear so I copied a sample app for another JMS provider - but as you point out deep in the JTA spec it does say you have to use start/end (though it's quite a weird API - a Java iterator would have made more sense imho). I spotted a helpful discussion of the recover() method on the WebSphere forums http://www.mqseries.net/phpBB2/viewtopic.php?t=7233

                 

                Now that I'm calling the method correctly the recovery operation seems to return correct results

                 

                However:

                1) Since you're saying XAResource.recover(TMNOFLAGS) is invalid unless you've already doing recover(TMSTART), do you think it would be better to throw an exception if people call it while it's in the wrong state, rather than silently failing and returning no Xids?

                 

                2) Could the HornetQ XA sample include using XAResource.recover()? The correct use of recover is sufficiently complicated that it would be helpful to show users a correct example and also as you saw from my original post, the fact that the sample uses the JMX api to do recovery rather than XAResource.recover made me wonder whether HornetQ only supports recovery using JMS.

                 

                e.g.

                 

                List<Xid> recovered = new ArrayList<Xid>();
                 
                Xid[] temp = xar.recover(XAResource.TMSTARTRSCAN);
                while (temp != null && temp.length > 0)
                {
                    recovered.addAll(Arrays.asList(temp));
                    temp = xar.recover(XAResource.TMNOFLAGS);
                }
                temp = xar.recover(XAResource.TMENDRSCAN);
                if (temp != null && temp.length > 0)
                    recovered.addAll(Arrays.asList(temp));
                 
                 
                System.out.println("Recovered "+recovered.size()+" prepared transaction(s), which can be rolled back or committed: ");
                for (Xid xid: recovered)
                    System.out.println("   "+xid);
                

                 

                I can put bugs into the database if you think it's a good idea...

                • 5. Re: XAResource.recover() doesn't seem to work in standalone mode
                  Andy Taylor Master

                  The APi is historic and comes from C code, hence the weird API

                  • 6. Re: XAResource.recover() doesn't seem to work in standalone mode
                    Tim Fox Master

                    Also recover() is not something that most users would ever really call. It's intended to be called by transaction managers so they can obtain the list of prepared transactions. The people who write the transaction managers probably understand the weird recover protocol pretty well.

                    • 7. Re: XAResource.recover() doesn't seem to work in standalone mode
                      Tim Fox Master

                      From JTA spec:

                       

                      "The **transac-tion manager** calls this method during recovery to obtain the list of transaction branches that are currently in
                      prepared or heuristically completed states."

                       

                      So, unless you're writing a transaction manager, I don't know why you would be calling this.

                      • 8. Re: XAResource.recover() doesn't seem to work in standalone mode
                        Tim Fox Master

                        Note that the recover() method is not designed to be called by users so they can apply heuristic commits or rollbacks independently of the transaction manager.

                         

                        The JTA spec say "... to obtain the list of transaction branches that are currently in prepared or ** heuristically completed states **"

                         

                        So if a heuristic commit or rollback is applied independently of the tx mgr, then you call recover() again you will still get back the transaction that you heuristically completed! So this method is useless to you if you want to use it to find the list of txs you can apply heuristics to.

                         

                        The tx mgr needs to know this information on heuristics so it can make decisions and resolve transactions appropriately.

                         

                        There is no standard way of listing transactions that can have heuristics applied using the JTA API. This is why every provider does it differently. E.g. HornetQ provides a management / JMX way of listing this information.

                         

                        This stuff gets pretty complex. I recommend asking the experts in the JTA forum for more information.

                        • 9. Re: XAResource.recover() doesn't seem to work in standalone mode
                          Ben Spiller Newbie

                          Thanks guys. I take your point about recover() not being used very often by end-users since XA normally involves a transaction manager.

                           

                          Though as far as I can see the way the JTA interfaces are designed, very sensibly, means that the XAResource manager doesn't actually know anything about a transaction manager, it just exposes some primitives to allow some code (which is typically but not necessarily a transaction manager) to create/rollback/commit/recover the resource manager's transactions.

                           

                          In my case I'm not running in a J2EE container so there's no transaction manager... there's just our code, calling methods on the JMS XAResource interface to transactionally combine JMS message sending with our own state-persist operations (in our own data store) using 2PC, since short of using proprietary stuff (e.g. the HornetQ duplicate detection, that many other providers don't have) there's no other way to send messages exactly-once. My algorithm is roughly:

                           

                          1. xa.start(xid)
                          2. send messages
                          3. xa.end(xid)
                          4. xa.prepare(xid) // once this successfully completes, I believe the JMS provider has persisted the messages so they can survive a broker failure
                          5. every 50ms persist(all prepared xids) in my own non-standard persistent data store (along with other app-specific state related to the message) - and at this point I can safely throw away the contents of the messages to be sent, since they're now stored by JMS
                          6. once my persist operation has completed successfully, commit all the xids to the JMS bus

                           

                          a) If my client crashes any time before (5) has completed successfully, after the process restarts I'll call recover() and rollback any prepared or partially-completed xids since they're not in my persistent store

                          b) If my client crashes before (6) the prepared transactions will be stored by JMS provider and I can call recover() then commit the xids that are listed in my own data store, knowing that any transactions that were in fact sent before the client crashed, will not be returned by recover() and so won't get sent twice

                          c) if my client crashes after (6), I can discover using recover() that there are no transactions left to be committed, so there's nothing else to do

                           

                          i.e. there's no transaction manager, so I'm allowed to commit/rollback the JMS XA transactions as necessary, and don't need to worry about a transaction manager messing things up by doing heuristic operations I wasn't expecting. I prepare the JMS send, then commit to my database, then if that worked, commit the JMS send. (combining a JMS 2PC with our own 1PC)

                           

                          Hope that made sense... as far as I can tell everything should work fine as long as the JMS provider does what the spec says.

                          • 10. Re: XAResource.recover() doesn't seem to work in standalone mode
                            Tim Fox Master

                            Sounds like you're writing a transaction manager

                             

                            I'd be aware there are many pitfalls and edge cases in implementing 2PC properly. It is not a trivial task. Good luck.

                             


                            b) If my client crashes before (6) the prepared transactions will be stored by JMS provider and I can call recover() then commit the xids that are listed in my own data store, knowing that any transactions that were in fact sent before the client crashed, will not be returned by recover() and so won't get sent twice

                            c) if my client crashes after (6), I can discover using recover() that there are no transactions left to be committed, so there's nothing else to do


                            recover() doesn't just give you the list of prepared and not committed transactions. It gives you the list of prepared and not committed transactions + the list of heuristically committed/rolled back transactions. So if someone has applied a heuristic on the server side and say committed the transaction it will still be returned in recover() so you'll have to deal with that.

                            • 11. Re: XAResource.recover() doesn't seem to work in standalone mode
                              Mark Little Master

                              As Tim says, you're writing a transaction manager. Don't bother if there's a perfectly good one already there for you to use, such as JBossTS.

                              • 12. Re: XAResource.recover() doesn't seem to work in standalone mode
                                Ben Spiller Newbie

                                Yep, thanks for the luck, I can see I'm gonna need it!

                                 

                                Tim - you're warning me that recover will return heuristically recovered/rolled-back transactions... is that actually going to pose a problem? I believe that in JMS world transactions would only end up heuristically completed/rolledback if the user intervenes (e.g. using JMX) to mess with outstanding transactions - in which case it's their fault if they make the wrong choices (I can just catch and log the exceptions, right?). But presumably HornetQ isn't going to arbitrarily decide to commit/rollback prepared transactions itself?

                                 

                                This stuff is certainly not easy... but every time I see your friendly muppet avatar it makes me feel less stressed Thanks for the advice Tim.

                                • 13. Re: XAResource.recover() doesn't seem to work in standalone mode
                                  Ben Spiller Newbie

                                  Hi Mark... hmmm I don't really want to add a 3rd party transaction manager since I'm not in a J2EE container, and don't need most of the complicated transaction management stuff, just the ability to batch messaging send/receive operations into a transaction with an identifier that persists across restarts of the client until I either commit/rollback - which the XAResource appears to provide. If I went down the txn manager route I'd have to correctly implement XAResource for the proprietary persist operation I need to perform in conjunction with sending/receiving, which would itself be fairly complicated. Also, talking to friends in the JMS world (who work in customer support) I've been told they quite often have problems around XA which are fixed by recommending a switch to a different J2EE transaction manager... so there's no guarantee using someone else's txn management code (which would include lots of features I don't need) would actually fix all the problems, and of course it's harder to work around any JMS XA implementation quirks I do come across if using someone else's code to manage the transactions.

                                   

                                  All I really want is:

                                  1. On the message receiving side, a sliding window of unacknowledged messages in my session - so I can receive and process messages m..n while I do some state-persist operation in the background, and then when it's complete acknowledge the messages received before the persist (i.e. 1..(m-1)) without also acknowledging the later messages (m..n) at the same time. This isn't possible with CLIENT_ACKNOWLEDGE (which acks everything received in the session). Using XA I can batch up groups of receive() operations, prepare the txn then move on to receive more messages and commit them when I'm ready... doing this significantly improves the receive throughput and latency I can get in my application, despite the extra overheads of doing XA (e.g. using client_ack+pausing my receive thread while I do a persist =7,900 msg/sec, but using xa to receive =10,300 msg/sec)
                                  2. On the message sending side, idempotent/exactly-once message sending, even if the client machine crashes at some point. This is only possible by having some kind of identifier associated with the messages (or transactional batch of messages) which persists across crashes of the client. Some JMS providers such as HornetQ have a proprietary way of doing this, but the only way to attach a cross-process-restart identifier to messages using standard JMS is XA transactions.

                                   

                                  To me these seem like entirely reasonable requirements, and it's strange that XA is the only non-proprietary way of addressing them... and a bit scary that XA seems to vary so much between vendors and have so many quirks and complexities. Maybe I should just give up and accept poorer performance and lose nice exactly-once send semantics...

                                  • 14. Re: XAResource.recover() doesn't seem to work in standalone mode
                                    Mark Little Master

                                    First, our transaction manager (JBossTS) can run outside of any container, so that shouldn't be an issue for you.

                                     

                                    Second, not needing "<i>most</i> of the complicated transaction management stuff" seems to cover precisely what you do need. Think about it this way: your participant (XAResource or whatever) needs to record information durably so that it can be run to completion in the event of failures. Now you need something else (in your example I think you are intending to do this work) that records the participants (durably) and replays the outcome in the event of failures. Guess what? That's what the transaction coordinator does.

                                     

                                    If you remain unconvinced then let me know which "complication transaction management stuff" you think you don't need.