1 2 Previous Next 15 Replies Latest reply on Nov 27, 2006 7:54 AM by timfox

    XA recovery integration

    timfox

      I'm looking at the current state of the 1_0_XARecovery branch.

      One thing I'm confused about, is I can see the prepared transaction ids being loaded at server peer startup into the transaction repository and I can see TransactionRepository::getPreparedTransactions returning a list of ids, but I can't see anywhere where the actual transactional state (i.e. the adds/acks) are being loaded from the database and "replayed" into the channel.

      I can also see the code Madhu supplied to do this has been removed. What am I missing here?

        • 1. Re: XA recovery integration
          timfox

          So to clarify, we should be doing something like the following:

          On server startup it should look in the db for any prepared tx states.

          These should be then be loaded into the transaction repository.

          We also need to load the state for these, i.e. "replay" the transaction through the channel.

          This should be done by recreating acks/refs for that transaction, and basically just sending or acking them transactionally against the channel.

          This will result in the correct tx callbacks being created on the tx.

          Then if the recover() is called from the client on the XAResource, the corresponding ids should be returned, and an entry should be added in the client resource manager (TXState) for each prepared tx (if they don't already exist)

          That's basically it.

          Some of the above could be done lazily (i.e. don't reload prepared states until recover() is called), but the basic principle is the same.

          • 2. Re: XA recovery integration

             

            "timfox" wrote:

            One thing I'm confused about, is I can see the prepared transaction ids being loaded at server peer startup into the transaction repository and I can see TransactionRepository::getPreparedTransactions returning a list of ids, but I can't see anywhere where the actual transactional state (i.e. the adds/acks) are being loaded from the database and "replayed" into the channel.


            Can you explain what you mean by replayed into the channel in terms of API calls?

            The handling of acks and references that I was able to extract from the submitted patch are below.

             public List getPreparedTransactions()
             {
            
             ArrayList prepared = new ArrayList();
            
             Iterator iter = globalToLocalMap.values().iterator();
            
             while (iter.hasNext())
             {
             Transaction tx = (Transaction) iter.next();
            
             try
             {
             if(trace)
             log.trace("Loading and handling refs and acks to the Tx "+tx);
            
             // TODO: [JPL] should this only apply to STATE_PREPARED transactions?
            
             handleReferences(tx, tx.getId());
             handleAcks(tx, tx.getId());
             }
             catch (Exception e)
             {
             // TODO: [JPL] fix this
             e.printStackTrace();
             }
            
             if (tx.xid != null && tx.getState() == Transaction.STATE_PREPARED)
             {
             prepared.add(tx.getXid());
             }
             }
            
             return prepared;
             }
            
            
            ...
            
            
            /**
             * Load the references and invoke the channel to handle those refs
             */
             private void handleReferences(Transaction tx, long txId) throws Exception {
            
             long messageId = persistenceManager.getMessageIdForRef(txId);
            
             List refsList = getRefs(messageId);
            
             // now we got all the refs
             // for each ref loaded, we'll invoke channel.handle
             for (Iterator iter = refsList.iterator(); iter.hasNext();)
             {
             CoreDestination d = getChannel(persistenceManager.getChannelId(txId), txId);
            
             if (trace)
             log.trace("Handling the channel");
            
             d.handle(null, (MessageReference) iter.next(), tx);
             }
             }
            
             /**
             * Load the acks and acknowledge them
             */
             private void handleAcks(Transaction tx, long txId) throws Exception {
            
             long messageId = persistenceManager.getMessageIdForAck(txId);
            
             List refsList = getRefs(messageId);
            
             for (Iterator iter = refsList.iterator(); iter.hasNext();)
             {
             Delivery del = new SimpleDelivery(null, (MessageReference) iter.next());
            
             try
             {
             if(trace)
             log.trace("Acknowledging..");
            
             ((DeliveryObserver)del).acknowledge(del, tx);
             }
             catch (Throwable e)
             {
             // TODO: [JPL] fix this
             e.printStackTrace();
             }
             }
             }
            
             /**
             * Get the message references based on the messageId from database
             */
             private List getRefs(long messageId) throws Exception
             {
             List noRefsList = new ArrayList();
             List refsList = new ArrayList();
            
             // Find the message store
             // TODO: [JPL] this needs to be fixed to go through the kernel
             MessageStore ms = getMessageStore();
            
             // and message reference from store
             MessageReference ref = ms.reference(messageId);
            
             // Store, sometime, does'nt know about the message referece
             // and the above ref may be null. Hence we need to load actual message
             // by goind back to the database and loading them based on id
            
             if (ref == null)
             {
             noRefsList.add(new Long(messageId));
             }
             else
             {
             refsList.add(ref);
             }
            
             // ask the pm to get the messages from messageId list
             List messagesList = persistenceManager.getMessages(noRefsList);
            
             for (Iterator iter = messagesList.iterator(); iter.hasNext();)
             {
             Message m = (Message) iter.next();
             MessageReference r = ms.reference(m);
             refsList.add(r);
             }
            
             return refsList;
             }
            
            
            



            "timfox" wrote:


            I can also see the code Madhu supplied to do this has been removed. What am I missing here?



            There should be no code removed. The only thing is some additions may have accidentally been omitted due to trying to extract a working patch from the multiple submissions. Let me know if something obvious is missing.


            • 3. Re: XA recovery integration

               

              "timfox" wrote:
              So to clarify, we should be doing something like the following:

              On server startup it should look in the db for any prepared tx states.



              Why should we do this at startup? The tx coordinator is driving the recovery, no? It will eventually find transactions that ought to have been terminated still in its logs in prepared state. This is done by the periodic tasks in JBossTS. Once this event occurs, JBossTS first initiates its integration layer (XAResourceRecovery) to obtain the XAResource references and then drives it via recovery() calls. At that point we need to go look into our message persistent state and find if we need to drive the prepared transactions to termination.

              This is how I understood how JBossTS manages things. Granted, I'm not done with the integration layer yet so I haven't been able to confirm this understanding to actual implementation, and may yet to be proven wrong.





              • 4. Re: XA recovery integration
                timfox

                 

                "juha@jboss.org" wrote:


                There should be no code removed. The only thing is some additions may have accidentally been omitted due to trying to extract a working patch from the multiple submissions. Let me know if something obvious is missing.


                I'm sorry, I was looking at an old version. I can see you have added that code now :)

                Yes this is what I mean by replaying, although there are multiple problems in the code, several of which you have spotted, but the overall method is pretty much right.





                • 5. Re: XA recovery integration
                  timfox

                  One issue is there is no need to manually add callbacks. Replaying the messages /acks will do this for you.

                  • 6. Re: XA recovery integration
                    timfox

                     

                    "juha@jboss.org" wrote:


                    Why should we do this at startup?



                    Check out this thread:

                    http://www.jboss.com/index.html?module=bb&op=viewtopic&t=90841

                    It seems, that sometimes it is valid for the tx mgr to call commit/rollback without first calling recover, hence we need any prepared tx states to already be in the map.





                    • 7. Re: XA recovery integration
                      timfox

                      And another biggie, is that the current code is not using the same message store, but you've spotted this one already.

                      • 8. Re: XA recovery integration

                         

                        "timfox" wrote:
                        One issue is there is no need to manually add callbacks. Replaying the messages /acks will do this for you.


                        Ok, I'll add a fix for this to be completed before we release the beta.

                        Thanks.


                        • 9. Re: XA recovery integration

                           

                          "timfox" wrote:

                          Check out this thread:

                          http://www.jboss.com/index.html?module=bb&op=viewtopic&t=90841



                          Haven't read that thread before, will check it.


                          • 10. Re: XA recovery integration

                             

                            "juha@jboss.org" wrote:
                            "timfox" wrote:

                            Check out this thread:

                            http://www.jboss.com/index.html?module=bb&op=viewtopic&t=90841



                            Haven't read that thread before, will check it.


                            Ok, so from that thread you appear to be saying that you already populate the tx map at server startup. Is that the case or not?



                            • 11. Re: XA recovery integration

                               

                              "juha@jboss.org" wrote:

                              Ok, so from that thread you appear to be saying that you already populate the tx map at server startup. Is that the case or not?



                              And according to ServerPeer, yes you do. TransactionRepository is created and loadPreparedTransactions is called which populates the tx map with prepared transactions.

                              So we should be fine on that account.


                              • 12. Re: XA recovery integration
                                timfox

                                Not sure what you mean by "already".

                                The code in TRUNK doesn't load the transactions at startup. Since this was to be done as part of the transaction recovery task.

                                Although the intention was it would do that.

                                • 13. Re: XA recovery integration
                                  timfox

                                  If you mean, in the 1_0 branch, then yes, but this code is basically just a placeholder.

                                  • 14. Re: XA recovery integration

                                    Well it's not a placeholder anymore, since the required methods have been filled in.

                                    So the required hookup should be in place for server startup.



                                    1 2 Previous Next