2 Replies Latest reply on Oct 26, 2010 11:10 PM by Grant Little

    What is the expected behaviour for failover when using JCA

    Grant Little Newbie

      What is the expected behaviour of Hornet Q Live/Backup pairs when working in a XA transaction using the JCA connector?

       

      My assumption is if the primary server fails then everything should failover seemlessly to the backup without exceptions being thrown. However this is currently not the behaviour I am seeing.

       

      I have a simpe test scenario.

       

      Environment:

      * HornetQ 2.1.2

      * Ubuntu 10.10

      * Java 1.6.0_21

      * JBoss 4.0,2

       

      1. Remote Primary Server is running

      2. Remote Backup server is running

      3. I have a DefaultMessageListenerContainer connected to the primary server. It is connected to the ConnectionFactory using a local JNDI lookup (in JBoss - defined in a jms-remote-ds.xml file) which connects to the HornetQ servers.

      4. I send a message to an example queue and I can see the DefaultMessageListenerContainer sending it to my simple MessageListener (which does a simple  - System.out.println( ((TextMessage)message).getText())).

      5. I kill the primary server

      6. I get XA transaction exceptions being thrown.

      7. Rather than failing over the existing transaction seemlessly. The transaction is ended (due to the XA exception) the connection is destroyed, a new one is attempted (which fails as the primary is unavailable) and then after n retries to the primary a connection to the backup is established.

       

      Part of the issue is that when the connection fails over it notifies the container which appears to destroy all of the connections in the connection pool (in JBoss). I have already logged a possible defect HORNETQ-555 which is related to this.

       

      However I have applied a local patch to attempt to stop this happening.  In my patch, if the failover occurs I simply don't notify the listeners  that the connection failed and let it resume.

       

      Even with my patch I continue to get XA transaction exception being thrown, I appreciate that this is just a simple local patch and may not be  fully implementing what is required (not to mention the fact that the  documentation states that listeners should always be notified even after  failover).

       

      Therefore this is really a question on what is the expected outcome in this scenario? Is HornetQ expected to failover seemlessly within the existing XA transaction? If so then has anybody actually had this working in a real world situation?

       

      I also supplied a patch HORNETQ-556 to allow FailoverOnInitialConnection to work from the JCA connector. This makes me suspicious that I may be using HornetQ in some uncommon way. As from what I can see a server (jboss outage) could never re-establish a connection to a backup server using JCA (it works from a non-JCA environment).

        • 1. Re: What is the expected behaviour for failover when using JCA
          Clebert Suconic Master

          if it failed before prepare, you're supposed to see a rollback only during commit. A Commit will fail with a rollback only.

           

          If a failure happened between Prepare and Commit, you will need to restart the server as the TM doesn't support replication (as far as I know) only the live node will have information about the pending TX.

           

          I believe Andy Taylor had a talk to TM guys recently about this.

          • 2. Re: What is the expected behaviour for failover when using JCA
            Grant Little Newbie

            Thanks Clebert,

             

            I was under the impression that the failover would "replay" the messages to the backup server (from its cache) and therefore the backup would be seemless. I'm guessing this isn't the case then.

             

            I can understand the situation if the failover happens between the prepare and commit as this would leave the transaction in a dangerous state. When you say restart the server, I presume you are talking about the JBoss server (as it contains the TM). Is there some way to determine when this server restart is required (as opposed to a standard - albeit hopefully not too common XA transaction exception), so we can add some monitoring etc?