1 2 Previous Next 19 Replies Latest reply on Apr 21, 2010 1:23 PM by timfox

    AsynchronousFailoverTest ignoring failures

    clebert.suconic

      It looks like AsynchronousFailoverTest is ignoring failures.

       

      I read the test and I don't see where an AssertionFailure would be reported back to the main thread. (or even other types of failrues).

       

       

      For instance I see this on the hudson logs, and test is still "passing":

       

       

          [junit] Exception in thread "Thread-4402" junit.framework.AssertionFailedError: count:0 last count:999
          [junit]     at junit.framework.Assert.fail(Assert.java:47)
          [junit]     at junit.framework.Assert.assertTrue(Assert.java:20)
          [junit]     at org.hornetq.tests.integration.cluster.failover.AsynchronousFailoverTest.doTestTransactional(AsynchronousFailoverTest.java:405)
          [junit]     at org.hornetq.tests.integration.cluster.failover.AsynchronousFailoverTest.access$200(AsynchronousFailoverTest.java:45)
          [junit]     at org.hornetq.tests.integration.cluster.failover.AsynchronousFailoverTest$2.run(AsynchronousFailoverTest.java:93)
          [junit]     at java.lang.Thread.run(Thread.java:619)
        • 1. Re: AsynchronousFailoverTest ignoring failures
          clebert.suconic

          I will fix the test this afternoon.. and I will see what's broken.

           

          I will keep this thread posted.

          • 2. Re: AsynchronousFailoverTest ignoring failures
            clebert.suconic

            ReplicatedAsynchronousFailoverTest just failed after fixed the test:

             

            http://hudson.qa.jboss.com/hudson/view/JBM%202/job/HornetQ/lastCompletedBuild/testReport/org.hornetq.tests.integration.cluster.failover/ReplicatedAsynchronousFailoverTest/testTransactional/

             

            I had tried it earlier before cutting Beta2 but it didn't fail.

             

            I don't consider it a big deal for Beta2. (I would have run it more extensively if we were cutting a GA).

             

             

            I will check what's the failure is about.

            • 3. Re: AsynchronousFailoverTest ignoring failures
              clebert.suconic

              There are a few issues with the test... Like, it currently doesn't send any messages (that's why the current version is always passing.. there are no messages to verify)

               

              I have made a few changes, the test is passing locally.

               

               

              There is one issue though on committing ACKs.  There's no way to know if the commit was received on the backup or not.

               

              I was thinking... it would only be possible to fix those cases if we had some sort of hand shacking to verify if backup received the commit. and throw a rolledBack exception only if it wasn't received.

               

               

              I will cleanup tomorrow and commit the test. (I had to add a bunch of log.info that I don't want committed.. and I'm lazy to do it tonight :-) )

              • 4. Re: AsynchronousFailoverTest ignoring failures
                timfox

                Clebert Suconic wrote:

                 

                There is one issue though on committing ACKs.  There's no way to know if the commit was received on the backup or not.

                 


                I doesn't matter with acks. Unlike sends, acks can always be resent successfully and reprocessed. If the ref has already been acked it can just be ignored.

                 

                In other words acks are idempotent, sends aren't.

                • 5. Re: AsynchronousFailoverTest ignoring failures
                  clebert.suconic

                  I'm not sure I agree with you.

                   

                   

                  Say you have the following consumer code:

                   

                   

                   

                   
                  // On initialization
                  consumer = session.createConsumer("MoneyTransfer");
                   
                   
                   
                  
                  // at some point
                   
                  try
                  {
                       ClientMessage msg = consumer.receive(500);
                       msg.acknowledge();
                       session.commit();
                       processCredit(msg);
                   
                  }
                  catch (HornetQException e)
                  {
                        if (e.getCode() == HornetQException.TRANSACTION_ROLLED_BACK)
                       {
                              /// At this point the user is expecting the message being rolled back.. the user will just believe what the exception is saying and won't process the credit
                       }
                       else
                       if (e.getCode() == HornetQException.UNBLOCKED)
                       {
                            /// At this point the user doesn't know what to do? The message will be received again, or not?
                       }
                  }
                   
                   
                  

                   

                   

                  Problems here:

                   

                  I - The only thing that might happen during failover is Transaction_Rolled_Back. Unblocked is never called on this scenario. And the transaction could be actually already committed on the backup node. Result here: a transaction will be lost and the user would miss a credit he was supposed to do

                   

                  II - If unblocked was working (it's not ATM).. the user wouldn't know how to process the credit. The message will be received again or not?

                   

                   

                  III - I"m not really sure if XA would work or how it would happen here.

                  • 6. Re: AsynchronousFailoverTest ignoring failures
                    timfox

                    When you retry the commit, if the transaction has already been committed, it doesn't matter, so it's always safe to retry with acks, don't need any duplicate detection here.

                     

                    Acks are idempotent, sends are not.

                    • 7. Re: AsynchronousFailoverTest ignoring failures
                      clebert.suconic

                      You are only taking our point of view. Which is processing messages and ACK.

                       

                       

                      I'm looking at the user's point of view. Will the message be received again or not? Should the user process the data inside the message or not?

                       

                       

                      I'm about to commit a test where the only thing that would be throwed during failover is HornetQException with code==rolled back, but the message was already committed on backup. As a result, the user might have been rolled back the work done after receiving the message but the message will never be received.

                      • 8. Re: AsynchronousFailoverTest ignoring failures
                        timfox

                        I'm not really sure what point you're trying to make here.

                         

                        I was simply making the statement that sends are not idempotent but acks are. That's why it's always safe to retry them.

                         

                        For sends it's different - to avoid duplicate messages you need duplicate detection. With acks, if you try to ack the message again, you simply won't find it and can ignore that - no harm done.

                        • 9. Re: AsynchronousFailoverTest ignoring failures
                          clebert.suconic

                          I'm not talking about ACKs. I'm talking about receiving a message.

                           

                          How the user know if the message will be redelivered or not when the commit failed?

                          • 10. Re: AsynchronousFailoverTest ignoring failures
                            timfox

                            I don't understand the question.

                             

                            Can you rephrase it with some more information / a better description?

                            • 11. Re: AsynchronousFailoverTest ignoring failures
                              clebert.suconic

                              Problems here:

                               

                              I - The only thing that might happen during failover is Transaction_Rolled_Back. Unblocked is never called on this scenario. And the transaction could be actually already committed on the backup node. Result here: a transaction will be lost and the user would miss a credit he was supposed to do

                               

                              II - If unblocked was working (it's not ATM).. the user wouldn't know how to process the credit. The message will be received again or not?

                               

                               

                              III - I"m not really sure if XA would work or how it would happen here.

                               

                               

                              As I said, While consuming a message. The only exception possibly throwed by session.commit() is HorrnetQException with code==RollBack

                               

                              As a consequence, the user doesn't know if the message was ACKed or not.

                               

                              And mainly: Will the message be redelivered or not? ATM there's no way to know that.

                              • 12. Re: AsynchronousFailoverTest ignoring failures
                                timfox

                                If they get transaction rolled back, then they retry the transaction.

                                 

                                This will ensure the ack gets committed. If the transaction was already committed, then it doesn't matter the ack is retried since it's idempotent.

                                 

                                So yes the user does know if the ack was committed or not.

                                 

                                The message will not be redelivered, if the user follows the retry protocol as explained in the user manual.

                                • 13. Re: AsynchronousFailoverTest ignoring failures
                                  clebert.suconic

                                  > If they get transaction rolled back, then they retry the transaction.

                                   

                                  How? By performing the whole consume loop again?

                                   

                                  I'm telling you. If we tell the user the Transaction was rolled back. The user will expect messages being redelivered.

                                   

                                  Messages are being lost.

                                   

                                   

                                  >This will ensure the ack gets committed. If the transaction was already committed, then it doesn't matter the ack is retried since it's idempotent.

                                   

                                  Sure.. HornetQ is fine... ACK will be just ignored. But what about the user's data? How the user know the message will be redelivered ?

                                   

                                  As I said TransactionRolledBack could also mean TransactionCommitted ATM.

                                  • 14. Re: AsynchronousFailoverTest ignoring failures
                                    clebert.suconic

                                    Tim said: "if the user follows the retry protocol as explained in the user manual."

                                     

                                     

                                    The user has a solution to enable duplicates by setting the duplicate header.

                                     

                                     

                                    but the user doesn't have a solution for losing messages during failover.

                                     

                                     

                                    The user got a TransactionRolledBackException.. So, the user will roll back all his process data. He will retry consuming the message but the message will never be redelivered as it wasn't really RolledBack

                                    1 2 Previous Next