7 Replies Latest reply on Feb 15, 2012 6:21 AM by Andy Taylor

    Can someone please explain why messages are never delivered in order - in simple (idiot can understand) terms?

    Kent Garner Newbie

      I have been using Hornetq since 2.0.0.GA was released.  I have since upgraded to 2.2.2.Final - 2.2.5.Final and I am now using 2.2.7 (because I need to the bug fix that allows me to retrieve the counts via JMSQueueControl... again)

      Anyway.  I can never remember a time when I was able to consistently retrieve the messages in order.

       

      - I have only 1 producer and 1 consumer at anytime configured per queue.

       

      - My queues are persistent and set to Block not Page.

       

      - My producer is simple it just sends with Session.Auto_ACKNOWLEDGE.

       

      - My Consumer receives messages as fast as it can with Session.CLIENT_ACKNOWLEDGE so that I can insure the message is cached to the drive before acknowledging it.

       

      - All my messages are numbered so I can attempt to keep them in-order on the receiving end.

       

       

       

      Typically, I can expect the messages to slightly out of order for example even numbers will come first ( 2,1,4,3,6,5,8,7, etc )

       

      I just experienced one test of only 20 messages using version 2.2.7 that ordered them like ( 2,4,6,8,10,12,14,  1,3,5,7,9,11,13,15,  16,17,18,19,20 ).

       

      The logging on my producer indicates they were sent to the queue in order.

       

      Please explain this. It is causing me grief on the receiving side because I have to process the messages in order and with this system re-ordering them has gotchas all on its own.

       

      I keep reading that Hornetq guarantees delivery order but I have not yet seen this to be true.

       

      What is the trick?

        • 2. Re: Can someone please explain why messages are never delivered in order - in simple (idiot can understand) terms?
          Kent Garner Newbie

          There is nothing special about my code, however, it is deeply ingrained in a large project and the code would not be easy to show except to - strip out the particulars and then what is the point.

           

          but in a nutshell....

           

          Consumer:

          ...

          createQueueConnection()

          - setExceptionListener()

           

          connection.createQueueSession(false, CLIENT_ACKNOWLEDGE)

          session.createReceiver(queue)

          connection.start()

           

          loop_for_ever{

            receiver.receive(2000) // wait 2 seconds for message

            if ( message != null ){

              cache to drive

              message.acknowledge()

            }

           

            on any exception{

              log exception

              close session/connection and re-establish connection/session/queue etc...

            }

            pause 250 milliseconds // carry over from earlier jboss race condition

          }

          close connections on exit

           

           

          Producer:

          ...

          createQueueConnection()

          - setExceptionListener()

           

          connection.createQueueSession(false, AUTO_ACKNOWLEDGE)

          session.createSender(queue)

          connection.start()

           

          loop_for_ever{

            if( message to send ){

              sender.send(message, DeliveryMode.PERSISTENT, 4, 0)

              log ->  message.getJMSMessageID()

              if ( message != null ){

                cache to drive

                message.acknowledge()

              }

           

              on any exception{

                log exception

                close session/connection and re-establish connection/session/queue etc...

              }

            }

            else // no message pause a bit

              pause 1 second // loop again for next message 

          }

          close connections on exit

           

           

           

          The only thing I have read that makes sense is that delivery order is no longer guaranteed when if a retry is required.

          I have not seen any errors or exceptions and I have not had to re-establish my connection with the queues at anytime on either the consumer or producer.

          Note: I do occasionally stop and restart components during testing.  I notice this issues most during testing becuase I am paying attention to them.

          This system (producer & consumer) is running on the same host and therefore I should not have any ip/network issues.

           

          However, the Hornetq log does have some interesting lines.  Perhaps they are the key.  If so - how do I fix the issue?

           

          see following log segment.

          --------------------------------------------

          hornetq-failure-check-thread] 19:55:43,563 WARNING [org.hornetq.core.protocol.core.impl.RemotingConnectionImpl]  Connection failure has been detected: Did not receive data from /172.16.1.14:53019. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=3]

          [hornetq-failure-check-thread] 19:55:43,564 WARNING [org.hornetq.core.server.impl.ServerSessionImpl]  Client connection failed, clearing up resources for session 57397a6e-56a6-11e1-b2a8-00144f9509a3

          [hornetq-failure-check-thread] 19:55:43,565 WARNING [org.hornetq.core.server.impl.ServerSessionImpl]  Cleared up resources for session 57397a6e-56a6-11e1-b2a8-00144f9509a3

          [hornetq-failure-check-thread] 19:55:43,566 WARNING [org.hornetq.core.server.impl.ServerSessionImpl]  Client connection failed, clearing up resources for session 5743b39f-56a6-11e1-b2a8-00144f9509a3

          [hornetq-failure-check-thread] 19:55:43,567 FINEST [org.hornetq.core.server.impl.ServerConsumerImpl]  Cancelling reference for messageID = 1421753, ref = Reference[1421753]:RELIABLE:ServerMessage[messageID=1421753,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [hornetq-failure-check-thread] 19:55:43,567 FINEST [org.hornetq.core.server.impl.ServerConsumerImpl]  Cancelling reference for messageID = 1421755, ref = Reference[1421755]:RELIABLE:ServerMessage[messageID=1421755,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [hornetq-failure-check-thread] 19:55:43,567 FINEST [org.hornetq.core.server.impl.ServerConsumerImpl]  Cancelling reference for messageID = 1421758, ref = Reference[1421758]:RELIABLE:ServerMessage[messageID=1421758,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [hornetq-failure-check-thread] 19:55:43,568 FINEST [org.hornetq.core.server.impl.ServerConsumerImpl]  Cancelling reference for messageID = 1421760, ref = Reference[1421760]:RELIABLE:ServerMessage[messageID=1421760,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [hornetq-failure-check-thread] 19:55:43,568 FINEST [org.hornetq.core.server.impl.ServerConsumerImpl]  Cancelling reference for messageID = 1421766, ref = Reference[1421766]:RELIABLE:ServerMessage[messageID=1421766,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [hornetq-failure-check-thread] 19:55:43,568 FINEST [org.hornetq.core.server.impl.ServerConsumerImpl]  Cancelling reference for messageID = 1421770, ref = Reference[1421770]:RELIABLE:ServerMessage[messageID=1421770,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [hornetq-failure-check-thread] 19:55:43,568 FINEST [org.hornetq.core.server.impl.ServerConsumerImpl]  Cancelling reference for messageID = 1421776, ref = Reference[1421776]:RELIABLE:ServerMessage[messageID=1421776,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [hornetq-failure-check-thread] 19:55:43,569 FINEST [org.hornetq.core.server.impl.ServerConsumerImpl]  Cancelling reference for messageID = 1421787, ref = Reference[1421787]:RELIABLE:ServerMessage[messageID=1421787,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [hornetq-failure-check-thread] 19:55:43,569 FINEST [org.hornetq.core.server.impl.QueueImpl]  Force delivery deliverying async

          [hornetq-failure-check-thread] 19:55:43,569 WARNING [org.hornetq.core.server.impl.ServerSessionImpl]  Cleared up resources for session 5743b39f-56a6-11e1-b2a8-00144f9509a3

          [Thread-8 (group:HornetQ-server-threads9050487-28161976)] 19:55:43,584 FINEST [org.hornetq.core.server.impl.QueueImpl]  Queue jms.queue.MLDBR_HL_Content is delivering reference Reference[1421753]:RELIABLE:ServerMessage[messageID=1421753,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [Thread-8 (group:HornetQ-server-threads9050487-28161976)] 19:55:43,585 FINEST [org.hornetq.core.server.impl.QueueImpl]  Queue jms.queue.MLDBR_HL_Content is delivering reference Reference[1421755]:RELIABLE:ServerMessage[messageID=1421755,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [Thread-8 (group:HornetQ-server-threads9050487-28161976)] 19:55:43,586 FINEST [org.hornetq.core.server.impl.QueueImpl]  Queue jms.queue.MLDBR_HL_Content is delivering reference Reference[1421758]:RELIABLE:ServerMessage[messageID=1421758,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [Thread-8 (group:HornetQ-server-threads9050487-28161976)] 19:55:43,586 FINEST [org.hornetq.core.server.impl.QueueImpl]  Queue jms.queue.MLDBR_HL_Content is delivering reference Reference[1421760]:RELIABLE:ServerMessage[messageID=1421760,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [Thread-8 (group:HornetQ-server-threads9050487-28161976)] 19:55:43,587 FINEST [org.hornetq.core.server.impl.QueueImpl]  Queue jms.queue.MLDBR_HL_Content is delivering reference Reference[1421766]:RELIABLE:ServerMessage[messageID=1421766,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [Thread-8 (group:HornetQ-server-threads9050487-28161976)] 19:55:43,587 FINEST [org.hornetq.core.server.impl.QueueImpl]  Queue jms.queue.MLDBR_HL_Content is delivering reference Reference[1421770]:RELIABLE:ServerMessage[messageID=1421770,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [Thread-8 (group:HornetQ-server-threads9050487-28161976)] 19:55:43,588 FINEST [org.hornetq.core.server.impl.QueueImpl]  Queue jms.queue.MLDBR_HL_Content is delivering reference Reference[1421776]:RELIABLE:ServerMessage[messageID=1421776,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          [Thread-8 (group:HornetQ-server-threads9050487-28161976)] 19:55:43,588 FINEST [org.hornetq.core.server.impl.QueueImpl]  Queue jms.queue.MLDBR_HL_Content is delivering reference Reference[1421787]:RELIABLE:ServerMessage[messageID=1421787,priority=4,expiration=0, durable=true, address=jms.queue.MLDBR_HL_Content,properties=TypedProperties[{ProcessID=00, Producer=MLDBR}]]

          • 3. Re: Can someone please explain why messages are never delivered in order - in simple (idiot can understand) terms?
            Andy Taylor Master

            The errors you see will happen when you kill a client, any unacked messages will then be redelivered. This would cause odering issues especially if you restart another consumer before the first consumers crash has been discovered.

            • 4. Re: Can someone please explain why messages are never delivered in order - in simple (idiot can understand) terms?
              Kent Garner Newbie

              In my case, I am not killing a client. 

              I may stop/start a client prior to my test, but I have not killed the client during one of these tests where the ordering was messed up. 

              Now, if Hornetq is doing some connection killing/restarting under the covers I do not know or have control over that. 

              I was leaning toward some connection management that is being done under the covers.

               

              like I said, i am not seeing any exceptions being thrown and I am not restarting my queue connection at any time during the test.

              • 5. Re: Can someone please explain why messages are never delivered in order - in simple (idiot can understand) terms?
                Andy Taylor Master

                the following message implies that clients are being closed by theserver

                 

                [hornetq-failure-check-thread] 19:55:43,566 WARNING [org.hornetq.core.server.impl.ServerSessionImpl]  Client connection failed, clearing up resources for session 5743b39f-56a6-11e1-b2a8-00144f9509a3

                 

                this is either because clients arent being closed cleanly or the server is not configured correctly ans is suffering saturation because of lack of memory or some other configuration issue. If the latter you need to supply a configuration and a test we can look at.

                • 6. Re: Can someone please explain why messages are never delivered in order - in simple (idiot can understand) terms?
                  Kent Garner Newbie

                  Andy

                   

                  OK, Thanks,

                   

                  I will add try-finally catches to insure things are closed. 

                   

                  Question: I have the HornetQ logs set to "ALL".  Should I see connection closed messages at the same time that I close the connection? 

                  The reason that I ask is because I know the close is being called based on my logging, however, I do not see a closed connection message in the HornetQ log at the same time.

                  I see a connection closed for port 1098 around 30 seconds after I connect, but then about 1min 30 seconds after I close the connection I see "Connection failure has been detected: Did not receive ping from ..."

                  in the logs.

                   

                  Does it still think I am not closing my connection?

                  In my close do the following ( on consumer )...

                   

                  session.recover();   // if I have not yet acknowledged the message

                  connection.close();

                   

                  Is there more to it that I am missing?

                   

                  btw.  I appreciate your help.

                  • 7. Re: Can someone please explain why messages are never delivered in order - in simple (idiot can understand) terms?
                    Andy Taylor Master

                    if you are seeing 'Connection failure has been detected: Did not receive ping from ...' then its either one of 2 things

                     

                    either

                     

                    1) you are not closing connections properly

                     

                     

                    or

                     

                    2) maybe the server is running out of memory or is overloaded.

                     

                    In case 2 you need to configure it to handle this either via flow control, paging, or by giving more memory to the server