1 2 Previous Next 17 Replies Latest reply on Jan 15, 2013 9:06 AM by ataylor

    Messages get lost from the queue!!

    janvandeklok

      Hello there,

       

         We have been using hornetq messaging for over a year  now without any problems. We are using the embedded version in JBoss 6.0.1_final.

       

      Up to now we had about 120 queues that had 1 consumer each and that are capable of paging. We have tested some of these queues to contain over 140.000 messages with no problem at all.

       

      Now we have change it to about 700 queues using paging. We have loaded a lot of messages in the system and ran into a problem that we can't  figure out.

       

      Our pages on disk are about 2mb (each page).  On several queue's the messageCounter states that there are still a number of messages on the queue, however there is just 1 diskpage on disk the has a size of 0 bytes and the listMessages call on the queue Bean using the jboss console has 0 messages!!!!!!!

      In our database log we still see that the number of messages that are not processed equals the  number of the messageCounter value. In other words we have lost a numer of messages.

      We checked the logging of the jboss app server but could not find errors that could indicate such a failure. Evenso,  We process each message in a seperate transaction an do commit / rollbacks depending on the processing result.

       

      - What is happening here?

      - Has someone else ran into the same problems?

      - Could someone give me a clue where to look for the source of the problem?

      - What will happen when the app server will hit its Xmx limit due to the in-memory messages for each of the 700 queues  for instance??

      - We do see an  HornetQException[errorCode=3 message=Timed out waiting for response when sending packet 43 error after wich we do a roll back for that single transaction by the way, but IMHO this should not cause loss of messages.

       

      Our appserver runs with these memory settings : -Xms3072M -Xmx4096M -XX:MaxPermSize=256m

      The Jboss appserver itself and the deployed apps do not use more than 1.2 GB so that will leave at least 1.8 GB for the message in memory. Will this be enough for 700 queue's where the disk page size ??

       

      Any help is greatly appreciated!!!!

       

      Jan van de Klok

       

       

       

      Her is our queue configuration:

       

      <configuration xsi:schemaLocation="urn:hornetq /schema/hornetq-configuration.xsd">

       

      <!--

        Don't change this name.

               This is used by the dependency framework on the deployers,

               to make sure this deployment is done before any other deployment

      -->

      <name>HornetQ.main.config</name>

      <log-delegate-factory-class-name>

      org.hornetq.integration.logging.Log4jLogDelegateFactory

      </log-delegate-factory-class-name>

      <bindings-directory>${jboss.server.data.dir}/hornetq/bindings</bindings-directory>

      <journal-directory>${jboss.server.data.dir}/hornetq/journal</journal-directory>

      <journal-file-size>${hornetq.journal.file.size:1048576}</journal-file-size>

      <journal-min-files>${hornetq.journal.min.files:2}</journal-min-files>

      <large-messages-directory>${jboss.server.data.dir}/hornetq/largemessages</large-messages-directory>

      <paging-directory>${jboss.server.data.dir}/hornetq/paging</paging-directory>

      <!-- true to expose HornetQ resources through JMX -->

      <jmx-management-enabled>true</jmx-management-enabled>

      <connectors>

      <connector name="netty">

      <factory-class>

      org.hornetq.core.remoting.impl.netty.NettyConnectorFactory

      </factory-class>

      <param key="host" value="${jboss.bind.address:localhost}"/>

      <param key="port" value="${hornetq.remoting.netty.port:5445}"/>

      </connector>

      <connector name="netty-throughput">

      <factory-class>

      org.hornetq.core.remoting.impl.netty.NettyConnectorFactory

      </factory-class>

      <param key="host" value="${jboss.bind.address:localhost}"/>

      <param key="port" value="${hornetq.remoting.netty.batch.port:5455}"/>

      <param key="batch-delay" value="50"/>

      </connector>

      <connector name="in-vm">

      <factory-class>

      org.hornetq.core.remoting.impl.invm.InVMConnectorFactory

      </factory-class>

      <param key="server-id" value="${hornetq.server-id:0}"/>

      </connector>

      </connectors>

      <acceptors>

      <acceptor name="netty">

      <factory-class>

      org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory

      </factory-class>

      <param key="host" value="${jboss.bind.address:localhost}"/>

      <param key="port" value="${hornetq.remoting.netty.port:5445}"/>

      </acceptor>

      <acceptor name="netty-throughput">

      <factory-class>

      org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory

      </factory-class>

      <param key="host" value="${jboss.bind.address:localhost}"/>

      <param key="port" value="${hornetq.remoting.netty.batch.port:5455}"/>

      <param key="batch-delay" value="50"/>

      <param key="direct-deliver" value="false"/>

      </acceptor>

      <acceptor name="in-vm">

      <factory-class>

      org.hornetq.core.remoting.impl.invm.InVMAcceptorFactory

      </factory-class>

      <param key="server-id" value="0"/>

      </acceptor>

      </acceptors>

      <security-settings>

      <security-setting match="#">

      <permission type="createDurableQueue" roles="guest"/>

      <permission type="createNonDurableQueue" roles="guest"/>

      <permission type="consume" roles="guest"/>

      <permission type="send" roles="guest"/>

      <permission type="manage" roles="guest"/>

      </security-setting>

      </security-settings>

       

      <address-settings>

       

      <address-setting match="jms.queue.studielink.delivery.#">

      <max-delivery-attempts>-1</max-delivery-attempts>

      <!-- -1 means no re-delivery-attempt  limit -->

      <page-size-bytes>2097152</page-size-bytes>

      <!-- size of disk pages  2Mb -->

      <max-size-bytes>3145728</max-size-bytes>

       

      <address-full-policy>PAGE</address-full-policy>

       

      </address-setting>

       

      </address-settings>

      </configuration>

        • 1. Re: Messages get lost from the queue!!
          gaohoward

          First I think you should try latest release. We have a lot of fixes/enhancements during the past year and your issue may have been included in the fixes.

          Second the issue seems that some messages have been received but the transactions are never committed or rolled back. You mentioned having seem some error messages and assuming that won't cause message loss. I however suspect that may be something you should looking further into. Do you have the full stack trace?

          • 2. Re: Messages get lost from the queue!!
            janvandeklok

            Hello Yong (is that your firstname?)

             

            Thank for responding. We will try to upgrade to the latest version but that will take a lot of time due to SLA's with our customer and hosting provider. Up to now we did not have these kind of problems.

            We are using  HornetQ Server version 2.2.5.Final (HQ_2_2_5_FINAL_AS7, 121) now.

             

            About the fact you mention that we do not roll back or commit:   IMHO this should never create a difference between the reported number of message in the queue and the number of message that are really on the queue since both queries are delegeted to the JMS api!!

            If this can be the case there is a bug in the Hornet implementation.

             

            Thanks to our logging we know that the reported number of message is correct, so it looks like we lost some pages with messages of f.i. all message that are in -momory for some queue's.

             

            We are right now trying to simulate and reproduce this phenomenon and we hoped someone could give us a clue where to look for the reason for this failure.

             

            Regards

             

            Jan

            • 3. Re: Messages get lost from the queue!!
              ataylor

              About the fact you mention that we do not roll back or commit:   IMHO this should never create a difference between the reported number of message in the queue and the number of message that are really on the queue since both queries are delegeted to the JMS api!!

              If this can be the case there is a bug in the Hornet implementation.

              can you explain what you mean by this, are you just saying the message count is incorrect, i know there was a bug in older HQ versions but it wouldnt make messages disapear.

               

              one other thing to check is delivery count, this is messges out for delivery but not acknowledged yet.

               

              also bear in mind that if there is a network issue when you call commit then you dont actually know for sure if it has commited or not, make sure your code handles this correctly.

              • 4. Re: Messages get lost from the queue!!
                janvandeklok

                Hi Andy,

                 

                   We write every incomming message on a hornetque and write an audit record that kees track of time received, time delivered and number of delivery try's.

                Using the audit information we know that the message was recived and comitted to the hornetq. We catch ever exception in the delivery process (also the exception that might come from a commit). We always roll back the transaction when an exceptio occurs! And of course we alway commit the transaction after the mesage is delivered succesfully. (we checked and duoble checked the code for this several times already).

                 

                The remark you are refering to, was a possible cause suggested by Yong, but even if we would forget to roll back or commit , the message count should be in sync with the actual message on the queue (total of messages in diskpages and in-memory)  or not?

                 

                this is the info from the jmx console for the faulty queue:

                 

                Durable R boolean Attribute exposed for management 

                True    

                    ConsumerCount R int Attribute exposed for management 

                1    

                    MessageCount R long Attribute exposed for management 

                321    

                    DeliveringCount R int Attribute exposed for management 

                0    

                    ScheduledCount R long Attribute exposed for management 

                0    

                    MessagesAdded R long Attribute exposed for management 

                642    

                 

                It stiil reports that there are 321 messages on the queue  while there are  actually 0 message on the queue and we see that the audit files shows that 321 message not have been delivered.

                 

                Beats me!!!

                 

                regards 

                 

                Jan

                • 5. Re: Messages get lost from the queue!!
                  ataylor

                     We write every incomming message on a hornetque and write an audit record that kees track of time received, time delivered and number of delivery try's.

                  Using the audit information we know that the message was recived and comitted to the hornetq. We catch ever exception in the delivery process (also the exception that might come from a commit). We always roll back the transaction when an exceptio occurs! And of course we alway commit the transaction after the mesage is delivered succesfully. (we checked and duoble checked the code for this several times already).

                  just a note that just becuase commit failed doesnt mean the session didnt commit.

                   

                  The remark you are refering to, was a possible cause suggested by Yong, but even if we would forget to roll back or commit , the message count should be in sync with the actual message on the queue (total of messages in diskpages and in-memory)  or not?

                  like i mentioned, there were some bugs in the message count which are now fixed but the actual messages should be correct on the queue

                  • 6. Re: Messages get lost from the queue!!
                    janvandeklok

                         "ust a note that just becuase commit failed doesnt mean the session didnt commit."

                     

                    ??? could you explain that to me?       Why should a commit fail when it succeeded???

                     

                    So , if a commit fails and  I do a roll back , the roll back may not be executed because the transaction was was comitted??? This is getting confusing!!

                     

                    Again, what we see is that the counter is correct but messages that were put on the queue disappeared whitout being processed by a consumer.

                     

                    ------------------------

                     

                    We tried to reproduce the problem in our development environment but here evething seems to work fine (up to now).

                    We will upgrade to version 2.2.14 just to make sure.

                     

                    We have now a number of queues that actually contain 0 messages  but where the messagecounter reports a number > 0.

                    Is there a way to manually reset these counters??

                     

                    Jan

                    • 7. Re: Messages get lost from the queue!!
                      ataylor

                      ???could you explain that to me? why should a commit fail when it succeeded?

                      I mean if an exception is thrown on a commit call then you cant garauntee if the the commits failed or succeeded.

                       

                      So , if a commit fails and  I do a roll back , the roll back may not be executed because the transaction was was comitted??? This is getting confusing!!

                      yes that is correct, if say there were a network problem at the point commit was called then you dont know for sure if the commit was calle on the server, you need XA for this.

                       

                      We tried to reproduce the problem in our development environment but here evething seems to work fine (up to now).

                      We will upgrade to version 2.2.14 just to make sure.

                      Its unlikely that messages would just disappear, either they never got to the queue or they have been consumed. If the message counter says they were counted then i would assume the latter, maybe a rogue consumer?

                       

                      To reset the message counter you need to restart the server.

                      • 8. Re: Messages get lost from the queue!!
                        janvandeklok

                        a rogue consumer is out of the question. We have just 1 consumer for this queue and even if there was a rogue consumer , the message counter should still be in sync.

                         

                        I was looking for a list of bugs fixed in 2.2.14 but could find it? Is there such a list (without getting into your bug tracking system) ?

                         

                        We already restarted the jboss server but the messagecounter is still off!!. I need a way to reset it.

                        • 9. Re: Messages get lost from the queue!!
                          ataylor

                          a rogue consumer is out of the question. We have just 1 consumer for this queue and even if there was a rogue consumer , the message counter should still be in sync.

                          Ok, forget the message counter, like i say there was a bug where this could be wrong. however if messages added = 100 and there are no messages in the queue then they have to have been delivered, if you still have the journal you could use the PrintData tool to analyze it.

                           

                          I was looking for a list of bugs fixed in 2.2.14 but could find it? Is there such a list (without getting into your bug tracking system) ?

                          you would need to look through JIRA https://issues.jboss.org/browse/HORNETQ.

                          • 10. Re: Messages get lost from the queue!!
                            janvandeklok

                            Andy,

                             

                            We already restarted the jboss server but the messagecounter is still off!!.

                            I need a way to reset it. (this is about the messageCount  and not the mesageAdded)

                            • 11. Re: Messages get lost from the queue!!
                              ataylor

                              if you restart the server the messsage count will be reset to 0, if its > 0 then this for messages that are added to the queue from the journal.

                              • 12. Re: Messages get lost from the queue!!
                                janvandeklok

                                I'm not sure that I understand this,  as far as I am aware there are no messages on the queue after the restart, consumer is wating for a message to arrive , at the same time the messageCount says there are 351 message on the queue, but no message is arriving for consumption.

                                 

                                As soon as we put a message on the queue, that exact messaage is being processed where we see the messageCount go up and down again by 1 as that message is processed. Afte3r that the consumer is waiting again for a message to arrive and the counter is still off.

                                 

                                 

                                If messages have been placed on the queue from the journal, I would expect the consumer to start processing messages, this is not the case!

                                 

                                I'm very confused now .....

                                • 13. Re: Messages get lost from the queue!!
                                  ataylor

                                  without debugging the restart im not sure what could be happening, could you debug queueImpl to see what is happening? check deliveringCount, this is how many message are actually in the queue. Maybe they are paged and you have some invalid configuration stopping depaging?

                                  • 14. Re: Messages get lost from the queue!!
                                    janvandeklok

                                    If I find a way to reproduce it  I can debug it. Still not able to reproduce it now :-(

                                     

                                    "   Maybe they are paged and you have some invalid configuration stopping depaging?"

                                      If the above was happening, any new message we put on the queue would not be consumed because it should be added to the last page since the queue is FIFO.

                                      If we add a message to the queue it is immedialtly processed! So I think we can rule out a depaging problem.

                                     

                                      We also checked the paging dir  for this queue and it contains 1 page  with a size 0 bytes. I assume that it normal that after all the paged data is processed that there is still 1 paga left behind. Since the byte size f that page = 0 , I assume the page holds no messages.

                                    1 2 Previous Next