1 2 Previous Next 17 Replies Latest reply on Oct 29, 2012 5:32 AM by kzakhar

    Hanging de-paging. Message-count vs. delivering-count

    kzakhar

      Hi, folks!

       

      Below the excerpt for the HornetQ (JBoss 7, HornetQ 2.2.16) configuration I have:

       

      <subsystem xmlns="urn:jboss:domain:messaging:1.2">
               <hornetq-server>
                  <clustered>false</clustered>
                  <persistence-enabled>true</persistence-enabled>
                  <journal-file-size>102400</journal-file-size>
                  <journal-min-files>2</journal-min-files>
                  <connectors>
                     <in-vm-connector name="in-vm" server-id="0" />
                  </connectors>
                  <acceptors>
                     <in-vm-acceptor name="in-vm" server-id="0" />
                  </acceptors>
                  <address-settings>
                     <address-setting match="jms.topic.systemBusTopic">
                        <dead-letter-address>jms.queue.dlqQueue</dead-letter-address>
                        <expiry-address>jms.queue.expiryQueue</expiry-address>
                        <redelivery-delay>0</redelivery-delay>
                        <max-size-bytes>1048576</max-size-bytes>
                        <page-size-bytes>204857</page-size-bytes>
                        <address-full-policy>PAGE</address-full-policy>
                        <message-counter-history-day-limit>2
                        </message-counter-history-day-limit>
                     </address-setting>
                  </address-settings>
                  <jms-connection-factories>
                      <connection-factory name="InVmConnectionFactory">
                        <pre-acknowledge>true</pre-acknowledge>
                        <connectors>
                           <connector-ref connector-name="in-vm" />
                        </connectors>
                        <entries>
                           <entry name="java:/ConnectionFactory" />
                        </entries>
                        <consumer-window-size>0</consumer-window-size>
                     </connection-factory>
                     <pooled-connection-factory name="hornetq-ra">
                        <min-pool-size>40</min-pool-size>
                        <max-pool-size>100</max-pool-size>
                        <transaction mode="xa" />
                        <connectors>
                           <connector-ref connector-name="in-vm" />
                        </connectors>
                        <entries>
                           <entry name="java:/JmsXA" />
                        </entries>
                        <consumer-window-size>0</consumer-window-size>
                     </pooled-connection-factory>             
                  </jms-connection-factories>
                 <jms-destinations></hornetq-server>
            </subsystem>
      

       

      systemBusTopic has 80 subscriptions with the messageSelector's.

       

      The all of the messages being sent are ObjectMessage's, so in case of a burst, there is a great need in the paging mode.

       

      The issue I faced: upon the burst, HornetQ goes to the paging mode for systemBusTopic and not coming back:

      • after the burst is over, for the next 30 minutes there is only 38 new messages were posted to the systemBusTopic
      • the new messages goes to the paging store
      • old messages are not consumed from the paging store
      • JBoss CLI observations for systemBusTopic:
        • delivering-count == 0
        • message-count > 0 and increasing while the new messages receiving (consuming?)
        • list-messages-for-subscription CLI operation does not show any messages ("result" => []), though the message-count > 0 for the requested subscription

       

      Attached the document with the more details for observations.

       

      The one of the questions I have: why do I have the inconsistency between the delivering-count and message-count?

       

      For my understanding:

      • delivering-count - number of messages that this queue is currently delivering to consumers, i.e. (# of delivered, but not yet acknowledged) + (# of "delivering", for example due a consumer is busy)
      • message-count - all messages in the queue, including the scheduled and delivered

       

      As far as scheduled messages and CLIENT_ACKNOWLEDGE is not my case, I expect those 2 values have to be the same.

       

      In my case, delivering-count == 0 and message-count > 0, some messages are sitting in the queue and not being processed by the consumers -> preventing the HornetQ from de-paging, because HornetQ first needs to process the oldest messages, before reading from the paging store, and in the same time those messages will never reach the consumers. Deadlock detected?

       

      Meantime, I have not seen any messaging issues in the log files, indicating some abnormal situation.

       

      Will appreciate your feedback and guidance.

       

      Best regards,

      Konstantin

        • 1. Re: Hanging de-paging. Message-count vs. delivering-count
          gaohoward

          you set consumer-window-size to 0, which means no consumer buffering. So HornetQ won't deliver next message to the client unless the previous one has been consumed and acked.

          When you have many messages in the queue but the consumer is slow, messages will accumulated in the queue. So the message count is not zero.

           

          Ragarding paging, the server will always deliver messages in the queue memory, then load the messages from paging store into queue memory. If you have a lot paged messages, the last message will always take relative longer time to reach the consumer, but not never.

           

           

          Howard

          • 2. Re: Hanging de-paging. Message-count vs. delivering-count
            kzakhar

            Howard, thanks for the answer.

             

             

            you set consumer-window-size to 0, which means no consumer buffering. So HornetQ won't deliver next message to the client unless the previous one has been consumed and acked.

            When you have many messages in the queue but the consumer is slow, messages will accumulated in the queue. So the message count is not zero.

            Yes, I'm ok with the absence of consumer buffering. I agree, that in this case, the new messages will be in the queue until the previous not consumed and acknowledged, i.e. message-count > 0. But, not acked messages should also set the delivering-count > 0, that's the inconsistency I'm talking about - I will add the additional debug message to verify, but as for now I see that no messages at all are "being processed" by consumers.

             

             

            Ragarding paging, the server will always deliver messages in the queue memory, then load the messages from paging store into queue memory. If you have a lot paged messages, the last message will always take relative longer time to reach the consumer, but not never.

            For the 30 minutes, no new messages were consumed from the paging store, no messages at all were consumed - only newly arrived goes to paging.

             

            Best regards,

            Konstantin

            • 3. Re: Hanging de-paging. Message-count vs. delivering-count
              gaohoward

              Re: not acked messages should also set the delivering-count > 0

               

              The delivering-count should equal to the number of messages being delivered but not acked, which means messages not being delivered (its payloads going out of server) are not counted.

               

              Re: For the 30 minutes, no new messages were consumed from the paging store, no messages at all were consumed - only newly arrived goes to paging.

               

              Do you mean once in paging mode, message delivering stops forever once all in-memory messages have been delivered? If you can confirm this, I think this may be a bug.

               

              Howard

              • 4. Re: Hanging de-paging. Message-count vs. delivering-count
                kzakhar

                The delivering-count should equal to the number of messages being delivered but not acked, which means messages not being delivered (its payloads going out of server) are not counted.

                Ok, let's rephrase this, to be closer to the issue statement. When at least one message is being processed by the consumer (in my case MDB), and not yet acked (no CLIENT_ACKNOWLDGE on my side) the delivering-count should be > 0.

                 

                 

                Do you mean once in paging mode, message delivering stops forever once all in-memory messages have been delivered? If you can confirm this, I think this may be a bug.

                Basing on the attached info, delivering-count is always 0, no messages are being consumed, the paging store size is only increasing.

                 

                I am in the process adding the debug messages to all of 80 subscriptions to verify, that no messages are processed after the hanging.

                 

                Can you please propose that else I may check to pin-point the issue?

                 

                After having the additional info, I hope would be able to prepare the isolated test scenario for this.

                • 5. Re: Hanging de-paging. Message-count vs. delivering-count
                  kzakhar

                  The last time reproducing the issue (on the full environment), I added the debug messages (entering/leaving) into #onMessage for one of the subscriptions (let it be ***), that is used not intensively.

                   

                  After the paging hung, I checked the JBoss CLI output for systemBusTopic:

                  • delivering-count == 0
                  • message-count > 0

                   

                  I executed the "list-all-subscriptions" operation, and checked that for *** subscription:

                  • "messageCount" => big integer 2,

                   

                  While the debug messages indeed shows no "entering" message without "leaving".

                   

                  Can it be some bug in the component reponsible for the counters (QueueControlImpl?)? So, the HornetQ considers that some messages are in the queue (# of messages for the page != # of acked), and due to consumer-window-size == 0, and so on, it does not start the de-paging?

                  • 6. Re: Hanging de-paging. Message-count vs. delivering-count
                    clebert.suconic

                    There are some issues being fixed on 2.2.21

                    • 7. Re: Hanging de-paging. Message-count vs. delivering-count
                      kzakhar

                      Hi Clebert. Thanks. Is there JIRA issue on this?

                      • 8. Re: Hanging de-paging. Message-count vs. delivering-count
                        clebert.suconic

                        HORNETQ-1017

                        • 9. Re: Hanging de-paging. Message-count vs. delivering-count
                          kzakhar

                          Thanks, Clebert. I downloaded the HornetQ 2.2.21 source code from Git, will do my tests against it and come back with the results.

                          • 10. Re: Hanging de-paging. Message-count vs. delivering-count
                            clebert.suconic

                            If you wait just a bit, you may get the synchronized fix.. (although you probably already applied it manually)

                            • 11. Re: Hanging de-paging. Message-count vs. delivering-count
                              clebert.suconic

                              If you are still seeing issues, I will be glad to look into that.

                              • 12. Re: Hanging de-paging. Message-count vs. delivering-count
                                kzakhar

                                Hi Clebert,

                                 

                                Yes, I did the testing with the updated synchronization for FilterImpl. Anyway, I have just downloaded the latest source code from GitHub repo (hornetq-hornetq-HornetQ_2_2_19_AS7_Final-45-g6ae1752.zip), checked that it includes both: the fix for synchronization and the fix for HORNETQ-1017.

                                 

                                Please, let me know, if I have to try another revision/tag for the tests.

                                 

                                As for now, I'm still see the same behavior after the burst of messages:

                                • delivering-count for systemBusTopic == 0
                                • message-count for systemBusTopic > 0
                                • no messages are being processed anymore
                                • via JBoss CLI (list-all-subscriptions) and debug messages, I see the same behavior, the messages are in the queue for the subscription, without "entering" debug messages

                                 

                                Please advise, what else can I check to get the more details on the issue.

                                 

                                Best regards,

                                Konstantin

                                • 13. Re: Hanging de-paging. Message-count vs. delivering-count
                                  clebert.suconic

                                  Pardon my ignorance, but what does hornetq-hornetq-HornetQ_2_2_19_AS7_Final-45-g6ae1752.zip mean? is that something generated by github?

                                   

                                   

                                  I didn't create any patch in top of 2.2.19 with the fixes, you would have to downlaod the Branch_2_2_AS7 with the latest fixes.

                                   

                                   

                                   

                                  Can you check you are using the right version from github, and if you are, I would need a way to replicate what you're seeing.

                                  • 14. Re: Hanging de-paging. Message-count vs. delivering-count
                                    kzakhar

                                    Sorry didn't make myself clear last time.

                                     

                                    I am not using any Git client to get the HornetQ source code from the GitHub, I simply requested GitHub to "ZIP" the Branch_2_2_AS7.

                                     

                                    As I checked the zipball, it contains the both fixes: FilterImpl synchronization and the commit for the HORNETQ-1017.

                                     

                                    So I guess, that I'm using the right version from GitHub.

                                     

                                    I do agree, that the stable "simple self contained program" would ease the debugging and understanding the root-cause, but unfortunately this is not yet feasible.

                                     

                                    I am in first place looking for advices what should I check on my side, to narrow the scenario, e.g.:

                                    - add the debug messages to the "solution" codebase, e.g. tx/non-tx session open/closed, ...

                                    - add the debug messages into the HornetQ library, e.g. to change the level of log records for the depage checks, ...

                                    1 2 Previous Next