5 Replies Latest reply on Aug 28, 2013 4:30 PM by clebert.suconic Branched to a new discussion.

    Performance Degradation while in paging (v2.3.0)

    johnnysoccer

      Running Jboss 7.2.0.Alpha1 with HornetQ 2.3.0

       

      We are observing that when we get to ~20 page files, the overall performance of the system seems to drop off to the point where it is no longer usable.

       

      Currently, we are testing with 700 durable consumers on a single address, with all consumers actively consuming.  We produce 700,000 messages which go to each of the 700 consumers.

      When we initially hit paging things seem to continue to work fine, and still in a performant manner.  Once we hit between 20-25 page file (10mb per page file) the performance drops off dramatically to a level that is 1/50 to 1/100 the previous rate, even to the point where consumers seem to be stopped.

       

      We notice that during these situations, the CPU utilization for HornetQ is relatively high. 

      If we stop producing messages, the poor performance persists.

       

      Is there a way that we should modify our configuration to help with this scenario (allow for more heap memory overhead relative to the max size for an address?)

      Is this a known condition that is a side affect of preventing message starvation in 2.3.0?

       

      Any insight would be helpful.

       

      thanks,

      John

        • 1. Re: Performance Degradation while in paging (v2.3.0)
          clebert.suconic

          Are you sending messages transactionally?

           

          currently paging is not infinite when you use transactions. You will hold every transaction in memory. It's something we will change on the next version.

           

          The reason for that is that paging was not meant to be used forever.. .more as a temporary alternative for when you have a temporary dead consumer in place. Not to hold messages forever lke in a database.

          • 2. Re: Performance Degradation while in paging (v2.3.0)
            johnnysoccer

            We are not using transactions to publish, and are not using XA transactions when we subscribe. 

            We typically grab up to 20 messages per subscriber when subscribing,  prior to doing a commit on the session.

             

            We are not using paging as a permanent storage, but exactly how you describe, or as an overflow when there is a very large surge in published messages, which we expect to catch up over time after the surge.

             

            We have noticed that there seems to be a particularly high utilization of the "Par Survivor Space" when things slow down.  I've tried adjusting the the -XX:SurvivorRatio setting in the JVM, and that seems to help over a period of time, but still eventually that space gets filled.  I guess my question then comes back to memory/configuration settings that you know of when dealing with a larger number of page files.  Like I mentioned, this behavior is new as of 3.2.0

             

            Do you know if using -XX:+AlwaysTenure would be a better approach, and avoid the survivor space altogether?

            • 3. Re: Performance Degradation while in paging (v2.3.0)
              johnnysoccer

              Just some more strange behavior related to our testing.

              After changing some memory parameters  (-XX:NewSize=512m -XX:SurvivorRatio=2) we managed to get all the messages to consume. or at least all the consumer clients believe they have consumed all the messages.

               

              When I look at the message count through JMX, all queues have a count of some kind ranging from -5361 to 9368 (none of the 699 queues have a count of 0) and I also still show 11 page files.

              If I attempt to execute removeMessages(null) via JMX I get a return value of 0

               

              If I publish more messages to the queues, the consumers will consume the new set of messages, but the messed up messages counts stay the same.

              When I restart jboss/hornetq: the consumer clients reconnect when the service comes back up and start consuming more messages.  Message and paging counts were the same at restart

              After a period of time, all consumers again act as though they have finished consuming.

               

              Page Files: 10

              message counts between -6400 and -1453  all queues have a negative count.

              This time I restart all the consumer clients, and they still behave as though they have no messages.

              When I restart jboss/hornetq (again), consumers do not act as though they've consumed any messages, and almost immediately, the page file count goes to 0 and the message count for all queues show 0 as well

               

              Though maybe not the cause of any problem, it certainly seems odd to have negative messageCount values at any time.  We only see this happen after going into paging, and then only when we get to some kind of paging level over 25 page files.

              • 4. Re: Performance Degradation while in paging (v2.3.0)
                clebert.suconic

                I'm currently working on cases with negative counters... There is a case with non persistent messages. Non Persistent messages are not updating the counters for paging and they are still using page files when going beyond the limits.

                 

                 

                The best way to verify if you have consumed the messages is by looking at the page files folders. Maybe use PrintData / PrintPages

                • 5. Re: Performance Degradation while in paging (v2.3.0)
                  clebert.suconic

                  Ah.. there was also a recent change on paging with a fix:

                   

                  https://github.com/hornetq/hornetq/pull/1239