4 Replies Latest reply on Jan 26, 2010 3:59 AM by Tim Fox

    Restarting HornetQ after paging starts

    John Muhlestein Newbie

      We have seen in several instances where we are unable to restart HornetQ after reaching certain memory thresholds.

       

      Scenario 1)  HornetQ has crashed due to running out of memory either due to misconfiguration of the paging parameter(s) or unexpected resource consumption as a combination of all the processes running on the machine.

       

      Scenario 2) We have reached a point where we have entered paging, and there are a large number of paging files (40+) meaning we have been paging for a while, typically do to one of our durable subscribers not being available to consume messages.

       

      In both cases, is now in a stopped state, and we attempt to start it back up again.  HornetQ never manages to start up, it either completely hangs, or throws an out of memory error.  In both instances, the only way to get HornetQ to restart was to bump up the -Mmx parameter in the start script to significantly higher than it was (roughly 50% higher) in order to get HornetQ to start back up again. (or clean out the journal and paging directories, which would not be an option for a system in production).

       

      Our typical configuration starts HornetQ with an -Xmx1024m and a paging configuration as follows:

       

            <address-setting match="jms.topic.Replication.#">

               <max-size-bytes>524288000</max-size-bytes>

               <page-size-bytes>10485760</page-size-bytes>

               <address-full-policy>PAGE</address-full-policy>

            </address-setting>     

      Where there are two topics that would match the address setting (we have assumed that the max-size-bytes is a total for all the queues identified by the match parameter). We have noticed, through jconsole, that despite having the max-size-bytes set at 1/2 Gig, that memory typically will grow to 900 Mb before paging starts, which also seems wrong, but that is what we have observed.
      Finally, all of our subscribers are durable, and in our test cases, there have typically been 4-5 subscribers, though we expect to have more than 400 in a normal situation.
      Is seems that based on this configuration, there should always be enough memory for HornetQ to restart, but this has happened many times (2.0.0.GA)
      Any suggestions on settings we should be using to protect the system better on startup.