9 Replies Latest reply on Aug 26, 2008 11:28 AM by peterj

    JVM pauses periodically

      Hi,

      We are having trouble with our production application server; the JVM seems to pause from time to time.

      This is apparent both in lack of responsiveness, and gaps in the logs. Normally we have tens of log lines per second, but we see gaps where nothing is logged for anything from 4-30 seconds.

      The server is under constant load, and averages about 150 threads, peaking at about 300.

      It's running on 32 bit Ubuntu Server using the Sun 1.6.0-b105 JVM.

      The jvm is started with the following:

      JAVA_OPTS="-server -Xms2048m -Xmx2048m -XX:PermSize=128m -XX:MaxPermSize=128m

      We have a tool which graphs the memory pools of the JVM, and I can see that survivor space and eden space are running at high percentages, while heap usage stays around 20%, peaking at up to 70%.

      At present the JVM uptime is 203035600 and the following GC figures are as follows.

      PS Scavenge: CollectionCount (9364) CollectionTime (321892)
      PS MarkSweep: CollectionCount (57) CollectionTime (67529)

      From my understanding of these figures, the throughput of the JVM seems ok to me, and doesn't seem to account for the pauses.

      I occasionally take a thread dump and find deadlocked threads, but normally there are none.

      My questions are; what should I be monitoring? Can I tweak the JAVA_OPTS to change some (ratio?) settings?

      I've tried to find the right forum for this, so sorry if this isn't it. If you know a better place to ask, I'd be grateful.

      Thanks in advance,

      James

        • 1. Re: JVM pauses periodically
          kjkoster

          Hmm. I'm curious, what is the machine doing during those 4-30 seconds? Is it burning CPU or I/O maybe? Is the JVM in RAM completely or are portions of it paged out to disk?

          You could run "vmstat 2" in a terminal next to "tail -f acces.log". As soon as you see the access log stall, check the vmstat to see what the box is doing, cpu or disk.

          Kees Jan
          http://java-monitor.com/forum

          • 2. Re: JVM pauses periodically
            peterj

            Sounds like you are hitting a major collection - I can see that taking 4-30 seconds to clean up the tenured generation based on the heap size. You could greatly improve your performance by setting the young generation size. For details, see my presentation at http://www.cecmg.de/doc/tagung_2007/agenda07/24-mai/2b3-peter-johnson/index.html

            • 3. Re: JVM pauses periodically
              kjkoster

              Hmm. I'm thinking that if it is a mayor collection, you should be able to see the time spent in the memory view of jconsole. On that view, check the lower panel, where the garbage collectors are shown.

              Peter: so what you are suggesting is that there actually is such a thing as too much memory?

              The JVM comes with several GC's. Perhaps picking a different one would alleviate this?

              • 4. Re: JVM pauses periodically

                OK, the problem has receded now, thanks for your input, much appreciated.

                The confusing thing was the JVM JMX beans showing the GC times and counts for PS Scavenge and PS MarkSweep painted a totally false picture.

                After reading the document suggested by PeterJ, and reading on about Ergonomics of the JVM I arrived at having only one JVM option set:

                -XX:MaxGCPauseMillis=500

                We now let the JVM manage memory itself, as discussed here:

                http://java.sun.com/j2se/1.5.0/docs/guide/vm/gc-ergonomics.html

                The other problem was quickly highlighted once all the GC'ing was eliminated; our half hourly DB backup was gumming up the system.

                Thanks again to everyone for taking the time.

                PeterJ, I had already read your presentation, and I would like to add that I considered implementing something based on your suggestions, but instead of using the GC debug information, getting the info from the JMX beans. The advantages being that i) we already have a graphing solution ready to plug in the output (cacti), and ii) I could turn graphing of GCs on and off without a restart.

                However, I now have to conclude that the JMX beans do not paint such a complete or accurate picture as the GC debug output.

                • 5. Re: JVM pauses periodically
                  peterj

                  The only reliable picture of what is going on with GC is by collecting GC data - mechanisms that use JMX or the JConsole provide only snapshots of current heap usage at the time the snapshot was taken. What is missing from that data is how much heap space survives each GC and in particular the final heap size after a major GC. That data is critical in right-sizing the heap.

                  And yes, too big of a heap is bad. There is an anti-pattern along the lines of "if a little is good then more must be better", and that definitely applies to the heap. If the heap is too large you will pay for it in time spent on a major GC.

                  • 6. Re: JVM pauses periodically
                    kjkoster

                    Dear Peter,

                    What you said earlier in the thread suggests that it is not so much having a massive heap that is an issue, but having too many objects end up in the tenured generation that are really too young. By increasing the size of the young generations, you are effectively raising the age at which objects pour over into the tenured generation.

                    Also, would using a parallel garbage collector not alleviate the pausing?

                    Kees Jan

                    • 7. Re: JVM pauses periodically
                      peterj

                      Smaller young generations get collected more often, which means that objects will more quickly use up their 32 (by default) lives before being placed into into the tenured generation. Thus medium-lived objects tend to end up in the tenured generation. By keeping your young generation larger, you are more apt to "kill" medium-lived objects via a minor collections because there are fewer of them.

                      In one test (on a real-world application, not on a test app) I saw the number of collections go from over 8000 to under 100 by right-sizing the young generation. The total time spent in GC was also drastically reduced from several minutes to under one minute.

                      Both size and number of live objects are factors in GC time. Larger heaps take longer to collect. Given two heaps of the same size, the one with the most number of live objects takes the longest to collect.

                      Parallel collection helps, given one CPU/core per thread. Fortunately, parallel collection is turned on by default starting with JVM 1.5 if you have two CPUs/cores and at least 1GB or RAM (yes RAM, not heap size).

                      • 8. Re: JVM pauses periodically

                        One final question; how much overhead is associated with having GC debug information recorded?

                        I'm thinking of turning it on for a week to give me a sustained period for analysis.

                        Thanks,

                        • 9. Re: JVM pauses periodically
                          peterj

                          If you use the -Xloggc:file-name option to direct logging statements to to a file, the logging overhead will be minimal, probably around 2-3%. (It is slightly higher when dumping it to the console just because writing to the console is slow.)