I'm guessing that a garbage collector thread crashed.
I doubt that a thread crashed - the JVM would have taken a dump if it did.
But I would guess that you are hitting a major garbage collection. A major collection on a 2GB heap will take a long time, and with a parallel collector you will see 100% CPU usage (and a lot of that usage will be kernel time).
See my presentation for ideas on how to determine if the problem really is garbage collection related and how you could go about tuning it.
Another thing you could do - when this slowdown hits, take a JVM thread dump and look at which threads are running and what they are doing. If it is not the CG threads running, then take several thread dumps a few seconds apart - this will help you identify code that could be causing a problem.
Thank you very much. I have question right away:
Another thing you could do - when this slowdown hits, take a JVM thread dump and look at which threads are running and what they are doing.
How do I take a thread dump?
kill -SIGQUIT pidbut that didn't do anything.
I'll check out your slide now.
The kill should do it - the output goes to stderr (I think, it could be stdout) for the JVM. If you ran JBossAS as a service, the service start script probably redirects stdout/stderr to a file, look there.
If you ran JBossAS from a terminal, you can enter CTRL-\ in that terminal to get it.
The kill should do it
Thanks for insisting.
It dumps to the terminal where the process was started from. So when I logout and re connect using SSH I don't know (yet) how to capture the dump.
If you (or anyone) has an idea I would be grateful if you shared it.
Another direction I'm exploring is using jconsole to look at my JVM.
jstack does the job of creating memory dumps.
is all it takes.
I now have a few thread dumps, but I'm not quite sure what to look for. I always have about 115 threads out of which about 33-38 are RUNNABLE. 32-34 are in state TIMED_WAITING, 32-34 are WAITING.
Correct me if I'm wrong, but I assume the thread or the threads that are burning CPU must be in state RUNNABLE and I guess, I will need to take take a look at a number of snapshots to actually catch the thread while it is executing.
In all of my dumps the GC Daemon is in state TIMED_WAITING (on object monitor), even though the machine is really busy. Does that mean it's not the GC Daemon?
What else should I look for?
I wasn't able to turn on GC logging on the production server, so I have no results from that side (yet).
Thank you for more help.
One thing to look for is if, over several thread dumps, certain threads are still working on the same code - this could indicate an infinite loop.
Another thing to look for is if certain code always show up as being the active code, though in different threads. This could identify code that could present a performance issue. A profiler should help you track that down.
Another possibility is if threads are always waiting on the database, in which case some database tuning is called for. Perhaps adding a missing index.
There are lots of other things. This is one of those areas where if you don't feel comfortable doing the analysis, you really need to hire an expert to do it for you.
Thank you, Peter. I will follow up when I have news.
You can analyse the behaviour by gc with hpjmeter is a tool free.
How does the gc work you can choose algorithm different in java.
You can refer at this url http://java.sun.com/docs/hotspot/gc1.4.2/faq.html
Or you can increase the number of jboss managing your application
Thanks everyone. Your help is much appreciated. I was able to identify the root cause of the problem. It wasn't the GC, but actually a query that was overwhelming the hardware.
I wonder if I can set the timeout for the transactional context in JBoss.
Thanks for your help. I was able to determine which threads were causing the slowdown. It wasn't GC, but a query that was overwhelming the resources.
Peter, thanks for getting me on the right track to nail this down.