I've been having a strange problem running an application on JBoss 5.1 and have been trying to find the source of it for weeks now, but haven't succeeded yet.
I am going to try to describe the problem so hopefully somebody will have a hint how to solve it. Application basically accepts SOAP requests, checks some parameters in PostgreSQL database and inserts records in Bind9 via DNSJava.
OS used is Solaris 10 10/09 with the latest available patch cluster at the moment of testing. Java used is Sun JDK 1.6.0_20.
When I start the application and start a load test, it works fine for approximately an hour. Garbage collector overhead is below 1% and I haven't noticed that anything in JVM or native heap is growing (watching with JProbe). The only thing which is growing is resident memory (RSS column) occupied by JBoss process when checking with prstat command.
Then, after approximately an hour, the value in RSS column reaches the value of SIZE column (also from prstat command) and I notice the bars in JProbe representing occupation of JVM heap by Old Gen, Young Gen, etc jump like crazy for some time. Then they stabilize, but now old gen fills up pretty frequently so GC is forced to clean it also very frequently. Now GC overhead grows and after several hours it reaches 15% - 20%. Load Test tool shows a significant performance degradation after the first hour. Both SIZE and RSS columns in prstat continue to grow until after 8 - 10 hours the Out Of Memory Exception is thrown. Actually not all physical memory is occupied at that moment, but probably all the addressable memory since I was running the tests in 32bit mode (just to reach this exception earlier). The heap dump in moment of the crash is attached bellow.
What I've tried so far to resolve this problem is changing almost every possible JVM parameter, but there was no improvement. I tried JVM 1.6_10, 1.6_13, 1.6_18 and 1.6_20 and even tried with OpenJDK 1.6._20 (I think this was the version), but again, nothing changed.
I tried to download and compile the latest JBoss 5.1 from source, but still no changes.
Monitoring with JProbe didn't show any memory leaks. I tried to check for memory leaks in native libraries using libumem and procedure described here http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/html/memleaks.html, but couldn't find any.
The only things which cause some changes are switching the JBoss version and changing the OS. I did the same load test with our application on JBoss 6.0 M3 and Solaris 10 and this problem didn't occur. I also did the same test on JBoss 5.1 but on Ubuntu and the problem also didn't appear. Unfortunately switching to either of those should be avoided for other reasons.
So does anybody kindly have an advice for me how can I find and resolve the source of this problem on JBoss 5.1?
If any additional information is necessary, just let me know.
hs_err_pid14726.log.zip 5.5 KB