I have run into a GC issue while configuring an application to run on Jboss. It looks like once the application has started it pretty much will exclusively Full GC. I initially attributed this to the way the application ran and figured some memory tuning was in order. I made sure I had enough heap, space in the new/old/perm generations, etc. I have since gone through all the VM tuning knobs I could find without luck.
My next step was to isolate some component in my application which might trigger such activity. This led me to strip out all the application pieces and extra Jboss deployments until I was left with only the initial jboss-service.xml. Based on this I determined that a piece in there is causing this behavior, but I am unable to isolate it.
The example output I include below indicates the Full GCs I am seeing. When I have my application deployed and running this problem is exacerbated and leads to 10+ sec Full GC's roughly every minute or two after the application has started. Those Full GC's are a major throughput/performance hit which I need to solve before I can continue with this deployment.
I am including log output and my jboss-service.xml (as an attachment) for background. Any insight would be helpful.
Jboss: ver 3.0.6 (bundled w/Tomcat 4.1.18)
VM: Sun 1.3.1_05
extra cmd line tuning args: -Xms64M -Xmx600M -XX:MaxNewSize=200M -XX:NewRatio=2 -Xnoclassgc -verbose:gc -verbose:class -verbose:jni -XX:MaxPermSize=128M
20030812 12:45:03:020; INFO; org.jboss.deployment.MainDeployer; org.jboss.logging.Logger; [ main ]; Deployed package: file:/home/.../jboss/conf/jboss-service.xml
20030812 12:45:03:026; INFO; org.jboss.system.server.Server; org.jboss.logging.Logger; [ main ]; JBoss (MX MicroKernel) [3.0.6 (CVSTag=JBoss_3_0_6 Date=200301260037)] Started in 0m:8s:802ms
[Dynamic-linking native method java.lang.Runtime.gc ... JNI]
[Full GC 2646K->1375K(64832K), 0.1425423 secs]
[Full GC 1391K->1375K(64832K), 0.1296706 secs]
[Full GC 1391K->1375K(64832K), 0.1275555 secs]
[Full GC 1391K->1331K(64832K), 0.1892810 secs]
[Full GC 1348K->1331K(64832K), 0.1315105 secs]
[Full GC 1348K->1331K(64832K), 0.1226858 secs]
[Full GC 1348K->1331K(64832K), 0.1253095 secs]
[Full GC 1348K->1331K(64832K), 0.1218643 secs]
[Full GC 1348K->1331K(64832K), 0.1366476 secs]
[Full GC 1348K->1331K(64832K), 0.1274152 secs]
[Full GC 1348K->1331K(64832K), 0.1233673 secs]
[Full GC 1348K->1300K(64832K), 0.1345630 secs]
[Full GC 1316K->1300K(64832K), 0.1347744 secs]
[Full GC 1316K->1300K(64832K), 0.1273432 secs]
[Full GC 1316K->1300K(64832K), 0.1548435 secs]
I was able to narrow it down to the default RMI functionality. When compared to the setup I used for the previous post I removed to following:
With the removal of only these MBeans the Full GC behavior disappeared. I read on the sun site's garbage collection tuning document that RMI could cause Distributed GC leading to major collections (Full GCs.)
Does it make sense that the removal of these would help this problem or is it incidental ?
It is hard to tell with the current testing you have performed. Unfortunately, we have removed all possible causes which doesn't tell us the exact nature of the issue.
I note that the reclaim is very little yet the Full GC kicks in anyway. It is also quite a large reclaim time. It seems to indicate that the standard GC routines are struggling.
The large reclaim at the start is probably due to the amount of object artefacts that JBoss creates and discards at boot - I'm investigating this at the moment. The current JBoss routines generate a reasonable number of ArrayLists while reading the XML configs, particularly with nested child nodes. These are then converted to an Iterator set for processing before discard. So that first hit is expected (at least by me).
As I said, the bigger issue is the continual object reaping with little reward. This is a feature of the Sun implementation - you'll find things different under the IBM implementation as under load, GC can be deferred until heap approaches 70% at which point it starts GC work.
One thing to try is to tune your RMI reap timing. Have you attempted:
Or similar? This sets reaping for 1 hour (maybe you want it less). The default is once per minute. As I've said in a semi-formal study, the Sun memory management is complex and for server applications requires tuning science to be applied - unfortunately, this requires an intimate knowledge of all things server and application.
I am fairly confident I have been able to narrow the cause to RMI behavior in the JRMPInvoker.
Since the JRMPInvoker is the unchangeable default for the access to EJBs under jboss 3.0.x a workaround is required. Under Jboss 3.2.x the container's invoker is configurable.
The recommended approach from the Sun site is to use the following switches to effectively turn off RMI (dgc) collections. My test application running with these properties no longer exhibits the Full GC behavior.
reference section 188.8.131.52:
Well, the RMI was the issue. What I was getting at was determining whether it was limited to a particular MBean and whether it is just RMI in general, or perhaps some particular environmental condition - since you removed 3 MBeans, it takes a leap in conditions.
I'd be interested to know if there are any adverse effects from switching off RMI collections for a server under load.
In the end I was able to further narrow it down to the JRMPInvoker Mbean. Once I saw that I was able to isloate the behavior to that MBean I went into the source to confirm how it works. No real surprises in there.
The application in question is designed as a remote bean server that does not access its beans locally. Access to the beans is provided through the HttpInvoker. RMI access via the JRMP invoker is not desired in this case and should not be used. Without RMI activity I don't expect turning off distributed GC to impact the application.
I am currently running through some tests to determine if there is any impact from turning this off. I will post again if I find anything interesting.