permgen make me think about unclosed ThreadLocal usage, and we actually removed many ThreadLocals in the road from 5.0 to 5.1.
So which version are you using?
And how do you trigger the PermGen, are you deploying & undeploying multiple times, or is simply boot & run enough to trigger it?
Also to check the basics: is it possible you simply didn't assign enough permGen memory to your JVM (did you check if it's leaking, or just not enough) ?
I'm using 5.0.1 Final right now. I had -XX:MaxPermSize=256M when I was running a single web app and that was ok. I did bump to 512M with no real effect. In this case, the behavior is on a fresh startup. The only activity is my load balancer hitting a health check page maybe 10 times before it freezes up with the out of memory error. I do not use the reload of webapps within Tomcat, always a complete restart. Just popped the options to 1024M max and 256M initial size with the same effect. Tomcat logged 44 requests before running out of memory. thanks, Paul
Just noticed this:
For the one with local storage only, I drop the reference to jgroups udp.xml:
I guess you also change the clustering mode="distribution" ?
Dropping the reference to a JGroups configuration will have it use the default JGroups configuration, so it's possible your two CacheManagers are actually connecting and forming a 2 nodes cluster, please check your logs about JGroups mentioning more than one address in views - but this doesn't justify the perm gen errors.
I don't know - even if I wanted to do it on purpose - how I could trash 512M of permgen with just 44 requests; sounds there is something worse than a memory leak going on. Is there anything else unusual, besides the configuration you've posted? Like custom Externalizers, eventlistener, something else to point out?
I would greatly help as well if you could test latest 5.1.0.CR2 instead, so that those threadlocals are reluded out.
So here's the actual xml for the webapp with local storage only:
<?xml version="1.0" encoding="UTF-8"?>
<namedCache name="session" />
<namedCache name="application" />
I just cut out most everything in it.
I did not see any ERROR level in the logs. The WARN that mentions a view is:
2011-12-22 10:17:07,236 - INFO - org.infinispan.remoting.transport.jgroups.JGroupsTransport.viewAccepted(JGroupsTransport.java:542) - ISPN000094: Received new cluster view: [cent-01-44672|1] [cent-01-44672, cent-01-764]
I will give 5.1.0.CR2 a try and let you know.
MAT has a nice duplicate classes view (there's a link to it in the heap dump overview page). You should have lots of duplicate classes loaded if you're using up 512mb of perm gen.
There are some more tips on using MAT to explore the perm gen here: http://sites.google.com/site/eclipsebiz/The-Unknown-Generation-Perm
I've been exploring MAT quite a bit, but have yet to make sense of the immense amount of information. That article you posted was good, haven't found it before. It would be much better if the images still existed. The duplicate classes for my heap dumps isn't telling me much yet. When I load both webapps, I have 3 instances of WebappClassLoader and one StandardClassLoader shown. When I only load one of my webapps, I get 2 instances of WebappClassLoader and one of of StandardClassLoader. In both cases, the "Defined Classes" and "No. of Instances" values are pretty close. org.apache.log4j.Category is at the top of both lists. The absolute numbers though are very close. For both of my webapps loaded, I see 4433 and 13547. For my single, I see 4433 and 13499. So the Defined Classes match and No. of Instances is different by only 48.
I tried 5.1.0.CR2 with the same results. Another idea I had was to load my Infinispan cache at the Tomcat level, so it's only loaded/defined once. I start it from a Listener so put that into the Tomcat/conf/web.xml. The needed jars went into Tomcat/lib. I did leave the Infinispan jars in my webapps as well (perhaps not a great idea). I still get the same behavior. Works if I load one webapps, fails if I load both.
Ok, so I've been working on this all day and now believe it's a class loader issue and not an Infinispan issue. In my instance, loading Infinispan twice just brought the problem out in the open.
So far, I think relevant items are:
- multiple copies of log4j jars on the classpath
- multiple copies of metro jars on the classpath
What has seemed to help is getting rid of those copies by placing a single copy in the Tomcat/lib directory and removing them from the webapp lib directories. I also continued to load my Infinispan cluster at the Tomcat level using the tomcat/conf/web.xml. I had not planned on it, but it makes more sense to load it at the container level than the application level.
So hopefully this will help someone looking through the archives.
I agree it's unlikely to be a problem of Infinispan, still.. if you find out what the issue was please post, you got me curious.
If it proves hard to change your application's structure you could try out JBoss AS, which provides a remarcable classloader: http://www.jboss.org/jbossas and since v.7 has "total isolation".
I've used Tomcat myself too in the past, but never got it to work correctly with duplicate classes (possibly my fault); if you have to, use http://www.jboss.org/tattletale to check your libraries: it provides scanning options for Tomcat or simple "flat" classloaders too.
Btw, a side note. I love Eclipse MAT. I think it's a remarkably easy to use memory analyzer, better than IntelliJ IMO for memory related stuff. It was orignally created by SAP guys who did a wonderful job.
I've got my app now working under Tomcat 6. It was a PITA! So the crux of it seemed to be problems with class loaders and Tomcat. I could not load my two distinct webapps in the same container successfully.
Some of the problems I encountered were:
- Duplicate log4j classes loaded due to the log4j jar file being inside the webapp
- Multiple Infinispan caches being started with the same configuration
- Rippling effects of all it inside permgen space
What I ended up doing to solve it in my case is:
- Merge my two webapps into a single webapp. Not too much of an issue there as one was a web front end and the other was supporting web services. It will get me a better memory footprint in the end.
- Move my Infinispan listener out of the Tomcat web.xml (moving during trouble shooting) and back into the web app web.xml so Tomcat only loads it once per container instead of once per class loader.
- Moving all log4j and slf4j to Tomcat's lib folder
- Moving Sun Metro stack to Tomcat's lib folder (which I learned I should have been doing anyway, but I'm only using 1.4)
What I'm considering changing:
- Add a shared lib folder to Tomcat (via catalina.properties) to put common shared jars such as Metro and Log4j/slf4j (better housekeeping)
- Moving to Jboss AS or Jetty as my container
I hope this helps someone in the future.
@Paul, maybe you should consider JBoss AS7 instead of Tomcat 6?