First off, if you are using EAP you are quite likely a support customer. If so you should open a ticket on the Customer Support Portal, where you'll get much better support than I can provide in the half hour or so I day I spend on this forum.
A basic debugging step is to take a heap histogram on n1 a) before restart n2 b) after n1's memory usage starts climbing c) before restarting n2. That will give a picture of what types of objects are increasing in the heap.
Thank you for your reply.
Those are our customer's servers, so I don't have a direct line to JBoss customer support, but I might try to have a ticket opened once I have something more concrete.
I am planning on getting a heap dump when I get the chance to reproduce this, but I was wondering how this is tied to clustering. I did some simple tests, and it does not seem (at least from the org.jboss.cache logs) that any replication happens while n1 is being shut down. But still something is triggering increased memory usage on n2 at exactly this point. Is there any other log category that might give more insight into what is happening?
I don't have any specific log categories, no, as without more information on what's increasing in the heap there's not much to go on.
When n1 is shut down, n2 starts getting failover requests and starts doing twice as much work, so memory increase is expected. Perhaps the increase would be a bit more than you'd expect, as a session that is being actively handled on a node takes ~ 2x the memory of one that is just being stored as a backup for another node. (That BTW is not the case in JBoss 5.) But whether that's the problem, or it's just that n2 is now doing n1's work, or if it's some bug, there's not enough information to say.
I did a bit more testing, and it seems that once I shut down n1, there is some increase in HashMap$Entry's and TreeMap$Entry's on n2. This is a test setup, so n2 is not servicing any requests when I stop n1. So I am guessing, it might have to do with the session being promoted from backup to active, as you suggested. Could you point me to the relevant code in JBoss so I can verify? (I want to check where those map entries are instantiated)
http://anonsvn.jboss.org/repos/jbossas/tags/JBPAPP_4_2_0_GA_CP06/tomcat/src/main/org/jboss/web/tomcat/service/session/ is the package where the distributed session manager code resides.
http://anonsvn.jboss.org/repos/jbossas/tags/JBPAPP_4_2_0_GA_CP06/tomcat/src/main/org/jboss/web/tomcat/service/session/JBossCacheManager.java is the key class.
I highly recommend you get a case opened on the Customer Support Portal.
Note that the only thing that promotes a session from backup to active is a user request.