Re: the memory usage pattern with buddy replication, I too expect the memory usage to be consistent across the 3 nodes if you are applying load evenly to them. When I've tested this, that's what I've seen. So, not sure why you're seeing something different.
Re: the effect of a logout, if you call session.invalidate() the session is removed cluster-wide. (If you just wait for the session to timeout, each node in the cluster independently does that.) I just looked at the impl of the TreeCache.getNumberOfAttributes() method and don't see any reason why it would overcount. So I'm concerned you've identified some bug here. What AS release are you using?
Brian - thanks for your input..
Re memory: I researched further and think this is the reason.. We have 3 app servers. The start sequence is a1, a2, a3. When a2 starts a1 (already started) and a2 form a buddy pair (they appear to both backup each other). Then when a3 starts it looks like a2 also becomes the buddy to backup a3. So a3 backups up two nodes and a3 ends up backing no nodes. I think that is why the memory patterns vary between the nodes (with a2 exhibiting the fastest depletion of memory)
RE session.invalidate(): I am still seeing unbounded growth of the number of replicated sessions and eventually we get out of memory. This is JBOSS AS 4.0.5. Any further input on this one would be most valuable since its now a show stopper for us at this point..
BTW - I assume there is no reason (and one should not) configure an eviction policy into the tc5-cluster.sar/jboss-service.xml.
This understanding is based on your comment
the effect of a logout, if you call session.invalidate() the session is removed cluster-wide. (If you just wait for the session to timeout, each node in the cluster independently does that.)
Is this understanding correct?
Node a2 becoming the buddy of both a1 and a3 sounds broken; it's not meant to be that way.
There have been a lot of improvements in JBoss Cache related to buddy replication since AS 4.0.5 came out. Can you try replacing the server/all/lib/jboss-cache.jar with the jboss-cache-jdk50.jar that comes with the JBC 1.4.1.SP9 release? For that to work you'll need to replace jgroups.jar, either with the one that comes with the JBC 1.4.1.SP9 download. or even better with the one from the JGroups 2.4.1.SP4 release.
That may help your session.invalidate() issue as well. I've never heard of a problem like that before; will investigate more.
Re: setting up eviction, normally I say don't do that, but it may be a valid workaround for now. If you do it, set it up for the _default_ region (i.e. the whole cache) so it also covers the internal buddy replication backup areas. Use LRUPolicy, and configure maxNodes=0 (disabling eviction based on # of nodes). The timeToLiveSeconds value must be greater than the session expiration timeout. I'd say make it a couple minutes greater -- idea is to make sure that the normal JBossWeb session cleanup process gets a chance to flush things out first, with the JBC eviction only cleaning out stuff that gets left behind.
Brian - ok, thanks for the input
I will try running the new jgroups and jboss-cache still using AS 4.0.5. So you are saying these will still be compatible - right? Is there a matrix somewhere that defines which version of which components can go together. e.g. what versions of jgroups or jboss-cache can we run with AS 4.0.5.
Another option maybe just to wait till AS 5.0 is ready. I am assuming that there have been many clustering improvements in that release..
After quite a bit of work I also finally figured out the session leak. It was actually being caused by our load test setup. So right now that issue is resolved.
We are now seeing a 3x cpu hit using session replication. But I will post that as a new topic in this forum..
Our support guys keep that fairly up-to-date.
AS 5 is going to be a while (at least a couple months). You can look at AS 4.2.2 as well, although again I recommend moving the JBoss Cache to 1.4.1.SP9. That's a small change from the stock 4.2.2; just a later bug fix release.