I suspect this code is giving me grief!!!
* The authenticated Principal associated with this session, if any.
* IMPLEMENTATION NOTE: This object is not saved and
* restored across session serializations!
private transient Principal principal = null;
It is in tomcats ClusteredSession (jboss code I believe). It is no suprise to me with this code that my session is not replicating. Does anyone have any ideas? I will likely try patching this and seeing if it helps.
I took out the transient but it had no effect. Does anyone have any ideas?
I will have a look at this. Generally it is useless to replicate the Principal because a Principal is only valid inside a single JVM (or you have to replicate the whole JAAS-class-system with your Principal). Maybe we have to reauthenticate after replication...
I have been looking into this non-stop since I made that post, so I have some more info. As it turns out none of my session information is replicating. However, the calls are going all the way to the BMP entity bean with the appropriate attributes, so I now believe that either my deployment configuration for my entities is hosed, or my javagroups confuration is hosed. The doco docs are a little out of date with regard to confiuring javagroups. Some of the options in the clustering docs are not accepted, but I see that default configuration commented of the "ClusterManager" class and replaced by a new one (which does not seem to be working for me).
I tried this configuration here (much like the sample one in the docs), but it did not work either:
The only difference between this one, and the sample in the clustering documentation, is the "min_wait_time" parameter has been removed from the UNICAST directive. This was chokcking the server on startup.
Now, to insure that multicast was working here on my LAN I went to the javagroups site and got some instructions for testing. Here is what I did, and it seemed to work just fine:
On my workstation:
java org.javagroups.tests.McastReceiverTest -mcast_addr 220.127.116.11 -port 5555.
One some random server box:
java org.javagroups.tests.McastSenderTest -mcast_addr 18.104.22.168 -port 5555.
And I typed in some messages at "random server box" and saw them appear on my workstation. I used the exact same address above (see javagroups config) to eliminate one of the variables. What I don't know is here their default configuration differs from the rest of the attributes in the javagroups config in my cluster-config.xml or the default in the jboss code (it appears to be an area of some experimentation, there are two commented out values for javagroups default...).
I could include my configuration files, but I am more or less using defaults taken from the "all" configuration. I have not tried Jetty as an alternative to tomcat (our app has a couple of known issues with Jetty), but as I said, I see the attributes going into the CMP In-Memory entity (added a bunch of nasty println's, and set logging to WARN/FATAL to really narrow things down) on the host where the HTTP request is processed, but the attributes don't appear on the other node.
I will be MOST GRATEFUL for any advice, and if I find the solution first I will waist no time getting back to the list. If any further information would be useful, please let me know.
So you're not replicating anything ?
Looking at the logs it could be a bug in JavaGroups, which causes a joining member which immediately after joining requests the state to receive incorrect group information. So, for example, jake:2394 does *not* recognize itself in the group.
This bug has been fixed in version 2.0.4 of JavaGroups, but the version used in JBoss is 2.0.3.
I'll change this in the CVS as soon as possible (probably next week). Because we changed some of the state transfer interface in JavaGroups, this will require some changes to JBoss.
If you want a quick fix: create a javagroups.jar from the JavaGroups CVS for version 2.0.4 (up to and including Oct 28) and replace the one in JBoss with it.
It appears I miss understood how replication worked to some degree. I now see I am getting replication. I was mistakenly looking for the CMPClusteredInMemoryPersistenceManager.loadEntity(ctx) method to be called in every node on every update to the cluster. This is not happening (if it is supposed to, then something is very wrong my setup). I only know this from many println's and quite a bit of study on the source code (much I still don't understand, but getting there). What I do see in a fail-over case is when a new node is "failed to" this loadEntity(ctx) method is called successfully, with the marshalled attributes. Immediately after I see this output of my println's in the persistence manager, I am redirected to my login screen. So it looks like all of my state was there, I just was unable to access the application. Armed with this new info I am going to try eliminating my security constraints to truly isolate the problem to the authentication area. FYI, this application does have a very complex security setup (40+ roles, used to secure individual fields) if that helps anyone at all. My JAAS components are subclasses of the basic login module framework provided with JBoss-3.0.3.