6 Replies Latest reply on Dec 2, 2002 3:24 AM by jakefear

Problems with login in cluster with 3.0.3 and tomcat

jakefear Nov 25, 2002 10:01 PM

Hi, I am having a problem with logging into my cluster configuration. I am doing a basic cluster setup with HttpSession replication type of "instant". I am fronting the cluster with Apache 1.X and mod_jk with a load balancing setup across two nodes (smaller than production, but good enough for in-house testing). The problem I am having is that when I load-balance to the server I did not originally log in to it forces me to login in again. Once I have logged in a second time sessoin state replication (and other clustering features) seem to work fine.

Does anyone have any idea what I need to do here? I have subclassed the provided JAAS code for password salting, and I suspect I may need to do something special there to get login state replicated but I'm not sure what. Any help would be greatly appreciated.

1. Transient Principal in Clustered Session!

jakefear Nov 26, 2002 12:17 PM (in response to jakefear)

I suspect this code is giving me grief!!!

/**
* The authenticated Principal associated with this session, if any.
* IMPLEMENTATION NOTE: This object is not saved and
* restored across session serializations!
*/
private transient Principal principal = null;

It is in tomcats ClusteredSession (jboss code I believe). It is no suprise to me with this code that my session is not replicating. Does anyone have any ideas? I will likely try patching this and seeing if it helps.
Actions
2. please help!

jakefear Nov 26, 2002 1:18 PM (in response to jakefear)

I took out the transient but it had no effect. Does anyone have any ideas?
Actions
3. Re: Problems with login in cluster with 3.0.3 and tomcat

derry Nov 30, 2002 5:31 AM (in response to jakefear)

Hello!

I will have a look at this. Generally it is useless to replicate the Principal because a Principal is only valid inside a single JVM (or you have to replicate the whole JAAS-class-system with your Principal). Maybe we have to reauthenticate after replication...

CU
Thomas
Actions
4. Re: Problems with login in cluster with 3.0.3 and tomcat

jakefear Nov 30, 2002 12:08 PM (in response to jakefear)

I have been looking into this non-stop since I made that post, so I have some more info. As it turns out none of my session information is replicating. However, the calls are going all the way to the BMP entity bean with the appropriate attributes, so I now believe that either my deployment configuration for my entities is hosed, or my javagroups confuration is hosed. The doco docs are a little out of date with regard to confiuring javagroups. Some of the options in the clustering docs are not accepted, but I see that default configuration commented of the "ClusterManager" class and replaced by a new one (which does not seem to be working for me).

I tried this configuration here (much like the sample one in the docs), but it did not work either:

UDP(mcast_addr=224.10.10.10;mcast_port=45566):PING:FD(timeout=5000):VERIFY_SUSPECT(timeout=1500):MERGE:NAKACK:UNICAST(timeout=5000):FRAG:FLUSH:GMS:STATE_TRANSFER:QUEUE

The only difference between this one, and the sample in the clustering documentation, is the "min_wait_time" parameter has been removed from the UNICAST directive. This was chokcking the server on startup.

Now, to insure that multicast was working here on my LAN I went to the javagroups site and got some instructions for testing. Here is what I did, and it seemed to work just fine:

On my workstation:
java org.javagroups.tests.McastReceiverTest -mcast_addr 224.10.10.10 -port 5555.

One some random server box:
java org.javagroups.tests.McastSenderTest -mcast_addr 224.10.10.10 -port 5555.

And I typed in some messages at "random server box" and saw them appear on my workstation. I used the exact same address above (see javagroups config) to eliminate one of the variables. What I don't know is here their default configuration differs from the rest of the attributes in the javagroups config in my cluster-config.xml or the default in the jboss code (it appears to be an area of some experimentation, there are two commented out values for javagroups default...).

I could include my configuration files, but I am more or less using defaults taken from the "all" configuration. I have not tried Jetty as an alternative to tomcat (our app has a couple of known issues with Jetty), but as I said, I see the attributes going into the CMP In-Memory entity (added a bunch of nasty println's, and set logging to WARN/FATAL to really narrow things down) on the host where the HTTP request is processed, but the attributes don't appear on the other node.

I will be MOST GRATEFUL for any advice, and if I find the solution first I will waist no time getting back to the list. If any further information would be useful, please let me know.

Cheers,
Jake
Actions
5. Re: Problems with login in cluster with 3.0.3 and tomcat

belaban Dec 1, 2002 6:19 PM (in response to jakefear)

So you're not replicating anything ?

Looking at the logs it could be a bug in JavaGroups, which causes a joining member which immediately after joining requests the state to receive incorrect group information. So, for example, jake:2394 does *not* recognize itself in the group.

This bug has been fixed in version 2.0.4 of JavaGroups, but the version used in JBoss is 2.0.3.

I'll change this in the CVS as soon as possible (probably next week). Because we changed some of the state transfer interface in JavaGroups, this will require some changes to JBoss.

If you want a quick fix: create a javagroups.jar from the JavaGroups CVS for version 2.0.4 (up to and including Oct 28) and replace the one in JBoss with it.

Cheers,
Bela
Actions
6. Re: Problems with login in cluster with 3.0.3 and tomcat

jakefear Dec 2, 2002 3:24 AM (in response to jakefear)

It appears I miss understood how replication worked to some degree. I now see I am getting replication. I was mistakenly looking for the CMPClusteredInMemoryPersistenceManager.loadEntity(ctx) method to be called in every node on every update to the cluster. This is not happening (if it is supposed to, then something is very wrong my setup). I only know this from many println's and quite a bit of study on the source code (much I still don't understand, but getting there). What I do see in a fail-over case is when a new node is "failed to" this loadEntity(ctx) method is called successfully, with the marshalled attributes. Immediately after I see this output of my println's in the persistence manager, I am redirected to my login screen. So it looks like all of my state was there, I just was unable to access the application. Armed with this new info I am going to try eliminating my security constraints to truly isolate the problem to the authentication area. FYI, this application does have a very complex security setup (40+ roles, used to secure individual fields) if that helps anyone at all. My JAAS components are subclasses of the basic login module framework provided with JBoss-3.0.3.

Cheers,
Jake
Actions

Go to original post