5 Replies Latest reply on Sep 16, 2004 5:15 PM by joe_klimm

2 node cluster

joe_klimm Sep 14, 2004 12:04 PM

Configuration

2 linux machines clustered w/ jboss 3.2.5
1 cisco 11000 load balancer

Issue

We are having a problem with a 2 node cluster replicating session information. Everything works well until we shut down one of the cluster nodes to do work (upgrade etc) on it. Apparently when 1 node is shut down the other node becomes all but unresponsive for a minute or two (seems longer for users). I assume this is due to the "up" node attempting to verify that the "down" node is gone so it can continue to process requests. My questions are:

1. What is the best way to minimize this unreponsive window? Is there an attribute in cluster-service.xml that would minimize this "downtime"?
2. Is there a possibility that if we attempt to minimize this window that the sessions could become out-of-sync and cause worse problems than unresponsiveness?

Thanks in advance.

1. Re: 2 node cluster

belaban Sep 15, 2004 4:41 AM (in response to joe_klimm)

This should *not* happen ! You should be able to access node2 immediately. Can you take a stack trace on node2 when it hangs for the 60 secs ?
Also post your cluster-service.xml.
Bela
Actions
2. Re: 2 node cluster

eric2 Sep 15, 2004 1:13 PM (in response to joe_klimm)

I get the same hanging going on. However it only occurs when I shutdown the MasterNode. I just assumed that the delay is caused by the other node turning into the new master node, which has to create all of the jms topics and queues.
Actions
3. Re: 2 node cluster

anil.saldhana Sep 16, 2004 1:11 AM (in response to joe_klimm)

Try with the latest code in CVS.
Actions
4. Re: 2 node cluster

terajiv Sep 16, 2004 10:46 AM (in response to joe_klimm)

Hi joe_klimm
Im trying to get clustering working for unix 2 machines (HTTP Servers)and 2 app servers.
Can you give me the details of how do we do clustering using JBoss
Thanks and Regards
Rajiv
Actions
5. Re: 2 node cluster

joe_klimm Sep 16, 2004 5:15 PM (in response to joe_klimm)

Thanks everyone for replying.

We ended up trying some other configurations and found that moving to an ASYNC replication-type (configured in jboss-web.xml) eliminated the "downtime" we were seeing.

I'm just speculating and Bela can correct me if I'm wrong but in a SYNC replication-type with many replication-triggers being fired the up node must do some final negotiations (verifications) in order to be sure that the modified session attributes do not need to be delivered to the now down node. This final SYNC negotiation is what causes the up node to become unresponsive. However, in ASYNC the up node can que the modifications while it determines if it's partner node is really down while it continues to service other incoming requests.

Obviously, one trades off the guarantee of session replication prior to response when one moves to an ASYNC type, but we have yet to see any issues.

Again, these are just my observations.
Actions

Go to original post