5 Replies Latest reply on Sep 16, 2004 5:15 PM by joe_klimm

    2 node cluster

    joe_klimm Newbie


      2 linux machines clustered w/ jboss 3.2.5
      1 cisco 11000 load balancer


      We are having a problem with a 2 node cluster replicating session information. Everything works well until we shut down one of the cluster nodes to do work (upgrade etc) on it. Apparently when 1 node is shut down the other node becomes all but unresponsive for a minute or two (seems longer for users). I assume this is due to the "up" node attempting to verify that the "down" node is gone so it can continue to process requests. My questions are:

      1. What is the best way to minimize this unreponsive window? Is there an attribute in cluster-service.xml that would minimize this "downtime"?
      2. Is there a possibility that if we attempt to minimize this window that the sessions could become out-of-sync and cause worse problems than unresponsiveness?

      Thanks in advance.

        • 1. Re: 2 node cluster
          Bela Ban Master

          This should *not* happen ! You should be able to access node2 immediately. Can you take a stack trace on node2 when it hangs for the 60 secs ?
          Also post your cluster-service.xml.

          • 2. Re: 2 node cluster
            Eric Neilsen Newbie

            I get the same hanging going on. However it only occurs when I shutdown the MasterNode. I just assumed that the delay is caused by the other node turning into the new master node, which has to create all of the jms topics and queues.

            • 3. Re: 2 node cluster
              Anil Saldanha Master

              Try with the latest code in CVS.

              • 4. Re: 2 node cluster
                Rajiv Terwadkar Newbie

                Hi joe_klimm
                Im trying to get clustering working for unix 2 machines (HTTP Servers)and 2 app servers.
                Can you give me the details of how do we do clustering using JBoss
                Thanks and Regards

                • 5. Re: 2 node cluster
                  joe_klimm Newbie

                  Thanks everyone for replying.

                  We ended up trying some other configurations and found that moving to an ASYNC replication-type (configured in jboss-web.xml) eliminated the "downtime" we were seeing.

                  I'm just speculating and Bela can correct me if I'm wrong but in a SYNC replication-type with many replication-triggers being fired the up node must do some final negotiations (verifications) in order to be sure that the modified session attributes do not need to be delivered to the now down node. This final SYNC negotiation is what causes the up node to become unresponsive. However, in ASYNC the up node can que the modifications while it determines if it's partner node is really down while it continues to service other incoming requests.

                  Obviously, one trades off the guarantee of session replication prior to response when one moves to an ASYNC type, but we have yet to see any issues.

                  Again, these are just my observations.