5 Replies Latest reply on Jul 27, 2005 5:11 PM by menkun

    puzzles about load balancing and failover of JBoss cluster

    menkun

      I just go through the book 'jboss clustering', but still have several questions related with the load balance and session bean failover capabilities of JBoss cluster.

      1) It seems that JBoss cluster cannot balance running process, for example, I have a two-node (A and B) cluster (my EJB running on both with clustered configuration), and three same client requests from third computer C are leveled with ?Round-Robin? policy, two in node A and one in node B. If I kill node A (Ctrl+c), all processes in A are transferred to B seamlessly. However, after I restart A and let A successfully join back to the cluster, all process running in B still remain there. Seems that JBoss cluster cannot migrate running process back. I just wonder that my conclusion is right or not, maybe I have missed some configuration?

      2) Here is my understanding about session bean failover: If we kill a server (say, node A) by ?Ctrl+c? or shutting down its OS normally, the OS of dying node will send out a specific message to other nodes, this message mark A as ?dead?. Thus other nodes will immediately know its death and take over the session running on the dead node. My question is that: does the ?smart proxy? in client side also capture and use this message? My guess is that: the ?smart proxy? does not need this message. When node A is unreachable, the interceptor will capture a RMI call exception, then the proxy will elect another target node and forward its RMI call to the new target node, the new node will take over the session and give back the response and new view of the cluster. If my understanding is right, every time the proxy capture a RMI call exception, it will forward its RMI call to another target node immediately, then no matter what kind of failure of A, there should be a very short time delay to failover a session. But we found a problem described in my question 3)

      3) Still suppose we have a two nodes cluster (A and B), and if we unplug the network cable of A, there will be no signal that can be sent to B to mark A as ?dead?. Node B cannot identify A as a ?dead? member immediately, because it is hard to immediately tell between the network traffic jam and an unplug event. Basically node B will try several times to ping node A till ?time out? to verify its suspect of node A?s death. With default ?time out? configuration in ?cluster-service.xml?, it may take minutes to failover a session. I have decreased those timeout parameters and number of re-try included in tag, also set the ?shun? attribute of ?pbcast.GMS? to ?True?. From log info, we know that node B can immediately detect the death of A, but it still take another 20~30 seconds to failover the session to B. So I feel puzzled, if the failover is exclusively handled by proxy in client side as I described above, there should not have such a time delay (even node B don?t know A is dead).

      4) My last question, suppose we have a extreme case, client make a RMI call to node A at t=0, and this RMI call will take 100min to finish the computation, then at t=2min, node A crush, then what will happen now? Can this also be failover to node B without restart this RMI call? The computation in node A can be continued in node B?

      I am not sure my question is clear and right, and I guess that maybe I still have wrong understanding of the mechanism of JBoss cluster. Any help will be highly appreciated, thanks a lot!

        • 1. Re: puzzles about load balancing and failover of JBoss clust
          vignesh76

          1) After node B comes online again, subsequents requests will be loadbalanced between A and B. I have tested this scenario and the loadbalancing works fine.


          • 2. Re: puzzles about load balancing and failover of JBoss clust
            menkun

            hi, thanks for your response, I appreciate your advice. However, maybe I didnot make my questions clear enough, actually in my 1st question, I emphasize on the running process, I want to know whether those process already failovered to A can be balanced back to B or not. I guess you are talking balancing the new request, and I am sure jboss can handle it.

            I guess JBoss cannot do process migration, the reason is that, the load balance is handled by the 'smart proxy', which has been downloaded to client side, so when the process is already running, it can be failovered, but not be balanced again while it is running.

            Any comments for my other questions? for my 4th questions, I am almost sure the answer is no. But I don't know too much of JBoss Clustering, so wish somebody can give me a hint, or just recommend some materials to read! Many thanks!

            • 3. Re: puzzles about load balancing and failover of JBoss clust
              menkun

              I am not sure I have make my questions clear enough or not, if you have some suggestions, please give me a help, or just mention that my questions are not clear, then I will try to re-describe them. Thanks!

              • 4. Re: puzzles about load balancing and failover of JBoss clust
                vignesh76

                Hi Menkun,

                Thanks for your clarification. Actually your questions 1 and 4 contradict each other if I am not mistaken. In question 1 you mention that "all processes in A are transferred to B seamlessly" and as per your last clarification, these processes are running processes i.e, still incomplete. Now your 4th question is all about failover of "running processes" and hence in fact your first question answers that. Your 4th question led me to assume that the processes that your are indicating in your first question could not have been about "running processes".

                I am surprised by your observation in 1 that JBoss could failover running processes since as far as I know JBoss cannot do that since the failover is handled by the client. You may also read some material here by Sacha that clarifies it

                http://www.ieeetcsc.org/content/tfcc-5-2-labourey.shtml

                Weblogic probably can failover running processes from what I have read about weblogic. Please correct me if I have understood wrongly.

                • 5. Re: puzzles about load balancing and failover of JBoss clust
                  menkun

                  Hi, Vigneshwaran
                  Thanks again for your kindly help, and also your valuable information.

                  Actually it is my fault to make some confusion. In my question 1), I should not use the term 'running processing', what I mean is the 'live session'. And in my question 4), what I meant is 'during RMI call'. The difference is that, in 1), I mean the session may have many RMI calls, but the session can be failover between two rmi call. and in 4), I mean that JBoss cannot do failover during a RMI call, I know it will be more difficult to do that.