1 2 Previous Next 26 Replies Latest reply on May 31, 2007 4:11 AM by timfox Go to original post
      • 15. Re: 1.2.0.GA transparent node failover does not always work
        bander

         

        "timfox" wrote:
        Since it seems you're not interested in the load balancing/automatic failover abilities of JBM, you could just use the non clustered connection factory at /NonClusteredConnectionFactory to create connections.


        We certainly are interested in the load balancing and automatic failover abilities of JBM but we have to verify that all our JBM 1.0.1 issues have been resolved.

        • 16. Re: 1.2.0.GA transparent node failover does not always work
          timfox

          Ben-

          I return to the UK next week (am in US right now), and I'll spend some time trying to get to the bottom of what is happening in your case.

          • 17. Re: 1.2.0.GA transparent node failover does not always work
            sergeypk

            I investigated this with help from Tim. The example worked fine for me, transparent failover was really transparent, and non-transparent failover also worked.

            To make it handle the situation when the entire cluster goes down, I modified the example to re-lookup the connection factory from JNDI when reconnecting.

            • 18. Re: 1.2.0.GA transparent node failover does not always work
              timfox

              Yes, thanks Sergey :)

              One thing we noticed with Ben's code, is that it tries to use the same old connection factory after failure has occurred.

              If you're doing the "old style" manual reconnect on failure, you need to throw away the connection factory after failure, or it may not know about the new cluster topology.

              Also, for this kind of thing, the user code should also bne using HAJNDI to ensure the new JNDI lookup works on a different node after failure of the original node.

              • 19. Re: 1.2.0.GA transparent node failover does not always work
                bander

                 

                "timfox" wrote:
                Yes, thanks Sergey :)

                One thing we noticed with Ben's code, is that it tries to use the same old connection factory after failure has occurred.

                If you're doing the "old style" manual reconnect on failure, you need to throw away the connection factory after failure, or it may not know about the new cluster topology.



                Is it standard practice to not reuse a connection factory reference or is this just a JBM specific requirement? (i'm asking because we're trying to keep our code vendor neutral)

                • 20. Re: 1.2.0.GA transparent node failover does not always work
                  timfox

                   

                  "bander" wrote:
                  "timfox" wrote:
                  Yes, thanks Sergey :)

                  Is it standard practice to not reuse a connection factory reference or is this just a JBM specific requirement? (i'm asking because we're trying to keep our code vendor neutral)


                  It certainly was standard practice with JBoss MQ, and if you think about it, you can't guarantee that provider XYZ doesn't encode information into the connection factor about their topology so it makes sense to do it for all.


                  • 21. Re: 1.2.0.GA transparent node failover does not always work
                    sergeypk

                    Tim, please correct me if I'm wrong on this:

                    I don't think there's anything in the JMS spec about behavior of clustering and failover. In any case, you only need to re-lookup the connection factory if the entire cluster goes down and is restarted later, not if just one node fails.

                    Basically, the factory remembers the last node that was alive and will try to create connections targetting this node. If the node doesn't come alive, the attempts will keep failing, even though there could be other nodes already alive that could be used in place of this one.

                    • 22. Re: 1.2.0.GA transparent node failover does not always work
                      timfox

                      But, in the general case you need to look it up every time, since you can't make assumptions how a specific provider implements their clustering.

                      • 23. Re: 1.2.0.GA transparent node failover does not always work
                        bander

                         

                        "timfox" wrote:
                        But, in the general case you need to look it up every time, since you can't make assumptions how a specific provider implements their clustering.


                        I'm showing my ignorance of JNDI here - what exactly does looking up the connection factory do (other than get an object reference)? Is a new connection factory object being created each time?

                        • 24. Re: 1.2.0.GA transparent node failover does not always work
                          timfox

                          It gets whatever is bound in the JNDI tree at that time.

                          • 25. Re: 1.2.0.GA transparent node failover does not always work
                            bander

                             

                            "timfox" wrote:
                            It gets whatever is bound in the JNDI tree at that time.


                            So JBM is actively changing whatever is bound as nodes go up and down etc?

                            • 26. Re: 1.2.0.GA transparent node failover does not always work
                              timfox

                               

                              "bander" wrote:

                              So JBM is actively changing whatever is bound as nodes go up and down etc?


                              Yes. But it's actually more than that. The connection factory maintains a list of nodes to failover onto and to load balance connetions across, when the cluster topology changes (a node joins or leaves) then two things happen:

                              1) The connection factory is rebound in JNDI with the updated list

                              2) An update message is sent to all the clients which already have their connection factories to make them update their internal lists.

                              I suspect other providers might also use a similar approach.

                              So if you re-use a CF from before the crash then it's likely it won't know about the different topology - at least you can't guarantee it since you'll have to start making nig assumptions based on the implementation details of the particular messaging system you're using. Safest thing to do is to throw it away and start again.

                              1 2 Previous Next