3 Replies Latest reply on Sep 15, 2006 3:18 PM by vpsangeetha

    JBAS-3639 -- Lookup fails duringfailover

    brian.stansberry

      Please confirm that your service makes more than one attempt to find the connection factory. The factory gets bound to JNDI as part of the failover of the HA-JMS service. HA-JMS is also running as an HA singleton. If your mbean does the lookup before the deployment of HA-JMS is complete, you'll get an NNFE.

        • 1. Re: JBAS-3639 -- Lookup fails duringfailover
          vpsangeetha

          I implemented the retry loop to find the connection factory. The MBean does start up on the other node, but now the failover takes about 5 min which seems a bit too long.

          We are wondering if it is because the startSingleton in our mbeans are blocking. I am going to implement the startSingleton so that it returns immediately and do the work in a seperate thread. I will see if that works.

          Does that sound right to you? Do you have any other suggestions?

          • 2. Re: JBAS-3639 -- Lookup fails duringfailover
            brian.stansberry

            That could very well be the issue. When there's a topology change, basically one thread loops through all the services that are monitoring the cluster notifying them of the change. Eventually those calls reach your singletons. If each of those singletons then takes a long time starting, the whole process will be slow. If the startup of a singleton is going to take a long time and it can be done asynchronously, it's definitely better to do it that way.

            • 3. Re: JBAS-3639 -- Lookup fails duringfailover
              vpsangeetha

              Thank you for your suggestions. We implemented the singletons so that they return immediately and run the actual tasks it is supposed to do asynchronously. We also implemented the retry loop to find the connection factory.

              We just tested it and the initial results look very promising. The failover happens immediately and all our services come back up in a minute. We are going to be doing more testing over the next week to make sure everything is OK.