JBAS-3639 -- Lookup fails duringfailover

brian.stansberry Sep 11, 2006 1:17 PM

Please confirm that your service makes more than one attempt to find the connection factory. The factory gets bound to JNDI as part of the failover of the HA-JMS service. HA-JMS is also running as an HA singleton. If your mbean does the lookup before the deployment of HA-JMS is complete, you'll get an NNFE.

1. Re: JBAS-3639 -- Lookup fails duringfailover

vpsangeetha Sep 13, 2006 2:03 PM (in response to brian.stansberry)

I implemented the retry loop to find the connection factory. The MBean does start up on the other node, but now the failover takes about 5 min which seems a bit too long.

We are wondering if it is because the startSingleton in our mbeans are blocking. I am going to implement the startSingleton so that it returns immediately and do the work in a seperate thread. I will see if that works.

Does that sound right to you? Do you have any other suggestions?
Actions
2. Re: JBAS-3639 -- Lookup fails duringfailover

brian.stansberry Sep 13, 2006 2:25 PM (in response to brian.stansberry)

That could very well be the issue. When there's a topology change, basically one thread loops through all the services that are monitoring the cluster notifying them of the change. Eventually those calls reach your singletons. If each of those singletons then takes a long time starting, the whole process will be slow. If the startup of a singleton is going to take a long time and it can be done asynchronously, it's definitely better to do it that way.
Actions
3. Re: JBAS-3639 -- Lookup fails duringfailover

vpsangeetha Sep 15, 2006 3:18 PM (in response to brian.stansberry)

Thank you for your suggestions. We implemented the singletons so that they return immediately and run the actual tasks it is supposed to do asynchronously. We also implemented the retry loop to find the connection factory.

We just tested it and the initial results look very promising. The failover happens immediately and all our services come back up in a minute. We are going to be doing more testing over the next week to make sure everything is OK.
Actions

Go to original post