2 Replies Latest reply on Aug 25, 2009 1:46 AM by vink

    Messaging not working reliably on failover

    vink

      I'm using JBoss 4.2.3GA with Messaging 1.4.2GA-SP1.

      I'm having 2 Jboss nodes. I'm having application deployed on these servers, which are firing events to a clustered queue & topic.
      I'm having a client which is listening to these events. I can see servers load balancing the calls invoked by clients resulting in jms messages, which is listened on clustered topic by the clients.

      I crash node1, all invocation calls are diverted to node2.
      The messages are seamlessly received by node2. (No exceptions on node2)
      Then I bring-up node1 again & wait until invocation calls are shared by both servers.

      Then I crashed node2, I see lot of exceptions on node1 for the failover. It takes few seconds until the failover is complete & then everything comes to normal.

      Problem: during the time failover is taking place, events are getting lost.
      - 1st case: It is successful in switching the load to next server; with no exception; no event loss; seamless to client.
      - 2nd case: The nodes which are once down, are not joining the cluster seamlessly on server side; as some failover sequence runs during the switchover. Events are getting lost in this process, & it is seamless to client.

      Why? this is critical & not the expectation.
      Is it a bug?

        • 1. Re: Messaging not working reliably on failover
          gaohoward

          Are you using durable subscribers or not?

          • 2. Re: Messaging not working reliably on failover
            vink

            No, my scenario don't allow me to make use of it. I've several active clients at the same time.

            Also, I don't know what it has to do with durable subscribers. If my case1 is successful then why my case2 is failing.

            I think somehow the nodes which goes down once are not joining some group which can allow that bridge to remain open for messages.

            Why only the failover sequence run in case2?

            In case2, my client also freezes for few seconds & donot receive any message. I see no problem with invocation thread placing call to my SLSB.