1 Reply Latest reply on Mar 2, 2010 8:21 AM by Mark Fishman

    The cluster did not reconnect after network failure...

    Mark Fishman Newbie

      Hi,

       

      I am using JBoss ESB 4.4 GA in a clustered enviroment and ran into a problem where, after a network outage, the cluster did not reconnect up correctly.  I started getting the following error over and over again in the log:

       

      [org.jboss.jms.server.connectionmanager.SingleConnectionManager] A problem has been detected with the connection to remote client 76e1i1v-jl5zdp-g5je1pxi-1-g63rvyz5-4sp, jmsClientID=null.  It is possible the client has exited without closing its connection(s) or the network has failed.  All connection resources corresponding to that client process will now be removed.

      [org.jboss.jms.server.connectionmanager.SingleConnectionManager] Cpmmectopm<amaher[e43b44] cannot look up remoting session ID 76e1i1v-jl5zdp-g5je1pxi-1-g63rvyz5-4sp

       

      The system had been idle for 5 days so it is possible that something timed out but I am going under the impression that it was instead a network outage that caused the problem.

       

      I did some research on the jboss remoting forums and I read a discussion about similar problems where they suggested changing the following parameters in the remoting-bisocket-service.xml file under jboss-messaging.sar:

       

      1. pingFrequency - from 214748364 to 30000
      2. pingWindowFactor - from 10 to 71582
      3. numberofCallRetries - from 1 to 5
      4. add a property of generalizeSocketException and set its value to true

       

      However, when I tried to do this in the jboss esb, I could not then get the esb to come up anymore.  Even when the only parameter I changed was pingFrequency and set it to 5 minutes, rather than max int, I still could not get my system up.

       

      Can anyone shed any light on my problem?  Am I going down the wrong path of trying to adjust the ping parameters?

       

      Thanks.

        • 1. Re: The cluster did not reconnect after network failure...
          Mark Fishman Newbie

          oops...I noticed that I fat fingered some of the information when I typed it in from log print out.  I am not sure if it helps but the error logs should really read:

           

          org.jboss.jms.server.connectionmanager.SimpleConnectionManager] A problem has been detected with the connection to remote client 76e1i1v-jl5zdp-g5je1pxi-1-g63rvyz5-4sp, jmsClientID=null.  It is possible the client has exited without closing its connection(s) or the network has failed.  All connection resources corresponding to that client process will now be removed.

          [org.jboss.jms.server.connectionmanager.SimpleConnectionManager] ConnectionManager[e43b44] cannot look up remoting session ID 76e1i1v-jl5zdp-g5je1pxi-1-g63rvyz5-4sp