0 Replies Latest reply on Jul 15, 2011 11:37 AM by jcstaff

    JMSBridge slow to shutdown during failed state

    jcstaff

      While tracking down the cause to a problem shutting down our JBoss servers, I discovered the following potential issue that -- when avoided -- allowed the JMS Bridge and JBoss server to shutdown within a tolerable amount of time. I am not yet sure of the ramifications, but the adjustment below in the JMSBridgeImpl  FailureHandler seems to fix the issue. (fyi...anyone interested in this thread is likely also interested in the solution in the following discussion thread and fix; unfortunately that patch from Aug2010 is not yet in 2.2.5.GA)

       

      * Environment: JBoss 5.1.0.GA, HornetQ 2.2.5.GA, 2 servers (A-B), 2 JMS Bridges (1 pull, 1 push) from B->A, timeout interval = 2secs

      * Scenario: All running, stop server A, then stop server B

      * Result: Server A stops <= 10secs (and B would too if A were running when B stopped). Server B takes ~2 minutes to stop.

       

      What I observed is that the JMS Bridge was going through a FailureHandler.run()->setupJMSObjectsWithRetry() logic because A was down

       

            while (true && !stopping)

            {

               boolean ok = setupJMSObjects();

       

      and then the server shutdown caused the JMSBridgeImpl.stop() method to be called and set stopping=true.

       

         public synchronized void stop() throws Exception

         {

            stopping = true;

       

       

      The stop() method then waits for the executor to shutdown.

       

      boolean ok = executor.awaitTermination(60, TimeUnit.SECONDS);

       

      Okay so far -- the FailureHandler looks like it breaks from the while loop but then attempts to call the same stop() method

       

               if (!ok)

               {

                  failed();

               }

      ...

            protected void failed()

            {

               // We haven't managed to recreate connections or maxRetries = 0

               JMSBridgeImpl.log.warn("Unable to set up connections, bridge will be stopped");

               try

               {

                  stop();

               }

       

      The problem is that the stop() method is synchronized and is waiting for this task to finish. It seemed to me that this cased the first 60 seconds to timeout before letting the shutdown proceed.

       

      My quick fix was to avoid the second call to the synchronized stop() method during the FailureHandler if the JMSBridge was already shutting down.

       

      failed()

      ...

      if (!stopping) {

         stop();

      }