8 Replies Latest reply on Nov 27, 2012 4:13 AM by Andy Taylor

    When master server fail back, the backup server will be stopped and never come back to backup status again

    yong deng Newbie

      Description:

       

      HornetQ: 2.3 BETA

      HA mode: data replication

      Topology: Two hornet server - 1 master and 1 backup. Configured to allow failback

       

      Steps:

      1. Start master

      2. Start backup and backup will replicate from master

      3. Gracefully stop master after replication finishes

      4. Start master and master will failback

      5. Backup will be stopped and won't come back to backup status unless we reboot the JAVA process

       

      This behaivor is not preferred. In step 5, The backup needs go back to backup status without reboot JAVA process.

      The issue occurs in Data replication mode. Share store mode works fine.

       

      Configuration files are attached.

       

      Backup server log:

       

      A subdirectory or file ..\logs already exists.

      ***********************************************************************************

      "java -Dbind.address=10.111.3.244 -Djnp.port=1099 -Djnp.rmiPort=1098 -Djnp.host=10.111.3.244 -Dhorne

      tq.remoting.netty.host=10.111.3.244 -Dhornetq.remoting.netty.port=5445 -Dorg.hornetq.logger-delegate

      -factory-class-name=org.hornetq.integration.logging.Log4jLogDelegate -XX:+UseParallelGC  -XX:+Aggres

      siveOpts -XX:+UseFastAccessorMethods -Xms512M -Xmx1024M -Dhornetq.config.dir="D:\work\hornetq-2.3.0.

      BETA1\bin\..\config\stand-alone\clustered" -Djava.util.logging.manager=org.jboss.logmanager.LogManag

      er -Djava.util.logging.config.file="D:\work\hornetq-2.3.0.BETA1\bin\..\config\stand-alone\clustered"

      \logging.properties -Djava.library.path=. -classpath "D:\work\hornetq-2.3.0.BETA1\bin\..\config\stan

      d-alone\clustered";..\schemas\;D:\work\hornetq-2.3.0.BETA1\lib\hornetq-bootstrap.jar;D:\work\hornetq

      -2.3.0.BETA1\lib\hornetq-commons.jar;D:\work\hornetq-2.3.0.BETA1\lib\hornetq-core-client.jar;D:\work

      \hornetq-2.3.0.BETA1\lib\hornetq-core.jar;D:\work\hornetq-2.3.0.BETA1\lib\hornetq-jboss-as-integrati

      on.jar;D:\work\hornetq-2.3.0.BETA1\lib\hornetq-jms-client.jar;D:\work\hornetq-2.3.0.BETA1\lib\hornet

      q-jms.jar;D:\work\hornetq-2.3.0.BETA1\lib\hornetq-journal.jar;D:\work\hornetq-2.3.0.BETA1\lib\hornet

      q-rest.jar;D:\work\hornetq-2.3.0.BETA1\lib\hornetq-service-sar.jar;D:\work\hornetq-2.3.0.BETA1\lib\h

      ornetq-spring-integration.jar;D:\work\hornetq-2.3.0.BETA1\lib\hornetq-twitter-integration.jar;D:\wor

      k\hornetq-2.3.0.BETA1\lib\jboss-jms-api.jar;D:\work\hornetq-2.3.0.BETA1\lib\jboss-mc.jar;D:\work\hor

      netq-2.3.0.BETA1\lib\jgroups.jar;D:\work\hornetq-2.3.0.BETA1\lib\jnp-client.jar;D:\work\hornetq-2.3.

      0.BETA1\lib\jnpserver.jar;D:\work\hornetq-2.3.0.BETA1\lib\log4j.jar;D:\work\hornetq-2.3.0.BETA1\lib\

      netty.jar org.hornetq.integration.bootstrap.HornetQBootstrapServer hornetq-beans.xml"

      ***********************************************************************************

      16:39:11,443 INFO  [org.hornetq.integration.bootstrap] HQ101001: Starting HornetQ Server

      16:39:13,115 INFO  [org.hornetq.core.server] HQ111045: Configuration option clustered is deprecated.

      Consult the manual for details.

      16:39:13,208 WARN  [org.hornetq.core.server] HQ112054: AIO was not located on this platform, it will

      fall back to using pure Java NIO. If your platform is Linux, install LibAIO to enable the AIO journ

      al

      16:39:13,318 INFO  [org.hornetq.core.server] HQ111001: backup server is starting with configuration

      HornetQ Configuration (clustered=true,backup=true,sharedStore=false,journalDirectory=../data/journal

      ,bindingsDirectory=../data/bindings,largeMessagesDirectory=../data/large-messages,pagingDirectory=..

      /data/paging)

      16:39:13,349 WARN  [org.hornetq.core.server] HQ112216: Moving data directory ../data/bindings to ..\

      data\bindings12

      16:39:13,349 WARN  [org.hornetq.core.server] HQ112216: Moving data directory ../data/paging to ..\da

      ta\paging12

      16:39:13,349 WARN  [org.hornetq.core.server] HQ112216: Moving data directory ../data/large-messages

      to ..\data\large-messages12

      16:39:13,412 INFO  [org.hornetq.core.server] HQ111017: Using NIO Journal

      16:39:13,427 WARN  [org.hornetq.core.server] HQ112008: Security risk! HornetQ is running with the de

      fault cluster admin user and default password. Please see the HornetQ user guide, cluster chapter, f

      or instructions on how to change this.

      2012-11-02 16:39:13,771 WARN  [Configurator] - FD property shun was deprecated and is ignored

      2012-11-02 16:39:14,115 WARN  [Configurator] - FD property shun was deprecated and is ignored

       

      -------------------------------------------------------------------

      GMS: address=NJ-WKST20-1062, cluster=fs_group1, physical address=10.111.3.244:4159

      -------------------------------------------------------------------

      16:39:18,052 INFO  [org.hornetq.core.server] HQ111037: Waiting to become backup node

      16:39:18,052 INFO  [org.hornetq.core.server] HQ111038: ** got backup lock

      16:39:18,052 INFO  [org.hornetq.core.server] HQ001111: HornetQ Backup Server version 2.3.0.BETA1 (Ho

      rnetQ sting, 122) [f57a81e8-1f74-11e2-b267-3fcd72a9716e] started, waiting live to fail before it get

      s active

      16:39:23,802 INFO  [org.hornetq.core.server] HQ111028: Backup server HornetQServerImpl::serverUUID=d

      2e77e52-1f74-11e2-98fe-c7c51388aa70 is synchronized with live-server.

      2012-11-02 16:39:23,818 WARN  [Configurator] - FD property shun was deprecated and is ignored

       

      -------------------------------------------------------------------

      GMS: address=NJ-WKST20-42693, cluster=fs_group1, physical address=10.111.3.244:4164

      -------------------------------------------------------------------

      16:39:27,943 INFO  [org.hornetq.core.server] HQ111036: backup announced

      16:39:31,865 INFO  [org.hornetq.core.server] HQ111044: HornetQServerImpl::serverUUID=d2e77e52-1f74-1

      1e2-98fe-c7c51388aa70 to become live

      16:39:31,880 WARN  [org.hornetq.core.server] HQ112107: Connection failure has been detected: HQ11903

      5: The connection was disconnected because of server shutdown [code=DISCONNECTED]

      16:39:31,896 WARN  [org.hornetq.core.server] HQ112107: Connection failure has been detected: HQ11903

      5: The connection was disconnected because of server shutdown [code=DISCONNECTED]

      16:39:32,349 WARN  [org.hornetq.core.server] HQ112107: Connection failure has been detected: HQ11903

      5: The connection was disconnected because of server shutdown [code=DISCONNECTED]

      16:39:33,818 INFO  [org.hornetq.core.server] HQ111005: trying to deploy queue jms.queue.DLQ

      16:39:33,833 INFO  [org.hornetq.core.server] HQ111005: trying to deploy queue jms.queue.ExpiryQueue

      16:39:33,833 INFO  [org.hornetq.core.server] HQ111005: trying to deploy queue jms.topic.topic1

      16:39:33,880 INFO  [org.hornetq.core.server] HQ111005: trying to deploy queue jms.topic.topic2

      16:39:33,974 INFO  [org.hornetq.core.server] HQ111024: Started Netty Acceptor version 3.4.5.Final-2d

      a5b0e 10.111.3.244:5,455 for CORE protocol

      16:39:33,974 INFO  [org.hornetq.core.server] HQ111024: Started Netty Acceptor version 3.4.5.Final-2d

      a5b0e 10.111.3.244:5,445 for CORE protocol

       

      -------------------------------------------------------------------

      GMS: address=NJ-WKST20-12914, cluster=fs_group1, physical address=10.111.3.244:4177

      -------------------------------------------------------------------

      16:39:34,177 WARN  [org.hornetq.core.server] HQ112050: There are more than one servers on the networ

      k broadcasting the same node id. You will see this message exactly once (per node) if a node is rest

      arted, in which case it can be safely ignored. But if it is logged continuously it means you really

      do have more than one node on the same network active concurrently with the same node id. This could

      occur if you have a backup node active at the same time as its live node. nodeID=d2e77e52-1f74-11e2

      -98fe-c7c51388aa70

      2012-11-02 16:39:34,615 WARN  [Configurator] - FD property shun was deprecated and is ignored

       

      -------------------------------------------------------------------

      GMS: address=NJ-WKST20-23110, cluster=fs_group1, physical address=10.111.3.244:4182

      -------------------------------------------------------------------

      16:39:51,287 INFO  [org.hornetq.core.server] HQ111030: Replication: sending JournalFileImpl: (hornet

      q-data-41.hq id = 97, recordID = 97) (size=10,485,760) to backup. NIOSequentialFile ..\data\journal\

      hornetq-data-41.hq

      16:39:54,162 INFO  [org.hornetq.core.server] HQ111030: Replication: sending JournalFileImpl: (hornet

      q-bindings-1.bindings id = 1, recordID = 1) (size=1,048,576) to backup. NIOSequentialFile ..\data\bi

      ndings\hornetq-bindings-1.bindings

      16:39:54,380 INFO  [org.hornetq.core.server] HQ111030: Replication: sending JournalFileImpl: (hornet

      q-bindings-8.bindings id = 8, recordID = 8) (size=1,048,576) to backup. NIOSequentialFile ..\data\bi

      ndings\hornetq-bindings-8.bindings

      16:40:00,052 WARN  [org.hornetq.core.server] HQ112165: On ManagementService stop, there are 7 unexpe

      cted registered MBeans: [jms.server, jms.queue.ExpiryQueue, jms.topic.topic2, jms.topic.topic1, jms.

      queue.DLQ, jms.connectionfactory.NettyThroughputConnectionFactory, jms.connectionfactory.NettyConnec

      tionFactory]

      16:40:00,537 INFO  [org.hornetq.core.server] HQ111004: HornetQ Server version 2.3.0.BETA1 (HornetQ s

      ting, 122) [d2e77e52-1f74-11e2-98fe-c7c51388aa70] stopped