1 Reply Latest reply on Nov 15, 2013 12:09 PM by Justin Bertram

    After second failover the backup server does not go to sleep

    Tom Ross Newbie

      This HornetQ 2.3.11.Final (2.3.11, 123). I have two JBoss instances each with a live and collocated back up server.

       

      After I stop once server the failover happens and the backup server becomes live. After the server is restarted the backup server goes to sleep.

       

      I stop the server again and the backup server becomes live again. But after I restart the server the second time its backup server stays live and I see message in the log file:

       

      16:07:22,417 WARN  [org.hornetq.core.client] (hornetq-discovery-group-thread-dg-group1) HQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=f2a8bc08-4de9-11e3-aced-bbe86e93d236

      16:07:27,137 INFO  [org.hornetq.core.server] (Thread-8 (HornetQ-server-HornetQServerImpl::serverUUID=d010af91-4de9-11e3-b792-2f16bdbc1bd1-1746755501)) HQ221027: Bridge ClusterConnectionBridge@72eda5a3 [name=sf.my-cluster.f2a8bc08-4de9-11e3-aced-bbe86e93d236, queue=QueueImpl[name=sf.my-cluster.f2a8bc08-4de9-11e3-aced-bbe86e93d236, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=d010af91-4de9-11e3-b792-2f16bdbc1bd1]]@519d9c72 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@72eda5a3 [name=sf.my-cluster.f2a8bc08-4de9-11e3-aced-bbe86e93d236, queue=QueueImpl[name=sf.my-cluster.f2a8bc08-4de9-11e3-aced-bbe86e93d236, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=d010af91-4de9-11e3-b792-2f16bdbc1bd1]]@519d9c72 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5545&host=192-168-1-100], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1376091014[nodeUUID=d010af91-4de9-11e3-b792-2f16bdbc1bd1, connector=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=192-168-1-100, address=jms, server=HornetQServerImpl::serverUUID=d010af91-4de9-11e3-b792-2f16bdbc1bd1])) [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5545&host=192-168-1-100], discoveryGroupConfiguration=null]] is connected

      16:07:38,120 WARN  [org.hornetq.core.client] (hornetq-failure-check-thread) HQ212037: Connection failure has been detected: HQ119014: Did not receive data from /192.168.1.100:51709. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]

       

       

      The CLI console indicates that the life server is still running.

       

      Two standalone-full-ha.xml attached (one for each server)