1 Reply Latest reply on Jul 26, 2013 1:42 AM by kapfi

    Clustered Queue sending problem on cluster node hard kill

    kapfi

      Hi

       

      I think this question was asked before and it also relates to a JIRA issue: https://issues.jboss.org/browse/HORNETQ-571

       

      But i still have troubles sending messages when killing a node.

       

      Scenario:

      Im working with HornetQ 2.2.14 Final and JBoss 5.1.0.GA

      I have a clustered hornetq configuration with one queue ( 2 Cluster nodes). A MDB receives all messages from this queue.

      On startup of each node im registering a HAMembershipListener to detect failure nodes in my cluster.

      If i shutdown one of the nodes properly the update from the MembershipListener is sent and received correctly by the MDB. But if i kill (kill -9) the second node it happens that there is no message received by the MDB even though the first node detects the kill and triggers the update via HornetQ. After start of the second node again the message is received which tells me that this message is stuck somewhere and not sent over the clustered queue.

      In the JIRA Issue it says that this behavior is fixed in Release 2.2.0.GA but to me it still occurs in that case.

       

      What would be the best solution to make sure i can detect missing nodes immediately and distribute this info in the right way?

       

      My hornetq config files are added so u can have a look at it.

      If u need more pls let me know

       

      thx in advance

       

      Michael

       

      P.S.: In my jboss log im getting this warning after killing the node:

      WARN  [                                          BridgeImpl ] ClusterConnectionBridge@59c9a28a [name=sf.my-cluster.9a15fad6-f2aa-11e2-ac13-bd6b398ceb1c, queue=QueueImpl[name=sf.my-cluster.9a15fad6-f2aa-11e2-ac13-bd6b398ceb1c, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=d3bf4b2f-f2a9-11e2-a575-81fda13f21d0]]@6bee6eca targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@59c9a28a [name=sf.my-cluster.9a15fad6-f2aa-11e2-ac13-bd6b398ceb1c, queue=QueueImpl[name=sf.my-cluster.9a15fad6-f2aa-11e2-ac13-bd6b398ceb1c, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=d3bf4b2f-f2a9-11e2-a575-81fda13f21d0]]@6bee6eca targetConnector=ServerLocatorImpl [initialConnectors=[org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5446&host=10-30-1-172], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1800826491[nodeUUID=d3bf4b2f-f2a9-11e2-a575-81fda13f21d0, connector=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=10-30-1-172, address=jms, server=HornetQServerImpl::serverUUID=d3bf4b2f-f2a9-11e2-a575-81fda13f21d0])) [initialConnectors=[org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5446&host=10-30-1-172], discoveryGroupConfiguration=null]]::Connection failed with failedOver=false-HornetQException[errorCode=2 message=Channel disconnected]

      HornetQException[errorCode=2 message=Channel disconnected]

          at org.hornetq.core.client.impl.ClientSessionFactoryImpl.connectionDestroyed(ClientSessionFactoryImpl.java:381)
          at org.hornetq.core.remoting.impl.netty.NettyConnector$Listener$1.run(NettyConnector.java:715)
          at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:100)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
          at java.lang.Thread.run(Thread.java:679)
        • 1. Re: Clustered Queue sending problem on cluster node hard kill
          kapfi

          Ok ... after some more research i understand the behavior now.

           

          The round robin implementation of clustered HQ puts the messages into a queue which is waiting for the killed node to come back(e.g. sending 10 messages -> 5 are sent and 5 are queued for the killed node). This is a design decesion and as far as I found out it cannot be changed. On a proper shutdown everything is delivered in the right way.

          At least no message are getting lost as long as the killed node comes back up.