-
1. Re: Same nodeUUID in hornetQ static-connection clustered nodes (wildfly 8.2.0)
jbertram Aug 11, 2016 9:48 PM (in response to meabhi007)Yes, it is expected that a live and a backup will share the same node UUID.
-
2. Re: Same nodeUUID in hornetQ static-connection clustered nodes (wildfly 8.2.0)
meabhi007 Aug 12, 2016 3:07 PM (in response to jbertram)Hi Justin,
Thanks for confirmation.
I have node-1(Active) and node-2(Backup) of hornetQ servers.
When node-1 faces network issue, node-2 becomes active.
Node-1 network got restored.
Since split brain occurred, manually graceful shut down triggered on node-1.
Based on hornetQ code, following flow is executed when node-1 goes down.
When node-1 goes down, it notifies node-2 about going down.
On node-2, while handing this message, it deletes the node-1 reference from topology.
Node-1 is sending following packet to node-2.(log from node-1)
2016-08-11 14:54:19,483 TRACE [org.hornetq.core.client] (ServerService Thread Pool -- 151) Sending packet nonblocking PACKET(DisconnectMessage)[type=11, channelID=0, packetObject=DisconnectMessage, nodeID=736f8d16-5f58-11e6-b85d-9951dd74bc8f] on channeID=0
Node-2 handles this packet and deletes the node-1 from topology. (log from node-2)
2016-08-11 14:54:19,564 TRACE [org.hornetq.core.client] (Thread-6 (HornetQ-client-netty-threads-1795638920)) handling packet PACKET(DisconnectMessage)[type=11, channelID=0, packetObject=DisconnectMessage, nodeID=736f8d16-5f58-11e6-b85d-9951dd74bc8f]
2016-08-11 14:54:19,564 TRACE [org.hornetq.core.client] (Thread-6 (HornetQ-client-netty-threads-1795638920)) Disconnect being called on client:PACKET(DisconnectMessage)[type=11, channelID=0, packetObject=DisconnectMessage, nodeID=736f8d16-5f58-11e6-b85d-9951dd74bc8f] server locator = ServerLocatorImpl (identity=(main-ClusterConnection::HornetQServerImpl::serverUUID=736f8d16-5f58-11e6-b85d-9951dd74bc8f)) [initialConnectors=[TransportConfiguration(name=netty-connector-cluster-node-1002, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=node1-domain-na&local-address=10-10-24-77, TransportConfiguration(name=netty-connector-cluster-node-2000, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=node2-domain-na&local-address=10-10-24-77], discoveryGroupConfiguration=null] notifying node 736f8d16-5f58-11e6-b85d-9951dd74bc8f as down: java.lang.Exception: trace
at org.hornetq.core.client.impl.ClientSessionFactoryImpl$Channel0Handler.handlePacket(ClientSessionFactoryImpl.java:1588) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.protocol.core.impl.ChannelImpl.handlePacket(ChannelImpl.java:641) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.doBufferReceived(RemotingConnectionImpl.java:556) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:532) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.client.impl.ClientSessionFactoryImpl$DelegatingBufferHandler.bufferReceived(ClientSessionFactoryImpl.java:1712) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.remoting.impl.netty.HornetQChannelHandler.channelRead(HornetQChannelHandler.java:73) [hornetq-core-client-2.4.5.Final.jar:]
at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:153) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:785) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:132) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:485) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:452) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:346) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_45]
2016-08-11 14:54:19,564 TRACE [org.hornetq.core.client] (Thread-6 (HornetQ-client-netty-threads-1795638920)) nodeDown ServerLocatorImpl (identity=(main-ClusterConnection::HornetQServerImpl::serverUUID=736f8d16-5f58-11e6-b85d-9951dd74bc8f)) [initialConnectors=[TransportConfiguration(name=netty-connector-cluster-node-1002, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=node1-domain-na&local-address=10-10-24-77, TransportConfiguration(name=netty-connector-cluster-node-2000, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=node2-domain-na&local-address=10-10-24-77], discoveryGroupConfiguration=null] nodeID=736f8d16-5f58-11e6-b85d-9951dd74bc8f as being down: java.lang.Exception: trace
at org.hornetq.core.client.impl.ServerLocatorImpl.notifyNodeDown(ServerLocatorImpl.java:1547) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.client.impl.ClientSessionFactoryImpl$Channel0Handler.handlePacket(ClientSessionFactoryImpl.java:1596) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.protocol.core.impl.ChannelImpl.handlePacket(ChannelImpl.java:641) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.doBufferReceived(RemotingConnectionImpl.java:556) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:532) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.client.impl.ClientSessionFactoryImpl$DelegatingBufferHandler.bufferReceived(ClientSessionFactoryImpl.java:1712) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.remoting.impl.netty.HornetQChannelHandler.channelRead(HornetQChannelHandler.java:73) [hornetq-core-client-2.4.5.Final.jar:]
at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:153) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:785) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:132) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:485) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:452) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:346) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_45]
2016-08-11 14:54:19,564 TRACE [org.hornetq.core.client] (Thread-6 (HornetQ-client-netty-threads-1795638920)) removeMember Topology@3a5ec825[owner=ClusterConnectionImpl@33512835[nodeUUID=736f8d16-5f58-11e6-b85d-9951dd74bc8f, connector=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-10-24-77, address=jms, server=HornetQServerImpl::serverUUID=736f8d16-5f58-11e6-b85d-9951dd74bc8f]] removing nodeID=736f8d16-5f58-11e6-b85d-9951dd74bc8f, result=TopologyMember[name = backup-group-1, connector=Pair[a=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-10-24-77, b=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-10-24-77]], size = 0: java.lang.Exception: trace
at org.hornetq.core.client.impl.Topology.removeMember(Topology.java:350) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.client.impl.ServerLocatorImpl.notifyNodeDown(ServerLocatorImpl.java:1550) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.client.impl.ClientSessionFactoryImpl$Channel0Handler.handlePacket(ClientSessionFactoryImpl.java:1596) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.protocol.core.impl.ChannelImpl.handlePacket(ChannelImpl.java:641) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.doBufferReceived(RemotingConnectionImpl.java:556) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:532) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.client.impl.ClientSessionFactoryImpl$DelegatingBufferHandler.bufferReceived(ClientSessionFactoryImpl.java:1712) [hornetq-core-client-2.4.5.Final.jar:]
at org.hornetq.core.remoting.impl.netty.HornetQChannelHandler.channelRead(HornetQChannelHandler.java:73) [hornetq-core-client-2.4.5.Final.jar:]
at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:153) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:785) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:132) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:485) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:452) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:346) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101) [netty-all-4.0.15.Final.jar:4.0.15.Final]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_45]
Since both nodes are sharing the same node id, while removal from topology, its actually deleting itself.Now, when Clients like MDB are trying to connect Node-2, "OnConnection" method from "org.hornetq.core.server.cluster.impl.ClusterConnectionImpl" tries to find localMember.
Here since local member is already deleted (as explained above), OnConnection logs message:
Local Member is not set at on ClusterConnection ClusterConnectionImpl@33512835[nodeUUID=736f8d16-5f58-11e6-b85d-9951dd74bc8f, connector=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-10-24-77, address=jms, server=HornetQServerImpl::serverUUID=736f8d16-5f58-11e6-b85d-9951dd74bc8f]
As a result, Messaging Node-2 is not able to serve clients, even though its up.
Can you please help in understanding the issue and letting me know, if we are doing anything wrong here.
-
3. Re: Same nodeUUID in hornetQ static-connection clustered nodes (wildfly 8.2.0)
jbertram Aug 12, 2016 3:31 PM (in response to meabhi007)At this point if you get into a split-brain situation you're only real recourse is to shut-down both brokers and manually resolve any potential data integrity issues. You shouldn't expect the broker to work properly in any way in a split-brain situation as it wasn't designed for this.
-
4. Re: Same nodeUUID in hornetQ static-connection clustered nodes (wildfly 8.2.0)
meabhi007 Aug 12, 2016 3:59 PM (in response to jbertram)Thanks Justin.
I'll try shared-store approach to see, if it can help in avoiding split brain issues.