Live-Backup configuration not working
luiz_gustavo Feb 17, 2016 10:41 AMUsing JBoss EAP 6.2 I've configured a cluster with two nodes. These nodes are on distinct hosts (fiscally separated), and both contains just one server configured (besides host controller, and the domain controller in one of them).
The application deployed in this cluster depends on messaging mechanism to send messages to desktop clients, so I've configured a live-backup pair with the two nodes, in which the node (10.200.76.20) that contains the domain controller is configured as the live server and the other node (10.200.76.21) is configured as the backup server.
In the applications (server and desktop client) we configured the fail-over mechanism, passing the IP addresses of both nodes when connecting, treating connection exceptions on the message consumers so in the case of failure it tries to reconnect, and on initial tests all worked fine.
Well, after some time we've created other clusters in the exact same way, each one independent of the others, but we notice that in some clusters some messages sent by the server application were not being delivered to the desktop clients. Then when we did a shutdown on the live server, the messages that had not been delivered before were delivered to the client.
Looking at the logs, the problem I found in these cases is the following: When I initialize both instances, the HornetQ server starts as live server in both; it seems that the "backup=true" configuration we do in node 2 is ignored.
Log snippet on node 1:
13:56:18,311 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 62) HQ221000: live server is starting with configuration HornetQ Configuration (clustered=true,backup=false,sharedStore=false,journalDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node1\domain\servers\orgunitx-node1\data\messagingjournal,bindingsDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node1\domain\servers\orgunitx-node1\data\messagingbindings,largeMessagesDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node1\domain\servers\orgunitx-node1\data\messaginglargemessages,pagingDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node1\domain\servers\orgunitx-node1\data\messagingpaging)
Log snippet on node 2:
13:57:56,682 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 62) HQ221000: live server is starting with configuration HornetQ Configuration (clustered=true,backup=false,sharedStore=false,journalDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node2\domain\servers\orgunitx-node2\data\messagingjournal,bindingsDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node2\domain\servers\orgunitx-node2\data\messagingbindings,largeMessagesDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node2\domain\servers\orgunitx-node2\data\messaginglargemessages,pagingDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node2\domain\servers\orgunitx-node2\data\messagingpaging)
Questions:
1. Look that node 2 starts with "clustered=true,backup=false", but shouldn't it be "clustered=true,backup=true"?
2. What's the exact behaviour of backup servers in initialization? They should really create the destinations, always, or just when assuming the live server messages, in case of failure of the live server?
3. In documentations, I read that even when backup servers are active (but in backup mode) they can receive message replications, but clients can never consume from them, just from the live server, and that just in case of live server failure the consumers can consume from backup server. It that right?
These are the configuration snippets of both servers, and the full domain.xml of both are attached.
One Node 1 (live)
<hornetq-server> <persistence-enabled>true</persistence-enabled> <security-enabled>false</security-enabled> <cluster-password>orgunitx</cluster-password> <backup>false</backup> <allow-failback>true</allow-failback> <failover-on-shutdown>true</failover-on-shutdown> <shared-store>false</shared-store> <journal-type>NIO</journal-type> <journal-min-files>2</journal-min-files> <check-for-live-server>true</check-for-live-server> <backup-group-name>cluster-orgunitx</backup-group-name>
One Node 2 (backup)
<hornetq-server> <persistence-enabled>true</persistence-enabled> <journal-type>NIO</journal-type> <journal-min-files>2</journal-min-files> <security-enabled>false</security-enabled> <cluster-password>orgunitx</cluster-password> <backup>true</backup> <!-- in backup server use "true" --> <allow-failback>true</allow-failback> <failover-on-shutdown>true</failover-on-shutdown> <shared-store>false</shared-store> <check-for-live-server>true</check-for-live-server> <backup-group-name>cluster-orgunitx</backup-group-name>
And below there are more log snippets of both servers right after the initialization (if necessary I can post complete logs, and the lines mentioned are relative the entire log):
Analysis of server-orgunitx-node1.log
line 130 - initializing messaging service (JMS) in node 1:
13:56:18,311 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 62) HQ221000: live server is starting with configuration HornetQ Configuration (clustered=true,backup=false,sharedStore=false,journalDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node1\domain\servers\orgunitx-node1\data\messagingjournal,bindingsDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node1\domain\servers\orgunitx-node1\data\messagingbindings,largeMessagesDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node1\domain\servers\orgunitx-node1\data\messaginglargemessages,pagingDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node1\domain\servers\orgunitx-node1\data\messagingpaging)
lines 777 a 779 - initializing HTTP service of the node 1 in cluster mode (til this point the HTTP cluster is not recognized):
13:56:32,524 INFO [stdout] (ServerService Thread Pool -- 63) ------------------------------------------------------------------- 13:56:32,524 INFO [stdout] (ServerService Thread Pool -- 63) GMS: address=orgunitxnode1:orgunitx-node1/web, cluster=web, physical address=10.200.76.20:7600 13:56:32,524 INFO [stdout] (ServerService Thread Pool -- 63) -------------------------------------------------------------------
lines 817 a 845 - JMS service in node 1 already initialized and starts the creation of destination objects:
13:56:33,117 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 62) HQ221007: Server is now live 13:56:33,117 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 62) HQ221001: HornetQ Server version 2.3.12.Final (2.3.12, 123) [31e6e889-cd9e-11e3-b757-f5fb6f15c333] 13:56:33,148 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 62) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/ConnectionFactory 13:56:33,148 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 95) HQ221003: trying to deploy queue jms.topic.NotificacaoCTETopic 13:56:33,148 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 95) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/topic/NotificacaoCTETopic 13:56:33,148 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 95) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/topic/NotificacaoCTETopic 13:56:33,148 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 97) HQ221003: trying to deploy queue jms.topic.ControleTrafegoEntradaTopic 13:56:33,148 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 97) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/topic/ControleTrafegoEntradaTopic 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 97) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/topic/ControleTrafegoEntradaTopic 13:56:33,164 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 96) HQ221003: trying to deploy queue jms.topic.ListarCargasPatioTopic 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 96) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/topic/ListarCargasPatioTopic 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 96) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/topic/ListarCargasPatioTopic 13:56:33,164 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 98) HQ221003: trying to deploy queue jms.queue.PRXQueue 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 98) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/queue/PRXQueue 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 98) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/queue/PRXQueue 13:56:33,164 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 100) HQ221003: trying to deploy queue jms.queue.BalancaEntradaQueue 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 100) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/queue/BalancaEntradaQueue 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 100) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/queue/BalancaEntradaQueue 13:56:33,164 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 101) HQ221003: trying to deploy queue jms.queue.BalancaSaidaQueue 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 101) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/queue/BalancaSaidaQueue 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 101) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/queue/BalancaSaidaQueue 13:56:33,164 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 104) HQ221003: trying to deploy queue jms.queue.ControleTrafegoSaidaQueue 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 104) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/queue/ControleTrafegoSaidaQueue 13:56:33,164 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 104) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/queue/ControleTrafegoSaidaQueue 13:56:33,179 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 103) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/jms/RemoteConnectionFactory 13:56:33,257 INFO [org.jboss.as.connector.deployment] (MSC service thread 1-6) JBAS010406: Fábrica de conexão registrada java:/JmsXA 13:56:33,304 INFO [org.hornetq.ra] (MSC service thread 1-6) HornetQ resource adaptor started 13:56:33,304 INFO [org.jboss.as.connector.services.resourceadapters.ResourceAdapterActivatorService$ResourceAdapterActivator] (MSC service thread 1-6) IJ020002: Deployed: file://RaActivatorhornetq-ra 13:56:33,304 INFO [org.jboss.as.connector.deployment] (MSC service thread 1-16) JBAS010401: Limite JCA ConnectionFactory [java:/JmsXA]
lines 1173/1174 - in this point the HTTP services (node 1 and node 2) recognize each other in cluster:
13:58:10,207 INFO [org.jboss.as.clustering] (Incoming-1,shared=tcpping) JBAS010225: Nova visualização do cluster para a partição web (id: 1, delta: 1, merge: false) : [orgunitxnode1:orgunitx-node1/web, orgunitxnode2:orgunitx-node2/web] 13:58:10,207 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-1,shared=tcpping) ISPN000094: Received new cluster view: [orgunitxnode1:orgunitx-node1/web|1] [orgunitxnode1:orgunitx-node1/web, orgunitxnode2:orgunitx-node2/web]
line 2713 - in this point the JMS service in node 1 recognizes the cluster
14:05:38,500 INFO [org.hornetq.core.server] (Thread-6 (HornetQ-server-HornetQServerImpl::serverUUID=31e6e889-cd9e-11e3-b757-f5fb6f15c333-9887497)) HQ221027: Bridge ClusterConnectionBridge@1f7ff17 [name=sf.my-cluster.3d9d1beb-cd9e-11e3-a542-3fcdf66d747d, queue=QueueImpl[name=sf.my-cluster.3d9d1beb-cd9e-11e3-a542-3fcdf66d747d, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=31e6e889-cd9e-11e3-b757-f5fb6f15c333]]@88b0d3 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@1f7ff17 [name=sf.my-cluster.3d9d1beb-cd9e-11e3-a542-3fcdf66d747d, queue=QueueImpl[name=sf.my-cluster.3d9d1beb-cd9e-11e3-a542-3fcdf66d747d, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=31e6e889-cd9e-11e3-b757-f5fb6f15c333]]@88b0d3 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-200-76-21], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@12782165[nodeUUID=31e6e889-cd9e-11e3-b757-f5fb6f15c333, connector=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-200-76-20, address=jms, server=HornetQServerImpl::serverUUID=31e6e889-cd9e-11e3-b757-f5fb6f15c333])) [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-200-76-21], discoveryGroupConfiguration=null]] is connected
Analysis of server-orgunitx-node2.log
line 132 - initializing messaging service (JMS) on node 2. Look that the service recognizes the cluster, but its backup configuration is false, and it should be true, because in domain.xml the backup configuration for node 2, in messaging subsystem, is true.
13:57:56,682 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 62) HQ221000: live server is starting with configuration HornetQ Configuration (clustered=true,backup=false,sharedStore=false,journalDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node2\domain\servers\orgunitx-node2\data\messagingjournal,bindingsDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node2\domain\servers\orgunitx-node2\data\messagingbindings,largeMessagesDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node2\domain\servers\orgunitx-node2\data\messaginglargemessages,pagingDirectory=c:\COMPANYX\CLUSTER\jboss-eap-orgunitx-node2\domain\servers\orgunitx-node2\data\messagingpaging)
lines 796 a 798 - initializing HTTP service of the node 2 in cluster mode (til this point the HTTP cluster is not recognized):
13:58:09,772 INFO [stdout] (ServerService Thread Pool -- 63) ------------------------------------------------------------------- 13:58:09,772 INFO [stdout] (ServerService Thread Pool -- 63) GMS: address=orgunitxnode2:orgunitx-node2/web, cluster=web, physical address=10.200.76.21:7600 13:58:09,772 INFO [stdout] (ServerService Thread Pool -- 63) -------------------------------------------------------------------
lines 800 e 801 - in this point the HTTP services (node 1 and node 2) recognize each other in cluster:
13:58:10,661 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 63) ISPN000094: Received new cluster view: [orgunitxnode1:orgunitx-node1/web|1] [orgunitxnode1:orgunitx-node1/web, orgunitxnode2:orgunitx-node2/web] 13:58:10,708 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 63) ISPN000079: Cache local address is orgunitxnode2:orgunitx-node2/web, physical addresses are [10.200.76.21:7600]
lines 829 a 855 - JMS service already initialized and starts the creation of destination objects:
13:58:12,439 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 62) HQ221007: Server is now live 13:58:12,439 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 62) HQ221001: HornetQ Server version 2.3.12.Final (2.3.12, 123) [3d9d1beb-cd9e-11e3-a542-3fcdf66d747d] 13:58:12,455 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 62) HQ221003: trying to deploy queue jms.queue.BalancaSaidaQueue 13:58:12,455 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 62) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/queue/BalancaSaidaQueue 13:58:12,455 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 62) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/queue/BalancaSaidaQueue 13:58:12,471 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 67) HQ221003: trying to deploy queue jms.topic.ListarCargasPatioTopic 13:58:12,471 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 67) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/topic/ListarCargasPatioTopic 13:58:12,471 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 67) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/topic/ListarCargasPatioTopic 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 68) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/ConnectionFactory 13:58:12,486 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 69) HQ221003: trying to deploy queue jms.topic.ControleTrafegoEntradaTopic 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 69) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/topic/ControleTrafegoEntradaTopic 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 69) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/topic/ControleTrafegoEntradaTopic 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 93) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/jms/RemoteConnectionFactory 13:58:12,486 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 90) HQ221003: trying to deploy queue jms.topic.NotificacaoCTETopic 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 90) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/topic/NotificacaoCTETopic 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 90) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/topic/NotificacaoCTETopic 13:58:12,486 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 64) HQ221003: trying to deploy queue jms.queue.BalancaEntradaQueue 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 64) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/queue/BalancaEntradaQueue 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 64) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/queue/BalancaEntradaQueue 13:58:12,486 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 71) HQ221003: trying to deploy queue jms.queue.ControleTrafegoSaidaQueue 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 71) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/queue/ControleTrafegoSaidaQueue 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 71) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/queue/ControleTrafegoSaidaQueue 13:58:12,486 INFO [org.hornetq.core.server] (ServerService Thread Pool -- 102) HQ221003: trying to deploy queue jms.queue.PRXQueue 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 102) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:jboss/exported/queue/PRXQueue 13:58:12,486 INFO [org.jboss.as.messaging] (ServerService Thread Pool -- 102) JBAS011601: Objeto de mensagem vinculado ao nome jndi java:/queue/PRXQueue 13:58:12,549 INFO [org.jboss.as.connector.deployment] (MSC service thread 1-21) JBAS010406: Fábrica de conexão registrada java:/JmsXA 13:58:12,595 INFO [org.hornetq.ra] (MSC service thread 1-21) HornetQ resource adaptor started
line 2292 - in this point the JMS service in node 2 recognizes the cluster
14:05:38,226 INFO [org.hornetq.core.server] (Thread-0 (HornetQ-server-HornetQServerImpl::serverUUID=3d9d1beb-cd9e-11e3-a542-3fcdf66d747d-23729605)) HQ221027: Bridge ClusterConnectionBridge@1d28dc9 [name=sf.my-cluster.31e6e889-cd9e-11e3-b757-f5fb6f15c333, queue=QueueImpl[name=sf.my-cluster.31e6e889-cd9e-11e3-b757-f5fb6f15c333, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=3d9d1beb-cd9e-11e3-a542-3fcdf66d747d]]@10519c6 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@1d28dc9 [name=sf.my-cluster.31e6e889-cd9e-11e3-b757-f5fb6f15c333, queue=QueueImpl[name=sf.my-cluster.31e6e889-cd9e-11e3-b757-f5fb6f15c333, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=3d9d1beb-cd9e-11e3-a542-3fcdf66d747d]]@10519c6 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-200-76-20], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@18916823[nodeUUID=3d9d1beb-cd9e-11e3-a542-3fcdf66d747d, connector=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-200-76-21, address=jms, server=HornetQServerImpl::serverUUID=3d9d1beb-cd9e-11e3-a542-3fcdf66d747d])) [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=10-200-76-20], discoveryGroupConfiguration=null]] is connected
-
domain-node1.xml.zip 5.5 KB
-
domain-node2.xml.zip 5.9 KB