Core Bridge with target on live/backup doesn't failover
brenuart Sep 17, 2013 4:09 PMHello everybody,
I am currently setting up the following HornetQ deployment:
- two remote sites with WAN connection (two different networks)
- site A is running a pair of HornetQ instances, standalone, configured as live/backup with multicast discovery (call them A-Live/172.16.1.6 and A-Backup/172.16.1.7)
- site B is running an HornetQ standalone (call it B)
- same versions of HornetQ everywhere
- a bridge is configured at site B to forward content of queue Q from site B to site A. Bridge definition refers to a static connector pointing to A-Live only.
Problem
Core bridge doesn't failover to backup server after live server is killed.
The problem seems to be related to the issue https://issues.jboss.org/browse/HORNETQ-1218
This issue appears to be fixed in versions 2.3.3.Final and 2.4.0.Alpha1. So I made tests with the following versions but couldn't get it working:
- HornetQ 2.3.0.Final (affected by the issue so it shouldn't work)
- HornetQ 2.3.8.Final (issue fixed - should work but doesn't)
- HornetQ 2.4.0.Beta1 (issue fixed - should work but doesn't)
I suppose I made something wrong in my configuration but can't find what :-(
Scenario
The live/backup configuration seems to work: backup takes over when live is killed. Remote consumer/producer clients, running on site A or B, are transparently redirected to A-Backup (they use JNDI to get access to the ConnectionFactory and the queue). When A-Live is restarted, it discovers A-Backup, synchronises its content and ask it to shutdown (as per the configuration).
The bridge configuration seems to work as well: messages posted on queue Q at site B are properly (and transparently) forwarded to queue Q on A-Live.
However, the bridge fails to reconnect to B-Backup after B-Live is killed.
When running with logging set at DEBUG level, one can see the following messages at site B (bridge source) when the topology changes at site A. The example below shows what happen when B-Backup is started and ready (log message is indented for better reading):
DEBUG [org.hornetq.core.client] ClientSessionFactoryImpl received backup update for live/backup pair = TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=172-16-1-6 / TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=172-16-1-7 but it didn't belong to TransportConfiguration(name=central-connector, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5455&host=172-16-1-6
As far as I can tell, "bridge" is notified of the new cluster topology (in this case the addition of B-Backup with IP 172.16.1.7) but refuses to consider it because it doesn't belong to the configuration. If A-Live is killed, here is what happens to the bridge:
DEBUG [org.hornetq.core.client] calling cleanup on ClientSessionImpl [name=4fd16b4f-1fd0-11e3-be41-8151284bb877, username=HORNETQ.CLUSTER.ADMIN.USER, closed=false, factory = ClientSessionFactoryImpl [serverLocator=ServerLocatorImpl (identity=Bridge my-bridge) [initialConnectors=[TransportConfiguration(name=central-connector, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5455&host=172-16-1-6], discoveryGroupConfiguration=null], connectorConfig=TransportConfiguration(name=central-connector, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5455&host=172-16-1-6, backupConfig=null], metaData=()]@544e732e DEBUG [org.hornetq.core.client] Trying reconnection attempt 0/0 DEBUG [org.hornetq.core.client] Trying to connect with connector = org.hornetq.core.remoting.impl.netty.NettyConnectorFactory@51b17cc0, parameters = {port=5455, host=172.16.1.6} connector = NettyConnector [host=172.16.1.6, port=5455, httpEnabled=false, useServlet=false, servletPath=/messaging/HornetQServlet, sslEnabled=false, useNio=false] DEBUG [org.hornetq.core.client] Started Netty Connector version 3.6.6.Final-90e1eb2 DEBUG [org.hornetq.core.client] Trying to connect at the main server using connector :TransportConfiguration(name=central-connector, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5455&host=172-16-1-6 DEBUG [org.hornetq.core.client] Remote destination: /172.16.1.6:5455 DEBUG [org.hornetq.core.client] Main server is not up. Hopefully there's a backup configured now! DEBUG [org.hornetq.core.client] Could not connect to any server. Didn't have reconnection configured on the ClientSessionFactory DEBUG [org.hornetq.core.client] Trying reconnection attempt 0/0 DEBUG [org.hornetq.core.client] Trying to connect with connector = org.hornetq.core.remoting.impl.netty.NettyConnectorFactory@7198dab2, parameters = {port=5455, host=172.16.1.6} connector = NettyConnector [host=172.16.1.6, port=5455, httpEnabled=false, useServlet=false, servletPath=/messaging/HornetQServlet, sslEnabled=false, useNio=false] DEBUG [org.hornetq.core.client] Started Netty Connector version 3.6.6.Final-90e1eb2 DEBUG [org.hornetq.core.client] Trying to connect at the main server using connector :TransportConfiguration(name=central1-connector, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5455&host=172-16-1-6
The bridge will keep trying to reconnect to the original A-Live server forever - pretending there is no backup.
Configuration
Live/Backup
A-Live and A-Backup are running the same configuration.
Bind address and ports are given at startup by the run.sh script.
B-Backup is started with -Dhornetq.backup=true.
Configuration files available in the "SiteA (live/backup).zip" attachment.
Bridge
Configuration files available in the "SiteB (bridge).zip" attachment.
-
SiteA (live:backup).zip 4.9 KB
-
SiteB (bridge).zip 4.4 KB