1 Reply Latest reply on Jan 6, 2012 10:25 AM by bighenry

Primary Attempts to Reconnect to Backup on Failback

bighenry Jan 6, 2012 10:03 AM

Hi all.

Apologies if this is a stupid question, I'm just starting out with HornetQ.

I'm attempting a live, backup node pair (HA) as standalone servers. Both startup fine, the live starts as live, the backup starts as backup, and the backup takes over when live is killed.

When I bring the live server back up, a failback occurs as I would expect -- but with the logging in FINE mode, however, it would seem the live server perpetually tries to re-establish a cluster connection to the now-shutdown 'backup' live server. From looking at the logs, the flow appears to be :

1) Shutdown live, backup takes over as live.

2) Restart live -- it initially starts as a backup server and establishes a cluster connection to the 'backup' live.

3) A request is made to failback to the original live server.

4) 'backup' live server shuts down and restarts as a backup server. In doing so, it signals to all cluster connections that it is shutting down.

5) The original live receives the 'shut down' signal and begins it's reconnect attempt cycle.

6) The original live server restarts as live.

Step 5 appears to be the problem, in that this now runs as an infinite loop attempting to connect to the backup cluster which will remain inactive, e.g.

[Thread-1 (group:HornetQ-client-global-threads-450608779)] 14:55:09,293 FINE [org.hornetq.core.client.impl.ClientSessionFactoryImpl] Trying reconnection attempt 2719

[Thread-1 (group:HornetQ-client-global-threads-450608779)] 14:55:09,294 FINE [org.hornetq.core.remoting.impl.netty.NettyConnector] Started Netty Connector version 3.2.3.Final-r${buildNumber}

[Thread-1 (group:HornetQ-client-global-threads-450608779)] 14:55:09,294 FINE [org.hornetq.core.client.impl.ClientSessionFactoryImpl] Trying to connect at the main server using connector :org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5446&host=localhost

[Thread-1 (group:HornetQ-client-global-threads-450608779)] 14:55:09,294 FINE [org.hornetq.core.client.impl.ClientSessionFactoryImpl] Main server is not up. Hopefully there's a backup configured now!

Is this behavior as expected? I note that it doesn't occur when I initially start the servers, so I was surprised it occurs on failback.

FYI, I've attached my server0, server1 configs.

Many thanks,

Henry

failover.zip 14.5 KB

1. Re: Primary Attempts to Reconnect to Backup on Failback

bighenry Jan 6, 2012 10:25 AM (in response to bighenry)

Actually, please ignore me!

I misread the logs, it seems the live server does perpetually attempt to connect to the backup on normal startup, and in fact the backup when it goes live perpetually tries to connect to the live. I'm assuming, therefore, that this is normal clustered behaviour.

Apologies for the noise ...
Actions