1 Reply Latest reply on May 12, 2008 3:53 PM by gklyuzner

Alternative HA, Master/Slave deployment scenario using virtual IP

jstrachan Jan 7, 2008 6:33 AM

I was with a customer the other day working out with their operations team the ideal HA / Clustering mechanism to have Master/Slave brokers available with messages replicated to multiple data centres etc. I just thought I'd mention it here as its a little different.

We went through the various options; some kind of Master/Slave is useful when messages need to be replicated to multiple data centres so that the messages themselves support HA. A common solution is often JDBC master/slave as often folks have a HA database available for reuse.

In this case there was no HA database infrastructure (yet), but there was a SAN we could use so the Shared File System Master Slave (SFSMS) looked a good option. It turned out that operations wanted control over which machine was the actual master (as this is kinda random with SFSMS as the first one in wins). However the customer had a virtual IP system in place which we decided to reuse.

So what actually happens is we have 2 masters using the SAN file system; the virtual IP system maps one virtual IP address to the master; then if that machine fails, the virtual IP system fails over to the other machine. i.e. the master/slave failover on the server side is implemented via virtual IP; the client uses failover to a single hostname to get client side failover and the SAN maintains replicas of the data across data centres.

1. Re: Alternative HA, Master/Slave deployment scenario using virtual IP

gklyuzner May 12, 2008 3:53 PM (in response to jstrachan)

In some cases failover based on virtual IP is not good solution for applications with high latency requirements (<100ms). In real life it takes time for network hardware to detect the failure and switch the traffic(>200ms or 2-3 sec). In this case critical components of the system should have application servers and brokers in hot reserve and also should have alternate network path for the messages in order to protect the application from software/hardware/network failures.

BTW, Why cannot we use simple cyclical Master/Slave solution?
1.     Start fault tolerant pair Master(Host_a)/Slave(Host_b)
2.     Master failed on Host_a
3.     Slave become stand alone Master on Host_b
4.     Start new Slave instead of failed Maser on Host_a
5.     Got fault tolerant pair Slave(Host_a)/Master(Host_b)

Especially if persistence is not the issue I cannot see why it would not work.I used similar procedures , but for our own application process (hot reserve). Obviously some synchronization has to be done in background before it could become full tolerant pair again. During that time ?new? broker(Slave) should not allow to failover any clients till it fully synchronized itself with ?old? broker(current Master).
PS: Host_c or Host_a or virtual IP could be used to start new Slave broker

Edited by: gklyuzner on May 9, 2008 5:12 PM

Edited by: gklyuzner on May 9, 2008 5:13 PM

Edited by: gklyuzner on May 12, 2008 3:35 PM
Actions