4 Replies Latest reply on Dec 11, 2009 3:14 PM by rabrego

HornetQ Replication Functionality

rabrego Dec 11, 2009 12:57 PM

Is there any built in ability to do replication within HornetQ? I've seen where diverts can be setup along with bridges, but these all seem to rely on queues already in place and/or destination queues to be well-known.

What I would like to do is create a HornetQ cluster wherein when a queue is programmatically created on one node of the cluster, the queue will be replicated amongst all nodes in the cluster. This means that if Node1 has a client which creates a queue called DynamicQueueA, then we would like to have DynamicQueueA be created on Node2, Node3, etc.

Similiarly, if a message is sent by a client to Node2 for DynamicQueueA, that the message will be copied (replicated) to Node1, Node3, etc. I can't seem to figure out how to do that particular scenario.

It seems that it is possible to use an Interceptor to intercept the Queue creation message and then manually replicate the queue on all other nodes, but that solution won't give the functionality that, for example, bridges would give us regarding tolerance of nodes being down, guaranteed delivery, etc.

Is what I'm laying out possible? If so, how would I go about implementing/architecting such a thing with HornetQ ?

Your expertise is very much appreciated! I am loving HornetQ so far.

1. Re: HornetQ Replication Functionality

timfox Dec 11, 2009 1:24 PM (in response to rabrego)

HornetQ already supports high availability which can be configured to either use replication or failover via a shared store

http://hornetq.sourceforge.net/docs/hornetq-2.0.0.CR1/user-manual/en/html/ha.html
Actions
2. Re: HornetQ Replication Functionality

rabrego Dec 11, 2009 1:45 PM (in response to rabrego)

The high availability quoted in the manual is for backup pairs only (so far as I can tell) and is not true replication. In the documentation for HA, it explicitly states that for the backup pair, only *one* of the servers is considered "live" until a failover occurs, and then the other server is live (but the first one is offline).

My question relates to true replication wherein you can have 3 or 4 HornetQ servers replicating amongst each other (see the example I quoted in the original post).

I had, in fact, read through the clustering, diverts, bridge, HA, and failover sections of the manual. I agree that if I was only doing HA, then the section you quoted is relevant and would provide what I need. However, I am more interested in a cluster of servers replicating to each other.

I can see that if I was only doing one way replication (ie., ServerA replicating to ServerB) then the use of Intercepts would probably be the way to go (maybe) as everything goes one way and all messages can be captured via the intercept and then posted to the replicate server. However, this breaks down when you consider guaranteed delivery or multi-way replication.

Perhaps some form of store and forward pattern would be of some use using Diverts, for example, but then I would not know how to dynamically determine the final destination for the diverted message and deliver it? For instance, I could divert (copy) a message to a well-known queue, say, "ForwardQueue". But then how could I deliver from "ForwardQueue" to more than one "dynamic" queue. Example: On ServerA, QueueAxxx (all dynamically named queues starting with QueueA) all have there messages "diverted" (copied) to a store and forward queue such as "ForwardQueue". How then to deliver those messages to ServerB and have the messages end up in their respective QueueAxxx destination queues. Then, how do you prevent those QueueAxxx queues on ServerB from storing and forwarding the messages back to ServerA. And how to get the dynamically created queues on ServerB to begin with if they were created with a client on ServerA?

That is really the crux of my original question.
Actions
3. Re: HornetQ Replication Functionality

timfox Dec 11, 2009 2:03 PM (in response to rabrego)

Full server replication is *extremely* difficult to implement a in scalable and performant way, believe we have explored this avenue very deeply.

Replication is a large and complex field.

Here's the excerpt from a passage in the user manual in trunk regarding this:

HornetQ does not replicate full server state betwen live and backup servers. When the new session is automatically recreated on the backup it won't have any knowledge of messages already sent or acknowledged in that session. Any in-flight sends or acknowledgements at the time of failover might also be lost.
By replicating full server state, theoretically we could provide a 100% transparent seamless failover, which would avoid any lost messages or acknowledgements, however this comes at a great cost: replicating the full server state (including the queues, session, etc.). This would require replication of the entire server state machine; every operation on the live server would have to replicated on the replica server(s) in the exact same global order to ensure a consistent replica state. This is extremely hard to do in a performant and scalable way, especially when one considers that multiple threads are changing the live server state concurrently.
Some messaging systems which provide full state machine replication use techniques such as virtual synchrony, but this does not scale well and effectively serializes all operations to a single thread, dramatically reducing concurrency.
Other techniques for multi-threaded active replication exist such as replicating lock states or replicating thread scheduling but this is very hard to achieve at a Java level.
Consequently it xas decided it was not worth massively reducing performance and concurrency for the sake of 100% transparent failover. Even without 100% transparent failover, it is simple to guarantee once and only once delivery, even in the case of failure, by using a combination of duplicate detection and retrying of transactions. However this is not 100% transparent to the client code.
Actions
4. Re: HornetQ Replication Functionality

rabrego Dec 11, 2009 3:14 PM (in response to rabrego)

Thanks Tim.

I did read that portion of the manual also. I agree that full server replication is no trivial matter.

I think Coward has hit on something though. If we discard the notion of HornetQ supported "Queue" replication, then we could possibly do something close to it, sort of a work-around.

Exploring the concept of a "Divert", I read this portion of the hornetQ manual:

"A divert will only divert a message to an address on the same server, however, if you want to divert to an address on a different server, a common pattern would be to divert to a local store-and-forward queue, then set up a bridge which consumes from that queue and forwards to an address on a different server."

So, using the concept of Diverts, one could conceivably replicate a queue amongst servers. But the documented example only shows a one for one mapping between source and destination queues. Could a Transformer be used to "change" the destinations? In other words, if the divert is placed on a dynamic queue (whose name is not known until time of creation) and when the bridge is set up, if we can somehow change the destination queue to an arbitrary queue name that perhaps the Transformer determines based on the original message destination.

Is something like that possible?
Actions

Go to original post