Austin Clustering meeting thoughts
timfox May 7, 2006 10:44 AMIn preparation to our meeting on Wednesday in Austin (I believe Adrian, Brian S, Ovidiu and myself are attending), here's a brief summary of the situation as I see it and my thoughts on it, to kick start the meeting a little and especially for the benefit of those we are not completely au fait with the current state of messaging clustering.
We currently have some clustering code in the current codebase which is currently disabled. It is based around the concept of a "distributed destination". (A destination is a queue or a topic).
The term "distributed" really relates to the fact that the destination is load-balanced across multiple nodes in a cluster, it does *not* provide high availability features which would have to be layered on top.
Messages can be sent to a queue/topic via any node in the cluster and consumed with a consumer attached to any node in the cluster.
The way distributed destinations work is described here: http://wiki.jboss.org/wiki/Wiki.jsp?page=JBossMessagingCore so I won't re-iterate here.
Some possible issues with current approach:
1) Does not intrinsically handle high availability. Messages are only stored in memory on one node at any one time, so if that node goes down then the messages are lost.
This is not a fault of the design per se, since HA was always meant to be layered on separately with this solution, however dealing with HA separately makes the solution more complex since you have to a) implement distributed destinations, then b) implement some kind of replication for HA.
2) Joining/leaving cluster. If a node is lost from the cluster then other nodes do not take on responsibility for handling the persistent messages for the lost node. This is because messages are effectively pinned to a node. This means the persistent messages for a node will never be processed until that node is recovered (which may never occur).
An alternative solution would be to forget the current idea of a "distributed destination" and have a solution based on replication (actually the current solution would have to employ replication *anyway* (to provide HA)).
The idea here would be that every node in the cluster is pretty much an exact replica of every other node. I.e. each node has the same queues, subscriptions and transactional state as every other node. In other words we replicate all the relevant state machines between the nodes.
To ensure the state machines are replicated correctly we would have to ensure that each node receives messages in the same order, so a total ordering protocol would have to be used. I believe JGroups currently has 2 protocols which provide total ordering.
With this solution, joining/leaving and recovery becomes much simpler since messages aren't pinned to individual nodes.
Also HA is "built in" since the solution is based on replication- failing over from one node to another is simple since the second node is already a replica of the failed node.
Potential issues / things to investigate
1) More network traffic since every state machine change needs to be broadcast over the cluster. Not sure if this is an issue - hopefully Bela/Brian have some insights here.
1) Larger memory footprint for a node since all messages are stored on all nodes. However this is no more than for a single server "cluster".
Another avenue we could explore is how we actually do the replication.
We could either use JGroups directly to replicate state machines, or perhaps we could use JBossCache (?), not being an expert on JBoss Cache this is one of the things I am interested in discussing more.