3 Replies Latest reply on Mar 3, 2010 5:09 AM by galder.zamarreno

Multiple clusters geographically distributed over unreliable links

teroheinonen Feb 26, 2010 7:09 AM

Assuming the following setup:

- multiple sites with a cluster running in each

- need to syncronize certain (relatively infrequent) information between the clusters

- links between the clusters can be low-bandwidth and/or unreliable (substantial downtimes) and/or congested

The need:

- to replicate (selected) data from a cluster to other clusters (potentially bidirectionally) in best effort given the link availability/quality

- the overall idea being that the cluster's internal state would keep changing, regardless of when the changes are updated over to other clusters (and independent on whether each cluster is run in SYNC or ASYNC mode).

I was thinking if this could be most easily implemented as a CacheStore to handle that (being in the role CacheStore in the originating side, and CacheLoader in the receiving side). The class would queue the updates, purge obsolete updates from the queue. How would that CacheStore need to be configured? What of the required functionality is already implemented?

What would be the easiest way to achive this? Anybody already have done similar implementation?

Initially the distribution logic (and particularly the conflict detectiona and handling) could be based on a simple ownership model, but a timestamp based model would be beneficial as well.

Please provide some starting point of how to approach this.

1. Re: Multiple clusters geographically distributed over unreliable links

manik Feb 26, 2010 7:35 AM (in response to teroheinonen)

Hi Tero

It is on our roadmap to provide a full solution to this (see ISPN-262). JBCACHE-816 has some information too, and related to JBCACHE-816, this wiki page has some discussion on the subject.

http://community.jboss.org/wiki/ReplicationAcrossDataCentres

Now most of this is specific to JBoss Cache, and we'd need to think about how this would apply to Infinispan, particularly with distribution.

Sorry about the lack of a clear and definite answer at his stage, but let's keep this discussion going, as it will help us get to a concrete solution.

Cheers
Manik
Actions
2. Re: Multiple clusters geographically distributed over unreliable links

teroheinonen Mar 2, 2010 8:56 AM (in response to manik)

Excellent, added feedback/opinion on the reqs to Jira (copied below):

Overall requirements:

SENDER
- listen all changes (also those not currently partitioned in the sender node) in the cluster
- queue the changes
- remove redundant older changes from the queue (either at the time of adding to the queue or at sending)
- compress/encrypt transmission as needed
- configurable packet size for transmission (not referring to IP packet size/fragmentation, but application level to manage with really bad transmission paths)
- transaction information would not be needed to be transmitted (except that no uncompleted transaction would ever be sent, even if the SENDER node is the source of the transaction)
- after some thinking, I'd prefer to keep this separated from StoreLoader/StoreSaver, as the use patterns are quite different, would be better as a new class Distributor etc.

RECEIVER
- receive and acknowledge transmissions from the SENDER
- queue changes
- put changes into the local cluster in accordance with conflict resolution policy (see below)

Possible conflict resolution policies:
1. ownership based (one-way) distribution, SENDER is always right, all transmitted changes are implemented
2. time-stamp based (one-way/two-way) distribution, the information with later timestamp is right (requires of course quite good time sync between the clusters, and assuming ms resolution, would not guarantee any absolute order), can be configured either one-way, or two-ways
3. user-definable ConflictSolver class/method
Actions
3. Re: Multiple clusters geographically distributed over unreliable links

galder.zamarreno Mar 3, 2010 5:09 AM (in response to teroheinonen)

Tero, first of all thanks for getting involved moving this forward.

Firstly, I think this would be best implemented as a custom interceptor rather than a cache loader/store (see http://community.jboss.org/docs/DOC-14852 for info on how to write custom interceptors). The reasons for this include the fact that commands received by interceptors are already in a format that could be sent to other nodes, of course after doing some coalescing (this is the process of removing older changes on a key and simply send the its latest modification; we use a similar process for sending asynchronous changes to cache stores)

Such custom interceptor would of course allow you to listen on all modification. Now, compression/encryption and packet size could be just handled by JGroups. We would expect to use JGroups to link up what you call sender and receiver since it enables very flexible group communication. Besides, JGroups enables to not only UDP but also TCP transports to be set up and the latter would be better suited for inter cluster links.

The complexity here is particularly how to handle distributed environments. Since a single node cannot know what's going on in the entire grid, you'd need all instances in the grid to be pushing changes that originate in each of them. I suppose this is the scenario you're talking about when saying: "... (also those not currently partitioned in the sender node)..."

We also agree on the need to only send changes that have been committed. In the interceptors we can know when a transaction is going on and we can wait for the commit to arrive before queuing those changes.

Wrt conflict resolution policies, those sound fine. Number 3 would enable users to merge the data based on their rules.

Finally, we must avoiding ping-pong effect, which is the fact that data bounces back from one cluster to the other continuously. Haven't thought much of how this could be done in Infinispan but a flag could be used to mark a push from another cluster so that this is not send back out to the other cluster again.

Message was edited by: Galder Zamarreno: "I clicked send too quickly earlier!!"
Actions

Go to original post