Now most of this is specific to JBoss Cache, and we'd need to think about how this would apply to Infinispan, particularly with distribution.
Sorry about the lack of a clear and definite answer at his stage, but let's keep this discussion going, as it will help us get to a concrete solution.
Excellent, added feedback/opinion on the reqs to Jira (copied below):
- listen all changes (also those not currently partitioned in the sender node) in the cluster
- queue the changes
- remove redundant older changes from the queue (either at the time of adding to the queue or at sending)
- compress/encrypt transmission as needed
- configurable packet size for transmission (not referring to IP packet size/fragmentation, but application level to manage with really bad transmission paths)
- transaction information would not be needed to be transmitted (except that no uncompleted transaction would ever be sent, even if the SENDER node is the source of the transaction)
- after some thinking, I'd prefer to keep this separated from StoreLoader/StoreSaver, as the use patterns are quite different, would be better as a new class Distributor etc.
- receive and acknowledge transmissions from the SENDER
- queue changes
- put changes into the local cluster in accordance with conflict resolution policy (see below)
Possible conflict resolution policies:
1. ownership based (one-way) distribution, SENDER is always right, all transmitted changes are implemented
2. time-stamp based (one-way/two-way) distribution, the information with later timestamp is right (requires of course quite good time sync between the clusters, and assuming ms resolution, would not guarantee any absolute order), can be configured either one-way, or two-ways
3. user-definable ConflictSolver class/method
Tero, first of all thanks for getting involved moving this forward.
Firstly, I think this would be best implemented as a custom interceptor rather than a cache loader/store (see http://community.jboss.org/docs/DOC-14852 for info on how to write custom interceptors). The reasons for this include the fact that commands received by interceptors are already in a format that could be sent to other nodes, of course after doing some coalescing (this is the process of removing older changes on a key and simply send the its latest modification; we use a similar process for sending asynchronous changes to cache stores)
Such custom interceptor would of course allow you to listen on all modification. Now, compression/encryption and packet size could be just handled by JGroups. We would expect to use JGroups to link up what you call sender and receiver since it enables very flexible group communication. Besides, JGroups enables to not only UDP but also TCP transports to be set up and the latter would be better suited for inter cluster links.
The complexity here is particularly how to handle distributed environments. Since a single node cannot know what's going on in the entire grid, you'd need all instances in the grid to be pushing changes that originate in each of them. I suppose this is the scenario you're talking about when saying: "... (also those not currently partitioned in the sender node)..."
We also agree on the need to only send changes that have been committed. In the interceptors we can know when a transaction is going on and we can wait for the commit to arrive before queuing those changes.
Wrt conflict resolution policies, those sound fine. Number 3 would enable users to merge the data based on their rules.
Finally, we must avoiding ping-pong effect, which is the fact that data bounces back from one cluster to the other continuously. Haven't thought much of how this could be done in Infinispan but a flag could be used to mark a push from another cluster so that this is not send back out to the other cluster again.
Message was edited by: Galder Zamarreno: "I clicked send too quickly earlier!!"