-
1. Re: Behaviour of the cache in case of a node failure
manik Jan 25, 2007 11:51 AM (in response to lucdew)If a node fails during the 2-phase commit protocol, an exception is thrown internally by the comms layer. This is trapped and the transaction is marked for rollback, and the TM sees this and initiates a rollback.
If the node does before the 2-pc protocol commences, then the tx does not fail; it just excludes the dead node from the cluster.
There is no difference between a shutdown and a node crashing as far as the 2-pc protocol is concerned. If the node dies/is shut down in the middle of the 2-pc protocol, it will cause a rollback on the initiator of the commit. Else, it will be excluded from the cluster before the 2-pc commences.
We detect dead nodes using JGroups' failure detection protocols - FD and FD_SOCK. See JGroups docs for details on this. -
2. Re: Behaviour of the cache in case of a node failure
lucdew Jan 25, 2007 6:15 PM (in response to lucdew)Thanks a lot for the explanations.
Reading the Junit test cases also helped.
I also have a question regarding (re)synchronization.
For a cluster of 2 cache members A & B, imagine that:
- A network failure occurs and members A & B can't "see" each other.
- a transaction starts and updates the node "/node1" on member A and commits
- a transaction starts and updates the exact same node on member B and commits.
Then network is back again and nodes can now see each other. What would happen regarding data consistency ? I guess node versioning or timestamping is used to determine which node that should prevail when synchronizing members but if the same node is updated on both members ? Is it "the last wins" rule ? -
3. Re: Behaviour of the cache in case of a node failure
manik Jan 26, 2007 3:54 AM (in response to lucdew)This is actually will have an indeterminate outcome.
It will depend on a number of factors, including how the network heals. JGroups again has a lot of documentation on merging. The problem with the 'split brain' scenario you describe is that there is no way of knowing which version of the data is the more up-to-date one. Timestamps are of relatively little value (unless you have sync clocks, but even these could have gone out of sync during the disconnect).
The only thing you can be sure of after such an event is that a subsequent write to the node (on either member) will bring them both in sync again.