Version 2

    When network partitions happen, and heal again, Infinispan suffers from data inconsistency.  The purpose of ISPN-263 is to provide a mechanism to sacrifice availability (in Brewer's CAP theorem) in exchange for consistency during network partitions.

     

    The implementation is based on the following design:

     

    • A listener is registered with the transport
    • Whenever a ViewChange is detected, each node tries to make a decision about whether this is a normal node failure/leave or an abnormal event (a network partition).
      • This can be "guessed" by looking at the number of nodes having left.
      • Or, this could be configured.  E.g., if the partition contains less than N nodes, it should be considered that it is in the smaller partition.
    • If a node thinks a network partition has happened AND it is in the smaller partition (old_partition.size / 2 < new_partition.size) (or a configured number is hit) then:
      • Go into READ-ONLY mode by flipping a switch in an interceptor to throw an exception whenever an update operation is encountered
      • Fail all ongoing transactions and clean up transaction table
    • When a MergeView is detected, reset the interceptor to allow updates.
      • TODO - how do nodes handle MergeViews from a State Transfer perspective?