4 Replies Latest reply on Dec 8, 2016 10:18 AM by vladsz83

    Partitioning type

    vladsz83

      Hi All,

       

      Can anyone explain me how Infinispan determines the 'split brain' case? The documentation says that the split brain appears when one or several nodes leave from the cluster without sending appropriate messages. But that means that there is no difference between crash of the process/machine and a network problem. I could pull the cable out or forcedly stop node/process imitating a process/system failure. Both cases issue no message to the cluster. But only the first situation could be treated as split brain. The second case actually requires immediate rebalancing.

       

      Does Infinispan distinguish such cases?

       

      Thanks

        • 1. Re: Partitioning type
          rvansa

          Split brain is an 'extra feature' on top of regular cache resilience, therefore the situations that can be interpreted as node crash - anytime the cluster does not lose all copies of data - are considered a node crash. An exception is if the cluster loses at least half of its members (even if it's lucky enough that it still contains at least one copy of each segment) - then there's the possibility that this cluster is actually the partition that "crashed".

          When the node that was considered crashed rejoins, it wipes out all the (possibly stale) data and gets a fresh copy.

          1 of 1 people found this helpful
          • 2. Re: Partitioning type
            vladsz83

            Thanks, Radim!

             

            But if the partition nevertheless happenes and the cluster enters 'degradated' mode, after getting healed, do the nodes only merge their state? In that case no cache copies/segments can be wiped. Am I right?

            • 3. Re: Partitioning type
              rvansa

              Yes. Upon split-brain, each *partition* either becomes degraded or stays available, and there can be at most one available partition. In degraded partition, no modifications can happen (1) and the partition won't rebalance. When a degraded partition joins available partition, degraded one wipes all its data. If degraded partition joins degraded partition, they merge the state and check if they contain enough nodes & data to become available.

               

              (1) unless this partition contains all copies of given segment

              • 4. Re: Partitioning type
                vladsz83

                Understood. Thank you.