4 Replies Latest reply on Dec 18, 2006 1:50 PM by brian.stansberry

    Buddy Replication and data consistency

    mica


      Hi,

      Some issues which are not clear for me after reading docs:

      if I execute put(fqn, key, value, option) command with autogravity option concurrently on two distinct cache instances (let say put(key, val1) on cache1 and put(key, val2) on cache2)

      1) with SYNC_REPL - invokes will be ordered and as a result i'll get consistent state of the cache (so either val1 on cache1 or val2 on cache2)
      or i can get a CacheException (or sth more specific?) while from one or both puts?

      2) with ASYNC_REPL - i can not get any exception and eventually at some point cache will achieve consistent state or sth else can happen (exception, inconsistent state)?

      3) is there any difference in behaviour while using _INVALIDATE than _REPL in cas of buddy replication?


      --
      thanks in advance,
      mj

        • 1. Re: Buddy Replication and data consistency
          brian.stansberry

          Buddy replication should not be used in a situation where you're expecting multiple servers to be concurrently modifying the same node. It's meant for use cases where one server owns the data.

          Buddy replication combined with INVALIDATION doesn't make sense. Invalidation means, "I have the latest data; you may be out of date, so throw away your data." Sending such a message to a limited subset of the cluster doesn't make sense.

          • 2. Re: Buddy Replication and data consistency
            mica

            Is it for efficiency or corectness reasons?
            i can imagine put with auto gravity option on as an atomic operation of subsequently:
            get() resulting in gravity of data
            put() performed 'locally'
            than according to my understanding the get() has to remove the node from other servers when the dataGravitationRemoveOnFind set on true - that's where the question about the difference between INVAL and REPL came from.

            Yet what if I invoke concurrently get() on 2 or more servers. Do I have any guarantees that at the end I will have only one main copy of the node?

            --
            cheers,
            mj

            • 3. Re: Buddy Replication and data consistency
              brian.stansberry

              If you do a put with a local option, it won't replicate to anyone, so the node that did the replication will be out of sync with the buddies.

              As to multiple nodes simultaneously doing a put on the same node, here's what happens. I'm assuming the node already exists.

              Assume no tx running. The data in question is stored on server0 and it's buddy group.

              1) You do a put() on server 1. Simultaneously a put() on server0.
              2) DataGravitatorInterceptor.1 and DataGravitatorInterceptor.2 both see the node doesn't exist; fetches the node's data from across the cluster.
              3) DataGravitatorInterceptor.1 and .2 take the data and do a put (not local). This replicates the data to its buddies. No tx, so no lock is held on the node. At this point there are three copies of the data -- the server0 group's, the server1 group's and the server2 group's.
              4) DataGravitatorInterceptor.1 and .2 send a cleanup call to the cluster. Any copy of the data not associated with the sending server's buddy group is removed.
              5) The original puts go through.

              The end result here will very much depend on how things get interleaved. With REPL_SYNC you could end up with a TimeoutException in Step 4 as server1 and server2 tell each other to remove the data and deadlock. Or server1 completes steps 3-5 and then server 2 executes steps 3-5, in which case server 2's change wins. Or both complete step 3, then server 1 completes step 4 (so the server 0 and server 2 copies are gone), then server 2 completes step 4 (so the server 1 copy is gone). Then the both complete step 5, resulting in 2 sets of data, each of which only has the key/value pair included in the put.

              Now, if there is a tx in place:

              The put() in step 3 is done in a tx, so a write lock will be held on the node on each server until the tx commits. The put will not replicate until the tx commits.

              The removes in step 4 will also not be broadcast until the tx commits.

              The put in step 5 will not be replicated until the tx commits.

              The fact that the WL from step 3 is held should make steps 3-5 atomic. If it's REPL_SYNC, you have two servers trying to write to the same node, so it's possible when the tx tries to commit you'll get a TimeoutExceptio due to a lock conflict. With REPL_ASYNC, the later tx will win; the step 5 put from the earlier tx will be lost.

              But.. while writing this I'm pretty sure I've spotted a bug in the tx case. The step 4 cleanup call gets bundled together with the other tx changes and therefore only gets replicated to the server's buddy's, not to the whole cluster.



              • 4. Re: Buddy Replication and data consistency
                brian.stansberry