1 Reply Latest reply on Jan 25, 2010 8:14 PM by mircea.markus

    Data redundancy lost... not in subspace..

      I made a small test with 3 infinispan nodes participating in the same cluster, each putting a key-value into the cluster. The concurrency level is set to 3. cluster mode is DIST_SYNC. All the entries show up on all the nodes as is expected.

       

      When I bring a 4th node into the cluster and generate a few entries, they also are distributed so that the data exist on 3 nodes as expected. There is now also an N+1 nodes available to distribute the data.

       

      However when I trie to "provoke" the setup by shutting down the nodes (with adequat time for any replication to happen in the background) the redundancy level for the data doesn't seem to be enforced.

       

      I would expect 3 copies of the data to exist in the cluster at all times (given time to replicate between nodes). For example when shutting down a node, I would expect the data that it contained to be replicated to one of the three remaining nodes so that it always contains 3 copies.

       

      In one of the nodes I saw the following message:

       

      Jan 24, 2010 11:46:13 PM org.infinispan.distribution.DistributionManagerImpl rehash
      INFO: Starting transaction logging!
      Jan 24, 2010 11:46:13 PM org.infinispan.distribution.DistributionManagerImpl rehash
      INFO: Not in same subspace, so ignoring leave event

       

       

      Am I expecting too much from Infinispan in this regard or is there some configuration that I should have a look at?

        • 1. Re: Data redundancy lost... not in subspace..
          mircea.markus

          I made a small test with 3 infinispan nodes participating in the same cluster, each putting a key-value into the cluster. The concurrency level is set to 3. cluster mode is DIST_SYNC. All the entries show up on all the nodes as is expected.

          The concurrency level does not have to do with data redundancy, but rather with the expected amount of threads that would concurrently access the code. It's the numOwners attribute on hash element that defines the number of nodes on which data will be held:

          <hash numOwners="3" rehashWait="120000" rehashRpcTimeout="600000"/>