Version 6

    Status -- Design Phase

    Problem


    Some users are reporting significant uneven distribution in Infinispan despite using good hashes. This has an impact on data distribution, which impacts on memory usage, and also on CPU usage (due to excessive gc on some nodes).

    Challenges

    • Rehashing/Rebalancing on join/leave
    • Distribution/Replication is used for backup of data, so distributed copies must be placed on other physical nodes
    • Communication to other virtual nodes on physical node doesn't need to visit the network layer at all
    • Network storm when physical nodes join

    Design

    TODO

     

    References

     

    Notes

    • Classes to look at: RehashTask, JoinTask, InvertedLeaveTask, DefaultCOnsistentHash, TopologyAwareConsistentHash, DistributionManagerImpl, DistributionInterceptor
    • Will need benchmarking to establish recommended parameters
    • Consider staggering start of virtual nodes to prevent network storm
    • Aim to do all work simply in consistent hashing and rehashing code avoiding need for building this deep into the architecture