Version 4

    The motivations behind this feature is to ensure when using distribution, backups are not picked to reside on the same physical server, rack or data centre. This is tracked by ISPN-180.

     

    To this end, the following additional hints may be configured on a transport:

     

    <transport
         clusterName = "MyCluster"
         machineId = "LinuxServer01"
         rackId = "Rack01"
         siteId = "US-WestCoast" />
    

     

    • machineId - this is probably the most useful, to disambiguate between multiple JVM instances on the same node, or even multiple virtual hosts on the same physical host.
    • rackId - in larger clusters with nodes occupying more than a single rack, this setting would help prevent backups being stored on the same rack.
    • siteId - to differentiate between nodes in different data centres replicating to each other. 

     

    All of the above are optional, and if not provided, the distribution algorithms provide no guarantees that backups will not be stored in instances on the same host/rack/site.

    Implementation

    When a JoinRequest is broadcast when a node joins, in addition to the node's address, the node broadcasts a NodeLocationMeta object instance.

     

    class NodeLocationMeta {   
         String machineId = null, rackId = null, siteId = null; 
    }  
    

     

    Each node that receives this caches this information it its' ConsistentHash implementation.  ConsistentHash implementations make use of this information when placing addresses on the hash wheel.

    Placing addresses on a hash wheel.

    Currently, addresses are placed on a hash wheel of size HASH_SPACE by using the following algorithm:

     

    pos = hash(address.hashCode()) % HASH_SPACE

    Making use of hints

    TODO