    JGroupsConsistentHash,etc  vs. KeyAffinityService: need tactics,samples

      Ambition =  on an Infinispan 5.3 grid of 4 nodes (4 x JVM,  all nodes have same IP address, each node has a different socket port)


      I.   distribute/partition a <K,V> set (call it set-A) uniformly across the grid and

      II.  pin a separate <K,V> set (call it set-B) at exactly 1 node on the grid.


      How do I best do this in 5.3?




      JGroupsTopologyAwareAddress - is it configurable to include at least  "ip_address+port"?  what is the physical  anatomy of a jGroupsTopologyAwareAddress?


      GroupingConsistentHash - this looks like a good starting basis for achieving our ambition.  Are there code samples that demonstrate how to use this?


      KeyAffinityService - is this API intended to insulate the User from having to worry about JGroups specific details?  i.e  If I use the KeyAffinityService, can I get away without having to know how to code via JGroupsTopologyAwareAddress a/o GroupingConsistentHash?


      Once we get our ambition realized in a running Infinispan 5.3 grid instance of 4 nodes:


      Assume that node #1 has set-B pinned to it (exclusively).   Is there any way that nodes #2,#3,#4 can then operate on set-B *by reference* (sort of like RMI operations over a network-stub to the operand implementation), or is it necessary that nodes #2,#3,#4 must consume *by value* a copy of set-B into its' JVM address space before they can operate on set-B?


      Thanks for any helpful insights, tactics and code samples.




          I'll try to address this in generic terms, for details (e.g. which classes to use etc) contact the Infinispan team.


          To distribute set A (more or less) evenly across a cluster, use mode=DIST.


          To ping set B to a single node, also use DIST, but set numOwners=1 and plug in your own consistent hashing function which pins all keys to a given node. E.g. if you have view {M,N,O,P}, the consistent hash could pin all keys to the first member of the list, M. This works because every member in a cluster has the same view (unless there's a split).


          However, this *will* lead to uneven distribution, so if set B is large, node M will bear more of the work than other nodes.


          Also, numOwners=1 is usually only advisable if you can fetch the data from somewhere, e.g. a DB, and use the cluster only as a front-end cache to a DB, for example.

          You could fix this by setting numOwners=2, and coming up with a consistent hash function which always pins the first of the 2 return values, e.g. [M->N], [M->O], [M->P], [M->N] etc.


          Once key set B is pinned to M, all access (using DIST) on other nodes will be routed to M. Note that you *could* use an L1 cache, which caches locally, but you stated you don't want to do that.

            As always, Bela, your responses are amazingly helpful.  Thank you  -- we will reach out to infinispan-dev team re: Infinispan API specifics to build this solution.