2 Replies Latest reply on Nov 20, 2012 3:21 AM by smariusp

    Get all values for which a specifc node is the primary owner

    smariusp

      Hi,

       

          I am doing some tests for evaluating Infinispan - distributed mode, and I can't figure out how to get all the the values for which a node is primary owner.

      Could you please help me with some hints?

       

      Thank you!

        • 1. Re: Get all values for which a specifc node is the primary owner
          nadirx

          The Cache<K,V>.values() entrySet() and keySet() methods only return the data stored locally in distribution mode, so that could be a way to obtain the information. It doesn't however distinguish between primary and backup owners. The DistributionManager (which you can obtain via the Cache.getAdvancedCache().getDistributionManager() call) has a locate method which provides the addresses of the specified keys in the cluster.

          You could also use the DistExec framework to perform operations on the local data of each node.

          • 2. Re: Get all values for which a specifc node is the primary owner
            smariusp

            Hi Tristan,

             

            Thank you for your response.

             

            This is what I have used, but I was unsure because the documentation states that keyset() and values() are not suitable for production. For example, when having a large number of entries in the local cache, do they return a copy of the entries?

             

            Just a few words about the context in which I want to use Infinispan. I have a cluster of processors and I am trying to distribute some workload within this cluster. I am using Infinispan to store a sort of task queue. As far as I know, each key has exactly one primary owner.  I want each node to process only data which is stored locally (in order to reduce the network calls) and I want to ensure that a task is processed exactly once. So, I thought of using the primary ownership info so that each processor selects the next task according to these criteria. Because it is distributed, if a node crashes, another node will be able to seamlessly take over and process tasks belonging to the dead node because another node would become the primary owner.

             

            In order to accomplish the above, I have tried to use the query feature: store in each value object the hash of the key (this information does not change when the cluster topology changes, so I don’t need to recompute it) after which each node could query the cache like this:

                                        Give me the first n values for which hash(key) belongs to a segment for which I am the primary owner.

            But the query feature seems way too much for what I need to do – I don’t need network calls to find these values. And it seems the query feature does not support complex filters.

             

            Could you please help me with any suggestions?