11 Replies Latest reply on Mar 22, 2011 6:37 AM by galder.zamarreno

    Eviction, Consistency, & Secondary Owners


      I know that this issue has been brought up but I wanted to expand on it a little differently.


      For me, the issue is the fact that evictions are not propagated to secondary owners.


      So if an item exists on nodes 1 and 2 and is evicted from 1, it remains on 2. This creates an inconsistency as any GET performed on node 1 will return null while a GET performed on node 2 will return the value.


      My first thought is that evictions should be propagated. This would ensure proper consistency. Now sure why they aren't?


      The other option is to do something simlilar to the cluster cache loader. The problem with this loader though is that it is NOT topology aware and thus highly ineffcient. It simply can't scale. I'd rather see another mechanism in the interceptor chain that does what the cluster cache loader does but using the topology.


      Perhaps the loader could be updated to take advantage of the toplogy, but part of me is just not certain that this technique should fall into the 'loader' category.


      The only downside to this type of modification is that it still results in a number of secondary requests for typical cache misses. Whereas propagated evictions would not.




      Can we add logic to support propagated evictions?

      Can we add logic to fetch evicted entries from secondary owners without the cache loader?

      Can we update the cache loader to take advantage of the topology?

        • 1. Eviction, Consistency, & Secondary Owners

          Eviction isn't really concerned about consistency, since it's basically a memory-saving mechanism. Inconsistency is expected. If the eviction algorithm decides this node needs to keep the data according to its policy, why shouldn't it keep it?


          So maybe what you're trying to do isn't really "eviction" it's something else?

          • 2. Eviction, Consistency, & Secondary Owners

            Its not about when the algorithm decides to keep a node, its about when it decides to eject it but the secondary node (numOwers > 1) does not know it.


            Why can't we support eviction and consistency? I see no reason why they should contradict each other.

            • 3. Eviction, Consistency, & Secondary Owners

              i think u don'n understand 'Eviction' correctly. Eviction does not mean the entry is invalid. It just a way to prevent memory overflow. The entry be evicted is still valid for other node

              • 4. Eviction, Consistency, & Secondary Owners

                I think you're missing my point. I know very well what eviction is, and yes, we use it to limit the amount of memory used.


                However, it can cause consistency issues. The problem is NOT that it still lives on the other node but that some requests will return NULL while others will return a value. That is very bad for us. Either EVERY request should return a value, or EVERY request should return NULL. Inconsistency is simply a bad thing and should be avoided at all costs.


                As mentioned in the other discussion, the clustered cache loader solves this problem. However, because it is NOT topology aware, it does not scale and becomes terribly inefficient.

                • 5. Eviction, Consistency, & Secondary Owners

                  If you want a guaranteed consistent view of your data on all nodes, you have to use transactions to handle data removal. Evictions by their nature aren't intented to be transactional. Even if they were, it wouldn't make sense because not every node would agree on what to remove when. (And also, think about what a "cache" is and what it's intended for...)


                  If you don't want your data found missing, you can simply create a non-clustered cache loader with passivation. I don't think this would scale too badly.

                  • 6. Eviction, Consistency, & Secondary Owners

                    I can see your point with respect to eviction as a result of overflow since this executes within the context of the underlying collection itself. That said, expiration could certainly be handled within a transaction.


                    Passivation would only buy us some time, it would not solve the issue.


                    I'll just sum up with two questions.


                    True or False? If an entry exists within a cluster (and there is no partition), it should be accessible via every node in the cluster.


                    True or False? There is no technical argument against updating the cluster cache loader to optionally use the CH to restrict clustered get commands to the known owners.



                    If the answer to the first question is false, then we I guess we'll just have to agree to disagree


                    If the answer to the second question is false, I would love to hear the argument.

                    • 7. Eviction, Consistency, & Secondary Owners

                      The answer for the first question is, it depends on what you need. If you're using eviction without a cache loader/cache store then usually you don't care if entries go missing.


                      The answer to the second is, it sounds like a good idea. And you can easily contribute a patch. But keep in mind, data might still go missing since the topology might be changed mid-flight. Anyway, CCL wasn't intended for what you seem to use it for.


                      I guess it comes down to: What's your specific use case?

                      • 8. Eviction, Consistency, & Secondary Owners

                        Thanks for the feedback Elias. It is always helpful to get some alternate/opposing viewpoints.


                        I think we'll focus on adding some options to the cluster cache loader.


                        Our particular use case is that we don't have the option for passivation and we have millions of entries. With a finite set of hardware, we have to rely on eviction. However, a cache miss is an expensive operation for us and if we can avoid one (even some of the time) using the cluster cache loader that would be great. Though we prefer to avoid a full clustered get. It won't provide us anything over a get to just the assumed owners (most of the time). Even if this helps some of the time, it is a great advantage for us.

                        • 9. Eviction, Consistency, & Secondary Owners

                          Although not explicitly said, it appears that you're using replicated caches here? Did you try distribution?

                          • 10. Eviction, Consistency, & Secondary Owners

                            I should have state that originally. We are using distribution. Thanks.

                            • 11. Eviction, Consistency, & Secondary Owners

                              Oh right, so the original description is around the problem of when node 1 is mapped locally and it's evicted.


                              ClusterCacheLoader has been primarily designed for the replication use case where instead of using state transfer, you can use the cluster cache loader to load stuff lazily. So, as you rightly said, it doesn't take distribution related hints into account. We'd however welcome a DIST-tailored cluster cache loader to make it more efficient


                              In the mean time, the easiest way to work around this is to use a cache store and passivation. It might be worth on trying to find a cache store whose access is not as expensive as it is right now?


                              Doing a clustered eviction would be very expensive to do since we'd have to wait for replies from all nodes involved and it coud lead to bottlenecks in the eviction layer waiting for responses...etc.