1 2 Previous Next 17 Replies Latest reply on Jul 25, 2011 5:05 AM by pmuir

    BoundedConcurrentHashMap/Eviction Logic on GETs

    shane_dev

      I've been looking at the eviction implementation in the BoundedConcurrentHashMap, and there are a few things that are disconcerting to me.

       

      Capacity = 1000

      Load Factory = .75

      Max Batch Size = 64 (Static - It will almost always be the smaller value.)

       

      BoundedConcurrentHashMap.put

      if segment count + 1 > table length,

      --> LRU.execute

           move accessQueue entries to lruQueue

           --> LRU.isOverflow

                return true if lruQueue.size > 750 (1000 * .75)

           if true, remove the entry from lruQueue and segment

           clear accessQueue

       

      The eviction strategy with respect to PUT commands seems to make sense. If we reach capacity, we evict entries until we get back to Capacity * Load Factor.

       

      What I'm not so sure about is why eviction is executed from GET commands.

       

      BoundedConcurrentHashMap.get

      --> LRU.onEntryHit

           add it to accessQueue

           return true if accessQueue.size >= 48 (64 * .75)

      if true,

      --> BoundedConcurrentHashMap.attemptEviction

           --> LRU.thresholdExpired (I know this is only if a lock has not yet been acquired.)

                return true if accessQueue.size >=  64

           if lock acquired

           --> LRU.execute()

                move accessQueue entries to lruQueue

                --> LRU.isOverflow

                     return true if lruQueue.size > 750 (1000 * .75)

                if true, remove the entry from lruQueue and segment

                clear accessQueue

       

      First, what is the purpose of calling LRU.thresholdExpired? All calls to attemptEviction go through onEntryHit and it already checks using Max Batch Size * Load Factor. It seems the logic at this point is all about onbtaining the lock and I don't see why we're checking the threshold (Max Batch Size) should play a role in obtaining the lock.

       

      Second, why is eviction even initiated by a GET command? I don't mean the call to onEntryHit, but the call to attemptEviction.

       

      Third, why should it happen every 48 GETs? It seems like we'll be in a constant state of eviction (once we reach Capacity * Load Factor + 1) assuming that we are doing at least 1 PUT per every 48 GETs.

       

      Finally, I've looked at the code for the EvictionManagerImpl and the documentation, but to me it looks like it schedules expiration. It does not appear to actually perform eviction. Is this interpretation correct?

       

      EvictionManagerImpl.processEviction -> DefaultDataContainer.purgeExpired

       

       

       

      It might be that I'm not interpreting the code correctly so please feel free to correct me .

        • 1. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
          shane_dev

          It occurred to me that maybe having the GET commands call attemptEviction is more about flushing the accessQueue than it is performing the actual eviction.

           

          If this happens to be the case, 1) could we simply separate these two functions and 2) wouldn't the flushing be better handled by a scheduled task such as the EvictionManagerImpl?

          • 2. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
            an1310

            In fact, that's what we do.  We wrote our own eviction manager and it improved performance tremendously.

            1 of 1 people found this helpful
            • 3. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
              vblagojevic

              Shane,

               

              Yes it is more about flushing that access queue. If I remember correctly I had to make this accomodation in order to plug other eviction algorithms like LIRS. It is great that this obscure area of research is getting some traction from users like you. The idea behind this amortization wrapper over an eviction algorithm is described in http://academic.research.microsoft.com/Publication/4715892

               

              The code that you are reading is my implementation of that research paper. It might not be the best and I am open to review, suggestions and corrections.

               

              Erik,

               

              How did you make removals from container safe given that you have to use another thread for removal? I thought that the best way to achieve safe removal is to piggyback on user threads hitting the container. Why don't you make this improvement public and open for review? We should all capatalize if you choose that course of action.

               

              Regards,

              Vladimir

              1 of 1 people found this helpful
              • 4. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                shane_dev

                Thanks Vladimir,

                 

                At this point we are just looking to understand the logic behind the existing code, so I'll definately take a look at that link. Thanks for that.

                 

                One of the changes I suggested here was to update the code so that the access queue is flushed, but eviction is not started. We also added some logging to help us identify how often it is happening and what not.

                 

                1) Do you foresee any issues with that approach?

                2) Regarding your concern with the eviction manager, what if we used it to simply flush the access queue periodically and left eviction to the PUT calls. Does that seem like a worthwhile approach to look at?

                • 5. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                  shane_dev

                  Do you think you can share the approach/logic behind your version if not the code? I too would be curious to see how you went about it.

                  • 6. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                    shane_dev

                    After reading that paper, I have a better understanding of the code. Here are some comments for anyone else that may come across this thread.

                     

                    1. I see that the maxBatchQueueSize is 64 because that seems to be the point of diminishing returns based on the published results of the paper.
                    2. I see that the code first checks to see if the access queue size > maxBatchQueueSize * batchThresholdFactor to take advantage of tryLock() and that a lock is only exlicity acquired if tryLock() failed and the access queue size >= maxBatchQueueSize.

                     

                    It seems to me that eviction in the context of a DB buffer is not quite the same as eviction in the context of a data grid. My interpretation is that they are victimizing entries in order to optimize the buffer usage whereas we are victimizing entries to make room for new ones.

                     

                    With that said, I'm still inclined to lean towards the following...

                     

                    • Separate flushing the acess queue to the lru queue from the actual eviction.
                    • Because of said separation, we may not need to limit the access queue size to 64 anymore.
                    • The tryLock() optimization seems to be based on the need to frequently perform eviction (for optimization). If we choose to perform eviction only as a result of a PUT and only when we've reached capacity (to make room), I'm not so sure it is needed anymore. At least in the context of maxBatchQueuSize and batchThresholdFactory as metioned in #2 above. Though maybe we want to preemptively perform eviction on PUTs if we reach a similar threshold based on the capacity. I'm a little uncertain about this though.

                     

                    Thoughts?

                    • 7. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                      vblagojevic

                      Hi Shane,

                       

                      I have tried access queue size of more than 64 and have seen no improvement whatsoever. There is MapStressTest you can use to experiement! I think your idea to perform eviction on PUT  would not yield much improvement because get eviction attempts are well amortized. But please do experiment, if you can show better performance with that approach we'll change things around!

                       

                      Vladimir

                      • 8. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                        shane_dev

                        Very cool. I'll definitely check out the stress test.

                         

                        My only concern with the GETs is that you are basically in a constant state of eviction. It will be called continously (every 48-64 GETs) assuming you have any degree of load. If we performing a 100 per second, we're executing an eviction every second. Though maybe this is not that big of a deal.

                         

                        We're currently at the early stages of troubleshooting an issue where once we reach our eviction point (a couple million entries I think) we encounter serious stability issues that never cease.

                        • 9. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                          vblagojevic

                          Hey,

                           

                          Yes it will be called but it will only evict if a segment is above threshold. Keep us updated about any new findings!

                           

                          Regards,

                          Vladimir

                          • 10. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                            shane_dev

                            Indeed, but once you reach that threshold you are going to remain above it assuming you do at least 1 PUT per 64 GETs per second. At that point, depending on the nature of your traffic, you may never get past threshold + 1. If it weren't for the GETs calling eviction, you could work you way back up to the capacity before performing eviction again. Because of the GETs, you'll likely not get much past the threshold.

                            • 11. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                              vblagojevic

                              Ok so we agree that even though eviction is callled from GET it will not be actually performed, right? Segment load will remain equal to load factor x capacity, right? Ok, I agree once we have a serious of PUTs we are going to trim segment back to threshold. Be it just one PUT or many I do not see how is that a problem :-(

                               

                              Regards,

                              Vladimir

                              • 12. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                                shane_dev

                                It may not be. I'm just trying to better understand the process.

                                 

                                This is how I was interpreting it. Capacity = 100, Load Factor = .75, Threshold = 75

                                 

                                1. A PUT results in entry 76.
                                2. 64 GETs are executed.
                                3. The 64th GET actually peforms eviction (76 > 75). Lock is acquired. In this case 1 entry is evicted.
                                4. A PUT results in entry 76.
                                5. 64 GETs are executed.
                                6. The 64th GET actually peforms eviction (76 > 75). Lock is acquired. In this case 1 entry is evicted.

                                 

                                This may not be a bad thing at all, I just assumed it would be more efficient to move closer to the capacity (100) before actually performing eviction seeing as how it would be beneficial to batch a bunch of evictions.

                                 

                                Sort of something like this.

                                 

                                1. A PUT results in entry 76.
                                2. 64 GETs are executed.
                                3. A PUT results in entry 77.
                                4. 64 GETs are excuted.
                                5. A PUT results in entry 78.
                                6. REPEAT until we get to entry 101.
                                7. Perform eviction. Lock is acquired. In this case 25 entries are evicted.

                                 

                                I'm just trying to confirm that what we want (based on the code) is to actually perform an eviction every 64 GETs (max) assuming there has been at least 1 PUT.

                                • 13. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                                  vblagojevic

                                  Yes, but then one could argue contents of a container in such case are a bit stale as we have waited for so many accesses before evicting. I see what you are saying but I do not think your proposal is without trade offs that might be detrimental in one way or other.

                                   

                                  Regards,

                                  Vladimir

                                  • 14. Re: BoundedConcurrentHashMap/Eviction Logic on GETs
                                    shane_dev

                                    I'm not entirely sure I follow.

                                     

                                    I would still be fine with regularly taking the acess queue and updating the lru queue.

                                     

                                    What would be the point of evicting entries from the container if we still have plenty of room left? That doesn't seem to make any sense.

                                     

                                    The notion of being 'stale' is a matter of inconsistent caching or expiration. It has nothing to do with eviction. Just because I haven't accessed an entry lately does not mean that it is stale. It is not an 'out of date' version nor has exceeded its lifespan.

                                    1 2 Previous Next