7 Replies Latest reply on Aug 8, 2008 1:24 PM by manik

    Redesigning eviction


      This is in reference to this thread on the user forum (see last few posts by nnnnn, bstansberry and myself).

      In a nutshell, the current eviction design combines a few things, such as when to evict and how to choose candidates for eviction into a single eviction policy. Ideally these two should be separated, along with other aspects such as which thread to use for eviction (separate housekeeping thread or user thread).

      I've also created a JIRA issue to track this: JBCACHE-1141.

        • 1. Re: Redesigning eviction

          I suppose the first question to ask is, is such a fine-grained approach to eviction a worthwhile thing to do? It makes sense in theory, but in practise, is it really a big win?

          • 2. Re: Redesigning eviction

            When is an eviction needed? When we're running out of memory. If memory was infinite, we wouldn't need evictions.

            An eviction is needed, what do we kick out? the node occupying the most? the node used less frequently?...etc

            I guess it depends on how the two different aspects highlighted by nnnn are gonna evolve, but the use cases could multiply in the future.

            The only downside to separating the two would be that users would have to think through the two, but they already have to do to some extent. It's certainly not easy to find out what storing 100 nodes of Class X is gonna take up memory wise. And then, they have to think which one they should be removing.

            Something worth noting even if it's no strictly linked to evictions. We're trying to evict nodes because we have a finite memory, but what if Cache was able to reshuffle itself to make access to nodes that are more regularly used faster? i.e. there's no need to go down the tree n times (/a/b/c/d/e/f/g/h/i/j/k...../myinstance). Evictions are driven by memory requirements, and Reshuffles would be driven by performance, but both could benefit from the same statistics gathered?

            Also, a result of an eviction process could be moving it to a node which has more free space that us, rather than passivating it to a db. For example, this could be done if we're able to determine that bringing the node from another cache instance would be faster than having to go to the db.

            The bottom line is, data gathered from node usage, currently only used by evictions, could server for more purposes. What to do when a node needs to be evicted could also vary, and this is on top of what nnnn suggested, which is when an eviction is needed and who needs to go. We're only scratching the surface here...

            • 3. Re: Redesigning eviction

              Separating what to do with nodes that are evicted, such as deleting them from memory or passivating them has given us flexibility. Same should be done for what nnnn suggested.

              • 4. Re: Redesigning eviction


                When is an eviction needed? When we're running out of memory. If memory was infinite, we wouldn't need evictions.

                I'm not sure this is the only scenario where an object needs evicting. I think another criteria would be that an object should be evicted when its state has become "stale". Sometimes the logic around what makes "stale state" is going to be so application specific that there could be no expectation that the cache works it out, however there are plenty of occasions where a broad-brush approach such as "objects under this sub-branch get evicted after X minutes".

                I have a couple of applications in production where JBossCache is used as a 2nd level cache for Hibernate and a time based eviction policy suits the application very well.

                I guess the point of this post is:

                1) Pointing out that memory consumption is not the only driver for eviction

                2) A simple policy (with simple configuration) will suite "real world" applications just fine, making it overly complicated to do easy stuff would be painful for us "simple users" (well, me anyway ;)

                3) As what's there kind of works, perhaps the effort that would go into this would be better off put into something else for now (like standardising the AOP framework for the pojo cache, even better to be standardising on the JBOss AOP framework so that those of us who are tied to using the appserver don't have to learn another AOP framework or worry about how a different AOP framework will interact with JBoss AOP thats already being used in other parts of their applications...)



                • 5. Re: Redesigning eviction

                  I started an implementation that separate:
                  when is there eviction => decided by EvictionTask.
                  which node is evicted => decided by the algorithm.
                  how many are evicted => decided by the number of nodes in the cache.
                  I posted a patch here:

                  • 6. Re: Redesigning eviction

                    I went down the road of thinking of eviction as also being about invalidation, and created an algorithm for performing timed expiration, but JBoss Cache expiration is really just the lowly task of managing memory.

                    Anyway, in addition to "when" "which" and "how many" I'd like to see "in what way" added to the list of concerns. Typically, people want to do "page to disk", but I can imagine wanting plain removal, or migration, or compression perhaps. (Background compression would be a neat feature, come to think of it.)

                    And in terms of invalidation, some sort of general "invalidation" interceptor would be nice to see that checked the validity of the data being fetched. This is likely necessary since the eviction thread might not be work soon enough.

                    • 7. Re: Redesigning eviction