1 Reply Latest reply on Dec 20, 2005 7:49 PM by manik

    CacheLoader optimisations - JBCACHE-118 and JBCACHE-374

    manik

      Looking at JBCACHE-118 now.

      There is a reported inefficiency of calling loader.exists() before calling loader.get().

      We could call loader.get() directly, but since only attribs are stored in the cache loader, loader.get() returning null could mean the node does not exist or that the node does exist but doesn't contain any data. Which means that if loader.get() returns a null, we would THEN have to call loader.exist() to test whether the node exists in the first place.

      1) The most efficient thing I can think of is to mandate in the API that loader.get() ALWAYS returns a Map of data, an empty map if there is no data. A null ALWAYS means the node doesn't exist. But this may involve patching our CacheLoaders (easy) and clients patching their custom CacheLoaders (harder).

      The next best option depends on use case.

      2) If most nodes contain data then it is best to do a loader.get() followed by (maybe) a loader.exists().

      3) If most nodes are empty, then the current implementation of loader.exists() followed by loader.get() is better.

      I have both implementations with me at the moment, with a new test case (CacheLoaderMethodCallCounterTest) that counts how many times each cache loader method is called and this is how I determined the above.

      Does anyone have any idea of most common use cases? I hardly think many people would have caches of many nodes without any data in them (case 3 above) while I think the case of having nodes with data (case 2 above) is more common. I don't suppose there is any chance we could "enforce" a change in API semantics (case 1, most efficient), is there? :)