1 2 Previous Next 29 Replies Latest reply on Aug 6, 2008 7:41 AM by manik

    JBoss Cache Public API

    manik

      So I've started on work on JBC 3.0.0. I don't intend to make any public API changes, the main new features are internal, but significant enough to warrant a major version update.

      But, if anyone can convince me of any reasons why the public API does need to change, now would be a good time to do it.

      Specifically, genericisation of the Cache interface - which happened in 2.0.0 - is not something I'm overly pleased with as I find it clunky. What do people think, should it stay or go?

      Ah, there is one public API that is due for a change though. Fqn. No more public constructors, all Fqn construction needs to go through factory methods (which have been in existence for some time now). This will help us optimise Fqns later as needed. Also, Fqn generics are going away. This has been more than just clunky, in fact downright error-prone. And without much value-add.

      Thoughts, comments?

        • 1. Re: JBoss Cache Public API
          genman

          Removing type parameters seems like a good idea.

          Something to consider would be to change the Node interface to return live references, e.g. "Cache.getNode(x).getData()" would return a Map whose operations affect the in-cache state. So a "getData().clear()" would clear the node data. I'm not sure how well this would work. It may be poorly performing.

          I created a "Caches" utility/factory object a long time ago that created java.util.Map instances backed by JBoss Cache and it'd be interesting if something like that were endorsed.

          I would suggest changing the CacheLoader interface.

          Remove these methods:

           Object put(Fqn name, Object key, Object value) throws Exception;
           Object remove(Fqn fqn, Object key) throws Exception;
          

          These are low-performing operations on many CacheLoaders implementations.

          Also I would remove
           void put(Fqn name, Map<Object, Object> attributes) throws Exception;
          

          and replace the put operation with:
           void replaceAll(Fqn name, Map<Object, Object> attributes) throws Exception;
          


          Remove methods that are redundant:
           void loadEntireState(ObjectOutputStream os) throws Exception;
           void storeEntireState(ObjectInputStream is) throws Exception;
          

          (Though their removal is probably not critical.)

          Also, I'd take a look at how Hibernate uses JBoss Cache 2. It seems there's a bunch of "util" methods that might make sense to carry over. One thing that comes to mind is the use case of storing a single item in a Node. Maybe create a "key" singleton object?

          • 2. Re: JBoss Cache Public API
            manik

             

            "genman" wrote:
            Removing type parameters seems like a good idea.


            Agreed, but I'd want to see to what degree this will affect backward compatibility. E.g., how many folk have written code to make use of this in 2.X.

            "genman" wrote:

            Something to consider would be to change the Node interface to return live references, e.g. "Cache.getNode(x).getData()" would return a Map whose operations affect the in-cache state. So a "getData().clear()" would clear the node data. I'm not sure how well this would work. It may be poorly performing.


            Could be poor performing if it has to be by way of a proxy map (which would then make calls up and down the interceptor chain to ensure locking, loading, etc.). But I like the "usability" implications.

            "genman" wrote:

            I created a "Caches" utility/factory object a long time ago that created java.util.Map instances backed by JBoss Cache and it'd be interesting if something like that were endorsed.


            It (or something very similar) certainly will be in 3.X. I think there is a lot of value in a "flat map" interface, especially since this plays into JSR107 nicely.

            "genman" wrote:

            I would suggest changing the CacheLoader interface.

            Remove these methods:
             Object put(Fqn name, Object key, Object value) throws Exception;
             Object remove(Fqn fqn, Object key) throws Exception;
            

            These are low-performing operations on many CacheLoaders implementations.

            Also I would remove
             void put(Fqn name, Map<Object, Object> attributes) throws Exception;
            

            and replace the put operation with:
             void replaceAll(Fqn name, Map<Object, Object> attributes) throws Exception;
            


            Remove methods that are redundant:
             void loadEntireState(ObjectOutputStream os) throws Exception;
             void storeEntireState(ObjectInputStream is) throws Exception;
            

            (Though their removal is probably not critical.)


            * put/remove and replaceAll - makes sense, but may be a case of "change for change's sake" since most cache loader impl's would actually do something like what you suggested internally anyway.
            * load/store state - this is needed to be able to stream state from one cache loader to another. Deserializing everything and then calling replaceAll() may be unnecessarily expensive if the CLs can have an internal mechanism of working with the streams directly.

            "genman" wrote:

            Also, I'd take a look at how Hibernate uses JBoss Cache 2. It seems there's a bunch of "util" methods that might make sense to carry over. One thing that comes to mind is the use case of storing a single item in a Node. Maybe create a "key" singleton object?


            Something like this already exists - JBCACHE-1082. Needs to be worked in to the codebase, properly profiled and optimised though.

            • 3. Re: JBoss Cache Public API
              genman

               

              "manik.surtani@jboss.com" wrote:

              * put/remove and replaceAll - makes sense, but may be a case of "change for change's sake" since most cache loader impl's would actually do something like what you suggested internally anyway.


              The problem with "putAll" (and "put" and "remove") is that they require a storage read for every call. If you can get them to perform okay, then I don't really care.

              When I worked on the Amazon S3 CacheLoader I felt like having to download data on the client side for those operations was unnecessary overhead--most of the time. If the local cache already had loaded the node data, then really all the local cache should do is perform the data modification on the local node map and do a "replaceAll".

              I suppose each CacheLoader could check if a node was loaded through the CacheSPI handle it has, but really this could be done uniformly in the CacheStoreInterceptor.

              You could always make a new interface in 3.0 like OptimizedCacheLoader with a "replaceAll" and detect it.

              "manik.surtani@jboss.com" wrote:

              "genman" wrote:

              Also, I'd take a look at how Hibernate uses JBoss Cache 2. It seems there's a bunch of "util" methods that might make sense to carry over. One thing that comes to mind is the use case of storing a single item in a Node. Maybe create a "key" singleton object?


              Something like this already exists - <a href="http://jira.jboss.org/jira/browse/JBCACHE-1082">JBCACHE-1082</a>. Needs to be worked in to the codebase, properly profiled and optimised though.


              It'd be nice if there was a canonical key object though. Something that could be optimally serialized and deserialized as a single byte using the existing CacheMarshaller. Maybe like have something like:

              public interface Node {
              
               private enum Key { KEY; }
              
               /** For nodes that store a single value, use this as the key */
               Object key = KEY;
              
               ... node methods here
              }
              


              "enum" is good for creating singleton instances.

              • 4. Re: JBoss Cache Public API
                manik

                 

                "genman" wrote:

                The problem with "putAll" (and "put" and "remove") is that they require a storage read for every call. If you can get them to perform okay, then I don't really care.


                Why? I agree that such calls should be optimised by the Interceptors not to do a load if the state needed is already in mem (for the return value of a put or remove), but beyond that, why would putAll, etc. do a storage read?

                "genman" wrote:

                "manik.surtani@jboss.com" wrote:

                Something like this already exists - <a href="http://jira.jboss.org/jira/browse/JBCACHE-1082">JBCACHE-1082</a>. Needs to be worked in to the codebase, properly profiled and optimised though.


                It'd be nice if there was a canonical key object though. Something that could be optimally serialized and deserialized as a single byte using the existing CacheMarshaller. Maybe like have something like:

                public interface Node {
                
                 private enum Key { KEY; }
                
                 /** For nodes that store a single value, use this as the key */
                 Object key = KEY;
                
                 ... node methods here
                }
                


                "enum" is good for creating singleton instances.


                Yup. Good thoughts to add to the relevant JIRA.


                • 5. Re: JBoss Cache Public API
                  genman

                   

                  "manik.surtani@jboss.com" wrote:
                  "genman" wrote:

                  The problem with "putAll" (and "put" and "remove") is that they require a storage read for every call. If you can get them to perform okay, then I don't really care.


                  Why? I agree that such calls should be optimised by the Interceptors not to do a load if the state needed is already in mem (for the return value of a put or remove), but beyond that, why would putAll, etc. do a storage read?


                  I don't see why they do, but they do.

                  For example, for FileCacheLoader, if erase is "false", which is called by "put(Fqn, Map)", then a read is made:
                   @Override
                   public void put(Fqn fqn, Map attributes, boolean erase) throws Exception
                   {
                   lock.acquireLock(fqn, true);
                   try
                   {
                   Map m = erase ? new HashMap() : loadAttributes(fqn);
                   if (m == null) m = new HashMap();
                   if (attributes != null)
                   {
                   m.putAll(attributes);
                   }
                   storeAttributes(fqn, m);
                   }
                   finally
                   {
                   lock.releaseLock(fqn);
                   }
                   }
                  


                  JDBC and all the other cache loaders are similar... I don't see why they should necessarily do a read for this operation.

                  • 6. Re: JBoss Cache Public API
                    manik

                    I think that is something we need to analyse in the interceptors and see if that can be done away with. No real reason to do a read here. From the cache loader javadoc for the put(Fqn, Map) method:

                    "javadoc" wrote:

                    Puts all entries of the map into the existing map of the given node, overwriting existing keys, but not clearing the existing map before insertion.


                    So this is all that should be done in the case of Cache.put(Fqn, Map).


                    • 7. Re: JBoss Cache Public API
                      genman

                      Filed the optimization issue as JBCACHE-1368

                      • 8. Re: JBoss Cache Public API
                        mircea.markus

                        Another method I don't find appropriate for public API is evict. I do not see an y reason for it to be publicly available, as it is rather internally used by eviction thread.
                        The only other possible usage I am aware of is in the case of eviction, when you want to use your own "eviction timer" - starting with 3.0 this will be handled through a plugable eviction timer (JBCACHE-1268).
                        So if no other use cases for using Cache.evict, I suggest deprecating it from the public API.

                        • 9. Re: JBoss Cache Public API
                          mircea.markus

                          curently there are 4 life cycle methods on the cache:
                          create, start, stop, destroy. Although these methods have a different semantics, merging them in two (start and stop) is simpler for the user to understand and use. Also the performance gain we bring by allowing one to reuse (re-start) a cache is not that big, start is doing all the hard word compared with create (object creation).

                          • 10. Re: JBoss Cache Public API
                            mircea.markus

                            move throws an exception if either source or destination does not exist. This is inconsistent with the rest of the API, as it is the only method that trows an exception if nodes(s) are inexistent. E.g. removeNode(fqn) will return a false if the node to be added was not found.
                            Here are some suggestions for changing this:
                            1) return a false if either source or destination does not exist (no-op)
                            2) treat it as a remove followed by an put: remove the node (return false if does not exist) and if destination does not exist create it (put semantics).
                            I vote for 1, mainly because 2 is hard to explain to the user

                            • 11. Re: JBoss Cache Public API
                              brian.stansberry

                               

                              Another method I don't find appropriate for public API is evict. I do not see an y reason for it to be publicly available, as it is rather internally used by eviction thread.


                              No! It makes total sense for a cache-based application to want to control memory management itself. The application often knows more about the overall state of the system than the cache does. The fact that JBC provides a feature whereby it can help the application control memory doesn't mean that users should be forced to use that mechanism.

                              • 12. Re: JBoss Cache Public API
                                brian.stansberry

                                Re: create, start, stop, destroy, yeah, looking into simplifying that makes sense. The 4 step lifecycle is kind of a legacy artifact of the way the old JBoss AS JMX microkernel worked.

                                • 13. Re: JBoss Cache Public API
                                  manik

                                  Re: lifecycle, don't people find it useful to create a cache, then set up stuff, such as register classloaders for regions, register listeners, etc., and then start?

                                  Or should create() really be an initial state that is attained by calling CacheFactory.createCache()?

                                  And then what about stop() and destroy()? Does it not make sense that a cache, once stopped, could be started again - which implies that internal components should not be cleaned up/released? In which case isn't a destroy() necessasry to actually release resources and clean up?

                                  • 14. Re: JBoss Cache Public API
                                    manik

                                    Re: evict, Brian is right. We shouldn't restrict applications from controlling eviction if they so wish.

                                    1 2 Previous Next