7 Replies Latest reply on Sep 5, 2007 1:26 PM by jason.greene

    Space versus time, e.g. JBCACHE-1157

    genman

      JBCACHE-1157 changes the HashMap used in an UnversionedNode to a ConcurrentHashMap, which makes an empty node take about 1.3K to store, from about 150 bytes. To do this sort of change on 1.4 might affect people's runtime memory usage quite a bit. For 2.0, I don't know. A cache that has so much overhead sort of bothers me.

      Perhaps with a "global" NavigableMap (backported from 1.6), you could store all the data in a single flat map, with the following techniques:

      1. Node.put(k, v) -> NM.put(new Pair(Fqn, k), v)
      2. Node.getMap() -> NM.getRange(new Pair(Fqn, FirstKey), new Pair(Fqn, LastKey));

      The main downside is that all operations are of course slower, depending on how the Fqn is constructed. (A Fqn is much faster than Fqn).

        • 1. Re: Space versus time, e.g. JBCACHE-1157
          genman

          I meant,

          Fqn<Integer>
          versus
          Fqn<String
          . Which is why I don't want to make Fqn be only for strings...

          • 2. Re: Space versus time, e.g. JBCACHE-1157
            manik

            Eh? 1157 has nothing to do with a CHM in UnversionedNode. I did try using a CHM at one point, which did "hide" the issue again, but this posed other problems (inability to store nulls in the node).

            It's back to HashMap in UnversionedNode now, and the correct fix for 1157 is in the PessimisticLockInterceptor.

            • 3. Re: Space versus time, e.g. JBCACHE-1157
              manik

              Another approach is to create a data structure that dynamically changes based on how the node is populated.

              E.g., a CacheDataStructure may be a very light weight wrapper, which could create and hold a reference to a CHM only if the number of attributes crosses a threshold, and use simple object references for smaller numbers of attributes.

              I've seen a lot of use cases where a node holds very few (<3) attribs, often only one.

              • 4. Re: Space versus time, e.g. JBCACHE-1157
                manik

                See JBCACHE-1082

                • 5. Re: Space versus time, e.g. JBCACHE-1157
                  genman


                  Looking at the JIRA issue discussion, I can sort of see people are missing the point. The point is to save on space and mostly people talk of "time" performance improvements.

                  Along the lines of "space", there's a couple of changes I would suggest as well:
                  1. Don't keep the Fqn in the Node. Instead, keep a reference to the parent and only store the name. Build the Fqn when necessary. (Not sure this really saves tons of memory, but...)
                  2. Consolidate the boolean flags into a byte or short. Use some sort of utility class (with "enum" values) to set/check the bits of the byte/short.
                  3. Store the key/values/children in flat arrays.
                  4. Use lock striping, rather than a lock per node.

                  The problem with all of these, though, it really complicates the hell out of supporting the APIs that expose the data as a Set/Map/etc. And how do you manage concurrency?

                  What you need is some sort of obsessive/compulsive possibly autistic intern to come in and get this done. :-)

                  • 6. Re: Space versus time, e.g. JBCACHE-1157
                    jason.greene

                     

                    "genman" wrote:

                    4. Use lock striping, rather than a lock per node.


                    We are doing this as part of MVCC. Locks will be FQN based, with configurable stripping.

                    -Jason

                    • 7. Re: Space versus time, e.g. JBCACHE-1157
                      jason.greene

                       

                      "manik.surtani@jboss.com" wrote:
                      Eh? 1157 has nothing to do with a CHM in UnversionedNode. I did try using a CHM at one point, which did "hide" the issue again, but this posed other problems (inability to store nulls in the node).

                      It's back to HashMap in UnversionedNode now, and the correct fix for 1157 is in the PessimisticLockInterceptor.



                      I believe he was referring to the child node map which has to be a CHM, even with Pessimistic locking, since we do not lock the parent node for inserts / removes. We could alternatively use a synchronized map.

                      With MVCC we could restore parent node locking on insert / remove, and thus switch to a pain hashmap, since reads are non blocking and would no longer affect traversal.

                      -Jason