12 Replies Latest reply on Jan 11, 2006 11:02 PM by bdueck

    String representation for Fqn

    belaban

      I noticed we're caching the string representation of an Fqn when toString() is called:

      public String toString()
       {
       if(stringRepresentation == null)
       {
       StringBuffer sb=new StringBuffer();
       for(Iterator it=elements.iterator(); it.hasNext();)
       {
       sb.append(TreeCache.SEPARATOR).append(it.next());
       }
       stringRepresentation = sb.length() == 0 ? TreeCache.SEPARATOR : sb.toString();
       }
       return stringRepresentation;
       }
      


      I'm questioning whether this is good. Okay, we're gaining some performance, but we're also using more memory.
      I'm not sure we made a good decision here.
      Opinions ?

        • 1. Re: String representation for Fqn
          gozilla

          Maybe we can use a Soft or Weak reference then ?

          • 2. Re: String representation for Fqn
            belaban

            Maybe that's the best compromise between space and speed.

            • 3. Re: String representation for Fqn
              manik

              A weak ref may do the trick - except perhaps in the case of JDBC cache loaders where Fqn.toString() is called quite frequently.

              Perhaps it makes sense to cache the String representation of an Fqn in the JDBC cache loader itself, rather than in the Fqn class, since not much else in the codebase would use Fqn.toString() that much?

              And even then, we could use a WeakHashMap<Fqn, String> in the JDBC cache loader to hold the references.

              • 4. Re: String representation for Fqn
                belaban

                I switched to WeakReference. AFAIK the JDBCCacheLoader already uses strings (varchars) to hold the keys, which is one of its weaknesses.

                • 5. Re: String representation for Fqn

                  Let's study the performance impact first then. I can think of two places that use it: eviction policy (needs Sting representation of Fqn) and also some places in POJO cache (I think).

                  • 6. Re: String representation for Fqn
                    manik

                     

                    "bela@jboss.com" wrote:
                    AFAIK the JDBCCacheLoader already uses strings (varchars) to hold the keys, which is one of its weaknesses.


                    True - but every time you do a cacheloader.get(Fqn, ... ) or cacheloader.put(Fqn, ...) with a JDBC cache loader, the cache loader calls Fqn.toString() to create it's SQL calls. If the String representation is weak, the toString() operation may have to recalculate the String rep of the Fqn quite frequently, when the weak refs are gc'd.

                    Wouldn't a better approach be:

                    1. Fqn does *not* cache the string representation at all
                    2. The JDBCCacheLoader maintains a WeakHashMap<Fqn, String> and caches the string representations there.

                    This way,

                    * String references are not maintained unnecessarily and when not needed (such as when NOT using a JDBCCacheLoader)
                    * JDBCCacheLoader performance can be improved if it maintains it's own reference, since it won't need to regenerate String representations of Fqns when the weak references in the Fqn class get gc'd.

                    • 7. Re: String representation for Fqn
                      belaban

                      Actually, we use toString() *a lot*: every time we log something we usually log Fqns as well, so toString() is called.
                      WaekRefs are only GC'ed when the garbage collector runs (full GC afaik)

                      • 8. Re: String representation for Fqn
                        manik

                        when logging, that would only be in DEBUG and INFO messages, right? So most production setups running with WARN or above will not see much usage of Fqn.toString()

                        • 9. Re: String representation for Fqn

                          OK, I've done some testing

                          1. Test the assumption that caching the string form of an Fqn is of benefit.
                          2. Test various ways of efficiently caching the Fqn.

                          To do this, I ran test of my cache test benchmark which simulates how my application uses JBossCache

                          - All tests are run with 1 client thread (concurrency not relevant for this test) performing about 92% read operations and 8% write operations.
                          - Each operation is a test to create or retrieve 340 nodes from the cache.
                          - A total of 100,000 operations is performed running for about 2-2.5 minutes.
                          - The cache is running in local mode
                          - All tests were run using Java 1.4.2_08-b03 on Windows with the -hotspot and -server options and a value of 700mb for min and max heap size.
                          - I ran each test twice and got less than a 2% variation between them and took the average between the tests.

                          Here are the results of the various runs. To make it easier to interpret the results, I normalized all the test results to a percentage and used a value of 100% as the result I got from v1.16 of Fqn. So, if a test got a value higher than 100 (say 105) then it would be 5% faster than version 1.16 of Fqn.

                          v1.15 = 102
                          v1.15 with caching removed = 107
                          v1.15 with my proposed constructor optimizations to stringRepresentation = 107

                          v1.16 baseline = 100
                          v1.16 with my proposed optimizations to stringRepresentation = 98
                          v1.16 with changes to use String.intern() = 70

                          As you can see from above, one of the fastest two implementations is actually the one without any caching involved at all. I suspect the reason is because the VM is spending less time doing GC. If I'm correct, then the benefits of not caching the string representation will be much more pronounced on multi-CPU machines where GC cycles absolutely kill scaleability due to their inherent single-threadedness on most VMs.

                          So, my recommendation here is that we remove the caching logic from Fqn.

                          Thoughts?

                          Brian.

                          • 10. Re: String representation for Fqn
                            belaban

                            Excellent ! This shows that the caching was absolutely not necessary ! I never did like the fact the now we had, for each Fqn, an additional WeakRef plus a string !
                            So remove them !

                            • 11. Re: String representation for Fqn

                              Will do.

                              • 12. Re: String representation for Fqn

                                This change has been done.