1 2 Previous Next 24 Replies Latest reply on Mar 1, 2006 11:51 PM by Ben Wang

    New version of CachedSetImpl

    Jussi Pyörre Newbie

      Hello,

      As I was running into severe performance problems with CachedSetImpl in TreeCacheAop, I wrote a new version with increased performance (can't remember the exact amount of improvement, but it was substantial anyway). This version has two main improvements:

      1. The older version used size() method several times in next() method of the IteratorImpl. Since size() is a quite heavy operation in CachedSetImpl, the performance was poor. New version doesn't use size() for iterations at all.

      2. The old version used ArrayList style indexing for node children names. This was only slowing things down. The new version uses the first available integer as a new child's name. For example, if first there were three children (0, 1 and 2) and '1' was destroyed, '0' and '2' would not get renamed. The next element addition would use '1' as its child name.

      As I am currently really busy, I unfortunately don't have time to get familiar with your ways of contributing to source code, so I'll just publish the new version here. So, here's the new CachedSetImpl:

      /*
       * JBoss, the OpenSource J2EE webOS
       *
       * Distributable under LGPL license.
       * See terms of license at gnu.org.
       */
      package org.jboss.cache.aop.collection;
      
      import org.jboss.cache.CacheException;
      import org.jboss.cache.Fqn;
      import org.jboss.cache.DataNode;
      import org.jboss.cache.aop.TreeCacheAop;
      import org.jboss.cache.aop.util.AopUtil;
      import org.jboss.util.NestedRuntimeException;
      
      import java.util.AbstractSet;
      import java.util.Collection;
      import java.util.Collections;
      import java.util.Iterator;
      import java.util.Map;
      import java.util.NoSuchElementException;
      import java.util.Set;
      
      /**
       * Set that uses cache as a underlying backend store
       *
       * @author Ben Wang, Jussi Pyörre
       */
      public class CachedSetImpl extends AbstractSet {
       protected TreeCacheAop cache;
      
       protected AbstractCollectionInterceptor interceptor;
      
       public CachedSetImpl(TreeCacheAop cache,
       AbstractCollectionInterceptor interceptor) {
       this.cache = cache;
       this.interceptor = interceptor;
       }
      
       private Collection keySet() {
       DataNode node = getNode();
      
       if (node != null) {
       Map children = node.getChildren();
      
       if (children != null) {
       return children.keySet();
       }
       }
      
       return Collections.EMPTY_SET;
       }
      
       private class IteratorImpl implements Iterator {
       private Iterator iterator;
      
       private Object key;
      
       private IteratorImpl(Collection keys) {
       iterator = keys.iterator();
       }
      
       private IteratorImpl() {
       iterator = CachedSetImpl.this.keySet().iterator();
       }
      
       public boolean hasNext() {
       return iterator.hasNext();
       }
      
       public Object next() throws NoSuchElementException {
       if (!this.hasNext()) {
       throw new NoSuchElementException();
       }
      
       this.key = iterator.next();
      
       try {
       return cache
       .getObject(AopUtil.constructFqn(getFqn(), this.key));
       } catch (CacheException e) {
       throw new RuntimeException(e);
       }
       }
      
       public void remove() throws IllegalStateException {
       if (this.key == null) {
       throw new IllegalStateException();
       }
      
       try {
       cache.removeObject(AopUtil.constructFqn(getFqn(), this.key));
       } catch (CacheException e) {
       throw new RuntimeException(e);
       }
       }
      
       }
      
       // protected static final Log log_=LogFactory.getLog(CachedSetImpl.class);
      
       protected DataNode getNode() {
       try {
       return cache.get(getFqn());
       } catch (Exception e) {
       throw new NestedRuntimeException(e);
       }
       }
      
       protected Fqn getFqn() {
       // Need this since fqn can be reset.
       return interceptor.getFqn();
       }
      
       public boolean add(Object o) {
       Collection keys = keySet();
      
       // This could be done with 'contains(o)' but it would invoke 'keySet()'
       // twice
       for (Iterator iter = new IteratorImpl(keys); iter.hasNext();)
       if (iter.next() == o)
       return false;
      
       // Search first available key. This is a fast operation as the key set
       // is already available
       int key = 0;
       while (keys.contains(Integer.toString(key)))
       key++;
      
       try {
       cache.putObject(AopUtil.constructFqn(getFqn(), Integer
       .toString(key)), o);
       } catch (CacheException e) {
       throw new RuntimeException(e);
       }
      
       return true;
       }
      
       public Iterator iterator() {
       return new IteratorImpl();
       }
      
       public int size() {
       return keySet().size();
       }
      
       public boolean equals(Object o) {
       if (o == null)
       return false;
      
       if (o == this)
       return true;
      
       try {
       Set set = (Set) o;
      
       return (set.size() == keySet().size() && this.containsAll(set));
       } catch (ClassCastException e) {
       return false;
       }
       }
      }
      
      


      It might need modification with older JDKs with rethrowing of catched exceptions (I use RuntimeException as you seem to throw NestedRuntimeException...). Anyway, that is a minor modification, if needed.

      Regards,

      Jussi Pyörre


        • 1. Re: New version of CachedSetImpl
          Brian Stansberry Master

          Hi Jussi,

          Thanks for the suggested improvements :-) I've created a JIRA issue to track your suggestion so it doesn't get lost in the forum. See

          http://jira.jboss.com/jira/browse/JBCACHE-394.

          Re: ways of contributing source code, the best is to create a diff to the current HEAD version of the file, create a "Patch" issue in JIRA, and attach the diff file there.

          You mention "severe performance problems". Do you have any more info about what you were doing and what you saw? I initially classified the JIRA issue as "Minor" as that's kind of standard for a performance improvement patch, but any more info you have would help to better prioritize the issue.

          Thanks again!

          • 2. Re: New version of CachedSetImpl
            Ben Wang Master

            Like Brian mentioned, tahnk you for the patch. We have been working on some optimization already actually (JBCACHE-241 and 182, specifically). So if you can do us one more favor of producing a patch based on head, that'd be great!

            -Ben

            • 3. Re: New version of CachedSetImpl
              Jussi Pyörre Newbie

               

              "bstansberry@jboss.com" wrote:

              You mention "severe performance problems". Do you have any more info about what you were doing and what you saw? I initially classified the JIRA issue as "Minor" as that's kind of standard for a performance improvement patch, but any more info you have would help to better prioritize the issue.


              I have to perform lots of searches on Sets which causes substantial use of iteration. I profiled my code and found out that most of the time was spent on size() method that calls getChildren() (size() is called twice in hasNext() and again once in next() ). The new version acquires node children only once when constructing a new Iterator and then utilizes them in iteration (no calls to getChildren() in hasNext() and next() ). This is very similar to your current implementation of CachedMapImpl.

              The change improved the performance of my program from about 70 seconds to 6 seconds. As my program is also performing other things than just iterating through Sets, I guess that the performance of CachedSetImpl must have improved at least by a factor of 20.

              BTW, have you considered implementing a new POJO Cache such that it would not use TreeCache as its backend? It seems that most of the time in TreeCacheAop is spent on updating the tree structure when some reference in cached POJO is updated... If you are considering such thing, I would gladly share my experiences during the specification.

              -Jussi-


              • 4. Re: New version of CachedSetImpl
                Brian Stansberry Master

                Sorry for the slow reply.

                No, AFAIK we're not thinking about moving away from a TreeCache-based approach, or at least if we are it's way back in the back of someone's mind. But can we get the benefit of your experiences anyway? :) We're definitely interested.

                BTW, reviewing your patch revealed a major bug in the existing implementation -- http://jira.jboss.com/jira/browse/JBCACHE-396. So thanks for that as well.

                I'll be applying changes that are pretty much your patch next week. Only semi-significant change is when the add method searches for a new key, it starts with the size of the existing key set rather than 0. This will leave gaps in the keys, but that doesn't matter in a Set.

                Looks like the List implementation could benefit from the same approach.

                • 5. Re: New version of CachedSetImpl
                  Scott Marlow Master

                  I compared the performance of this new implementation against what I checked in on December 5th 2005 and don't see much difference:

                  December 5th changes:
                  1. Calling add 2000 times in a loop took 39.816 seconds
                  2 Performing 2000 iterations in a loop took 39.329 seconds

                  With the new implementation changes:
                  1. Calling add 2000 times in a loop took 38.887 seconds
                  2. Performing 2000 iterations in a loop took 43.188 seconds

                  Prior to my changes on December 5th, performance was lower.

                  I do like the new implementation and thought that performance would be better than it was but it doesn't seem any better than the December 5th change.

                  If anyone would like the test case code, let me know (I'll check it in enventually, just not ready yet).

                  Scott

                  • 6. Re: New version of CachedSetImpl
                    Brian Stansberry Master

                    So for iterating over the set, the new code was actually slower! That's not good :(

                    • 7. Re: New version of CachedSetImpl
                      Scott Marlow Master

                      I would say that performance was about the same as the results almost always vary (I didn't repeat the test enough times to get an exact measure).

                      • 8. Re: New version of CachedSetImpl
                        Manik Surtani Master

                        What I tend to do when running such tests is to give it a few 'warm up' loops (perhaps 2000) for the JIT compiler to come up to speed - I've noticed this irons out inconsistencies. I usually then run the test with about 10,000 loops...

                        Either way though, even if there isn't a performance gain, I understand this actually fixes a pretty major bug?

                        • 9. Re: New version of CachedSetImpl
                          Scott Marlow Master

                          Yes, it definitely fixes a major bug and I have checked in the donated patch which resolves the bug.

                          I have some longer term performance changes that I'm working on for CacheSetImpl.

                          Before my longer term changes:

                          1. Calling add 2000 times in a loop took 39.816 seconds
                          2 Performing 2000 iterations in a loop took 39.329 seconds

                          After my longer term changes:
                          1. Calling add 2000 times in a loop took 0.757 seconds
                          2 Performing 2000 iterations in a loop took 0.429 seconds

                          • 10. Re: New version of CachedSetImpl
                            Scott Marlow Master

                            I have a newer implementation of CachedSetImpl that improves the performance for "add" + "iteration", however, I need to resolve the following questions before proceeding.

                            1. For Map, I think that we convert the key to String in one place but not the other.

                            CollectionClassHandler computes key value as "new Fqn(fqn, entry.getKey())"

                            CachedMapImpl computes key value as "AopUtil.constructFqn(getFqn(), key)"

                            2. I came up with a faster Set implementation, however, I'm using the "Set" values as keys and I'm not converting them to (serializable) String.

                            3. I modeled the new Set implementation after HashSet, but what if they pass a TreeSet. I'm not sure if we should change the collection support to be implementation class based (separate mappings for HashSet, TreeSet, LinkedHashSet rather than interface Set).

                            4. The issue with not converting the key value to String is for non-local case and the key value is not serializable. How should we deal with this?

                            • 11. Re: New version of CachedSetImpl
                              Ben Wang Master

                              1. This is an oversight. I have fixed that.

                              2. The requirement to use String as a key is to avoid the recursive loop in Map. It is probably ok to use Object in "Set"

                              3. We have discussed to provide different kind of implementations in the future.

                              4. In the case that Key is non-primitive, we shall require them to be Serializable. I think this is in line with regular POJO Cache where we require the POJO to be aspectizable (.e.g, with xml or annotation declaration), otherwise, it needs to be Serializable.

                              -Ben

                              • 12. Re: New version of CachedSetImpl
                                Scott Marlow Master

                                I took the latest source and ran the testsuite against my new CachedSetImpl and I'm getting a stack overflow. I need to adjust to the adjustment that you made (I'm guessing). :)

                                Caused by: java.lang.StackOverflowError
                                at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:394)
                                at java.lang.StringBuilder.append(StringBuilder.java:120)
                                at org.jboss.cache.aop.test.Address.toString(Address.java:59)
                                at java.lang.String.valueOf(String.java:2577)
                                at java.lang.StringBuffer.append(StringBuffer.java:220)
                                at org.jboss.cache.Fqn.toString(Fqn.java:160)
                                at java.lang.String.valueOf(String.java:2577)
                                at java.lang.StringBuilder.append(StringBuilder.java:116)
                                at org.jboss.cache.aop.CacheInterceptor.isPojoDetached(CacheInterceptor.java:171)
                                at org.jboss.cache.aop.CacheInterceptor.invoke(CacheInterceptor.java:77)
                                at org.jboss.aop.joinpoint.FieldReadInvocation.invokeNext(FieldReadInvocation.java:48)
                                at org.jboss.cache.aop.test.Address.street_r_$aop(Address.java)
                                at org.jboss.cache.aop.test.Address.getStreet(Address.java:29)
                                at org.jboss.cache.aop.test.Address.toString(Address.java:59)
                                at java.lang.String.valueOf(String.java:2577)
                                at java.lang.StringBuffer.append(StringBuffer.java:220)
                                at org.jboss.cache.Fqn.toString(Fqn.java:160)
                                at java.lang.String.valueOf(String.java:2577)
                                at java.lang.StringBuilder.append(StringBuilder.java:116)
                                at org.jboss.cache.aop.CacheInterceptor.isPojoDetached(CacheInterceptor.java:171)
                                at org.jboss.cache.aop.CacheInterceptor.invoke(CacheInterceptor.java:77)
                                at org.jboss.aop.joinpoint.FieldReadInvocation.invokeNext(FieldReadInvocation.java:48)
                                at org.jboss.cache.aop.test.Address.street_r_$aop(Address.java)
                                at org.jboss.cache.aop.test.Address.getStreet(Address.java:29)
                                at org.jboss.cache.aop.test.Address.toString(Address.java:59)
                                at java.lang.String.valueOf(String.java:2577)
                                at java.lang.StringBuffer.append(StringBuffer.java:220)
                                at org.jboss.cache.Fqn.toString(Fqn.java:160)
                                at java.lang.String.valueOf(String.java:2577)
                                at java.lang.StringBuilder.append(StringBuilder.java:116)
                                .
                                .
                                .

                                • 13. Re: New version of CachedSetImpl
                                  Scott Marlow Master

                                  No, looks like I was wrong, it wasn't your change. You only changed Map handling.

                                  • 14. Re: New version of CachedSetImpl
                                    Scott Marlow Master

                                    I'm past the stack overflow and on to the next error. :)

                                    1 2 Previous Next