6 Replies Latest reply on Sep 16, 2009 5:34 AM by tom.baeyens

    Change caching strategy from nonstrict-read-write to transac

    the_olo

      Hi!

      I'm opening this thread in relation to this JIRA issue: https://jira.jboss.org/jira/browse/JBPM-2518

      In general, after reading the excellent article about clustering jBPM (http://www.theserverside.com/tt/articles/content/WorkflowEngineJBossCluster/article.html) I came into conclusion that clustering should not require the user to make such far reaching modifications to the jBPM itself.

      It seems that the changes needed to cluster jBPM effectively are:

      1) Change hibernate 2nd level caching strategy from nonstrict-read-write to transactional, which is supported by clustered TreeCache
      2) Change hibernate 2nd level cache implementation from HashtableCacheProvider to the clustered TreeCache.

      I think that the first change could be made in the source without any drawbacks (apart from some possible performance hit while the behaviour would be more correct).

      The second change should be possible by using a different configuration file and such an example configuration file for clusters should be distributed inside jBPM releases.

      BTW I've looked at changes between jbpm3 and jbpm4 SVN branches and I think this still applies to both. Correct me if I'm wrong.

      BTW2 This also applies to jBPM-BPEL although its development seems to have stagnated.

        • 1. Re: Change caching strategy from nonstrict-read-write to tra
          tom.baeyens

          i didn't read the article in that much detail.

          but changing caching strategy from nonstrict-read-write to transactional doesn't seem to be necessary to me.

          in jBPM 3, we only cached process DEFINITION data. which is assumed to be static in the DB. so the idea is that this can be cached in read only mode. the reason why we used nonstrict-read-write instead of read-only is to allow for new process definitions to be deployed (read: inserted)

          in jBPM 4, the process definitions are cached in memory by jBPM itself. so there we don't even have hibernate second level cache configurations at all.

          we didn't specify 2nd level cache configurations on the runtime data. not in jBPM 3 and not in jBPM 4. that is a topic we could explore. but it doesn't have a priority for us at this time. in that case i guess the cache must be configured as transactional as the runtime data is not read only (of course)

          • 2. Re: Change caching strategy from nonstrict-read-write to tra
            the_olo

             

            "tom.baeyens@jboss.com" wrote:
            changing caching strategy from nonstrict-read-write to transactional doesn't seem to be necessary to me.

            in jBPM 3, we only cached process DEFINITION data. which is assumed to be static in the DB. so the idea is that this can be cached in read only mode. the reason why we used nonstrict-read-write instead of read-only is to allow for new process definitions to be deployed (read: inserted)


            Exactly for this reason we need write-capable and cluster-wide consistent cache.

            We're going to hot redeploy business processes - otherwise, what's the point of the whole business process engine, if we can just as well write some logic and deploy the new EAR (ok, ok, ready framework for keeping the state of long running processes, process abstraction that it imposes and other stuff is good to have, but hot deployment is the killer feature for many).

            I've investigated the issue a bit further and it turns out that the problem lies in some lack of coordination between jBPM-jPDL and jBPM-BPEL development, coupled with a Hibernate bug and general quirks of Hibernate 2nd level cache.

            In jBPM-BPEL 1.1.GA release, there was jBPM-jPDL version 3.2.2 embedded. It contained all its cache configuration in .hbm.xml files.

            The per class HBM files located in jbpm-jpdl.jar, bundled with jBPM-BPEL had the cache hardcoded to nonstrict-read-write.

            This is the caching strategy that works with all 2nd level cache implementations apart from JBoss's TreeCache and JBoss Cache 2 (http://docs.jboss.org/hibernate/stable/core/reference/en/html/performance.html#performance-cache-compat-matrix).

            It seems that between versions 3.2.2 and 3.2.3, jBPM-jPDL people have decided to centralize second level cache configuration and remove it from individual HBM files, placing them centrally in the hibernate config file (config/hibernate.cfg.xml), after the mapping section.

            I can't point to an URL for the exact distributed file (you can download jbpm-jpdl-3.2.3 release and have a look yourself), but an approximate version can be seen here since it has been embedded in seam:
            http://svn.apache.org/repos/asf/myfaces/tobago/trunk/example/seam/src/main/resources/hibernate.cfg.xml

            As you can see, the file contains the mapping declarations, sourced from individual HBM files, then goes on to specify cache settings for the mapped classes and collections.

            Now, two things went wrong here:

            1) It seems that Alejandro Guizar, when incorporating jbpm-jpdl-3.2.3 into the next release jbpm-bpel-1.1.1 (https://jira.jboss.org/jira/browse/BPEL-297), hasn't noticed the caching change. This has resulted in jbpm-bpel-1.1.1 running totally cache-less with respect to jPDL entities. We've observed a ten-fold performance drop when upgrading from jbpm-bpel-1.1.GA to jbpm-bpel-1.1.1. Process execution speed has literally dropped through the floor.

            Only after adding the cache tags back to HBM files in jbpm-jpdl.jar (one by one, since the HBM files for jPDL 3.2.3 contained some vital modifications unrelated to 2nd level cache), we got the engine's performance back to normal.

            One might think that it would be a lot easier to simply copy the centralized cache settings from jbpm-jpdl-3.2.3 hibernate.cfg.xml to jbpm.hibernate.cfg.xml in jbpm-bpel-1.1.1. It's not so simple, unfortunately, due to the problem no.:

            2) Due to a Hibernate bug (http://opensource.atlassian.com/projects/hibernate/browse/HHH-2808 - rejected BTW by Hibernate devs), one cannot specify collection cache settings in the Hibernate session factory configuration file.

            The elements are there in the DTD (see http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd), but using them on a collection mapped in a subclass simply results in an exception: "org.hibernate.MappingException: Cannot cache an unknown collection".

            So the example configuration distributed with jbpm-jpdl-3.2.3 won't work for jbpm-bpel-1.1.1 anyway and unless HHH-2808 issued gets fixed, one has to resort to modifying the jPDL HBM files by hand.

            This also makes clustering the business process engine much harder, since in order to guarantee consistency of hot-deployed business processes throughout the cluster, one needs to employ a replicated second level cache, which requires changing the caching strategy from nonstrict-read-write to transactional, and doing this across dozens of .hbm.xml files packed in a .jar is far from perfect.

            The method we employed was to unpack the jar, run a find|perl one liner that did the substitution, then jar it up again.

            "tom.baeyens@jboss.com" wrote:
            in jBPM 4, the process definitions are cached in memory by jBPM itself. so there we don't even have hibernate second level cache configurations at all.

            we didn't specify 2nd level cache configurations on the runtime data. not in jBPM 3 and not in jBPM 4. that is a topic we could explore. but it doesn't have a priority for us at this time. in that case i guess the cache must be configured as transactional as the runtime data is not read only (of course)


            Based on our testing, we can say that properly caching the definition data helps a lot - improves performance by an order of magnitude.
            Cache for runtime might help, but first and foremost, I don't think that definition data can be treated as read only, since the point of a business process engine is to gain flexibility in the world of inevitable change, which concerns business processes.

            We cannot treat processes as static data.


            • 3. Re: Change caching strategy from nonstrict-read-write to tra
              the_olo

              BTW, I didn't try jBPM 4 yet, but as I understand it implements its own process definition cache instead of relying on Hibernate's.

              In this situation, how do you handle clustering of the jBPM engine?

              Most interestingly, how is the cache consistency maintained in a cluster?

              • 4. Re: Change caching strategy from nonstrict-read-write to tra
                tom.baeyens

                Aleksander,

                jBPM itself always relies on the DB for concurrency and transaction handling.

                If jBPM caches something, it means that information is considered static. Process definitions are static.

                When a new version of a process definition is deployed, it is stored NEXT to the old version. Both are kept in the DB. so the old version doesn't change.

                That is why jBPM can cache process definitions in memory.

                Whenever a process definition is being used, it is verified that it still exists in the DB. If not, it is removed from the cache.

                hth

                • 5. Re: Change caching strategy from nonstrict-read-write to tra
                  the_olo

                  So I suppose, when (if) jBPM-BPEL makes the move to jBPM 4, the clustering and caching issue will be fundamentally solved?

                  Are you aware what are the plans (a roadmap perhaps) for jBPM-BPEL? Is there active development going on in that direction?

                  • 6. Re: Change caching strategy from nonstrict-read-write to tra
                    tom.baeyens

                     

                    "the_olo" wrote:
                    So I suppose, when (if) jBPM-BPEL makes the move to jBPM 4, the clustering and caching issue will be fundamentally solved?


                    yes

                    "the_olo" wrote:
                    Are you aware what are the plans (a roadmap perhaps) for jBPM-BPEL? Is there active development going on in that direction?


                    there was an initiative at Bull. but i'm not sure if they are still actively working on that. if we ever revive BPEL on jBPM, that is a good starting point. but in the near future, we won't have time to look into that. that will probably be mid-to-long term.

                    in the meantime, the jboss products will include riftsaw (based on apache ode)