8 Replies Latest reply on May 2, 2007 10:28 AM by Edward Staub

    ProcessDefinition in 2nd level cache

    Andriy Gryn Newbie

      Hello,

      Process definitions are cached in the 2nd level cache. Nearly all components of the process definition are lazy loaded. Process definitions are accessed by many threads and are read from 2nd level cache after they were partially loaded from the database. I see here the following problem:
      If we keep the hibernate session that loaded the ProcessDefinition instance open then it will be accessed sooner or later by multiple threads simultaneously. And hibernate session is not thread safe according to the documentation. Reconnecting the ProcessDefiniton instance to the new session leads to the same problem.
      If we close the session then we get lazy loading exception. Has anyone some experience addressing this problem?

      Thank you.
      Andrey.

        • 1. Re: ProcessDefinition in 2nd level cache
          Ronald van Kuijk Master

          imo (but I'm no expert on databases/hibernate etc...) there is no problem with simultaneous read-only access to processdefinitions should not be a problem.

          • 3. Re: ProcessDefinition in 2nd level cache
            Ronald van Kuijk Master

            I'm kind of aware of these, but what I meant to say is that there probably is no problem when accessing read-only data in a second-level cache from a non-thread safe session is there??

            • 4. Re: ProcessDefinition in 2nd level cache
              Andriy Gryn Newbie

              Hi Roland,

              Read-only data doesn't mean read-only Session instance. Session caches all read data internally (1st level cache), so each select may change its state. I'm not sure if it applies for the objects that are cached also in the 2nd level cache but in any case I would follow the recommendation in the Hibernate documentation not to use one session from several threads because there is no guarantee that if this concurrent access works now without problems (and I don't expect it does) it will be also working with the future versions.
              Besides, usually you do not access process definition directly. You start from process instance. In my case even from another object that contains process instance. And process definition (there can be of course many of them) will be transparently loaded from the session that loaded my main object. And it is also kind of problematic to keep all these sessions open. Having used Hibernate intensively in the last 3 years I believe that the objects with lazy loading in the 2nd level cache is a problem. At least I have no idea how to deal with it. Maybe someone from jbpm development can explain what usage scenario they had in mind.

              Thanks. Andrey.

              • 5. Re: ProcessDefinition in 2nd level cache
                Tom Baeyens Master

                indeed. only use a session in 1 thread.

                • 6. Re: ProcessDefinition in 2nd level cache
                  Edward Staub Expert

                  Tom, Andrey,

                  Either I misunderstand, or Tom did.

                  Tom, as I understand it, Andrey is reporting a latent bug in JBPM's definition cache-loading.

                  Andrey, is that correct?

                  Given that the frequency of failure will vary from database to database, and from one Hibernate version to the next, it may not be possible to create a unit test that reliably demonstrates this. As in many low-frequency multithreading problems, analysis is the first line of defense - not tests.

                  -Ed Staub

                  • 7. Re: ProcessDefinition in 2nd level cache
                    Andriy Gryn Newbie

                    Ed, Tom,

                    Yes, I do think it is a problem.
                    In nearly any real-life application the process definition has to be accessed from many threads, each working with own process instance. For simplicity lets assume that they are all based on the same process definition. All process instances are on the different stages of the process. As soon as you send signal to the process instance JBPM will try to access the corresponding process definition internally. Process definition will have to read some additional data from the database using Hibernate session it was attached to. With multiple process instances processed in different threads there will be sooner or later the situation when the same hibernate session will be used to read different parts (or even the same part) of the process definition in the 2 or more threads .
                    I think it is clear that it is not feasible to sync. all the accesses to the process instances.
                    In my case I wrote some generic code that initializes recursively all the associations of the process definition whenever new process definition is loaded for the first time. I just didn't want to modify JBPM hbm files. But I do not see the reason to have lazy loading for the process definition, considering the problems it causes. At least for the cases that I would consider as a mainstream: few process definitions and many process instances (each instance requiring some part of the process definition for further processing).

                    Andrey.

                    • 8. Re: ProcessDefinition in 2nd level cache
                      Edward Staub Expert

                      Tom,

                      The exact same design issue is in last week's commit (rev 1.2) to JpdlParser.java. SaxParserFactory is not threadsafe.

                      Either each thread needs it's own SaxParserFactory (typically hung off a ThreadLocal),

                      ... or the newInstance() call in SaxParserFactory.createXmlReader() needs to be synchronized on a static class member or SaxParserFactory.class.

                      In this case, since contention will be so rare, just adding the synchronize (...JpdlParser.class) seems fine.

                      -Ed Staub