10 Replies Latest reply on Feb 18, 2009 3:37 PM by heiko.braun

    process definition as a resource

    tom.baeyens

      jbpm users typically deploy the process to the db and then store the instances in the DB as well.

      we also supported non persisted execution of processes to some extend.

      after thinking this through, there are basically 6 execution modes. see next comment in this thread for a reference to the presentation on execution modes. 2 of those 6 modes are now in the test suite: dynamic persistent and memory

      but i want to discuss persistent resource. In that execution mode, the process definition is loaded from a resource on the classpath. the process executions still live in the DB and the history can also be captured in the DB.

      in order to achieve this execution mode, we must change the references from the execution datamodel to the process definition datamodel. the references cannot be based on foreign keys, but they must be based on just the plain activity name.

      in a lot of situations, i think this would be the most appropriate/optimal way to deploy and run processes.

      some questions:

      in the persistent dynamic execution mode, do we want to keep using the foreign key mappings ?

      or would it be possible to transition to text-based activityName references as well ?

      in the latter case, we would be able to keep the same hibernate mapping files for both execution modes. if we can find a good solution for text-based activityName references then targetting the 'persistent process resource' execution mode becomes a possible target.

      outside of the current activity pointer, there are a number other of foreign key references in the execution datamodel. we would have to do all of those text based.

      another aspect that becomes more complicated in 'persistent process resource' execution mode is deployment in a clustered environment. when using the DB to store process definitions, this is the single entry point accessible from all nodes. but how would we deploy a resource to all nodes in a cluster ? and how would the process definitions be cached in memory ?

      ok... my thoughts are not yet clear and structured. but i wanted to start this discussion. cause if we want to include the persistent process resource mode for GA, then we need to discuss now and start working on it in the next iteration.

        • 1. Re: process definition as a resource
          tom.baeyens

          Here's the images that should explain the execution modes: http://www.jboss.org/community/docs/DOC-13340

          • 2. Re: process definition as a resource
            kukeltje

             

            "tom.baeyens@jboss.com" wrote:
            jbpm users typically deploy the process to the db and then store the instances in the DB as well.

            I think most do not realy do this 'on purpose'. It's how it works.

            "tom.baeyens@jboss.com" wrote:

            another aspect that becomes more complicated in 'persistent process resource' execution mode is deployment in a clustered environment. when using the DB to store process definitions, this is the single entry point accessible from all nodes. but how would we deploy a resource to all nodes in a cluster ? and how would the process definitions be cached in memory ?


            Uhhmm.... JBoss Cache?

            Funny is that I kind of semi use jBPM in this resource way. I extend the processdefinition with my own xml tags for which I do not develop a specific jpdl parser or whatever. Runtime, I read the xml nodes with xpath from the processdefinition in the processarchive (will show this in the presentation also on the community day)

            I'll think about the other thinks though. Not fully grasp all differences yet, but that might be because it is to late (04:00 am)

            • 3. Re: process definition as a resource
              heiko.braun

              I am very much in favor of what you call "Persistent Process Resource", which basically means just storing execution state and history in the DB.
              Anything else, and most importantly, all class resources should be retrieved through the classloader. This already answers your question regarding the clustering: Leave it to the deployment/classloading framework of the AS.

              But in general we should aim at the plain non clustered AS integration first
              and then worry about clustering when we need it. You can easily relay questions like this if you leverage the AS infrastructure too a large degree.

              I do actually have uncommitted AS integration code on my disk, that installs the process engine as a service and adds the corresponding deployers so that you can drop *.par archives into the deploy folder. Quiet similiar to what Bernd described in his Blog, just for AS 5.


              I would say, to begin with we should aim at

              1) Persistent Process Resource as execution mode inside the AS
              2) Add a service and deployer
              3) Fix the classloading (classloader registry, reassociation upon reboot)
              4) Deal with different versioning strategies upn deployment and server boot.

              We need to get this sorted before diving into clustering topics.

              • 4. Re: process definition as a resource
                tom.baeyens

                 

                "heiko.braun@jboss.com" wrote:
                I am very much in favor of what you call "Persistent Process Resource", which basically means just storing execution state and history in the DB.
                Anything else, and most importantly, all class resources should be retrieved through the classloader. This already answers your question regarding the clustering: Leave it to the deployment/classloading framework of the AS.


                downside is that this would not support process versioning as we're used to. (at least not without hacks) maybe we should consider support this as a separate execution mode.

                so if we assume that versioning is done by the client in the process file, then this would really fit the jboss deployer architecture. (finally !)

                "heiko.braun@jboss.com" wrote:
                But in general we should aim at the plain non clustered AS integration first
                and then worry about clustering when we need it. You can easily relay questions like this if you leverage the AS infrastructure too a large degree.


                if possible, we should look at the DB for clustering. if we stick with that, they our clustering works in any environment. so those type of solutions are preferrable.

                but on top of that, we can always look at how we can improve jboss specific clustering. (like described above)

                "heiko.braun@jboss.com" wrote:
                I do actually have uncommitted AS integration code on my disk, that installs the process engine as a service and adds the corresponding deployers so that you can drop *.par archives into the deploy folder. Quiet similiar to what Bernd described in his Blog, just for AS 5.


                loading versioned classes lead to real complex solutions. and i'm not sure if those tricky things are justified.

                if a user deploys a new version of a process, it's always possible to update the classnames that are referenced and append _V2 or something to the classname. users can always reference new classes by pointing to different classnames. and then we don't have to come up with the very complex classloading to load similarly named classes in different process definition scopes.

                "heiko.braun@jboss.com" wrote:
                I would say, to begin with we should aim at

                1) Persistent Process Resource as execution mode inside the AS
                2) Add a service and deployer
                3) Fix the classloading (classloader registry, reassociation upon reboot)
                4) Deal with different versioning strategies upn deployment and server boot.

                We need to get this sorted before diving into clustering topics.


                you're forgetting the basic BPM Suite use case: if you want the process versioning to work on every platform to give a non-coding-only-poin-and-click demo, then you need the dynamic persistent use case. so that should be 0) and for the next efforts, i indeed agree with your list

                • 5. Re: process definition as a resource
                  heiko.braun

                   


                  so if we assume that versioning is done by the client in the process file, then this would really fit the jboss deployer architecture. (finally !)


                  yes, that's what bernd did with the AS 4 deployer as well. let's start with explicit versioning inside the process description, at least for AS deployments. It sounds reasonable to me.



                  if a user deploys a new version of a process...


                  we'd need to be more precise here. I can actually see three use cases:

                  1) a changed pdl file, but classes remain the same
                  2) same pdl, but classes changed
                  3) changed pdl and classes changed

                  IMO 1) and 3) should lead to a new version of that process, i.e. demanding an explicit version increment in the pdl, whereas 2) simply associated a different resource set to with process (i think bernd called it patching/bug fixes)

                  In general all those cases where the pdl is changed, we can encounter two situations: the user explicitly changed the version within the pdl or he didn't. IMO the later case indicates "replacement" of a process definition.



                  • 6. Re: process definition as a resource
                    heiko.braun

                    It's actually not necessary to do out of band classes name tricks to isolate process resources inside the AS. the classloading framework in place takes care of the scoping.

                    • 7. Re: process definition as a resource
                      camunda

                       

                      Heiko wrote:

                      1) a changed pdl file, but classes remain the same
                      2) same pdl, but classes changed
                      3) changed pdl and classes changed

                      IMO 1) and 3) should lead to a new version of that process, i.e. demanding an explicit version increment in the pdl, whereas 2) simply associated a different resource set to with process (i think bernd called it patching/bug fixes)


                      I see the use cases a bit different:

                      1) Fix a process (patching/bug fixing)
                      a) with changed classes
                      b) with changed jpdl
                      c) or both
                      2) Deploy a rael new version
                      with changed classes
                      and with changed jpdl

                      The distinction between fix and new version cannot be based dependent on the changed artefacts but has to be made from the user.

                      1a is simply a redeployment on the server
                      1b creates a new process version of same process in database
                      2 creates a new process in database and adds the deployment classes.

                      I would like to keep process db versioning for use case 1b. If you think of long running processes being forced to release a new version (with classes and stuff as a new deployment artefact) just because one state was forgotten seems unhandy for me. And then you have to develop sophisticated mechanisms to undeploy not longer used processes. With the db versioning it just "fades out".

                      But maybe I am too used to the current process versioning concept already and like it too much ;-) At least I can say, that it is good for marketing to support it ;-)

                      @Tom: Classloading work pretty well if you correctly use scoped classloading of AS with different parallel versions if the deployer and service take care of it correctly. The deployer I wrote for jbpm 3 works productive at the customer without problems...

                      For the execution modes: Yeah, differentiate these and support each of them would be quite interesting!
                      Even if I would prefer "persistent dynamic" for most use cases, because then you have all information together in the database (good for reporting or BI as well, or? In fact BI/ETL tools normally talk to databases not to XML files in the classpath ;-)).

                      But on the other hand persistent process resource could be really an interesting choice for some scenarios, I have to think more about and play with it a bit... What scares me is to loose the foreign keys to the process definition. And I could imagine implementing all the nuts and bolts of it isn't quite easy. And the question is: where is the big advantage of it? But during writing, I start to like it more and more ;-)

                      The idea about the embedded execution modes are really very interesting, since you find these kind of customer tables quite often and it could provide a good migration strategy towards jbpm :-)

                      Cheers
                      Bernd

                      • 8. Re: process definition as a resource
                        tom.baeyens

                         

                        "camunda" wrote:
                        @Tom: Classloading work pretty well if you correctly use scoped classloading of AS with different parallel versions if the deployer and service take care of it correctly. The deployer I wrote for jbpm 3 works productive at the customer without problems...


                        my question is: would it have been possible at this client to work without versioned classes by just using different classnames if a class needs to be changed ?

                        • 9. Re: process definition as a resource
                          camunda

                          Technically I think yes.

                          But for the development this would be a nightmare I think (renaming the stuff yourself, keep track what has changed to not forget to rename something and so on).

                          And it could be that your new version requires a different third party library version, which is easily possible with scoped deployment, but not without (or do I forget something here?)....

                          • 10. Re: process definition as a resource
                            heiko.braun

                            nobody will rename the classes. but you'd need to rename the deployment artifact (i.e. myProcess_v2.par). Otherwise you'd force redeployment of an existing deployment artifact. But the then it works like Bernd says, the scoping is guaranteed by the classloading/deployer framework.