11 Replies Latest reply on Jan 9, 2007 10:21 AM by dmlloyd

    user based process versioning ?

    tom.baeyens

      the current process versioning is based on the fact that during deployment, jBPM will assign a version number.

      while in the past, i thought this was the only and natural way to go, i'm now seeing some problems with that approach. the main problem is that it is hard to integrate jBPM process deployment with jboss traditional mechanism of deploying things. previously i thought that this was due to the fact that process life cycle is inherently different from other process deployment artifacts.

      The new alternative approach would be to add an extra attribute in the process-definition element: version. The user would be able to specify the version of the process here.

      The downside of this approach is that potentially, this enables more problems: we need to add checks to throw an exception in case with the same name and version is already deployed, version numbers can be skipped, ...

      But the advantage is that we can create a plain file based deployer. Upon server boot, the process directory (or the jboss deploy directory) can be scanned for new process definitions. All process definitions for which the given version is already deployed are ignored. The newly added processes will be deployed.

      other thoughts ?

      regards, tom.

        • 1. Re: user based process versioning ?

          Yes I agree, the addition of a version attribute can be usefull to manage processes in a production environment.

          I think also that an attribute to specify if the process definition is 'public' or not can be interesting: if the process definition is 'public' users can start a process based on this definition, if the process definition is 'private' users cannot see/start the associated process.

          It's usefull if you have sub-processes which should be started only by main processes.


          Regards,
          David

          • 2. Re: user based process versioning ?
            saviola

            One more suggestion: How about adding a property which will contain the byte[] representation of the process definition.
            That way two process definitions will be easily comparable. And any minor chage will exert influence on this property

            • 3. Re: user based process versioning ?
              tom.baeyens

              comparing byte arrays has 2 disadvantages:

              1) its big.
              2) compiling a process class with a different compiler option (or a different compiler) will lead to an unnecessary deploy.

              regards, tom.

              • 4. Re: user based process versioning ?
                aguizar

                I like the idea of process archives as deployment artifacts. I think we should reject a new version of a process if the version number does not follow the existing sequence. This would prevent the process deployer (I mean, the person in control of the process) from inadvertely using a wrong number.

                • 5. Re: user based process versioning ?
                  kukeltje

                  Since 3.1.2 (and correct in 3.1.3) the processdefinition is stored in the database as plain xml as well. Generating (and comparing) a hash for both the latest pd in the DB and the pd that is provided is fairly simple. It's none intrusive, backwards compatible etc..etc..

                  Normalization should take place to prevent reformatting leading to a new deployment. Maybe even a "reordering" kind of normalization since switching two nodes in jbpm does not lead to a different executional process. So start task first, then all generic nodes, task nodes etc..etc..etc... Or a reordering based on node name is another option (but what about the elements/attributs IN nodes.. hmmm...

                  comments?

                  • 6. Re: user based process versioning ?
                    brittm

                    Being able to normalize two Process Definitions could be a nice feature, but it also looks to me like a whole lot of trouble. Also, it seems more like a developer's "Oops, I didn't record what I was doing, so now I don't know what I have," kind of feature, rather than a solid business function.

                    I do think that carrying a settable version number in addition to the current incremented one could be a good idea--sort of like the difference between a business key and a primary key. This way the developer could specify a version number that could be kept in sync with CVS and other deployables.

                    -Britt

                    • 7. Re: user based process versioning ?
                      dmlloyd

                      In my opinion the new prototype would work like this:

                      All process information is stored in the process definition archive (not in the database). The database has a row for the process, and a row for the specific version; this allows executions to link back to their owning process. These rows need not use system-maintained versioning, and in fact probably should not.

                      A process definition archive contains the name and version of the process, as well as any associated information (process diagram, forms, etc). A variation on the standard file deployer is used to add a process definition to the container.

                      A jBPM instance can execute any process that is currently deployed. An attempt to execute a process that is not deployed in the current container will result in an exception.

                      Deploying processes in this way has several advantages:

                      * We no longer have to deal with BLOBs, as we are not storing large chunks of binary data in the database. The process definition archives and their associated metadata are deployed much like any other javaee component.
                      * We support two models of versioning. The first allows the user to deploy a new, independent version of their process by specifying a new version number, allowing the old version to continue to exist. The second allows the user to easily *replace* the definition of an existing version, in the case where they have (for example) a critical bug in their process definition and they want to get a quick fix out there, without having to execute a potentially complicated data update as well.
                      * Table normalization. This should always be a goal in my opinion. We should never require grouping or distinct operations for a query of a simple domain object, as we do today for distinct process definitions.
                      * Reusing standard javaee deployment mechanism. This is the standard way of performing a javaee deployment. By way of comparison, EJB authors do not deploy their code into the database; and they are used to the idea of doing a new file deployment for new versions of their applications.

                      • 8. Re: user based process versioning ?
                        koen.aers

                         

                        "david" wrote:
                        A process definition archive contains the name and version of the process, as well as any associated information (process diagram, forms, etc). A variation on the standard file deployer is used to add a process definition to the container.

                        Hm, does this not imply that you are forced to use Java EE? I think that much of the elegance of the current system comes from its lightweightness and its ability to embed itself in whatever system. It would be a pity to lose that IMO. Or is there something I am not getting?

                        I like the option of a 'settable version number', much like manifest information that you can add to any archive.

                        Regards,
                        Koen

                        • 9. Re: user based process versioning ?
                          dmlloyd

                           

                          "koen.aers@jboss.com" wrote:
                          Hm, does this not imply that you are forced to use Java EE? I think that much of the elegance of the current system comes from its lightweightness and its ability to embed itself in whatever system. It would be a pity to lose that IMO.


                          Well, storing the archive data inside the database isn't exactly lightweight. :-)

                          I don't think you'd have to use java ee deployers if you didn't want to, I'm just using that as an example. As long as the engine knows where to pick up process definitions it should work. This is (I think) actually a simpler use-case of javaee deployers, since there's no special startup or shutdown action to be taken; it just has to store the process information in a Map. Likewise the standalone version can just read the process definitions from the filesystem; it already has the plumbing to do this today as far as I can see.

                          • 10. Re: user based process versioning ?
                            tom.baeyens

                            we shouldn't rely on enterprise java.

                            the more i think about it, the more i'm convinced that we cannot put the users in charge of versioning.

                            so if we want an auto-deploy directory, full copies of the process archive files will have to be stored for comparison upon next directory scan. This full copy could potentially be stored on the filesystem or in the database. that doesn't really matter. but since jbpm already depends on a db in most deployments, this would be the first thing i implement.

                            david, i don't yet see the point of having a table separation for process definition versions and process definitions. now there is only a process definition version table and the name and version are columns.

                            • 11. Re: user based process versioning ?
                              dmlloyd

                               

                              "tom.baeyens@jboss.com" wrote:
                              david, i don't yet see the point of having a table separation for process definition versions and process definitions. now there is only a process definition version table and the name and version are columns.


                              Yes, which means that in order to determine the latest version of a process, the generated SQL must select all rows for a process to know what the latest version is.

                              If we have a separate version table, then only one row from each table need be selected to determine the latest version. This is just one benefit of having normalized database tables. All the tables in jBPM should be normalized unless there is a very good reason not to (which there usually isn't).