Rework of jBPM deployment within JBoss| JBoss.org Content Archive (Read Only)

15. Re: Rework of jBPM deployment within JBoss

bill.burke Jun 7, 2007 8:17 AM (in response to bill.burke)

"kukeltje" wrote:
Question: Why should a new processarchive with an identical pd.xml file but new classes not always require a new deployment? This is related to the problem how to decide a class is new. (build clean gives you a new timestamp but is something else identical?)

Expose a version attribute in the jpdl schema. This would allow you to rev the version if you want when only class changes are present.

16. Re: Rework of jBPM deployment within JBoss

bill.burke Jun 7, 2007 8:54 AM (in response to bill.burke)

"tom.baeyens@jboss.com" wrote:
"bill.burke@jboss.com" wrote:

1. The hibernate and jbpm xml files are embedded deep within the enterprise archive. These need to be moved up for easy edits by the user

one option could be to extract all the .hbm.xml files and put then in the config jar file. would that qualify as 'moved up' ?

Has to be an exploded archive so people don't have to unjar things. The less people have to do, the better.

i definitely want at least something that is as portable as this solution. if there are jboss specific ways on how to move the configs even up further, i 'm ok with that as long as it's another option that we offer apart from the portable deployment.

It would be another option, as its definately JBoss specific.

"bill.burke@jboss.com" wrote:

a) You need to create the jBPM schema in the database

Previously detection and update of the schema was done automatically in a .sar. But i want to move to a portable .rar for this. Would that be an option for you ?

Its as simple as configuring Hibernate to "update" mode so that the schema is created on boot time.

As for JCA, an outbound adapter for jbpm might be interesting so that you could inject JBPM contexts and would give you somewhat of a portable way of doing deployment. Like, with the outbound configuration, you'd specify the .jpdl file or process archive to load.

"bill.burke@jboss.com" wrote:

At deployment time, the JBPM deployer would calculate a hash for hte .xml file. If this hash matches a previously deployed .xml file nothing is done. If it is a new hash, a new version of the process definition is saved to the database.

hash in combination with the last update time of the file would be good, i think. i wouldn't trust hash only. chances are small for a collision, but imagine if you have such a collision, then you would have to make sure something in the file is changed to get it deployed.

I can't imagine having such a collision as you have a higher chance of getting struck by lightning. RMI and our EJB code uses an MD5 of a method to send its identity over the wire.

saving that info in the database is going to be tricky in 3.2.x as we only want to allow updates to the db schema between minor version releases. 3.3 is still some time out.

Why so tricky? Just create another table.
create table(
process_name VARCHAR,
process_version int,
process_hash long
);

Its something only the deployer knows about.

1. Parse the .jpdl file.
2. Create an MD5 hash of the file (or object model or whatever)
3. Get PD name from deployed file
4. query table for latest PD
5. compare hashes
6. If new, insert into table name, version++, new_hash
7. create new PD in JBPM

"bill.burke@jboss.com" wrote:

A new config structure would look like this:
jbpm.deployer/
 hibernate.cfg.xml
 jbpm.cfg.xml
 jbpm-ds.xml
 jbpm-jpdl.jar
 jbpm-enterprise.ear
 META-INF/
 jboss-service.xml
that looks good. but consider JCA and a .rar to deploy jbpm separate on the appserver, apart from the deployer. then the .rar would be portable.

A RAR would be limited and you would still need to deploy 2 separate files, a pd-ds.xml file, to define the process instance, and the .par file (or .jpdl file). Where in a JBoss deployer, its just one file and less metadata to write.

What's also good about the JBoss model is you would be able to compose applications into one archive:

foo.sar/
 ejbs.jar
 jbpm-app.par
 foo-dsl.xml

with JBoss 5 it would be even more interesting and simple as a multiple deployers can process one archive (unlike JBoss 4).

jbpm-app.jar/
 com/
 org/ (class files, could be annotated components like EJB and SEAM)
 foo-ds.xml
 META-INF/
 ejb-jar.xml
 jbpm.jpdl.xml
 seam-components.xml

17. Re: Rework of jBPM deployment within JBoss

kukeltje Jun 7, 2007 9:08 AM (in response to bill.burke)

A RAR would be limited and you would still need to deploy 2 separate files, a pd-ds.xml file, to define the process instance, and the .par file (or .jpdl file). Where in a JBoss deployer, its just one file and less metadata to write.

Not completely. It is not really a pd-ds.xml but a jbpm-ds.xml. Multiple processdefinitions can share the same jbpm database. In fact, if you have a separate ds for each process you also have to deploy the console app multiple times to connect to the different ds's.

18. RAR

bill.burke Jun 7, 2007 9:32 AM (in response to bill.burke)

"kukeltje" wrote:
A RAR would be limited and you would still need to deploy 2 separate files, a pd-ds.xml file, to define the process instance, and the .par file (or .jpdl file). Where in a JBoss deployer, its just one file and less metadata to write.

Not completely. It is not really a pd-ds.xml but a jbpm-ds.xml. Multiple processdefinitions can share the same jbpm database. In fact, if you have a separate ds for each process you also have to deploy the console app multiple times to connect to the different ds's.

You're missing my idea. A JBoss -ds.xml file really isn't solely a datasource definition, it is an outbound adapter definition/configuration. Each process definition would be a an instance/activation of the JBPM RAR resource adapter. Just like a datasource in JBoss is an activation of the JDBC RAR.

19. Re: Rework of jBPM deployment within JBoss

kukeltje Jun 7, 2007 9:50 AM (in response to bill.burke)

ok, I was mislead by the -ds postfix...

20. what's wrong with deployment

bill.burke Jun 7, 2007 9:55 AM (in response to bill.burke)

Let me expand on why I don't like JBPM deployment model:

* First of all, if you are deploying via an ANT script or the designer, these mechanisms need a database and knowledge how to connect to it. This requires the user to either set up a database (MYSQL/Oracle), or use JBoss's embedded Hypersonic. The user then also has to configure

* I don't like the Servlet mode of configuration at all. WHy? Because its inconsistent with the way Seam deploys JBPM and inconsistent with they way every other JBoss project does deployment.

There is a huge problem at JBoss.org at the moment. Every project is more interested in being portable to other non-JBoss environments than taking advantage of the JBoss platform and becoming one, integrated consistent suite. Take deployment for instance: Seam does it different than JBPM which does it different than ESB which does it different than AppServer wich does it different than Drools. I brought ESB into the JBoss deployment model, I'd like to do the same for JBPM.

We need to start being consistent across JBoss.org projects. This way we can start writing generic tools and frameworks that can abstract things like deployment and write a portability layer to other application servers. JBoss 5 kernel does a lot to modularize things and the JBoss Embedded project is where I want to write this portability layer.

21. Re: Rework of jBPM deployment within JBoss

dmlloyd Jun 7, 2007 9:56 AM (in response to bill.burke)

"estaub" wrote:
Though I've always disagreed with storing the process archive (especially the classes) in the database.

Don't forget about clustering.

I haven't - you'd do clustering in the exact same way you would with any other J2EE archive, like an EAR or WAR or whatever. It's a problem that has already been solved - and not by storing the archive in the DB.

22. Re: Rework of jBPM deployment within JBoss

tom.baeyens Jun 7, 2007 10:25 AM (in response to bill.burke)

"david.lloyd@jboss.com" wrote:
"kukeltje" wrote:
I agree, that he/she should have the option, but that should also be the case for changes in the .xml file. (small typo's in some text e.g.) and what about the forms....

Exactly. That's why I said to hash the semantic graph structure, rather than the file itself. The only change that absolutely requires a deployment of a new version is a change in the graph structure. For any other change, it should be controllable by the user whether a new version is deployed.

i don't think it's good to have a model that has a complicated calculation on wether or not it will redeploy.

in development it doesn't matter if you redeploy too much.

in production you won't update the file that often.

so i don't really see a problem with just using a hashcode of the total process archive, preferrably in combination with the timestamp of the file. without inspecting the contents of the deployed process

23. Re: Rework of jBPM deployment within JBoss

tom.baeyens Jun 7, 2007 10:38 AM (in response to bill.burke)

"bill.burke@jboss.com" wrote:

Its as simple as configuring Hibernate to "update" mode so that the schema is created on boot time.

As for JCA, an outbound adapter for jbpm might be interesting so that you could inject JBPM contexts and would give you somewhat of a portable way of doing deployment. Like, with the outbound configuration, you'd specify the .jpdl file or process archive to load.

I'm not sure if we're on the same page here. I discussed with weston that it would be possible to use JCA/RAR for the following:

upon server boot,
1) update the schema using hibernate's tool
2) produce a JbpmConfiguration in JNDI

"bill.burke@jboss.com" wrote:

saving that info in the database is going to be tricky in 3.2.x as we only want to allow updates to the db schema between minor version releases. 3.3 is still some time out.

Why so tricky? Just create another table.
create table(
process_name VARCHAR,
process_version int,
process_hash long
);

Managing and documenting all of this. Managing the right hibernate mappings to be used in deployment. People deploy jBPM in *many* different ways. My current strategy is to try and limit the ways it is being deployed. Creating an extra table is not hard to get working, but many users will be confronted with either a table that doesn't exist and that should exist or similar problems with the hibernate mappings.

i think it should be possible to reuse an existing field for it in 3.2.x or just wait till 3.3 comes out. Then we can introduce a new table.

A RAR would be limited and you would still need to deploy 2 separate files, a pd-ds.xml file, to define the process instance, and the .par file (or .jpdl file). Where in a JBoss deployer, its just one file and less metadata to write.

i agree with building a jboss deployer based on hash codes.

i'm only trying to separate the parts that you put into the depoyer into 2 separate modules:

1) automatic database update and putting the JbpmConfiguration in JNDI (that can be done in a portable RAR deployment, afaik)
2) jboss process deployer

24. Re: Rework of jBPM deployment within JBoss

kukeltje Jun 7, 2007 11:02 AM (in response to bill.burke)

@Bill

+1 on the unified deployment approach across jboss (unified UI over projects too btw)

+1 on not really promoting the ANT way...

I do not agree however on not liking the servlet way, at least not if there are no other options for other servers. A unified approach for other servers (maybe a lot simpler, with less features) would be nice though

25. Re: Rework of jBPM deployment within JBoss

dmlloyd Jun 7, 2007 7:00 PM (in response to bill.burke)

"tom.baeyens@jboss.com" wrote:
"david.lloyd@jboss.com" wrote:
The only change that absolutely requires a deployment of a new version is a change in the graph structure. For any other change, it should be controllable by the user whether a new version is deployed.

i don't think it's good to have a model that has a complicated calculation on wether or not it will redeploy.

It doesn't have to be a complicated calculation. You could use StAX to filter the XML file to remove whitespace and to filter out any elements that are not part of the XML namespace that defines the graph structure, and hash the result of that incrementally. It could actually be more efficient than hashing the whole file because the complicated md5 calculation only takes place over a portion of the data!

"tom.baeyens@jboss.com" wrote:
in development it doesn't matter if you redeploy too much.

Agreed.

"tom.baeyens@jboss.com" wrote:
in production you won't update the file that often.

Well, yes and no. I might want to update all the deployed processes with new GPD information for example, without causing a new version to be created for every process. Or maybe I want to change other auxiliary information, again using other XML namespaces.

"tom.baeyens@jboss.com" wrote:
so i don't really see a problem with just using a hashcode of the total process archive, preferrably in combination with the timestamp of the file. without inspecting the contents of the deployed process

I disagree with the timestamp though. That implies that any change to the file requires a new process version, and I don't believe that this is true.

26. Re: Rework of jBPM deployment within JBoss

kukeltje Jun 8, 2007 3:50 AM (in response to bill.burke)

complex calculation: I think Tom means that it should be clear to the user when the process is (not) deployed. If to many factors are taken into account, it gets to complicated.

+1 on not using the timestamp

27. Re: Rework of jBPM deployment within JBoss

bill.burke Jun 8, 2007 11:37 AM (in response to bill.burke)

I've been thinking about this a bit the last day. Here's the advantages, IMO, of the new deployer I'm proposing:

* Easier for newbie to get start. One last thing they have to worry about.
* Easier for development
* Less configuration
* Doesn't impair other jBPM deployment options
* gets in line with how other JEMS projects deploy into JBsos

Disadvantages:
* Not portable to other application servers. (Until we get JBoss Embedded going).
* I don't think this will work well with class versioning unless we add metadata to the .par (like a version number)

My questsions are:
* How often is versioning used?
* How many versions are live at one time? How many versions do applications usually have in play?

One last thing:
* Maybe with this new deployer, versioning should be turned off by default?
* Should a piece of metadata be added to the deployment to say whether or not it should be versioned or just overwrite the old deployment?

28. Re: Rework of jBPM deployment within JBoss

brittm Jun 8, 2007 5:45 PM (in response to bill.burke)

* How often is versioning used?
* How many versions are live at one time? How many versions do applications usually have in play?

Over the past year, we've put 15 versions of a "New Order" process in production and 5 or 6 versions of several other supporting processes. Yes, we still have a few version 1 New Order processes that are still active. (A typical order can take 3 months to fulfill, while some can take 8 months to a year if the salesman is "pre-selling".)

Over that amount of time, its almost a guarantee that business processes will change, or we'll acquire someone (or be acquired by someone) that we'll have to integrate with. The ability to EITHER version OR replace any of the resources associated with a process is extreemly important.

-Britt

29. Re: Rework of jBPM deployment within JBoss

tom.baeyens Jun 9, 2007 6:24 AM (in response to bill.burke)

"bill.burke@jboss.com" wrote:
I've been thinking about this a bit the last day. Here's the advantages, IMO, of the new deployer I'm proposing:

* Easier for newbie to get start. One last thing they have to worry about.
* Easier for development
* Less configuration
* Doesn't impair other jBPM deployment options
* gets in line with how other JEMS projects deploy into JBsos

Disadvantages:
* Not portable to other application servers. (Until we get JBoss Embedded going).
* I don't think this will work well with class versioning unless we add metadata to the .par (like a version number)

could you summarize your new proposal on a wiki page cause a lot of arguments and aspects have been put forward in this discussion and i lost track of which combination your evaluating

e.g. on http://www.jboss.org/wiki/Wiki.jsp?page=JbpmJBossDeployment

also, it's not a question of pro's and con's for me. i'm all in for a jboss deployment model in addition to the models that we have.

the deployment models might be sanetized later, but to date, we don't have a clear enough view on what people actually need. so we offer everything.

the motivation is that people must be free to work the way *they* want. we shouldn't force them to work in one way or another.

"bill.burke@jboss.com" wrote:

My questsions are:
* How often is versioning used?

It's used quite a bit. Although we recommend for performance reasons to skip versioning of classes and just put them in the classpath.

"bill.burke@jboss.com" wrote:

* How many versions are live at one time? How many versions do applications usually have in play?

no clue on the average.

but process instance migration is not a straightforward task. when you deploy a new process definition (either as a new separate definition or as an update of the existing) you'll need to convert the old executions to the new definitions.

if all the nodes are the same, it's easy to convert the tokens to the new process. if you can map old node names to new node names that could be an enhancement too.

but in the general case, you can't translate old process instances to new definitions. even your concurrency model might have changed. in that case it becomes impossible to map the old tokens to new tokens.

on top of that, you need to take the logs into account. in case of a process update, the previous logs might not match the process definition any more.

that is why i only see one way of handling versioning:
when deploying a process that already exists (process equality is based on the name), then a completely new process definition is created. old executions keep running in the old definition, new process instances are started in the new definition.

"bill.burke@jboss.com" wrote:

One last thing:
* Maybe with this new deployer, versioning should be turned off by default?
* Should a piece of metadata be added to the deployment to say whether or not it should be versioned or just overwrite the old deployment?

see previous remark about process instance migration. please explain how you want to handle that if you talk about not doing versioning.