9 Replies Latest reply on Oct 5, 2009 7:01 PM by kukeltje

parser for each jpdl release

tom.baeyens Oct 1, 2009 3:43 PM

Just wanted to give you guys an early update. I encountered a significant unexpected issue. It was related to updating the classloading. Due to updates in classloading, I came to change the jPDL parsing. the occasion here is minor: i want to refactor the expr attribute to object-expr in user code declarations. but the implications seem to be hard.

Then I realized that the old (v4.0 and v4.1) process files are in the DB. For users that want to upgrade, this parser change potentially breaks existing installations in a non-recoverable manner as the new parser might not parse the existing deployed process definitions in the same way. (the XML is stored in the DB)

The solution I'm currently targetting for is pretty involved:

* Each release will have its own namespace URI. (i forgot this in 4.1. so release 4.1 still contains the 4.0 namespace)

* Each version of jPDL will add it's own parser. All old parsers will be kept in the codebase as well. So that jBPM can still deploy older process versions that are deployed and stored in the DB.

* Each time when a process is deployed, the optional namespace declaration is checked. If not present, the jPDL version deployer will add the namespace and re-serialize the process xml in the deployment that will be stored in the DB. That way the deployed process in the DB will contain the version explicitly of the process XML. And later, each time when jBPM parses that process file after a reboot, then the correct parser associated with that jPDL version can be used.

* So users should still be able to deploy e.g. a jPDL process in version 4.3 XML in jBPM version 4.7

* Migration tool will apply the namespace check and add the 4.0 namespace.

* For testing, I think it is sufficient to use a single test suite. When we embark on a new version, the full test suite will be using the new version's parser. The old version's parser will not be changed any more. So that is why I think it is ok if it is not tested any more.

from a first glance, this seens to be the only way in which we can support decent backwards compatibility and still allow us to evolve the language.

thoughts ?

1. Re: parser for each jpdl release

sebastian.s Oct 1, 2009 4:24 PM (in response to tom.baeyens)

Thank you very much for the update, Tom. From what I can say with my limited knowledge to me this sounds like a reasonable way to account for backwards comptability while allowing enhancements for jPDL which are necessary.

By the way: Did anybody manage to look in to this issue here?

https://jira.jboss.org/jira/browse/JBPM-2537

I'd love to here your opinion on this. To me it is a priority issue for the next release because the bug prevents you from using a lot of the features and the power jBPM offers.

Thanks for your efforts and keep up the good work! :)
Actions
2. Re: parser for each jpdl release

camunda Oct 2, 2009 2:25 AM (in response to tom.baeyens)

Hey Tom.

Sorry, for not replying earlier to the mail. I am not sure about this solutions. Depends on how good we can develop these new parsers. I think of fixes in parsing for older released, where complexity might explode if you have to have a look at a couple of versions.

Hmm, and you changed, you want to ADD namespaces per default???? ;-)

A basic question upfront: Why not forbit removing attributes from the schema at all? Shouldn't be that regular and not hard to avoid, or am I wrong here? Would save us all that complexity...

Cheers
Bernd
Actions
3. Re: parser for each jpdl release

koen.aers Oct 2, 2009 4:27 AM (in response to tom.baeyens)

Bernd,

As far as I understood Tom the old parsers would be parked and no more fixes would be applied on them. In that case the approach seems feasible to me. We possibly need to add a parser with each release *if* the language changes.

And indeed, to my own surprise I saw Tom happily experiment with namespaces yesterday ;-)

Cheers,
Koen
Actions
4. Re: parser for each jpdl release

kukeltje Oct 2, 2009 7:41 AM (in response to tom.baeyens)

Ok, the 'parking' of older parsers makes sense. I do have som questions though
- A version is not only the parser but also accompanying activity implementations. These are instantiated once a process is started (from what I understand). Does this have implications?
- Currently the schema is not fully 'any:any' aware. (I still think it should be 'other:any' for the extensibility. Fully supporting this is needed if you automatically want to add a namespace since people can remove the ns now if they want to add custom attributes.
- Does this mean that people have to redeploy their old processes if they want to take advantage of certain bugfixes in e.g. activity implementations?
Actions
5. Re: parser for each jpdl release

tom.baeyens Oct 3, 2009 9:17 AM (in response to tom.baeyens)

"camunda" wrote:
Depends on how good we can develop these new parsers. I think of fixes in parsing for older released, where complexity might explode if you have to have a look at a couple of versions.

after an attempt i did yesterday, I realized having a parser-per-version gets much more trickier then I originally thought.

"camunda" wrote:
Hmm, and you changed, you want to ADD namespaces per default???? ;-)

not really.

basically one of the techniques that we could use is the following: upon deploying the process, we could
* parse the process into dom
* if no namespace is present to indicate the version, we could add a namespace declaration or another attribute that indicates the version
* then serialize that process
* and update the xml in the deployment before it gets saved

in general we could potentially add or change the xml when we deploy it. but I think that a downside will be that developers don't like this because they will love it better if they still recognize their own XML in the DB.

the alternative is that we leverage the properties that are associated to a deployment object.

"camunda" wrote:
A basic question upfront: Why not forbit removing attributes from the schema at all? Shouldn't be that regular and not hard to avoid, or am I wrong here? Would save us all that complexity...

here's the dilemma:

attribute 'expr' is used in conditions on transitions in a decision. in that case the returned value is expected to be a boolean.

the 'expr' attribute is also used in one of the places where you specify user code. in that case the resulting object is used as the user defined object.

now I want to clean that up:

1) i want to keep

2) i want to make all the user code parsing consistent. in order not to clash with the expr of condition, I want to use object-expr for all usages where the resulting object is used as user code.

i don't see how we can clean this up without differentiating the parsing between versions.

in the meantime i do think that we should be able to support this with just a couple of if-then-else statements in the single parser based on the namespace or version info.
Actions
6. Re: parser for each jpdl release

tom.baeyens Oct 3, 2009 9:36 AM (in response to tom.baeyens)

"kukeltje" wrote:
Ok, the 'parking' of older parsers makes sense. I do have som questions though
- A version is not only the parser but also accompanying activity implementations. These are instantiated once a process is started (from what I understand). Does this have implications?

very true ! that's the tricky thing i ran into yesterday.

so now i think we should aim for the following approach:

* make sure that we can know the version of each process when we parse it. this can be a bit tricky, because 1) we do not enforce people to specify namespace, so when deploying a process, the current library version somehow has to be added in to the db.

and another issue is this: if we use the namespace for knowing the version, then this is also used to activate or not activate the xml validation. so we could end up with the situation that an invalid process deploys because originally the namespace was not present. if we then add the namespace before we save it in the deployment, then when loading it from the db it contains the namespace and that will activate validation. and hence the process can not be used. given that we use a cache after deployment, this kind of problem might not show up in test environments.

* introduce a couple of if-then-else statements in the parser that depend on the version

* the hard part is testing. i'm still trying to find how to we test this? how can we make sure that old deployed processes will still work correctly in newer versions of jBPM.

i've been thinking about making sure that we run the test suite with a version parameter. then all the processes in the testsuite which don't have an explicit namespace, will use the configured parser version.

but i'm not yet sure if this tests what we actually need. and to what extend it protects us from making backwards incompatible changes.

if we get this wrong, the risk is that we break existing installations after they upgrade.

"kukeltje" wrote:
- Currently the schema is not fully 'any:any' aware. (I still think it should be 'other:any' for the extensibility. Fully supporting this is needed if you automatically want to add a namespace since people can remove the ns now if they want to add custom attributes.

if you declare extensibility with any:any, then users can *still* define all their extensions in their own namespace. they are just not forced to do it if they want to.

why do think users *need* to define their extensions in a separate namespace. i agree that it would be good practice from a user perspective. but i don't see a reason why we should enforce our users to it like that.

"kukeltje" wrote:
- Does this mean that people have to redeploy their old processes if they want to take advantage of certain bugfixes in e.g. activity implementations?

that's the kind of questions that we have to ask ourselves when composing a solution.

i didn't yet see the best solution for this whole compatibility and versioning of deployed processes issue. but my neural network is chewing on this with 100% CPU utilization.
Actions
7. Re: parser for each jpdl release

kukeltje Oct 3, 2009 7:16 PM (in response to tom.baeyens)

Detailed comment coming later, but one remark first:

but my neural network is chewing on this with 100% CPU utilization.

I've seen CPU's being busy for 100% and effectively doing nothing. The times I've seen a neural network busy 100% of the time it was shortly before a nervous breakdown... Either way, it is not good ;-P
Actions
8. Re: parser for each jpdl release

tom.baeyens Oct 5, 2009 3:50 AM (in response to tom.baeyens)

exactly. with my formulation i explicitely didn't want to imply any result coming out of it :-)
Actions
9. Re: parser for each jpdl release

kukeltje Oct 5, 2009 7:01 PM (in response to tom.baeyens)

"tom.baeyens@jboss.com" wrote:
"kukeltje" wrote:
Ok, the 'parking' of older parsers makes sense. I do have som questions though
- A version is not only the parser but also accompanying activity implementations. These are instantiated once a process is started (from what I understand). Does this have implications?

very true ! that's the tricky thing i ran into yesterday.

so now i think we should aim for the following approach:

* make sure that we can know the version of each process when we parse it. this can be a bit tricky, because 1) we do not enforce people to specify namespace, so when deploying a process, the current library version somehow has to be added in to the db.

I disagree. We should just enforce the namespace(declaration) That is what it is for.

"tom.baeyens@jboss.com" wrote:
and another issue is this: if we use the namespace for knowing the version, then this is also used to activate or not activate the xml validation. so we could end up with the situation that an invalid process deploys because originally the namespace was not present.

if we then add the namespace before we save it in the deployment, then when loading it from the db it contains the namespace and that will activate validation. and hence the process can not be used. given that we use a cache after deployment, this kind of problem might not show up in test environments.

Then don't add it, but (unfortunately) assume that every deployed process without a namespace declaration is 4.0 or 4.1.

"tom.baeyens@jboss.com" wrote:

* introduce a couple of if-then-else statements in the parser that depend on the version

* the hard part is testing. i'm still trying to find how to we test this? how can we make sure that old deployed processes will still work correctly in newer versions of jBPM.

i've been thinking about making sure that we run the test suite with a version parameter. then all the processes in the testsuite which don't have an explicit namespace, will use the configured parser version.

but i'm not yet sure if this tests what we actually need. and to what extend it protects us from making backwards incompatible changes.

if we get this wrong, the risk is that we break existing installations after they upgrade.

Yes, this is *the* challenge. But I'd use a namespace declaration for this. If none is there, assume 4.1. For the testing itself, maybe there is an option to use the tagged
svn) testcases for a specific version and run them against the latest jbpm release. This would prevent an explosion of the number of testcases that we need (to duplicate?) in the source.

"tom.baeyens@jboss.com" wrote:

"kukeltje" wrote:
- Currently the schema is not fully 'any:any' aware. (I still think it should be 'other:any' for the extensibility. Fully supporting this is needed if you automatically want to add a namespace since people can remove the ns now if they want to add custom attributes.

if you declare extensibility with any:any, then users can *still* define all their extensions in their own namespace. they are just not forced to do it if they want to.

Again I strongly disagree. What if they extend some things and we start using the same tags? I *hate* the any:any. It was probably introduced (read forced upon us) in some weird way by a company that already developed something for it's customers. Just like with BPMN2 ;-)

"tom.baeyens@jboss.com" wrote:

why do think users *need* to define their extensions in a separate namespace. i agree that it would be good practice from a user perspective. but i don't see a reason why we should enforce our users to it like that.

to prevent future element/tag clashes, to be explicit (I love explicit things, no.... I hate implicit things. Probably because it is one of the (few) negative aspects of females. Saying one thing but implying another....
"tom.baeyens@jboss.com" wrote:

"kukeltje" wrote:
- Does this mean that people have to redeploy their old processes if they want to take advantage of certain bugfixes in e.g. activity implementations?

that's the kind of questions that we have to ask ourselves when composing a solution.

What about an activities file per jpdl version? If we know we are breaking backwards compatibility (testcases of previous version failing against the latest source) , we create a new activities class which extends the previous one and we e.g. introduce a version number in the packagename and put that in the new activities file? (I was dreaming when I wrote this, forgive me if it goes astray
Actions

Go to original post