7 Replies Latest reply on Aug 9, 2006 6:00 PM by jeffdelong

jbpm drools integration meeting

tom.baeyens Aug 8, 2006 7:28 AM

This is a report of the jBPM drools meeting we just had in london.

1) Rules deployment model is quite different from jBPM's deployment model. This discussion brought us to an interesting conclusion: there does not seem to be an obvious common model on how domain specific sources end up on a runtime deployed system. Drools and jBPM will keep their current deployment models. But as part of the jBPM-drools integration, we worked out how we can build a unified process-rules repository.

Basically, the idea will be that the unified repository will be a jBPM database, extended with a table for rule bases. A rule base will be stored as a blob and referenced by a name. Then from inside of a process, you can reference the rule bases by name. This scenario allows for rule bases to be deployed and managed separately from processes but in the same repository.

A special simplified use case is when a process archive contains a set of rules in source format. In that case a default rule base will be created and associated with the process. The process and the rules will then appear to the user as one logical entity. In a process you can then say 'fireAllRules' without specifying a specific rule base.

Processes will contain a mapping between logical names and rulebases. Rule bases can be referenced in the unified process repository or in any other location. One of the referenced rule bases can be the default rule base.

2) Process variables as facts. In each process operation you can already feed in all the process variables into a working memory. This will fire rules and many of the intermediate results will be kept in the working memory. In a subsequent process operation, you might want to fire all rules again. Currently, that would require that all the process variables will have to be fed in again. Re-asserting all the variables as facts is not desirable: all the concequences will be re-fired with potential side effects This implies redoing a lot of calculations of which the result was available in the working memory. Currently, there is no way to store the working memory, restore it later and re-install the process variables as facts into the working memory again. Mark will be adding support for storing and retrieving working memories. A working memory is composed of facts and an agenda. The solution will be to store the agenda and a mapping between rule-fact-handles and process variables. At working-memory reconstruction time, the agenda will be deserialized and the mapping will be used to rebind the rule-fact-handles in the agenda to the process variables.

By default, jBPM will fire all rules when a process variable is updated. The working memory will be stored as process execution data (along side the process variables).

3) making process objects (varables, taskInstance, token, ...) available on the right hand side in rules as global variables. Drools will add an interface like this
public interface VariableResolver {
Object resolveVariable(String name);
}
Then jBPM and SEAM can provide a different implementations that know about the contextual objects.

All questions and remarks on this strategic directions are appreciated.

1. Re: jbpm drools integration meeting

kukeltje Aug 8, 2006 7:03 PM (in response to tom.baeyens)
I know some things about drools and use it (and probably will use it in a major financial system in the Netherlands, but I still have to convince people).

Some things I read above:
1:

This scenario allows for rule bases to be deployed and managed separately from processes but in the same repository.
and
A special simplified use case is when a process archive contains a set of rules in source format.
sounds contradictory, do I miss something?

In a process you can then say 'fireAllRules' without specifying a specific rule base.
All rules specific to that process right? or also global rules? Making processes dependend on global rules in not neccesarilly a problem, but keeping referential integrity is.

2:
I cannot grasp the first part of this paragraph which might be the reason for the next questions

In a subsequent process operation, you might want to fire all rules again. Currently, that would require that all the process variables will have to be fed in again.
I do not think this is unwanted behaviour since the chances are big they have changed

By default, jBPM will fire all rules when a process variable is updated.
I hope you mean after all process variables in a transaction are updated. I do not (never?) want rules to go of after one update if I update several variables in e.g. a task

And why automatically? What is the usecase for this. Isn't doing it explicitly in an actionhandler/decisionnode/custom node enhough?

3:

Initially this sounds ok, but this introduces the chance people start signalling tasks/nodes from within JBoss Rules (not drools anymore right ;-)). you get a mix then of the pd and the rules which might lead to a more complex definition. We internally made the choice to let the rules have a minimal impact on the process, just return an outcome and have the pd use that info to e.g. make a decision. Since we WANT to store the outcome anyway, we have a mapping of the outcome of the rules to a process variable and NOT have the rules definition set the process variable.

The only situation where we want to have the rules engine start a number of tasks is in a custom node (not implemented yet) is in a kind of evaluation system where depending on the outcome each of 100 rules, either a task should be started, a message send or some dossier has to be updated e.g.
if (var1 > 10) { start task2; risk = risk + 10; } if (var2 = "yes") { start task5; send message3; } etc...

This is one rulebase (customer risk assertion) which will fire. We'd like the tasks to be separately defined in the pd (due date etc...) but created if needed (create task=false or even skip the node) Maybe we can take this as an example usecase and see how JBoss rules fits in this)
Actions
2. Re: jbpm drools integration meeting

tom.baeyens Aug 9, 2006 2:02 PM (in response to tom.baeyens)

Hi, i take the liberty to post jeff's response:

"Jeff" wrote:

Thanks for including me on the email thread. I am very interesting in this topic, as I have been working with customer to integrate jBPM and Drools.

The emails contain various solutions (combined repositories, variable mappings, etc.), but I am not sure in some cases what problems are trying to be solved. So I though I might back up and start at what I think the jBPM / Drools integration requirements are, and see if that helps me understand this email thread.

I think the goal of jBPM / Drools integration should be to provide the application developers with several integration components out-of-the box that are well documented and address many of the common use cases. In many instances however a customer's requirements will be different enough that they will need to create their own integration components. Even in these cases, what we provide out of the box will help them as a starting point.

I think the requirements for jBPM / Drools integration are derived from a few general use cases:

1) Task Assignment
2) Actions Nodes
3) Decision Nodes
4) Rule Flow

Before discussing the details of each of these, I think there are a couple general observations that should be made. In terms of "how often they change", I believe that users would like to change their rules the most frequently, business processes somewhat frequently, and code (hopefully), not as frequently. Second, they would like business analysts to be able to make the frequent, minor changes to their rules (example below).

1) Task Assignment. This involves invoking rules from a "RulesAssignmentHandler" to determine the actorId to set on the assignable (TaskInstance or SwimlaneInstance). In this case the organizational model, some domain objects, and the assignable need to be asserted into the rules engine.

The organizational model does not change too frequently (not more than daily in any case), and is perhaps quite large (thousands of employees) so it would be nice to not have to re-assert into working memory each time an assignment is made. The domain objects are changing from process to process, so they should be re-asserted. The assignable could be a global variable.

The rules themselves might be quite numerous. For example, if we had an organizational model where users were mapped to roles within an office, and we wanted to assign a Task to someone in a specific role (based on the task definition) and office based on the state the insured lived in and the type of claim, then there might be a separate rule for each office and type of claim.

Rule "Determine Chicago Office"

when
Insured ($state : state == "Illinois")
Claim ($lob : lineOfBusiness == "health") then
assert(new Office("chicagohealth"));

end

...

Rule "Determine Actor"

when
exists Office()
Office($officeName : officeName)
$role : Role()
Membership( role == $role, office == $officeName, $user : user )

then
$a.setActorId($user.getName());

In most cases the same rules would determine task assignment for all tasks within a process definition, so the task assignment rules should be able to be specified for the entire process definition.

If the assignment rules change, e.g., a new office is opened to support the southern half of Illinois, then it is probable the case that the process definition should change. Otherwise in our example, we might have a claim that was being worked in Chicago office suddenly being assigned to the new office.

This type of usage would be supported by a pre-compiled rule base associated with a process definition. (as Tom describes in 1 in his email).

This scenario would also be well supported through the ability to assert the organizational model into the working memory once, and then use this working memory, asserting new instance of domain objects (and I suppose retracting them as well?). It is not really clear to me how this would work, particularly with multiple threads. Perhaps some kind of partitionable / composite working memory is required.

Although each customer would likely have their own organizational model, an out-of-the-box implementation based on the jBPM Identify component would be very useful for customers to understand how rules based assignment can work.

2) Actions / Nodes. This involves invoking rules as at a node in a process, either from an action configured on the node itself or associated with some node event. In this scenario domain objects would be asserted in the working memory, along with the jBPM ContextInstance. The later would be used to communicate the results of the rule execution. I.e., the right hand side would invoke contextInstance.setVariable("resultsOfRuleExecution", "some result").

The ability to describe what domain objects should be asserted by this "RulesActionHandler" on the process definition is required. However this must be extremely flexible, as there are many different possible requirements here: assert all instance of a specific domain object, assert a subset of a specific domain object, assert single instances of many types of domain objects, assert multiple instances of many types of domain objects, etc.

Also, where the domain objects live must be flexible. While it may be convenient to store them off as ContextInstance variable, this involves serializing each object, and, depending on when they are stored in jBPM, the potential for them to become out-of-date. Thus just specifying a set of jBPM constextInstance variable is not enough, there must be a way to describe how to get the domain objects from other repositories (e.g., a hibernate query, a web service, etc.).

As noted above, Customers expect to be able to make minor changes to their rules within the context of an executing business process. So for example a rule compares values from two different domain objects, and if the values differ by more that a specified per cent, then the rule fires and the consequence sets some jBPM ContextInstance variable. In this scenario the user would expect to be able to adjust that deviation percentage, re-deploy the rule file, and when a currently executing process instance gets to the node that fires that rule, it will fire the latest version. So in this case if the rule base was precompiled and stored within jBPM, then there must be a way to modify it while a process instance is executing. I.e., it is okay for the "rule bases to be deployed and managed separately from processes but in the same repository" as long as by separately managed this requirement can be met.

Although each customer would have their own domain model, an out-of-the-box implementation of a RulesActionHandler along with an example would be very useful for customers to understand how rules can be invoked at node.

3) Decision Nodes. This involves invoking rules from a RulesDecisionHandler. In this scenario, domain objects would be asserted in the working memory along with a set of transitions, and the token. The LHS of the rules would specify which transition to take, e.g., token.signal("transition"). The token and the transitions could be global variables.

In this scenario we might expect that if the rules for making a process routing decision change, then we might want to re-deploy the process definition.

Although each customer would have their own domain model, an out-of-the-box implementation of a RulesDecisionHandler along with an example would be very useful for customers to understand how rules based decision nodes can work.

4) Rule Flow - Using jBPM as to control rule flow. In this scenario this is a single working memory for the process instance, and every node in the process definition asserts, detracts, modifies data to this working memory, causing rules to activate / fire. It is another way to control rule flow at a more macro level (for situations where agenda groups, salience, and just good rule design are not enough).

This requirement would be supported by a "rule flow jpdl" (much like the Seam "page flow" jpdl). Nodes would become blocks of rules (like agenda groups with even more control over when they execute). Process variables would constitute the objects asserted. When the process instance is created, the process variables would be instantiated, others may be created as the process proceeds.

I am not sure this requirement involves persistent wait states. That is, I believe the execution of the entire process instance happens in a single transition, with no requirement to persist the intermediate state of the rule execution (I may be wrong here, I have not given this scenario much thought). On the other hand, if persistent wait states within the rule flow are required, then what Tom proposes for storing / reconstructing the working memory would be required as well.

I don't understand Tom's comment "By default, jBPM will fire all rules when a process variable is updated." If the update comes from the rules updating the working memory, then they will fire (if their conditions are now satisfied). How else would the process variables change (from some outside source?). Perhaps Tom has a different use case in mind for number 2, although I don't think it fits in the other use cases I described above.

Jeff
Actions
3. Re: jbpm drools integration meeting

tom.baeyens Aug 9, 2006 2:06 PM (in response to tom.baeyens)

Hi Jeff,

"Jeff" wrote:
| 1) Task Assignment.

For calculating the actual actorId (user or group) that is responsible for a certain task, i think that rules are not really suited. Jeff, you obviously have more real life experience so i would appreciate your comment on this. One of the big problems i see is that with rules you're not sure if you'll find 0, 1 or more solutions.

But i DO think that rules are suited to calculate all the constraints that people have to satisfy in order to apply for a certain task. So the rule base could calculate those constraints after the rules have fired, a plain hibernate query could be build from those results. Would that approach make sense ? Another advantage would be that we don't have to assert the whole identity store into the working memory. The datamodel of the idenity store might have to have a generic way of managing properies for people so that the rules can produce plain (hibernate?) criteria that express the constraints.

"Jeff" wrote:
| Perhaps some kind of
| partitionable / composite working memory is required.

I think that would make sense. But mark and michael will know better then me.

"Jeff" wrote:
| 2) Actions / Nodes. This involves invoking rules as at a node
| in a process, either from an action configured on the node
| itself or associated with some node event.

Maybe the rules only need to be fired when the process data (== facts) change. So i think i should add a variable-update event or something like that. So that the rules can be fired after each variable update. I'm still in doubt wether explicit rules invocations on a specific location in the process make sense.

"Jeff" wrote:
| communicate the results of the rule execution. I.e., the
| right hand side would invoke
| contextInstance.setVariable("resultsOfRuleExecution", "some result").

Another option could be that after a rules calculation, certain facts could be fetched from the working memory and submitted into the process variables. The variable names could be part of the action handler configuration. But probably this is not possible because the working memory doesn't yet have named facts, right ?

So how do we collect the new data that is generated by the rules ? Should the rules themselves distinct between an assert and a setVariable on the process ? Defining contextInstance as a global variable seems the best approach to me. But it would be great if the facts were named, rules only contain asserts and jbpm was able to collect all new asserted information in the process variables transparantly.

"Jeff" wrote:
| As noted above, Customers expect to be able to make minor
| changes to their rules within the context of an executing
| business process.

Maybe my next answer is not quite in the context of your point, but it relates to this and i have been thinking about this after our london meeting.

In my opinion, rules sources, process sources are on the same level as java sources. They're best managed in a source control system like CVS or SVN. The fact that projections of rules and processes can be presented in pictures that make sense for non-tech business people does not change this. We should be careful in our admin consoles not to include process and rule updates that are not synched with the source code just for the sake of the business user. We can only do that if there is a clear strategy on how the sources will be kept in sync. This is important cause i do NOT think that in general, rules and processes are stand-alone things that can be managed by non-techies. While the graphical projection of processes and rules make sense, both processes and rules will most likely include technical details. So processes and rules cannot live in isolation, but they will be integrated in plain java development. Therefor, they cannot be managed by business users alone and it is important that we still keep an eye on the development environment. We should prevent that a developer has to change some code in CVS and then update the rule DB and then update the process DB to start testing. This creates a very difficult deployment structure. Similarly, we have to prevent that the admin console allows for updates in the deployment that have to be synched back with the development sources.

I just want to warn that by focussing too much on the business user, we might introduce a complex sourcecontrol, build and deployment for the development team.

This does not mean that business users can't have what they want. They still can have their GUI editing of processes and rules, but it should be in sync with the technical development source code control system.

"Jeff" wrote:
| 3) Decision Nodes.

Even if in natural english it makes sense that 'a decision is based on rules', I'm not sure if a rule engine is the proper instrument to calculate a decision-node in a jpdl process. Again as with task assignment, you're not sure if there will be 0, 1 or more results coming out of the rule engine calculation. The decision node needs exactly *1* result: namely the name of the transition to take.

So finding the proper use cases on how the technologies make sense together would be more interesting for me that showing how the two can be hooked up together technically. The latter part is not so hard, i think.

"Jeff" wrote:
| This requirement would be supported by a "rule flow jpdl"
| (much like the Seam "page flow" jpdl).

yes, that could be done in another process language. it probably could be done with jpdl as well, but a special purpose process language would give you much more convenience over the more generic jpdl.

"Jeff" wrote:
| I am not sure this requirement involves persistent wait
| states.

A node would be a wait state, but it wouldn't have to be persisted.

"Jeff" wrote:
| I don't understand Tom's comment "By default, jBPM will fire
| all rules when a process variable is updated."

I mean that there is one use case in which the combination of rules and process makes good sense to me (much more then in assignment or decisions): That is where the process variables are the facts in a working memory and the rules calculate derived information from process variables. That derived information might also be stored as process variables (as discussed above). To keep the derived information up to date, all rules need to be fired for each time, a process variable changes. So that is where we need some sort of event in jbpm that allows us to inject the 'fireAllRules' after each setVariable in the jbpm context.
Actions
4. Re: jbpm drools integration meeting

tom.baeyens Aug 9, 2006 2:24 PM (in response to tom.baeyens)

"kukeltje" wrote:

Some things I read above:
1:

This scenario allows for rule bases to be deployed and managed separately from processes but in the same repository.
and
A special simplified use case is when a process archive contains a set of rules in source format.
sounds contradictory, do I miss something?

The jbpm deployment can detect that there are rule sources in the process archive. Then at deployment, it can compile all the rule files into a rulebase and deploy that to the rule repository with a link to the process definition.

"kukeltje" wrote:

In a process you can then say 'fireAllRules' without specifying a specific rule base.
All rules specific to that process right? or also global rules? Making processes dependend on global rules in not neccesarilly a problem, but keeping referential integrity is.

The process local rule base would only be to create a simple environment where process and its rules appear as one logical entity. Apart from that, we also must support global rule bases. But in that case, the process developer must always specify the name of the rule base in the process and manage the rule deployment separately. The process local rules are a simplified scenario in case the developer considers the process and rules as one logical entity.

"kukeltje" wrote:

2:
I cannot grasp the first part of this paragraph which might be the reason for the next questions

In a subsequent process operation, you might want to fire all rules again. Currently, that would require that all the process variables will have to be fed in again.
I do not think this is unwanted behaviour since the chances are big they have changed

They need to be re-installed in the rule base because they might have changed indeed. But they should not be re asserted, as that might fire rules and causes consequences (right hand sides) to be executed and that might cause side effects since these already have been executed. The idea is that the working memory should behave as if it is restored transparantly after the previous use. One way would be to serialize it in a process variable, but that would create duplicates of the process variable facts. Only the agenda-part of the working memory should be serialized and the facts should be re-installed without being asserted as that might mess up the agenda.

"kukeltje" wrote:

By default, jBPM will fire all rules when a process variable is updated.
I hope you mean after all process variables in a transaction are updated. I do not (never?) want rules to go of after one update if I update several variables in e.g. a task

probably it doesn't make sense to fire the rules after each update, but only when a process is saved. every process variable update should then be marked with a flag that triggers the rule firing... i am still in doubt and didn't think this through yet.

i'll respond to the rest later as my 30' internet connection is almost consumed :-)
Actions
5. Re: jbpm drools integration meeting

woolfel Aug 9, 2006 3:07 PM (in response to tom.baeyens)

Mark as me to comment on this, so here goes my bias 2 cents.

From first hand experience building applications with JESS, it doesn't match reality. I understand the concerns about the control of the rules and reducing risk, but the main selling point of a rule engine and rule approach is it enables business analysts to write the rules.

take for example EBay. they currently use JRules from iLog. At EBay, the rules are used to route transactions and filter out bogus stuff. If the rules are considered java code, that implies it should go through a full deployment process. This would mean the business analysts couldn't change the rules on the fly during the day to filter out bogus products or bids.

another example. within the securities world, compliance officers are responsible for writing rules and making sure violations against government regulations do not occur. when violations occur, it results in heafty fines ranging in the millions. Many older compliance systems treat rules as code, which means it takes large institutions 8-10 months to write, test and deploy new rules.

the reality is the rules always have to be in sync with the application, so whether CVS or a rule repository is used is not the issue. The issue is how can a rule repository find a good balance between risk control and flexibility. Many of the cases I know of use a rule template approach. A rule template defines a given pattern, which users populate. When ever a new rule template is introduced, it goes through a rigorous validation and testing process. If a business analyst creates a new rule using an existing template, the risk is rather low. this reduces the cost of testing and validation.

having said that, using a BRMS (business rule management system) is best suited if the rules use a data driven approach like drools3, jrules, blaze, jess, etc. If the rules couple the data to the rules, then a BRMS approach is more painful and likely not useful. I don't know how jBPM works or whether it uses a data driven approach. If jBPM doesn't use a data driven approach, then I would agree a CVS/SVN approach is better suited.

whether jBPM should tie the data to the rules is a different issue beyond the scope of this thread.

peter
Actions
6. Re: jbpm drools integration meeting

tom.baeyens Aug 9, 2006 4:06 PM (in response to tom.baeyens)

"woolfel" wrote:
Many of the cases I know of use a rule template approach. A rule template defines a given pattern, which users populate. When ever a new rule template is introduced, it goes through a rigorous validation and testing process. If a business analyst creates a new rule using an existing template, the risk is rather low. this reduces the cost of testing and validation.

exactly. i agree and come from a different perspective. my perspective is that business users see a projection of the actual software artifacts like rules and processes. if the 'moving parts' are restricted within the projection, business users can just do their thing without a need for a developer.

a decision table spreadsheet is a good example of that. see also Keith Swenson http://kswenson.wordpress.com/2006/07/09/what-bpm-can-learn-from-a-spreadsheet/ and me http://jboss.org/jbossBlog/blog/tbaeyens/2006/07/26/Clean_handoff_Collaboration_and_Pluggable_Process_Constructs.txt

"woolfel" wrote:
having said that, using a BRMS (business rule management system) is best suited if the rules use a data driven approach like drools3, jrules, blaze, jess, etc. If the rules couple the data to the rules, then a BRMS approach is more painful and likely not useful. I don't know how jBPM works or whether it uses a data driven approach. If jBPM doesn't use a data driven approach, then I would agree a CVS/SVN approach is better suited.

i don't care about CVS/SVN either. but my reasoning is that sources should always be kept in sync by the process in which you develop software. not by software that tries to synchronize CVS with a runtime rules deployment server. in my opinion, in a general case, rules and processes cannot live outside a development environment. and development envs are typically managed by CVS/SVN so that is why i think that our efforts should go to trying to provide the tools so that business users operate on that repository. rather then a seperate repository and then end up with having to keep multiple repo's synchronized.
Actions
7. Re: jbpm drools integration meeting

jeffdelong Aug 9, 2006 6:00 PM (in response to tom.baeyens)

Tom,

Well I just spent two hours responding in the forum point by point to your comments on my last email, only to have the page go away when I hit Preview. So now I have written this up in a separate doc and copy and paste.d So if the tone of this is too terse, it is the result of having lost that first response.

Short answer is that I believe the use cases that I have described are valid - that is real customers do use BPM products and production rules engines in the manners that I have described (including jBPM and Drools!). They are based on real-life customer engagements, both as a JBoss consultant and my past consulting experience.

For a quick example however, just think of credit scoring within a lending process. Lenders have been using production rules technology to perform credit scoring for years. From a process perspective, it could be modeled as a separate node in the process graph, or as a "decision node" to approve or reject. Either will work and deliver the same result, but one process design may be more expressive to a given customer than the other.

As for task assignment, think of the example I gave before and then add skill level, procedure code, availability, and three or four other constraints and you can understand why an insurance company might want to use a rules engine to determine which user (or set of users) they might want to assign the task to. As opposed to writing some complex, un-manageable SQL query. Clearly the issue with asserting a large organizational model needs to be addressed as part of the design, but that is a rules design issue.

Also, to clarify, that a set of rules might be satisfied by multiple objects is not a detriment to their use; another rule can then determine which of multiple objects should be selected if the business case only requires one. In task assignment, it is legitimate to set multiple users via the setPooledActors interface, if this supports the business model.

From a practical standpoint, jBPM already supports the integration that I have described through its delegation model. AssignmentHandlers, ActionHandlers, and DecisionHandlers can all invoke rules very easily. And Drools can assert new objects, or set properties on existing objects, making it quite easy to communicate the results of the rule execution to the process instance. So there is really no new jBPM development work needed to support what I have described. I do think providing documented examples would be useful, however.

As for some of the other ideas being discussed, I don't understand the need to persist Rules or RulesBases inside the jBPM database, as long as there is a BRMS to manage these objects.

I also don't agree with the need to manage rules, process definitions, and java all from within CVS either. This works for simple scenarios (like the examples I create) but would not support a production environment very well. I think a BRMS requires a separate "rule repository" to manage rules and rules bases, and process components should go to the BRMS when they need to run some rules.

Users do require rules to be updateable by non-developers to a certain extent without the need to change the process definition or the code, and without having to re-deploy either the ear or the process definition. It is up to rule authors to create rules that allow some flexibility in this regard. An example of this is relaxing the number of years at the same address from three years to two in some credit scoring rules. Since the domain model is not changing (i.e., numberOfYearsAtCurrentResidence is already an attribute of the Applicant object), then this change can be made without re-compiling and re-deploying code. Clearly the BRMS needs to be able to have access to the latest Java from CVS in order to compile the rules, but that is a BRMS function.

Finally, I don't understand the use case for "changing process variables causes the rules to fire", so I would be interested in discussing an business process example that required this in more detail.
Actions

Go to original post