6 Replies Latest reply on Feb 19, 2007 11:29 AM by drjava

Implementing data flow

drjava Feb 15, 2007 2:27 PM

Has anyone attempted to implement a data flow representation of the process graph using jBPM? In data flow, nodes would have inputs and outputs of different types. Output ports of nodes could be hooked together with compatible input ports of other nodes. Then, object instances could be "passed" from node to node throughout the graph. For example, let's say you created a database connection in the first node of a graph and another node later in the graph also takes a database connection as an input. Using dataflow, the first node could declare an output that could be hooked to the input of the node later in the graph.

I would think process variables could be used to implement this, but input and output constructs would need to be added to the jpdl. And, the workflow engine would have to do a little work passing the data around and type checking.

Any other thoughts? Has anyone tried to do this?

1. Re: Implementing data flow

estaub Feb 15, 2007 5:01 PM (in response to drjava)

Good idea - I was thinking similar things. Being able to see data flow at a high level is very useful.

Currently, I don't believe the process definition can actually know what the flow is, because the handlers can access the variables at runtime without any declaration. The exception is scripts, although even they can access process variables without declaration, through the ExecutionContext.

A complication is presented by events. If a variable is used in an event action handler on entry or exit to/from a node, how should this be denoted?

This kind of UI work requires a really good eye, enough time to experiment, and strong implementation skills. Process graphs are often cluttered already - adding another 2 or 3 dimensions to them (variable/read-write/event-type) will make things too cluttered unless done very cleverly. For example, it may be desirable to show data flow in a display mode that hides the node rectangles, or at least reduces their alpha.

-Ed Staub
Actions
2. Re: Implementing data flow

kukeltje Feb 15, 2007 5:05 PM (in response to drjava)

afaik, nobody has tried to do this. But instead of using jpdl, I would suggest using the Graph Oriented Programming 'kernel' (I think Tom invented a new acronym for GOP, but do not know what it is) and create a new language. jpdl and bpel use the same core. An aditional language is possible.

Ronald
Actions
3. Re: Implementing data flow

koen.aers Feb 15, 2007 5:32 PM (in response to drjava)

"kukeltje" wrote:
I think Tom invented a new acronym for GOP, but do not know what it is

The acronym is PVM for Process Virtual Machine ;-)

Cheers,
Koen
Actions
4. Re: Implementing data flow

drjava Feb 16, 2007 11:13 AM (in response to drjava)
Ed- Thoughtful points, thank you.

The way I'm imagining it, with the added input and output constructs in the jpdl, nodes could declare n "ports" that take or produce certain types of data. The code in the node, or handler I guess, would be responsible for retrieving data from an input and staging data on an output. The execution engine could then handle moving the data structures in and out of the process variable space using a naming convention that would identify them as belonging to inputs or outputs. For example, when a node would call,
getInput(String portName)
let's say, then some infrastructure code would look for the corresponding input in the process variable map, return the object, and remove it from the process variable map. Conversely, when a node wanted to stage an output, it would call
setOutput(String portName, Object output)
, and infrastructure code would handle moving the data into the process variable map using the specified naming convention.

Using this mechanism, I think the scenarios you describe could be prevented. But I might be missing something. Also, process variables would work just as they currently do, but would be overloaded to support data flow.

Maybe this sounds a lot like a hack, but I was just brainstorming about how this could be implemented with the least amount of work up front.

Your point about cluttering up the process graph is well taken. I concur that another "view" of the graph would be required specifically for the data flow to reduce the complexity.
Actions
5. Re: Implementing data flow

drjava Feb 16, 2007 11:27 AM (in response to drjava)

"kukeltje" wrote:
afaik, nobody has tried to do this. But instead of using jpdl, I would suggest using the Graph Oriented Programming 'kernel' (I think Tom invented a new acronym for GOP, but do not know what it is) and create a new language. jpdl and bpel use the same core. An aditional language is possible.

Ronald

I hadn't considered that. Thanks. However, that seems like a lot of work considering I need the existing functionality provided by the jpdl. My organization needs control flow *and* data flow. I haven't seen many workflow frameworks provide both, despite the fact that they go hand in hand in my opinion. Most BPM engines seem to do exclusively control flow. Scientific workflows tend to lean towards data flow. I'd love to see a high quality product like jBPM support both.
Actions
6. Re: Implementing data flow

drjava Feb 19, 2007 11:29 AM (in response to drjava)

I'm interested in hearing from other jBPM contributors if they see data flow, as I am describing it, as being a possible feature addition to future versions of jBPM.

Andrew
Actions

Go to original post