6 Replies Latest reply on Feb 19, 2007 11:29 AM by Sam Healy

    Implementing data flow

    Sam Healy Newbie

      Has anyone attempted to implement a data flow representation of the process graph using jBPM? In data flow, nodes would have inputs and outputs of different types. Output ports of nodes could be hooked together with compatible input ports of other nodes. Then, object instances could be "passed" from node to node throughout the graph. For example, let's say you created a database connection in the first node of a graph and another node later in the graph also takes a database connection as an input. Using dataflow, the first node could declare an output that could be hooked to the input of the node later in the graph.

      I would think process variables could be used to implement this, but input and output constructs would need to be added to the jpdl. And, the workflow engine would have to do a little work passing the data around and type checking.

      Any other thoughts? Has anyone tried to do this?

        • 1. Re: Implementing data flow
          Edward Staub Expert

          Good idea - I was thinking similar things. Being able to see data flow at a high level is very useful.

          Currently, I don't believe the process definition can actually know what the flow is, because the handlers can access the variables at runtime without any declaration. The exception is scripts, although even they can access process variables without declaration, through the ExecutionContext.

          A complication is presented by events. If a variable is used in an event action handler on entry or exit to/from a node, how should this be denoted?

          This kind of UI work requires a really good eye, enough time to experiment, and strong implementation skills. Process graphs are often cluttered already - adding another 2 or 3 dimensions to them (variable/read-write/event-type) will make things too cluttered unless done very cleverly. For example, it may be desirable to show data flow in a display mode that hides the node rectangles, or at least reduces their alpha.

          -Ed Staub

          • 2. Re: Implementing data flow
            Ronald van Kuijk Master

            afaik, nobody has tried to do this. But instead of using jpdl, I would suggest using the Graph Oriented Programming 'kernel' (I think Tom invented a new acronym for GOP, but do not know what it is) and create a new language. jpdl and bpel use the same core. An aditional language is possible.

            Ronald

            • 3. Re: Implementing data flow
              Koen Aers Master

               

              "kukeltje" wrote:
              I think Tom invented a new acronym for GOP, but do not know what it is


              The acronym is PVM for Process Virtual Machine ;-)

              Cheers,
              Koen

              • 4. Re: Implementing data flow
                Sam Healy Newbie

                Ed- Thoughtful points, thank you.

                The way I'm imagining it, with the added input and output constructs in the jpdl, nodes could declare n "ports" that take or produce certain types of data. The code in the node, or handler I guess, would be responsible for retrieving data from an input and staging data on an output. The execution engine could then handle moving the data structures in and out of the process variable space using a naming convention that would identify them as belonging to inputs or outputs. For example, when a node would call,

                getInput(String portName)
                let's say, then some infrastructure code would look for the corresponding input in the process variable map, return the object, and remove it from the process variable map. Conversely, when a node wanted to stage an output, it would call
                setOutput(String portName, Object output)
                , and infrastructure code would handle moving the data into the process variable map using the specified naming convention.

                Using this mechanism, I think the scenarios you describe could be prevented. But I might be missing something. Also, process variables would work just as they currently do, but would be overloaded to support data flow.

                Maybe this sounds a lot like a hack, but I was just brainstorming about how this could be implemented with the least amount of work up front.

                Your point about cluttering up the process graph is well taken. I concur that another "view" of the graph would be required specifically for the data flow to reduce the complexity.



                • 5. Re: Implementing data flow
                  Sam Healy Newbie

                   

                  "kukeltje" wrote:
                  afaik, nobody has tried to do this. But instead of using jpdl, I would suggest using the Graph Oriented Programming 'kernel' (I think Tom invented a new acronym for GOP, but do not know what it is) and create a new language. jpdl and bpel use the same core. An aditional language is possible.

                  Ronald


                  I hadn't considered that. Thanks. However, that seems like a lot of work considering I need the existing functionality provided by the jpdl. My organization needs control flow *and* data flow. I haven't seen many workflow frameworks provide both, despite the fact that they go hand in hand in my opinion. Most BPM engines seem to do exclusively control flow. Scientific workflows tend to lean towards data flow. I'd love to see a high quality product like jBPM support both.

                  • 6. Re: Implementing data flow
                    Sam Healy Newbie

                    I'm interested in hearing from other jBPM contributors if they see data flow, as I am describing it, as being a possible feature addition to future versions of jBPM.

                    Andrew