I'm having great trouble designing an application which uses jBPM5 in a scalable way. It seems to be impossible on a practical level. I hope that this is just due to my misunderstandings of the way the jBPM5 works.
I have listed each of my assumptions, and the ramifications of each below. I hope that someone from the community will be able to point out where these assumptions are incorrect.
I would particularly appreciate it if anyone has managed to scale jBPM5, and could described how they achieved it.
1. Everytime jBPM5 executes a process the database transaction updates the process instance info AND the session info, therefore:
1a. it's not scalable to use a single session to execute all processes, or you would suffer contention on the session info.
1b, it's not scalable to use a single session to execute all processes in a cluster, or the updated session info would have to be continually synchronised across the cluster.
2. When using BPMN2 events, jBPM5 only allows you to send events to the process instances within a single session at a time. You need to maintain a list of all the sessions which have incompleted process instances(*), and loop through them all to send events. Therefore:
2a. you should execute all processes in as few sessions as possible, to lessen the number of iterations through this loop.
3. jBPM5 persists BPMN2 timer info in the session info, but the session must be active (ie. loaded from persistence) in order for the timers to activate. Therefore:
3a. when your application starts, you must load all sessions that have active process instances that have timers(**).
3b. you must not have the same session active in two different nodes of a cluster, or the same timers will expire around the same time
3c. when a node crashes, your application must detect this and reload the sessions that were active in the crashed node
4. If you start a process instance in a session, that process instance must always be executed in that session.
4a. when a node wishes to resume a process instance that was persisted, it must first (due to 3b) ask all other nodes if they have the session active, and if so instruct them to dispose it. It can then load the session, load and resume the process. All while preventing race conditions.
4b. when a node receives an event it must (due to 2) carry out all the processing in 4a for each session with active process instances.
(*) I don't think it's possible to know if a session has incompleted process instances..?
(**) I don't think it's possible to know if a session and timers..?