I'm working on a rather complex project with jBPM and would like to see what other contributors think of some concerns I have. I will address them if necessary by customizing jBPM and would like to contribute changes. That way I also win because the project I'm working on can get other updates as jBPM continues to be improved in other areas. I should say up front I've looked at jBPM 3.0.1 and 3.0 so if some of these items are included in 3.02 or 3.1, let me know.
Basically, I'm working on a system that requires execution of about 300k workflows/day. Each workflow will have between 40-60 steps, most of them asynchronous. I am concerned about the following:
1) Number of relationships in schema. Its great that the model seems to be completely normalized, but it might make sense to denormalize just a little bit to ensure good performance. Anyone has had issues with performance in this sense? Anyone interested in proposals?
2) Not having looked at hibernate too much yet and how it is used within jBPM, this might sound silly but can hibernate's cache be turned off for most runtime process information and be turned on only for configuration and process definitions?
3) The size of the log table is of concern. With 300k workflows a day, it is going to be huge. Anyone interested in a built-in archiving mechanism so that old data can be taken to a two stage (online/offline) archive? The idea would be to also add capabilities to the API to search and return information on archived process instances. Another idea is to add the ability to reduce the logging level. Is this possible? I'm concerned about possible side effects of doing this and I kind of like keeping that information handy but recognize that the storage requirements for my production environment would be inmense.
4) Error handling: I know there are open issues about this, but would like to get opinions as to what your thoughts have been on revamping exception management in jBPM. There are two issues: fixing point cases where exceptions are not being trown-up and overall wrapping exception in jBPM-specific classes rather than using java.lang.RuntimeException, et al.
5) As I mentioned before, we will require asynchronous execution so we will add the capability to jBPM somehow. Would be interested in what Tom an others think about what the best way to accomplish that is.
Some of these are not issues in JIRA, if you would like to add them, its fine by me as long as someone has them assigned so that there is little work duplication. I will begin work on some parts of these enhancements sometime soon.
Thanks a lot for your time.
3) logging is customizable quite easily. it is however not yet configurable. so it still requires some code tweaking.
5) async execution will be part of 3.1
I am also planning on implementing a system potentially similar to
Eduardo's (in size), with jBPM.
The main issue I have is that each task in the workflow needs to be an automated task, not a user one. Because the automated task will be carried out on a separate machine, in a separate JVM (I will do this through EJBs or web services), I need to add robustness to the task execution by implementing retry/timeout logic for each task.
It seems that the best way to do this would be to actually encapsulate the task exectution and timeout/retry logic into an action handler (away from jBPM). BTW - I would do this asynchronously, signalling the end of task execution with a call back
I know that someone will tell me add the automated tasks as action events, but this will not proved the level of control I require. This is because I need to drive the workflow based on the results of the automated tasks, and also have a robust, asynchronous way of executing the tasks.
So my question is:
I see that sync exectution will be a part of 3.1. I believe the best way for me to implement my custom logic (retrys etc) is to encapsulate this logic inside a custom action handler, not directly in the jBPM process definition. Is there any plan to add a number of retries to the scheduler component of jBPM, and is there a better way of implemeting the robustness logic I require to a jBPM process?
hmmm.. you need robustnes and are thinking of using webservices? hmmm.. once you have those robust, the performance of the jBPM core itself is the least of your worries.
Let me summarize:
- transaction aware
- load balancing
All by default in JMS.
I agree with Ronald, in most of the case JMS is most relevant than web services.
In my last project I have implemented jbpm with JMS for the integration with the others applications and it works fine. But may be this solution is not compliant with you project.
Actually I only mentioned web services because it is possible we will have to plug in legacy (non java) applications.
My prefered method is EJBs for communication between applications, although now I am thinking JMS would have a part to play.
We still need to send out parameters between the participating applications and I am not sure if JMS handles that - I need to do some reading up on JMS!
hmmm.... (sorry, I repeat myself ;-))
Legacy and webservices aren't those kind of contradictory? I know multiple vendors would like you to think otherwise, but in my (not so) humble opinion (at least not on this subject) It is easier to integrate e.g. a mainfraime via MQ or Cics than to install a webservice piece of software on the mainframe, connect that to cics, convert everything to a webservice and last, call it (unreliable!) remotely.
I'd more look into of combining jBPM with something like openadaptor or an esb (I wonder what JBoss has up its sleave)
OK, what are "openadaptor" or "esb"?
If I implement a subset of these features, can they be checked into the CVS and possibly approved for rolling into the main jbpm branch?
Totally agree on JMS for async processing. Is the async support patch in the latest alpha release already? Haven't seen any interest in a built-in archiver.
One more thing, anyone interested in a Quartz-based scheduler/timer service?
My scenario is completely automated with some minimal user interaction as well.
CVS etc should be answered by Tom. Regarding quartz, we've had that discussion previously. Ony advantage of the businessscheduler as it is now, is that it takes workingdays into account. If you have a pros/cons list for the current implementation and Quartz, maybe you can convince us ;-)
Well the main problem (for me anyway) with the jBPM scheduler is that it does not support a max retry limit.
Also, can someone confirm that the scheduler as it stands is persistent?
it is persistent. Triggers etc are in the db and are 'polled' by a schedulerthread. A max retrylimit could be added fairly easilly I think. Open a jira issue and see what Tom thinks
You might want to take a look at Mule (http://mule.codehaus.org), especially the "LoanBroker ESB" example (http://mule.codehaus.org/LoanBroker+ESB).
Mule is an open source ESB based on the Enterprise Integration Patterns book (by the same author as the developerWorks article just posted in this thread).
I'm using jBpm together with Mule (and JMS) for a legacy integration project and they make a very powerful combination!
Thanks for the replies, haven't had a chance to look into it yet, but I will definitely look into ESB.