- Has some performance/scale testing been done for jBPM 5 and if yes then I please get a pointer to the results?
Yes, we have done performance / scale testing (and are still doing this as part of the productization process), and these are used to make sure we can detect regressions, memory leaks etc. In general, we don't publish these however, as the results depend heavily on the hardware you're using, what process you're testing, how the engine is configured, the architecture you are using, etc. Our recommended approach is to set up a prototype to check performance and if necessary check what you can do to improve this (as there are numerous ways to scale this). I did a blog entry about performance a long time ago, but it might still serve as a basic reference:
- Any pointers on how to deploy jBPM in a high scale enviroment where we have manage 10s of thousands of workflows concurrently?
That's a difficult question to answer without knowing more about the problem domain itself. The number of active instances usually isn't a limiting factor, as most of those will probably be inactive (waiting for some input), and will be stored in the database, so in that case there will just be thousands of rows in the database. Usually the number of requests / s is important though, as this relates to the actual processing of these workflows, and processing power is of course limited. One session will in general be able to support quite a number of requests per second (see my advice on prototyping to get an idea in your specific case). If this is unsufficient, you can start looking at an architecture where multiple sessions are used to handle requests. If your problem domain can be divided in independent sections, that you can just have multiple sessions running completely independently, in parallel. Since you can have any number of sessions on any number of machines, this makes it very scalable.
- From the documentation it seems like the following is possible but it would be great if someone can confirm this:
- Developers can write domain specific activities and 'publish' them to some repository (Guvnor?)
- Users can use the web front end to build workflows that compose these domain specific activities
Yes, we use the term service repository (which could be url or file based). You can them import these into your Eclipse workspace or Guvnor package so you can start using them in your processes.
For an example of a twitter service:
- Are there any good samples around building these domain specific activities?
There are some simple examples in the documentation and the jbpm-workitems module contains a number of implementations that we support out-of-the-box. The human task service itself is also an example. But as you will see, the handler code is usually relatively simple, as it's integration code that usually just calls existing services.
Thanks much Kris!
I understand that the exact scale/perf requires more intimate details about the problem domain.
Quick follow-up question:
" One session will in general be able to support quite a number of requests per second (see my advice on prototyping to get an idea in your specific case). If this is unsufficient, you can start looking at an architecture where multiple sessions are used to handle requests."
Do you think it would be useful if multiple sessions are loaded on the same machine or were you referring to loading these sessions on different physical boxes?
Depends on what you are trying to achieve. If you are running out of CPU power on one node, then instantiating a session on the same node probably won't help a lot (but you might still see improvements on multi-CPU machines, as multiple sessions will probable be able to make use of all of them in parallel).
That doesn't mean that loading multiple sessions on the same node isn't useful, you could use it to divide the processing in multiple, independent units (for example, a common approach we see is one session per customer if you need to make sure requests from different customers run independently from each other) and it makes scaling out later easier as well (you can move some sessions to a new node).
But in this case I was refering to instantiating sessions on new nodes to increase the total processing power, yes.