2010

Twins Father

August 2010 Previous month Next month

How will we reach the "Linear Parallelism" (Took JBPM as examples)

Posted by andy.song Aug 16, 2010

[Introduction]

Many my workmates and viewers ask me about the detail about the "Parallelism" I mentioned in my previous blog post http://community.jboss.org/people/andy.song/blog/2010/08/06/new-way-scale-upout-the-jbpm-without-jbpm-clustering-jbpm-sharding. Because what they think and practice it is impossible from their perspectives, and without the details they even cannot accept the "JBPM sharding" solutions. Then I realized I hided too much things, without that level details something about the new ideas could not be practicable.

[Glossary]

Parallelism

Here I will not lend some expert explaination, I will use my little poor words skill (). If we make the whole system behavior (SLA, Stability) is log（N) same when dealing single request or multiple requests. Simplifies with the examples:

Type	Single Request (Total Time)	10 Requests (Total Time)	100 Request (Total Time)
Parallelism	10 Seconds	10.XX seconds	11.XX seconds
None Parallelism	10 Seconds	15 seconds	30 seconds

Type

Single Request

(Total Time)

10 Requests

(Total Time)

100 Request

(Total Time)

Parallelism

10 Seconds

10.XX seconds

11.XX seconds

None Parallelism

10 Seconds

15 seconds

30 seconds

Linear Parallelism

[Details]

Normal JBPM operation contains 3 steps:

1. Start New Processes and signal to the first nodes

2. Do the business logics with each node defined.

3. Singal the process continue execution to next nodes

So here we will fundmentally two different choices will come up, others will be little variant with the following two.

1. Everything is conceptual inside-bpm operations

2. Everything is conceptual outside-bpm operations

So which was the parrallelism version? Of course "second" was the one we want.

[Pros]

1. Any invocation/interaction with JBPM should be break-up to two pieces: 1. Trigger , 2. Execution. Trigger leverage with asynchronous messaging protocol, jboss does great jobs on that part so HornetQ/Jboss Message is your first choice.

2. The each invocation/interaction input could be searilized to some text format stream, so business logic will only meet some identifiers stream and true business state information will load from storage (DB, Cache, etc).

3. The same datacenter for jBPM and execution logics will be important, otherwise the network partition will become some bottelnecks.Please ref: http://highscalability.com/blog/2010/8/12/think-of-latency-as-a-pseudo-permanent-network-partition.html

4. Transaction ISOLATION prefer the eventually consistency, rather than read_committed. As snapshot information will be largely used in whole process executions.So tranditional transaction isolation will become the stone-block for parallelism.

[Cons]

1. Easy model the transparancey layer wrapper jbpm. So the JBPM sharding will become possible implictly.

2. Object seralization is not acceptable anymore only applicable is identiifer of object seralization.

3. So the individual request processing a little longer than none-parallelism mode.

[Changllenges]

1. Select correct phase for split-up is more important, so SEDA model could help you re-think your system architecture. Please ref: http://community.jboss.org/people/andy.song/blog/2010/08/12/how-jboss-products-contribute-to-seda-models

2. Select correct seralization mode is also important, TEXT/String will be applicable solutions. But memory consumption for sting in java is very bad, you should try your best banlance the memory consumption and parallelism maximum.

3. Only could be put inside the same location Data Center. If cross network locations, the pattern is not applicable. We may need other tools for help like Inter-Data Grid service or Clouding services.

4. You should build your own diagnostic tools for tracing, debuging and deployments. But that means you can learn more, good thing isn't it from DEV perspectives. But managers may hate that.

So right now you understand a little more detail about my previous two blog messages regarding parallelism? Some simple principles comes online:

1. try your best split/isolations.

2. seralization is not always your victim.

3. network communciation isn't always your victim.

4. synchronous or in-jvm execution isn't always means the parellelism

5. research, research, .....

6. reading source codes is good habit

7. slowness, none-parellelsim isn't tools fault, actually is your mind.

8. you should always has your own libraries, because your always don't have enough time for practices.

9. try your best balance the performance(SLA), parellelism, and high avaibilities.

How JBoss Products Contribute to SEDA Models

Posted by andy.song Aug 12, 2010

[Introduction]

SEDA, is a design for highly-concurrent servers based on a hybrid of event-driven and thread-driven concurrency. The idea is to break the server logic into a series of stages connected with queues; each stage has a (small and dynamically-sized) thread pool to process incoming events, and passes events to other stages. http://matt-welsh.blogspot.com/2010/07/retrospective-on-seda.html

[Pre-conditions]

Your system processing could be model as ansynchronous processing.
You'd better not expect the SEDA model could give your reliable return results (No directly return value).
You system processing could be break into several stages/states. And the transition of each stage/state could have dependencies or just state transitions.
Each stage/state could be repeatable if the input was same. (That will reduce a lot of complexity on the recovery part)
Performance is not your highest priority in your mind

[Chanllenges]

State transition need to be visible and manageable. And multiple versions request may be facing in most situations
Cannot find feasible data communication channel for data transfering when stage/state swapping.
You cannot find out one feasible, lightweight, portable multiple thread exectuion components.
Recovery will be even more harder as the execution logic manually break-up

[Solutions]

JBPM 3.2.7 or later. I would rather more expect the JBPM 3.2.9 version with improvements with sql change and fork/join improvement
HornetQ 2.0 or later. The reliable asynchronous messaging processing system.
CommonJ pattern + Command Pattern. The pattern support muli-threading system and replaying features
IFINISPAN jboss version data grid cache solutions.

[Need Consideration]

JBPM performance issue, please refer my blog for new way scale up/out jbpm ref: http://community.jboss.org/people/andy.song/blog/2010/08/06/new-way-scale-upout-the-jbpm-without-jbpm-clustering-jbpm-sharding
Seralization issue please use MapMessage or TextMessage when you want transfer data using message protocol. JSON or JAXB was good solutions for that.
Reliable Multi-Thread components:
- CommonJ pattern.
- Data Grid
- Computing Grid
Recovery may need pre-persist the input for each stage/state execution. Then Command pattern (REDO) part could help that recovery.

[Benefitical]

Parellelism will be largely put inside the SEDA models. So your system could survive in huge load or small load
Scalability already be built-in. So add more servers will help you a lot or upgrade your single machine power may also good. (Scale-up/Scale-out)
JBPM help you on the state transition visible and manageable, versioning already be build-in
INFINISPAN will help you on data grid part, so very soon it will provide much reliable data grid execution platform for you.
Plug-in and extension will be easily, like Drools, etc.

New way scale up/out the JBPM without JBPM clustering [JBPM Sharding]

Posted by andy.song Aug 6, 2010

[Introduction]

JBPM (JBoss Business Process Management) is the one of famous open source BPM tool in the world! Even the orignial founder of JBPM already moved to another company but still you need admit JBPM still be good as your expectation. But as usual most BPM trigger pattern hold for user/role interaction model or some explictly rule model or some business calendar pattern, so it's not suitable for some OLTP/Emergency action need. So from my research and sprawlling in the internet, I don't find lot interesting information about the scalability and performance discussion about JBPM. So we dig into and find our way break through the barrier blocking the JBPM cannot support OLTP/Emergency action need. We did some interesting practice and get huge benefit from that, I will share that information for all of you.

[Background]

Our company main handle the communication/negotiation between planner and supplier in the meeting planning fields. And most actions finished by the online web operations, so when each peer finish their operations we need trigger some designated processes then notify the another peers about the detail information as the format with Email, SMS or FAX. The process includes serveral little complex rules, and each step output will impact next steps behavior in most situations. Then we find it mostly match with BPM pattern. So we decide use the JBPM plus MVEL. May someone will jump out say why not Drools? It's too complex for our small business requirements at least from my perspectives.

[Lessons]

So what we learned from:

Which version supposed to be trust in many situations?

The answer is JBPM 3.X. Why? Because the founder of JBPM left JBoss when the date of JBPM 4.X supposed to be released, many concept and api was designed very well. But the backend storage design still leverage the old model of JBPM 3.X (and I quite cannot believe the JBoss guys also design so badly about the schema of DB and Hibernate especially from performance perspective). After carfully and painful practices we gave up the JBPM 4.X and moved back to JBPM 3.X. Current in prod we are using JBPM 3.2.7

Please don't put any execution of business logic inside JBPM VM

That doesn't mean you cannot do that, only reason if you use that you will bring lots of contention into JBPM backend model (especially DB parts) and also memory. So try your best push out the exeution outside the JBPM VM with messaging system or some distributed invocation. Current in prod we are using JMS (HornetQ is really good messaging-oriented product really supposed to be tried)

Please don't simply persist you map variables requests into JPBM

If you are familiar with JBPM, jpbm try to transit the variables as map format (similar with java.util.Map) during process execution. So in most cases I do see many person try to simply leverage with protocol transit the variables, that was the most victim of performance devil. So we still need that protocol but please use that smartly, rather than simply persist the Map information iteratively change to only persist one key/value with unique business key/ whole serializd map. You will get huge performance indicates.

Before Recommend

//Initialize the Map //Initialize the Map

Map<String, ?> variables; Map<String, ?> variables;

.... .... .... ....

variables.put("hello", 10); variables.put("hello", 10);

variables.put("hello1", 20); variables.put("hello1", 20);

variables.put("hello1", new HashMap()); variables.put("hello1", new HashMap());

... ... ..... ....

//Call JBPM api //Call JBPM api

ProcessInterface pi = getProcessInterface(); ProcessInterface pi = getProcessInterface();

ContextInterface ci = pi.getContextInterface(); ContextInterface ci = pi.getContextInterface();

for (Map.Entry<String, ?> aEntry : variables.entrySet()) { String serilizedText = ToStringUtil.

serializedByJason(variables);
ci.setVariable(aEntry.getKey(), aEntry.getValue()); Context.setVariable("BusinessKey", serilizedText);
}

After at least 10 rounds of performance testing we got the following comparable results:

total invocation(start process) per hour Avg Time (milliseconds)

Before 1,000 8,982

Recommend 2,000 412

Almost we got 20X increasing from performance perspective.

[Sharding !!!]

Sorry for so late we came to the main point, here introduce the sharding.

JBPM clustering

JBPM sharding

Pros:

Every node should consist the same process definitions (not mandatory, I will discuss some variants)
You should have your own distributor or LB controller (inter-jvm or intra-jvm will be fine, only difference you want scale-up or scale-out)
Try you best be aware of your single node capacity, if your application request far beyond the single node or clustering nodes capacity then try this pattern.

Cons:

The load of your application (for JBPM part) will divided into X parts (X means how many sharding you want to setup) and also keep the performance of single nodes under specific loads. Increase total capacities.
Maintainence is harder than before, some deployment or other useful tools need be developed by yourselves.
Issue diagnostic will be even harder than before. You should have some centralized logging system helpping you dig into the process existed within which nodes, etc.
Archiving strategy need clearly definied and controlled. Otherwise you will be messed up.

Suitable areas:

Can be used in any areas, but need resolve the process locality/stickness issues. I will discuss that later.

[Knowledges]

Distributor design
- Intra-JVM

Within single JVM we started multiple JBPM shardings, the distributor will be easily become the java method calling

Inter-JVM

Running multiple JVMS holds for different JBPM shardings, the distributor will be designed as remote method calling:

Web Services
Socket
JMS

Current in prod we use the intra-Jvm design

Distribution Algorithm
- Consistent Hashing LB
- Weight LB
- Round robin LB
Process Locality/Stickness
- tight coupling with JBPM

JBPM ProcessInstance structure cotain one not used field "key", so when we create process instance in jbpm we can put the sharding ID of jbpm into process instance. So later we need populate that key information when do node event handling and signal process continuation.

pros:

Very easy and useful, not spof

cons:

Hard to diagnostic

Current in prod we use that design, but we will log the process creation information into logging system

Loose coupling with JBPM

Create one centralized JBPM process DNS lookup table, similar with other DB sharding design. But you also meet spof and maintainence issues.

Our final design

Results

Invocation Cnt (1 hour) Avg Invocation Time(Milliseconds)

Single JBPM 9,000 412

JBPM Clustering (4 JBPM nodes + 1 JBPM storage) 10,000 350

JBPM sharding (4 JBPM nodes + 4 JBPM storages) 38,000 370

+ Intra-JVM distributor

So we can see the throughputs and performance increase a lot.

Other variants
- Computing Grid + JBPM sharding (TBD)

Clouding + JBPM sharding (JBPM clouding) (TBD)

[Note]

Please forgive me poor english, that skill need to be long-going improvements as I'm Chinese guy.

All testing setting:

1. JVM: jdk 1.6 u21

2. Marchine: Single machine (VM), with dual core cpu and 4G memory

[Reference]

JBPM site: http://jboss.org/jbpm
JBPM 3.2.7 docs: http://docs.jboss.com/jbpm/v3.2/userguide/html_single/
JBPM Clusering references: http://www.theserverside.com/news/1363588/Scalability-and-Performance-of-jBPM-Workflow-Engine-in-a-JBoss-Cluster
Consistent Hashing: http://michaelnielsen.org/blog/consistent-hashing/
CommonJ pattern: http://commonj.myfoo.de/
Terracotta: http://www.terracotta.org/
My twins photo: http://cid-afd8b274e5b99db9.photos.live.com/browse.aspx/10%e6%9c%885%e6%97%a5

[My Bio]

I am living and working in Shanghai of China, my current company was StarCite. I worked as technical leader in that company and my most forcus was scalability and performance design.

You can contact me via email: hqxsn@hotmail.com

MSN: hqxsn@sina.com

Filter Blog

By date:

By tag:

JBossDeveloper

Twins Father

How will we reach the "Linear Parallelism" (Took JBPM as examples)

[Details]

How JBoss Products Contribute to SEDA Models

New way scale up/out the JBPM without JBPM clustering [JBPM Sharding]

Actions

Filter Blog