[Introduction]

 

"How many threads should I create?". Many years before one of my friends asked me the question, then I gave him the answer follow the guideline with " Number of  CPU core + 1". Most of you will be nodding when you are reading here. Unfortunately all of us are wrong at that point.

 

Right now I would give the answer with if your archiecture was based on shared resource model then your thread number should be "Number of CPU core + 1" with better throughput, but if your architecture was shared-nothing model (like SEDA, ACTOR) then you could create as many thread as your need.


[Walk Through]

So here came one question why so many eldership continuely gave us the guideline with "Number of  Cpu core + 1", because they told us the context switching of thread was heavy and would block your system scalability. But noboday noticed the programming or architecture model they were under. So if you read carefully you would find most of them described the pragramming or architecture model were based on shared resource model.

 

Give you several examples:

1. Socket programming - socket layer was shared by many requests, so you need context switch between every requests.

2. Information provider system - most customer will contiuely access the same request

etc...

 

So they would meet the multiple requests access the same resource situation so system would require add lock to that resource since consistency requirement of their system. Lock contention would come into play so the context swich of multiple threading would be very heavy.

 

After I find this interesting thing, I consider willother programming or architecture models can walk around that limitation. So if shared resource model has failed for creating more java threading, maybe we can try shared nothing model.

 

So fortunately I get one chance create one system need large scalability, the system need send out lots of notfication in very quick manner. So I decide go ahead with SEDA model for trial and leverage with my multiple-lane commonj pattern, current I can run the java application with maximum number around 600 threads if your java heap setting with 1.5 gigabytes in one machine.

 

So how about the average memory consumption for one java thread is around 512 kilobytes (Ref: http://www.javacodegeeks.com/2011/04/erlang-vs-java-memory-architecture.html), so 600 threads almost you need 300M memory consumption (include java native and java heap). And if you system design is good, the 300M usage will  not your burden acutally.

 

By the way in windows you can't create more then 1000 since windows can't handle the threads very well, but you can create 1000 threads in linux if you leverage with NPTL. So many persons told you java couldn't handle large concurrent job processings that wasn't 100% true.


Someone may ask how about thread itself lifecycle swap: ready - runnable - running - waiting. I would say java and latest OS already could handle them suprisingly effecient, and if you have mutliple-core cpu and turn on NUMA the whole performance will be enhanced more further. So it's not your bottleneck at least from very beginning phase.

 

Of course create thread and make thread to running stage are very heavy things, so please leverage with threadpool (jdk: executors)

 

And you could ref : http://community.jboss.org/people/andy.song/blog/2011/02/22/performance-compare-between-kilim-and-multilane-commj-pattern for power of many java threads

[Conclusion]

 

In the future how will you answer the question "How many java threads should I create?". I hope your answer will change to:

1. if your archiecture was based on shared resource model then your thread number should be "Number of CPU core + 1" with better throughput

2. if your architecture was shared-nothing model (like SEDA, ACTOR) then you could create as many thread as your need.