6 Replies Latest reply on May 18, 2007 12:31 PM by gavin.king

A performance concern with Init.instance(), a suggested work

raghinii May 3, 2007 12:07 AM

The short version:

I claim that in Init.instance()

this line:

Init init = (Init) Contexts.getApplicationContext().get(Init.class);

should be replaced with :

Init init = (Init) Contexts.getApplicationContext().get("org.jboss.seam.core.init")

because the current version calles Class.getAnnotation which in turns calls synchronized method and this results in poor behavior under heavy load.

The long version:

First some history (and a similar previous problem):

Over the past few months I've been working with a team converting an existing webapp of moderate complexity to a JSF / Seam stack w/ Facelets.

It's been in production now for about a month and a half.
Running on jboss 4.0.5.GA w/ seam 1.2.0.patch1 Sun JSF RI 1.2 b4 (?)

Before we put it in production had to do some performance tuning. What we started out with was about 10% as fast as the original app , which wasn't acceptable, but eventually we got to about 60%, which was fine.

One undesirable behavior was that under moderately heavy load, when requests were slowing down from their normal ~300ms to 3-4 seconds, the distribution of request response time got very wide with some requests taking 5 seconds while other, identical, requests were timing out at 5 minutes.

With some testing we tracked this down to the fact that org/apache/catalina/core/ApplicationContext.get() is being called something like 50K - 100K times per request in our app. Internally this call is synchronized.

The solution we tried and eventually adopted was to edit the tomcat ApplicationContext code and replace the HashMap with a ConcurrentHashMap and to remove the synchronization in the get().

This resulted in something like a 30-40% performance improvement (if memory serves), but more importantly it significantly narrowed the response time curve. That is, the response time for requests ,under load, was a lot more consistent.
(note 1: this makes sense since having lots of threads contending for a lock can lead to starvation. In this case, the fact that any request has to "win" the lock 100K times before it succeeds makes it more likely that starvation will occur despite any anti-starvation attempts by the OS/VM)

(Note 2: This change not strictly speaking a safe thing to do, but it works for us. (Tomcat 6 also takes this approach but presumably they've gone further in validating that the rest of the app is safe with the concurrent hash map semantics)).

THE CURRENT ISSUE:

Basically, in production under heavy load, we saw the same distribution of request response time. (measured by mod-log-firstbyte in apache). Bascially once the per-page response got to ~7-10 seconds / page some requests start to take 100 - 300 seconds (there's some timeout in the system that caps things at 300 seconds). The situation then often spirals out of control with active connections climbing to the 100s and inevitably resulting in an OOM.

This time a thread-dump of jboss showed that many threads were waiting in to lock a specific lock in java.lang.Class.initAnnotationsIfNecessary().

Looking at the stack traces it became clear that the Class in question was Init.class and that all these threads were calling Init.instance().
( Init.instance() is called ... everywhere... all the time).

The specific line that leads to the eventual lock contention is this one:
Init init = (Init) Contexts.getApplicationContext().get(Init.class);

Which internally leads to a call to get the value of the @Name annotation for the Init.class so that it can then look up the instance by that name.

It is this lookup of the Name of the Init component that causes the contention.

Replacing the get(Init.class) with get("org.jboss.seam.core.init") bypasses this lookup and also that synchronized method.

I've been running this modification on half of the servers for the past two days and they are showing MUCH more linear performance under heavy load. Basically when subjected to a stressor (GC pause, DB related pause, traffic spike etc ) the modified machines return to normal much more quickly (say... 20 -30 seconds compared to 5-10 minutes) and they haven't shown a tendency to spiral out of control towards OOM.

I generally like the compile time checking that one gets from using the get(Class) version of this method, however in this specific case i think the loss of safety is worth it given the deep loop nature of this call.

I'm curious as to what you guys think.

Thanks,
radu

1. Re: A performance concern with Init.instance(), a suggested

raghinii May 3, 2007 12:24 AM (in response to raghinii)

To clarify two things:

1) In my app, running on some dual processors Xeons w/ Hyperthreading contention on this lock starts to show up with maybe... w/ 20 concurrent .seam connections. Below that there is no locking on it.

But above that most of the threads are waiting on this lock.

(The section in the actual synchronized method in Class is basically just a null check. So it looks like just acquiring the lock (and the memory implications of that ) that are causing problems not the stuff that's going on in the synchronized method.)

2) If there is no strong objection i'd like this change made in the core code so that i don't have to edit/patch the seam jar.
Actions
2. Re: A performance concern with Init.instance(), a suggested

christian.bauer May 3, 2007 1:00 AM (in response to raghinii)

Sounds like you found a scalability bottleneck. Please report with full story on JIRA.
Actions
3. Re: A performance concern with Init.instance(), a suggested

gavin.king May 18, 2007 12:00 PM (in response to raghinii)

First, the bottleneck in Tomcat was already found by Michael and Phillip, during their performance profiling of Seam. It is fixed in TC6 / JBoss 4.2.

Second, I will indeed fix the problem with Class.getAnnotation(), as reported here:

http://jira.jboss.org/jira/browse/JBSEAM-1332

What is not clear to me at this time is whether we need to go through and make the same change in many other places in Seam....
Actions
4. Re: A performance concern with Init.instance(), a suggested

michael.yuan May 18, 2007 12:09 PM (in response to raghinii)

Thanks radu! We have been doing some internal performance testing / tuning for Seam as well. We fixed a number of contentions (including the Tomcat one you mentioned :)).

But I have not observed the "init" contention you described -- that is probably my test demo does not do as much lookup as your app. Can you give me some basic information about your app? For instance, how many SFSBs do you typically access for each web request? Thanks a lot.

cheers
Michael
Actions
5. Re: A performance concern with Init.instance(), a suggested

gavin.king May 18, 2007 12:30 PM (in response to raghinii)

OK, I think I've got it fixed in CVS, please test that for me, and let us know the results.
Actions
6. Re: A performance concern with Init.instance(), a suggested

gavin.king May 18, 2007 12:31 PM (in response to raghinii)

Note that this is fixed in Seam 1.3, you'll need to run it on AS 4.2.
Actions

Go to original post