-
1. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 9:59 AM (in response to adrian.brock)On the TimeoutFactory class
"
We should certainly remove the stupid object pooling done by "freeList".
The whole class needs separating into the TimeoutFactory implementation
and the PriorityQueue implementation. It is incomprehensible as it stands. :-)
" -
2. Re: TimeoutFactory flagged as point of contention
clebert.suconic Apr 6, 2006 11:04 AM (in response to adrian.brock)Phillip is not seeing just one thread with that stack trace.
There is a lot, around 600 in our performance tests, for the Injection Rate he disclosed on the case. -
3. Re: TimeoutFactory flagged as point of contention
clebert.suconic Apr 6, 2006 11:08 AM (in response to adrian.brock)And... we didn't have this behavior in any previous version.
Looking at CVS history, this is something introduced on JBoss_4_0_4_RC1.
We immediatly started seeing this contention in our performance tests after upgraded our version.
So, this is a regression, and that's why we decided to open the case. -
4. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 11:10 AM (in response to adrian.brock)Yes I know. But since he doesn't show who the holds the lock
we cannot determine what is the cause of the contention.
Just dumping a stacktrace into JIRA and claiming a bug
is useless under definition of analysis I've ever heard of.
It falls into the "IT DOES NOT WORK" category of questions. -
5. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 11:12 AM (in response to adrian.brock)"clebert.suconic@jboss.com" wrote:
And... we didn't have this behavior in any previous version.
Looking at CVS history, this is something introduced on JBoss_4_0_4_RC1.
We immediatly started seeing this contention in our performance tests after upgraded our version.
So, this is a regression, and that's why we decided to open the case.
Well actually the code in 4_0_4RC1 was broken. It leaked memory all over
the place. You've seen CVS history, look at the bug reports they reference. -
6. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 11:17 AM (in response to adrian.brock)You'll also notice I didn't reject the bug report.
I said it was incomplete and described the information required. -
7. Re: TimeoutFactory flagged as point of contention
clebert.suconic Apr 6, 2006 12:31 PM (in response to adrian.brock)Phillip will provide Stack Traces.
But Adrian,
TransactionImpl is creating/destroying Timeouts over a singleton factory, using some "expensive" operations, like an arrayCopy on TimeoutFactory.newTimeout and removeNode on dropTimeout.
Every single commit is competing access on the Singleton TimeoutFactory.
On TransactionImpl:TransactionImpl(long timeout) { ... this.timeout = TimeoutFactory.createTimeout(start+timeout, this); ... }
On TimeoutFactory:static public Timeout createTimeout(long time, TimeoutTarget target) { return getSingleton().schedule(time, target); }
schedule will call newTimeout...private synchronized Timeout newTimeout(long time, TimeoutTarget target) { ... System.arraycopy(q, 0, newQ, 0, q.length); //... there are other possible contentions on this method also ... .. }
and on dropTimeout....private boolean dropTimeout(TimeoutImpl timeout) { synchronized (this) { ... removeNode(timeout.index); ... } }
And removeNode looks an expensive operation to me.
So, by looking at source code I really think this is going to create a contention. -
8. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 12:44 PM (in response to adrian.brock)"clebert.suconic@jboss.com" wrote:
So, by looking at source code I really think this is going to create a contention.
I don't doubt it, but until we understand it, we won't know how to fix it.
i.e.. Are the operations bad/expensive or is it the global lock?
I've added some stress tests to the testsuite,
and these indicate to me that the problem is just the global lock
from looking at the stack traces.
I'm finding it hard to get a stacktrace with somebody holding the lock.
The PriorityBlockingQueue from JDK5/backport-concurrent-util
looks even more expensive in terms of the remove/cancel. -
9. Re: TimeoutFactory flagged as point of contention
dimitris Apr 6, 2006 1:22 PM (in response to adrian.brock)When I looked at TimeoutFactory my conclusion was that:
http://www.jboss.com/index.html?module=bb&op=viewtopic&t=75955
I looked at many generic Heap / PriorityQueue implementations and they all suffer from the inefficiency that removing an arbitrary object, other than the one on the "top" on the heap requires (a) a sequencial scan of the stored elements, until it is found & removed, (b) the heap restructured.
Ole's implementation avoids (a) because the stored elements keep an index back to Heap, that is updated whenever stored elemenets are swapped inside the Heap.
So generalizing this (factoring out the priority queue), without the performance hit it not possible, I think. -
10. Re: TimeoutFactory flagged as point of contention
dimitris Apr 6, 2006 1:34 PM (in response to adrian.brock)I agree the freeList pooling is unecessary. In fact, only the canceled items are returned to the pool, not the ones that expire normally.
-
11. Re: TimeoutFactory flagged as point of contention
dimitris Apr 6, 2006 1:55 PM (in response to adrian.brock)After 'genmans' changes and my refactoring/fixes, I think the biggest change in terms of what we had before and the way it works now, is the synchronized call to get the singleton object.
I guess, one way to avoid this, is to initialize this singleton as before, early, rather than lazily. -
12. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 2:03 PM (in response to adrian.brock)I'm not seeing contention on the singleton (synchronizing on the class).
Though obviously that could be optimized away in TransactionImpl.
I am seeing wait for locks on the scheduler instances
(even with only one thread :-)
One possible problem is that the main thread is waiting on the same
lock as the add/remove operations. So it is likely the JDK is falling
back to real locks?
Also if the timeout is the first one in the list, the main thread
is notified and holds the lock while it recalcuates the new time
to wakeup.
However, I don't see the main thread actually performing this
calcuation. It is always waiting as far as I can see. -
13. Re: TimeoutFactory flagged as point of contention
pthurmond Apr 6, 2006 2:19 PM (in response to adrian.brock)This sample thread dump shows 200 threads blocked on TimeoutFactory. All of the blocked stack traces are similar to the one I put in the JIRA task. If you need complete dumps, I have several which I can provide. Not sure where to attach these since the JIRA issue is closed.
Find unique contentions:
[pthurmond@dev09 tmp]$ grep Blocked 2.dump | sort -u
-- Blocked trying to get lock: org/jboss/util/timeout/TimeoutFactory@0x2000000015c53290[fat lock]
Count contentions on TimeoutFactory:
[pthurmond@dev09 tmp]$ grep "Blocked trying to get lock: org/jboss/util/timeout/TimeoutFactory@0x2000000015c53290" 2.dump | wc -l
207 -
14. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 2:24 PM (in response to adrian.brock)The JIRA issue is only closed until you provide the full information.
As it is, I 've manged to get the information with a stress test.
Now we just to figure out what we can do to remove the full locking
that is going on. i.e. Make it do something more like spin locking?