-
15. Re: TimeoutFactory flagged as point of contention
clebert.suconic Apr 6, 2006 2:29 PM (in response to adrian.brock)Adrian, what other information do you want?
We already have this whole thread, and we kind of already acknowledge there is a problem. So, isn't that enough to be a bug?
Phillip could attach the whole thread dump to the case, but he can't do that because the JIRA is closed. It's what comes first.. the chicken or the egg? -
16. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 3:19 PM (in response to adrian.brock)Just reopen the bug report.
I know for a fact, that anybody who creates a bug report can reopen it.
Failing that, I'll reopen when I know what the real problem is,
i.e. what needs fixing.
I'm only interested in seeing the real thread dump so we can
see it isn't some other problem, to what we are *guessing* at on
this thread. -
17. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 3:28 PM (in response to adrian.brock)I tried removing the continued lookup of the singleton.
That made no discernable difference.
So tomorrow, I am going to experiment with using a different "lock"
for the main thread's wait/notify and the maintenance of the queue.
The main thread looks overly contentious, given that once it has the
work it can submit it from outside the synchronized block.
I'll also try removing object pooling.
If none of that works, then we'll probably need to design a more
concurrent priority queue implementation. -
18. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 6, 2006 3:32 PM (in response to adrian.brock)Some academic papers, if somebody fancies a read or wants
to play with implementing them in java. :-)
http://www.google.co.uk/search?hl=en&q=concurrent+priority+queue&btnG=Google+Search -
19. Re: TimeoutFactory flagged as point of contention
clebert.suconic Apr 6, 2006 3:45 PM (in response to adrian.brock)ok... just reopened it. I thought you wanted to keep the case closed.
I've just attached Phillip's thread dump. (I asked him to send me the file).
208 threads waiting a lock on this point"SocketServerInvokerThread-10.64.36.1-58" id=151 idx=0x120 tid=16201 prio=5 alive, in native, blocked -- Blocked trying to get lock: org/jboss/util/timeout/TimeoutFactory@0x2000000015c53290[fat lock] at jrockit/vm/Threads.waitForSignal()V(Native Method) at jrockit/vm/Locks.fatLockBlockOrSpin(JLjrockit/vm/ObjectMonitor;II)V(Unknown Source)[optimized] at jrockit/vm/Locks.lockFat(Ljava/lang/Object;JLjrockit/vm/ObjectMonitor;Z)Ljava/lang/Object;(Unknown Source)[optimized] at jrockit/vm/Locks.monitorEnterSecondStage(Ljava/lang/Object;I)Ljava/lang/Object;(Unknown Source)[optimized] at jrockit/vm/Locks.monitorEnter(Ljava/lang/Object;)Ljava/lang/Object;(Unknown Source)[optimized] at org/jboss/util/timeout/TimeoutFactory.dropTimeout(Lorg/jboss/util/timeout/TimeoutFactory$TimeoutImpl;)Z(TimeoutFactory.java:549) at org/jboss/util/timeout/TimeoutFactory.access$200(Lorg/jboss/util/timeout/TimeoutFactory;Lorg/jboss/util/timeout/TimeoutFactory$TimeoutImpl;)Z(TimeoutFactory.java:46) at org/jboss/util/timeout/TimeoutFactory$TimeoutImpl.cancel()Z(TimeoutFactory.java:696) at org/jboss/tm/TransactionImpl.cancelTimeout()V(TransactionImpl.java:1374) at org/jboss/tm/TransactionImpl.completeTransaction()V(TransactionImpl.java:1194)[inlined] at org/jboss/tm/TransactionImpl.commit()V(TransactionImpl.java:315)[optimized] at org/jboss/ejb/plugins/TxInterceptorCMT.endTransaction(Lorg/jboss/invocation/Invocation;Ljavax/transaction/Transaction;Ljavax/transaction/Transaction;I)V(TxInterceptorCMT.java:501)[optimized] at org/jboss/ejb/plugins/TxInterceptorCMT.runWithTransactions(Lorg/jboss/invocation/Invocation;)Ljava/lang/Object;(TxInterceptorCMT.java:361)[optimized] at org/jboss/ejb/plugins/TxInterceptorCMT.invoke(Lorg/jboss/invocation/Invocation;)Ljava/lang/Object;(TxInterceptorCMT.java:181)[optimized] at org/jboss/ejb/plugins/LogInterceptor.invoke(Lorg/jboss/invocation/Invocation;)Ljava/lang/Object;(LogInterceptor.java:165)[optimized] at org/jboss/ejb/plugins/CleanShutdownInterceptor.invoke(Lorg/jboss/invocation/Invocation;)Ljava/lang/Object;(CleanShutdownInterceptor.java:278) at org/jboss/ejb/plugins/ProxyFactoryFinderInterceptor.invoke(Lorg/jboss/invocation/Invocation;)Ljava/lang/Object;(ProxyFactoryFinderInterceptor.java:136)[optimized] at jrockit/reflect/CompiledMethodInvoker.invoke0(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;(Unknown Source) at jrockit/reflect/CompiledMethodInvoker.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;(Unknown Source)[optimized] at java/lang/reflect/Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;J)Ljava/lang/Object;(Unknown Source)[optimized] at org/jboss/mx/interceptor/ReflectedDispatcher.invoke(Lorg/jboss/mx/server/Invocation;)Ljava/lang/Object;(ReflectedDispatcher.java:155) at org/jboss/mx/server/AbstractMBeanInvoker.invoke(Ljava/lang/String;[Ljava/lang/Object;[Ljava/lang/String;)Ljava/lang/Object;(AbstractMBeanInvoker.java:189)[optimized] at org/jboss/mx/server/MBeanServerImpl.invoke(Ljavax/management/ObjectName;Ljava/lang/String;[Ljava/lang/Object;[Ljava/lang/String;)Ljava/lang/Object;(MBeanServerImpl.java:659) at org/jboss/invocation/unified/server/UnifiedInvokerHA.invoke(Lorg/jboss/remoting/InvocationRequest;)Ljava/lang/Object;(UnifiedInvokerHA.java:146) at org/jboss/remoting/ServerInvoker.invoke(Lorg/jboss/remoting/InvocationRequest;)Ljava/lang/Object;(ServerInvoker.java:828) at org/jboss/remoting/ServerInvoker.invoke(Ljava/lang/Object;)Ljava/lang/Object;(ServerInvoker.java:681) at org/jboss/remoting/transport/socket/ServerThread.processInvocation()V(ServerThread.java:358) at org/jboss/remoting/transport/socket/ServerThread.dorun()V(ServerThread.java:412) at org/jboss/remoting/transport/socket/ServerThread.run()V(ServerThread.java:239) at jrockit/vm/RNI.c2java(JJJJ)V(Native Method)
and just 1 at this point"TimeoutFactory-0" id=58 idx=0x74 tid=16107 prio=5 alive, in native, waiting, daemon -- Waiting for notification on: org/jboss/util/timeout/TimeoutFactory@0x2000000015c53290[fat lock] at jrockit/vm/Threads.waitForSignalWithTimeout(J)V(Native Method) at java/lang/Object.wait(J)V(Native Method) at org/jboss/util/timeout/TimeoutFactory.doWork()V(TimeoutFactory.java:603) ^-- Lock released while waiting: org/jboss/util/timeout/TimeoutFactory@0x2000000015c53290[fat lock] at org/jboss/util/timeout/TimeoutFactory.access$000(Lorg/jboss/util/timeout/TimeoutFactory;)V(TimeoutFactory.java:46) at org/jboss/util/timeout/TimeoutFactory$1.run()V(TimeoutFactory.java:264) at jrockit/vm/RNI.c2java(JJJJ)V(Native Method) -- end of trace
No wonder why...// thread.run will call this method... private void doWork() { while (!cancelled) { TimeoutImpl work = null; // Look for work synchronized (this) { if (size == 0 && !cancelled) { try { wait(); } catch (InterruptedException ex) { } }
-
20. Re: TimeoutFactory flagged as point of contention
clebert.suconic Apr 6, 2006 3:57 PM (in response to adrian.brock)It's kind of funny...
wait is releasing the first lock.
but it's also creating another lock on the wait itself:
Look at Waiting for notification on .... [fat lock]"TimeoutFactory-0" id=58 idx=0x74 tid=16107 prio=5 alive, in native, waiting, daemon -- Waiting for notification on: org/jboss/util/timeout/TimeoutFactory@0x2000000015c53290[fat lock] at jrockit/vm/Threads.waitForSignalWithTimeout(J)V(Native Method) at java/lang/Object.wait(J)V(Native Method) at org/jboss/util/timeout/TimeoutFactory.doWork()V(TimeoutFactory.java:603) ^-- Lock released while waiting: org/jboss/util/timeout/TimeoutFactory@0x2000000015c53290[fat lock] at org/jboss/util/timeout/TimeoutFactory.access$000(Lorg/jboss/util/timeout/TimeoutFactory;)V(TimeoutFactory.java:46) at org/jboss/util/timeout/TimeoutFactory$1.run()V(TimeoutFactory.java:264) at jrockit/vm/RNI.c2java(JJJJ)V(Native Method) -- end of trace
-
21. Re: TimeoutFactory flagged as point of contention
clebert.suconic Apr 6, 2006 4:06 PM (in response to adrian.brock)
Maybe the thing locks / unlocks so frequently (spinning between waits and notifies) that we couldn't capture the lock holder.
The stack trace, and high response times when going higher in our performance tests is the information we have now. -
22. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 11, 2006 5:22 AM (in response to adrian.brock)Over the weekend, I refactored the TimeoutFactory into
the core part and a TimeoutPriorityQueue plugin with one
implementation that is just the old code.
I've stress tested this such that we can use it with the 4.0.4.GA
release if necessary.
The only material change over the current behaviour is that I moved the
submission to the thread pool for time outs outside the synchronized
block.
I can now try alternative priority queue impls.
This is done by specifying the system property:
-Dorg.jboss.util.timeout.TimoutPriorityQueue=some.class.name -
23. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 11, 2006 5:30 AM (in response to adrian.brock)From the testing that I have done, the issue appears to be the
algorithm doesn't scale the way it is intended.
It is "log(n)" in terms of adding and removing, but that is just for a
single thread. Once the background thread is contending with the
schedulers it ceases to be so.
It certainly doesn't scale with the number of scheduling threads in a nice way.
I am also seeing a wide variation in the length of time a test takes
to run. Which suggests to me that there is a "fair queuing" issue
from the use of the simple synchronization primitives.
P.S. This alogirthm is very similar to the DelayQueue from JSR166
(backport-concurrent-util).
The only main difference I can see is that it uses two different
synchronization primitives.
1) A re-entrant lock to make the queue thread safe
2) A condition to wake up the waiting takers.
The re-entrant lock does not use fair queuing though? -
24. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 11, 2006 5:32 AM (in response to adrian.brock)I am also unsure whether what I am really measuring is just the
scalability of the JDK's lock/notify primitives rather than the efficiency
of the algorithm. -
25. Re: TimeoutFactory flagged as point of contention
clebert.suconic Apr 11, 2006 3:41 PM (in response to adrian.brock)Adrian,
Phillip Thurmond will take another run with current CVS version to see how it would behave.
(Unless you think it doesn't worth trying yet).
We will make these tests in about three days. (unless we need to change priorities)
We will post results here. -
26. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 11, 2006 4:01 PM (in response to adrian.brock)You can try it, but I haven't changed much.
1) Moved the thread pool submission out of the synchronization block
2) Removed the object pooling
I've tried three different implementations today.
All were slower or made little difference compared with the current version.
* Using a "scheduler" queue such that the submitting threads do
virutally no work
* Using a modified version of the public domain JSR166 version of DelayQueue to optimize removal (using fair queued re-entrant locks with this was even slower :-)
* A modification of the JSR166 DelayQueue to use normal synchronization
rather than a Reentrant lock.
Looks like I am going to have to look some of the academic
implementations that do more fine grained locking
or the some of the skip-list stuff. -
27. Re: TimeoutFactory flagged as point of contention
clebert.suconic Apr 11, 2006 4:47 PM (in response to adrian.brock)Just brain storming...
Isn't possible to partition the queue somehow?
Using hashcodes? -
28. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 12, 2006 3:12 PM (in response to adrian.brock)Probably. But the issue is synchronization between the adders/removers
and the background thread that needs to wait/notify when a timeout
is required.
I tried today to implement this:
http://www.cs.rochester.edu/~scott/papers/1996_IPL_heaps.pdf
but it doesn't scale as well as the current implementation.
That is without the "bit-reversedsounter" to hash the adds/removes
across different parts of the tree.
But I couldn't get this part to work. The alogrithm as it stands is broken.
I've done a few fixes to get it working as a priority queue, but I
haven't been able to get it to work with the "random" remove/cancellation
of the timeouts.
Next on the list is this :-)
http://citeseer.ist.psu.edu/sundell03fast.html
Which is a more recent alogorithm, but also doesn't include a random
remove/cancel operation. :-( -
29. Re: TimeoutFactory flagged as point of contention
adrian.brock Apr 13, 2006 8:21 AM (in response to adrian.brock)I've implemented your Hash based priority queue.
To use it, start JBoss with:
-Dorg.jboss.util.timeout.TimeoutPriorityQueue=org.jboss.util.timeout.HashedTimeoutPriorityQueueImpl
This divides the queue into 40 subqueues with an individual node at the top
for the next to be scheduled.
It doesn't make any significant difference in my tests.
But like I said before, I am not sure whether I am just measuring the
locking/scheduling capabilities of the OS/JDK.