1 2 3 Previous Next 30 Replies Latest reply on Feb 5, 2008 10:39 AM by Adhi R

    TimeoutFactory flagged as point of contention

    Adrian Brock Master

      This "bug report" flags the TimeoutFactory as a point of contention.
      http://jira.jboss.com/jira/browse/JBAS-3073
      with little information.

      The problem should be validated/discussed here and decision
      on what needs doing, before "bug reports" are raised.

        • 1. Re: TimeoutFactory flagged as point of contention
          Adrian Brock Master

          On the TimeoutFactory class

          "
          We should certainly remove the stupid object pooling done by "freeList".

          The whole class needs separating into the TimeoutFactory implementation
          and the PriorityQueue implementation. It is incomprehensible as it stands. :-)
          "

          • 2. Re: TimeoutFactory flagged as point of contention
            Clebert Suconic Master

            Phillip is not seeing just one thread with that stack trace.

            There is a lot, around 600 in our performance tests, for the Injection Rate he disclosed on the case.

            • 3. Re: TimeoutFactory flagged as point of contention
              Clebert Suconic Master

              And... we didn't have this behavior in any previous version.

              Looking at CVS history, this is something introduced on JBoss_4_0_4_RC1.
              We immediatly started seeing this contention in our performance tests after upgraded our version.

              So, this is a regression, and that's why we decided to open the case.

              • 4. Re: TimeoutFactory flagged as point of contention
                Adrian Brock Master

                Yes I know. But since he doesn't show who the holds the lock
                we cannot determine what is the cause of the contention.

                Just dumping a stacktrace into JIRA and claiming a bug
                is useless under definition of analysis I've ever heard of.

                It falls into the "IT DOES NOT WORK" category of questions.

                • 5. Re: TimeoutFactory flagged as point of contention
                  Adrian Brock Master

                   

                  "clebert.suconic@jboss.com" wrote:
                  And... we didn't have this behavior in any previous version.

                  Looking at CVS history, this is something introduced on JBoss_4_0_4_RC1.
                  We immediatly started seeing this contention in our performance tests after upgraded our version.

                  So, this is a regression, and that's why we decided to open the case.


                  Well actually the code in 4_0_4RC1 was broken. It leaked memory all over
                  the place. You've seen CVS history, look at the bug reports they reference.

                  • 6. Re: TimeoutFactory flagged as point of contention
                    Adrian Brock Master

                    You'll also notice I didn't reject the bug report.
                    I said it was incomplete and described the information required.

                    • 7. Re: TimeoutFactory flagged as point of contention
                      Clebert Suconic Master

                      Phillip will provide Stack Traces.

                      But Adrian,

                      TransactionImpl is creating/destroying Timeouts over a singleton factory, using some "expensive" operations, like an arrayCopy on TimeoutFactory.newTimeout and removeNode on dropTimeout.

                      Every single commit is competing access on the Singleton TimeoutFactory.

                      On TransactionImpl:

                      TransactionImpl(long timeout)
                       {
                       ...
                       this.timeout = TimeoutFactory.createTimeout(start+timeout, this);
                       ...
                       }
                      



                      On TimeoutFactory:

                       static public Timeout createTimeout(long time, TimeoutTarget target)
                       {
                       return getSingleton().schedule(time, target);
                       }
                      


                      schedule will call newTimeout...

                       private synchronized Timeout newTimeout(long time, TimeoutTarget target)
                       {
                       ...
                       System.arraycopy(q, 0, newQ, 0, q.length);
                       //... there are other possible contentions on this method also ...
                       ..
                       }
                      
                      



                      and on dropTimeout....


                       private boolean dropTimeout(TimeoutImpl timeout)
                       {
                       synchronized (this)
                       {
                       ...
                       removeNode(timeout.index);
                       ...
                       }
                       }
                      
                      



                      And removeNode looks an expensive operation to me.


                      So, by looking at source code I really think this is going to create a contention.

                      • 8. Re: TimeoutFactory flagged as point of contention
                        Adrian Brock Master

                         

                        "clebert.suconic@jboss.com" wrote:
                        So, by looking at source code I really think this is going to create a contention.


                        I don't doubt it, but until we understand it, we won't know how to fix it.
                        i.e.. Are the operations bad/expensive or is it the global lock?

                        I've added some stress tests to the testsuite,
                        and these indicate to me that the problem is just the global lock
                        from looking at the stack traces.
                        I'm finding it hard to get a stacktrace with somebody holding the lock.

                        The PriorityBlockingQueue from JDK5/backport-concurrent-util
                        looks even more expensive in terms of the remove/cancel.

                        • 9. Re: TimeoutFactory flagged as point of contention
                          Dimitris Andreadis Master

                          When I looked at TimeoutFactory my conclusion was that:

                          http://www.jboss.com/index.html?module=bb&op=viewtopic&t=75955


                          I looked at many generic Heap / PriorityQueue implementations and they all suffer from the inefficiency that removing an arbitrary object, other than the one on the "top" on the heap requires (a) a sequencial scan of the stored elements, until it is found & removed, (b) the heap restructured.

                          Ole's implementation avoids (a) because the stored elements keep an index back to Heap, that is updated whenever stored elemenets are swapped inside the Heap.

                          So generalizing this (factoring out the priority queue), without the performance hit it not possible, I think.


                          • 10. Re: TimeoutFactory flagged as point of contention
                            Dimitris Andreadis Master

                            I agree the freeList pooling is unecessary. In fact, only the canceled items are returned to the pool, not the ones that expire normally.

                            • 11. Re: TimeoutFactory flagged as point of contention
                              Dimitris Andreadis Master

                              After 'genmans' changes and my refactoring/fixes, I think the biggest change in terms of what we had before and the way it works now, is the synchronized call to get the singleton object.

                              I guess, one way to avoid this, is to initialize this singleton as before, early, rather than lazily.

                              • 12. Re: TimeoutFactory flagged as point of contention
                                Adrian Brock Master

                                I'm not seeing contention on the singleton (synchronizing on the class).
                                Though obviously that could be optimized away in TransactionImpl.

                                I am seeing wait for locks on the scheduler instances
                                (even with only one thread :-)
                                One possible problem is that the main thread is waiting on the same
                                lock as the add/remove operations. So it is likely the JDK is falling
                                back to real locks?

                                Also if the timeout is the first one in the list, the main thread
                                is notified and holds the lock while it recalcuates the new time
                                to wakeup.
                                However, I don't see the main thread actually performing this
                                calcuation. It is always waiting as far as I can see.

                                • 13. Re: TimeoutFactory flagged as point of contention
                                  Phillip Thurmond Newbie

                                  This sample thread dump shows 200 threads blocked on TimeoutFactory. All of the blocked stack traces are similar to the one I put in the JIRA task. If you need complete dumps, I have several which I can provide. Not sure where to attach these since the JIRA issue is closed.

                                  Find unique contentions:
                                  [pthurmond@dev09 tmp]$ grep Blocked 2.dump | sort -u
                                  -- Blocked trying to get lock: org/jboss/util/timeout/TimeoutFactory@0x2000000015c53290[fat lock]

                                  Count contentions on TimeoutFactory:
                                  [pthurmond@dev09 tmp]$ grep "Blocked trying to get lock: org/jboss/util/timeout/TimeoutFactory@0x2000000015c53290" 2.dump | wc -l
                                  207



                                  • 14. Re: TimeoutFactory flagged as point of contention
                                    Adrian Brock Master

                                    The JIRA issue is only closed until you provide the full information.

                                    As it is, I 've manged to get the information with a stress test.

                                    Now we just to figure out what we can do to remove the full locking
                                    that is going on. i.e. Make it do something more like spin locking?

                                    1 2 3 Previous Next