1 2 Previous Next 15 Replies Latest reply on Apr 24, 2008 9:57 AM by harish43

    Resource GC problem

    kurtstam

      Hi guys,

      I'm posting this to keep track of the issue that when you write connection open/close intensive code, it seems that the GC cannot free up all the resources that is was using. This will be not be an issue if you use pooling of the connections, but if you don't and simple do:

      lookup factory, lookup queue, connect to queue, open session, send message, receive message, close connection.

      and you make this loop then you will run out of memory at some point. Both JBM and JBMQ seem to have this problem. From what Kevin could see it may have to do with the way JBoss Remoting is used. Note that we did not look at code; just at what the profiler is telling us.

      Thanks,

      --Kurt

        • 1. Re: Resource GC problem
          kconner

          The following is a copy of the email I sent to our group earlier today.

          Kev

          -----------------------------------------------------------------------------

          From what I can see the issue appears to be related to two things

          - The way the current codebase repeatedly creates connections
          - The way JBoss Remoting works

          One cause appears to be the timers used by remoting. From what I can
          see there are three in play

          - ConnectionValidator
          - BisocketServerInvoker$ControlMonitorTimerTask
          - LeasePinger$LeaseTimerTask

          Each one of these timers has indirect access to the majority of the heap.

          The big culprit in the timers appears to be LeasePinger$LeaseTimerTask.
          When connections are closed this task is cancelled but unfortunately
          cancelling j.u.TimerTask does *not* remove the task from the queue, all
          it does is mark it as cancelled. The consequence of this is that every
          instance referenced by the task *cannot* be garbage collected until the
          timer would normally fire (and the task is then removed from the queue).

          Referenced from each LeasePinger instance is a BisocketClientInvoker
          which contains a ClientSocketWrapper. Each ClientSocketWrapper
          references a Socket, a DataInputStream (containing BufferedInputStream)
          and a DataOutputStream (containing BufferedOutputStream). Each BIS/BOS
          contains a 64k array! In my tests these instances amount to a
          cumulative size of about 1/3 of the heap.

          Another cause appears to be the use of hash maps. There are numerous
          hashmaps referenced from BisocketServerInvoker and BisocketClientInvoker
          which do not appear to be garbage collected. One reason is the above
          timers but a second is that BisocketServerInvoker holds on to
          BisocketServerInvoker references in a static map called
          listenerIdToServerInvokerMap. This map currently contains an instance
          of BisocketServerInvoker for every iteration of the loop.

          This has all been discovered from examining profile information, not
          source code. It may be that this analysis is completely wrong and that
          examination of the source code will highlight other issues.

          Kev

          • 2. Re: Resource GC problem
            timfox

            Thanks Kev - looks like a JBoss Remoting issue.

            I will investigate Monday when after I return from California.

            BTW - any particular reason ESB is creating so many ephemeral connections?

            I would say this is an anti-pattern, although of course this should not be causing a resource leak.

            • 3. Re: Resource GC problem
              kconner

              Hiya Tim.

              Yes, it appears from the profiler that this is a remoting issue. Which version are you using? Is it 2.0.0GA?

              I also agree that it is an anti-pattern and we are working to address this in the 4.x code. Kurt and I are rewriting this part of the code now.

              Kev

              • 4. Re: Resource GC problem
                kconner

                The version of remoting appears to be 2.2.0 beta1, is this correct?

                • 5. Re: Resource GC problem
                  kconner

                  Hiya Tim.

                  I have suggested fixes for both of these issues. Initial tests show the client appearing to remain stable at around 2m-5m.

                  I'll run more tests later this weekend.

                  Kev

                  • 6. Re: Resource GC problem
                    timfox

                    Thx Kevin.

                    Yes, we are running 2.2.0.beta1.

                    BTW I think Ron Sigal may have some fixes for this too - maybe you guys should liaise?

                    • 7. Re: Resource GC problem
                      kconner

                      I have sent my suggested fixes to Ron.

                      I will run more tests later this weekend though as there appears to be another issue on the server side.

                      Kev

                      • 8. Re: Resource GC problem
                        kconner

                        Hiya Tim.

                        The server side issue is with Messaging :-)

                        The ServerSessionEndpoint code contains an executor for each instance. Unfortunately nothing shuts down the executor which means that the threads created by the executor are never destroyed.

                        I modified the messaging code to include a call to executor.shutdownNow() in ServerSessionEndpoint.close() and this appears to have done the trick.

                        Kurt's test is now up to 40000 iterations and the client/server look stable.

                        I'll give it a longer run over the next day or so.

                        Kev

                        • 9. Re: Resource GC problem
                          clebert.suconic

                          I am moving this thread to "Design of Messaging on JBoss (Messaging/JBoss)"

                          • 10. Re: Resource GC problem
                            timfox

                            Kev-

                            Thanks for you work over the past few days :) If you want to join the messaging team there are vacancies ;)

                            Regarding the executor - good catch. Actually this queued executor was a last minute addition to workaround another remoting issue.

                            BTW are you creating/destroying a lot of sessions rapidly now? If so, pls bear in mind that sessions are also fairly heavyweight objects so this would be considered an anti-pattern too.

                            This is why the JCA layer for instance, caches underlying JMS sessions.

                            • 11. Re: Resource GC problem
                              kconner

                              Hiya Tim.

                              Thanks for the offer, I'll bear it in mind :-)

                              These observations have come from the initial test which Kurt wrote, the intention of which was to mimic the code which is currently present in the ESB.

                              We are aware that these are anti-patterns and that their use will have a negative effect on performance. Kurt and I are rewriting this as we speak.

                              Do you have any plans to release a version of Messaging which addresses these issues?

                              Thanks,
                              Kev

                              • 12. Re: Resource GC problem
                                timfox

                                We are hoping to release a follow up to 1.2.0 fairly soon, although I'm not sure of the exact timing.

                                BTW I couldn't see your change for the QueuedExecutor in SVN - I assume you haven't committed it yet?

                                (I'm thinking a shutdownAfterExecutingCurrentTask() would be more appropriate too)

                                • 13. Re: Resource GC problem
                                  kconner

                                  Hiya Tim.

                                  I have not committed anything to svn as I felt the decision on how best to handle this should come from the messaging team. :-)

                                  There are two other shutdown methods which could be used depending on what you wish to achieve. The choice of shutdownNow was only made to test the fix.

                                  My instinct would be to let the queue drain using shutdownAfterProcessingCurrentlyQueuedTasks rather than to use the shutdownAfterExecutingCurrentTask.

                                  Kev

                                  • 14. Re: Resource GC problem
                                    timfox

                                    np

                                    I'll apply the fix

                                    1 2 Previous Next