1 2 Previous Next 29 Replies Latest reply on Feb 4, 2009 9:39 AM by adrian.brock Go to original post
      • 15. Re: Strange classloading behavior -- thread stuck

         

        "adrian@jboss.org" wrote:
        "adrian@jboss.org" wrote:

        The code is the same as JBoss4.x


        There is one difference between 4.x and 5.x, which is the 4.x code won't loop forever
        since it uses a for loop
        JBoss 4.x
        
         int size = taskList != null ? taskList.size() : 0;
         synchronized( taskList )
         {
         for(int i = 0; i < size; i ++)
        
        JBoss5.x
        
         synchronized (taskList)
         {
         while (taskList.isEmpty() == false)
         {
        


        But using a for loop would just hide the problem.

        The thread on Trace2 still wouldn't be woken up because the other thread wouldn't
        do the notification on its task list (it would still assign the task to the wrong thread).


        One horrible thought is that JBoss4.x has the same problem but we haven't seen it
        as clearly before because of the for loop. ;-(

        • 16. Re: Strange classloading behavior -- thread stuck
          brian.stansberry

          Just an FYI: Bela's likely out for the rest of the day. I'm trying to reproduce w/ TRACE logging now, but so far w/o success. If I can figure out the secret recipe to reproduce, I'll switch the jboss-classloader.jar and see what happens.

          • 17. Re: Strange classloading behavior -- thread stuck

            I think I know what the problem is, but if I'm correct it will be very difficult to
            reproduce the problem.

            If you look at the old code in JBoss4, LoadMgr3 in nextTask()
            when releaseInNextTask == true just removes the lock on the classloader
            and unassigns the current thread as the owner of the classloader.

            The important part is that it doesn't try to reschedule any outstanding tasks
            back to their original threads. That only occurs in endLoadTask().

             if( threadTask.releaseInNextTask == true )
             {
             if( trace )
             log.trace("Releasing loadLock and ownership of UCL: "+threadTask.ucl);
             synchronized( registrationLock )
             {
             loadClassThreads.remove(threadTask.ucl);
             }
             synchronized( threadTask.ucl )
             {
             ucl3.release();
             ucl3.notifyAll();
             }
             }
            


            While in JBoss5, it does:
             if (threadTask.isReleaseInNextTask())
             threadTask.getClassLoader().unlock();
            


            The classloader unlock() will always try to reassign the outstanding tasks
            when it invokes ClassLoaderManager.unregisterLoaderThread()
            so it is reassigning tasks in both places.

            • 18. Re: Strange classloading behavior -- thread stuck

               

              "adrian@jboss.org" wrote:
              I think I know what the problem is, but if I'm correct it will be very difficult to reproduce the problem.


              The reason I think it will be difficult to reproduce the problem is because
              I think this only occurs when you hit the ClassCircularity bug.

              The releaseInNextTask == true means that it got a temporary lock on the classloader.
              Its not the original classloader used to initiate the classloading attempt.

              This can only occur if the thread owns the classloader so it should in normal circumstances be able to complete the task, release the classloader and reschedule back the other requests.

              If however, it hits the ClassCircularity problem (see the catch block just above the unlock), it will reschedule that task. So when it does the unlock, it will loop
              on the reassignment of the task back to the same thread.

              • 19. Re: Strange classloading behavior -- thread stuck
                brian.stansberry

                 

                "adrian@jboss.org" wrote:
                The reason I think it will be difficult to reproduce the problem is because I think this only occurs when you hit the ClassCircularity bug.


                Perhaps the mod_cluster objects being deserialized trigger the ClassCircularity bug? That may explain why Bela is seeing this with mod_cluster while we haven't gotten a ton of reports on it for AS 5.

                Still trying to reproduce... Bela said it wasn't that hard for him to reproduce by stumbling into it, so if I can't there's a reasonable chance he'll be able to.

                • 20. Re: Strange classloading behavior -- thread stuck

                   

                  "bstansberry@jboss.com" wrote:
                  "adrian@jboss.org" wrote:
                  The reason I think it will be difficult to reproduce the problem is because I think this only occurs when you hit the ClassCircularity bug.


                  Perhaps the mod_cluster objects being deserialized trigger the ClassCircularity bug? That may explain why Bela is seeing this with mod_cluster while we haven't gotten a ton of reports on it for AS 5.


                  It also depends which version of the JDK you use. e.g. Sun claim it is fixed from
                  1.5.0-b08 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=2139628


                  • 21. Re: Strange classloading behavior -- thread stuck

                    I've uploaded a second snapshot which includes a fix for the reassignment problem.
                    http://snapshots.jboss.org/maven2/org/jboss/cl/jboss-classloader/2.0.3-SNAPSHOT/
                    i.e. this part of the code should now be back to what it was in JBoss-4.x

                    But Ales says he is going to do the 2.0.3.GA release tonight anway.

                    I'm going to close JBCL-81 since the reassignment problem has all the symptoms
                    of the original problem and it is consistent with the stacktraces posted before.

                    An extra check would be to get a TRACE log (if you reproduce the problem)
                    that has the ClassCircularityError or LinkageError. We only log it as an error
                    if it fails after 10 attempts. :-)

                    • 22. Re: Strange classloading behavior -- thread stuck
                      brian.stansberry

                      OK.

                      Note that Bela was using some JDK 6 release; you can tell from the "java.lang.Thread.State: RUNNABLE" in the thread dumps.

                      • 23. Re: Strange classloading behavior -- thread stuck

                         

                        "bstansberry@jboss.com" wrote:
                        OK.

                        Note that Bela was using some JDK 6 release; you can tell from the "java.lang.Thread.State: RUNNABLE" in the thread dumps.


                        Notice I said "claim" to have fixed it. :-)

                        I remember still seeing it on JDK 6 with one of our classloading tests, I forwarded
                        it to one of the guys at Sun working on the concurrent classloading in JDK 7
                        but I never heard back from him. Maybe it got fixed in a later release?

                        The changes in JDK 7 (if they happen) will make all this nonsense unnecessary!


                        • 24. Re: Strange classloading behavior -- thread stuck
                          brian.stansberry

                          Any reason why the category for the logger in org.jboss.classloader.spi.base.ClassLoaderManager is

                          private static Logger log = Logger.getLogger("org.jboss.detailed.classloader.ClassLoaderManager");


                          which means enabling TRACE for org.jboss.classloader doesn't pick up output from this class. :-(

                          I'd just change it but the hardcoded String category makes me think there's a 0.5% chance there's a reason for that, so I'm asking first. ;-)

                          Anyway, good news is I can reproduce this now; will fix my jboss-log4j.xml to work around above issue and will have logs showing what's going on.

                          Then I'll try the 2.0.3-SNAPSHOT.

                          • 25. Re: Strange classloading behavior -- thread stuck
                            brian.stansberry

                            I was able to induce the classloading loop quite easily using JDK 6 (u12) a stock 5.0.0.GA. But with the 2.0.3-SNAPSHOT libraries I've been trying for over half an hour without success.

                            So, something you did helped. :-)

                            More mysterious is exactly what caused the problem. I have logs from a server that failed and grepped for "Circular" or "LinkageError" and saw nothing. The logs are quite complex so it would probably take hours of detail analysis to get anything out of them. I attached them to JBCL-81 just so they are archived somewhere in case they prove useful.

                            I believe a bug I found today (https://jira.jboss.org/jira/browse/MODCLUSTER-47) was relevant here. It basically resulted in one node continually sending an RPC to another, with a complex object passed as a param and a complex object returned. So, fair bit of classloading calls involved. This RPC happens about 20 times a second. Failure seemed to occur when you made an AJP request to the server while this was going on; the thread handling the AJP request ends up in the state shown in my 3rd post on this thread; the JGroups thread ends up in the tight loop.

                            The continual RPC issue is why the log attached to the JIRA is so large.

                            • 26. Re: Strange classloading behavior -- thread stuck
                              belaban

                              I downloaded Adrian's snapshot and I wasn't able to reproduce the class loading issue.

                              However, I ran into hangs when shutting down an instance. I think this is related to MODCLUSTER-47, so it is not relevant to this forum. I'll continue in MODCLUSTER-47.
                              Thanks !

                              • 27. Re: Strange classloading behavior -- thread stuck

                                 

                                "bstansberry@jboss.com" wrote:
                                Any reason why the category for the logger in org.jboss.classloader.spi.base.ClassLoaderManager is

                                private static Logger log = Logger.getLogger("org.jboss.detailed.classloader.ClassLoaderManager");


                                which means enabling TRACE for org.jboss.classloader doesn't pick up output from this class. :-(


                                Sorry about that. I did it deliberately (but forgot I did it)
                                because I wanted to keep Scott's logging for the
                                thread scehduling in the code, but not normally show it.

                                As you found, this thread scheduling logging tends to swamp the main
                                classloader logging making it hard to read for normal classloader debugging
                                e.g. why didn't it load that class from where I expected?


                                • 28. Re: Strange classloading behavior -- thread stuck
                                  brian.stansberry

                                  No problem. :-)

                                  The ClassLoadingTask log category had I typo:

                                  protected static Logger log = Logger.getLogger("org.jboss.detailed.classloader.ClassLoadinTask");


                                  Missing "g" in "Loading". I just fixed that, r83853. To which I just a just realized I didn't add a commit message. :( Better get some coffee.

                                  • 29. Re: Strange classloading behavior -- thread stuck

                                     

                                    "bstansberry@jboss.com" wrote:
                                    I was able to induce the classloading loop quite easily using JDK 6 (u12) a stock 5.0.0.GA. But with the 2.0.3-SNAPSHOT libraries I've been trying for over half an hour without success.

                                    So, something you did helped. :-)

                                    More mysterious is exactly what caused the problem. I have logs from a server that failed and grepped for "Circular" or "LinkageError" and saw nothing. The logs are quite complex so it would probably take hours of detail analysis to get anything out of them. I attached them to JBCL-81 just so they are archived somewhere in case they prove useful.


                                    I've been examing your log and I think I understand another way this can happen.
                                    My fix is also correct in that case as well, which is why it helped.

                                    Its failry complicated so bare with me (I'll try and simplify it to the bare minimum :-).

                                    Step 1

                                    Thread1: gets a lock on ClassLoader A to do a normal classloading request on ClassA

                                    Step 2

                                    Thread1: to load ClassA it also needs to load ClassB from ClassLoader B
                                    Thread1: aquires the lock - but hasn't scheduled the task yet

                                    Step 3

                                    Thread2: wants to load ClassC from ClassLoaderA and schedules against Thread1
                                    to do that since it owns the lock

                                    Step 4

                                    Thread1: schedules the load ClassB, but it is behind ClassC

                                    Step 5

                                    Thread1: does the load on ClassC but this requires to load ClassD from ClassLoaderA
                                    (note Thread1 is now the requestingThread even though the requestingThread for the
                                    initial load of ClassC is Thread2 - this is the importan part)

                                    Step 6

                                    Thread1: Schedules the load of ClassD

                                    Step 7

                                    Thread1: does the load of ClassB and then tries to reassign the tasks.

                                    But there is still the load of ClassD for the same thread in the list from step (5)
                                    So it loops trying to reassign to itself.

                                    As I noted in step (5) the case I'd missed when trying to determine how it could happen
                                    is when you get a recursive classloading request. Although the load of ClassC
                                    belongs to Thread2, its knock on effect to load ClassD actually belongs to Thread1

                                    So the ClassCircularity bug is not required to reproduce the problem,
                                    just this "jumping the queue" or more correctly, interleaving of recursive requests
                                    that get executed on the same thread.


                                    1 2 Previous Next