2 Replies Latest reply on Oct 31, 2012 11:18 AM by tomjenkinson

    Byteman hanging during BMUnit.unloadScriptFile?

    tomjenkinson

      Hi,

       

      I am using Byteman 2.0.1 to test some crash recovery behaviour over at JBTM.

       

      We are having a strange issue where one of our tests occasionally (1 out of 10 runs?) hangs. We have done a kill -3 and have a stack trace below.

       

      You will see 3x Transaction Reaper Worker 0, Transaction Reaper, Transaction Expired Entry Monitor as the test launches three transaction managers in the same VM with port offsets.

       

      What I am interested in is what looks to be happening in the "main" thread and wondered if you could possibly shed some light for me please?

       

      Full thread dump Java HotSpot(TM) 64-Bit Server VM (22.1-b02 mixed mode):

      "Transaction Reaper Worker 0" daemon prio=10 tid=0x0000000015455800 nid=0x4a22 in Object.wait() [0x00000000428c5000]

         java.lang.Thread.State: WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000fa3a5868> (a java.util.LinkedList)

          at java.lang.Object.wait(Object.java:503)

          at com.arjuna.ats.arjuna.coordinator.TransactionReaper.waitForCancellations(TransactionReaper.java:321)

          - locked <0x00000000fa3a5868> (a java.util.LinkedList)

          at com.arjuna.ats.internal.arjuna.coordinator.ReaperWorkerThread.run(ReaperWorkerThread.java:65)

      "Transaction Reaper" daemon prio=10 tid=0x000000001736f800 nid=0x4a21 in Object.wait() [0x0000000041d5a000]

         java.lang.Thread.State: TIMED_WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000fa3a5600> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)

          at com.arjuna.ats.internal.arjuna.coordinator.ReaperThread.run(ReaperThread.java:90)

          - locked <0x00000000fa3a5600> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)

      "Transaction Expired Entry Monitor" daemon prio=10 tid=0x0000000015dc2000 nid=0x4a20 in Object.wait() [0x00000000403b0000]

         java.lang.Thread.State: TIMED_WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000fa3a8ba0> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)

          at com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor.run(ExpiredEntryMonitor.java:190)

          - locked <0x00000000fa3a8ba0> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)

      "Transaction Reaper Worker 0" daemon prio=10 tid=0x0000000017217800 nid=0x4a1f in Object.wait() [0x00000000425c2000]

         java.lang.Thread.State: WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000fa3c4f78> (a java.util.LinkedList)

          at java.lang.Object.wait(Object.java:503)

          at com.arjuna.ats.arjuna.coordinator.TransactionReaper.waitForCancellations(TransactionReaper.java:321)

          - locked <0x00000000fa3c4f78> (a java.util.LinkedList)

          at com.arjuna.ats.internal.arjuna.coordinator.ReaperWorkerThread.run(ReaperWorkerThread.java:65)

      "Transaction Reaper" daemon prio=10 tid=0x000000001773f000 nid=0x4a1e in Object.wait() [0x0000000041c59000]

         java.lang.Thread.State: TIMED_WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000fa3c4d10> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)

          at com.arjuna.ats.internal.arjuna.coordinator.ReaperThread.run(ReaperThread.java:90)

          - locked <0x00000000fa3c4d10> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)

      "Transaction Expired Entry Monitor" daemon prio=10 tid=0x000000001569f000 nid=0x4a1d in Object.wait() [0x00000000424c1000]

         java.lang.Thread.State: TIMED_WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000fa3c8010> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)

          at com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor.run(ExpiredEntryMonitor.java:190)

          - locked <0x00000000fa3c8010> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)

      "Transaction Reaper Worker 0" daemon prio=10 tid=0x00000000131f9800 nid=0x4a1c in Object.wait() [0x00000000422bf000]

         java.lang.Thread.State: WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000fa3e52b8> (a java.util.LinkedList)

          at java.lang.Object.wait(Object.java:503)

          at com.arjuna.ats.arjuna.coordinator.TransactionReaper.waitForCancellations(TransactionReaper.java:321)

          - locked <0x00000000fa3e52b8> (a java.util.LinkedList)

          at com.arjuna.ats.internal.arjuna.coordinator.ReaperWorkerThread.run(ReaperWorkerThread.java:65)

      "Transaction Reaper" daemon prio=10 tid=0x0000000013033800 nid=0x4a1b in Object.wait() [0x00000000423c0000]

         java.lang.Thread.State: TIMED_WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000fa3e5050> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)

          at com.arjuna.ats.internal.arjuna.coordinator.ReaperThread.run(ReaperThread.java:90)

          - locked <0x00000000fa3e5050> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)

      "Transaction Expired Entry Monitor" daemon prio=10 tid=0x0000000016769000 nid=0x4a1a in Object.wait() [0x0000000041ae7000]

         java.lang.Thread.State: TIMED_WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000fa3e8550> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)

          at com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor.run(ExpiredEntryMonitor.java:190)

          - locked <0x00000000fa3e8550> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)

      "Attach Listener" daemon prio=10 tid=0x0000000013229800 nid=0x49ca runnable [0x0000000000000000]

         java.lang.Thread.State: RUNNABLE

      "Service Thread" daemon prio=10 tid=0x00002aaab4039800 nid=0x4969 runnable [0x0000000000000000]

         java.lang.Thread.State: RUNNABLE

      "C2 CompilerThread1" daemon prio=10 tid=0x00002aaab4037000 nid=0x4968 waiting on condition [0x0000000000000000]

         java.lang.Thread.State: RUNNABLE

      "C2 CompilerThread0" daemon prio=10 tid=0x00002aaab4034000 nid=0x4967 waiting on condition [0x0000000000000000]

         java.lang.Thread.State: RUNNABLE

      "Signal Dispatcher" daemon prio=10 tid=0x00002aaab4032000 nid=0x4966 runnable [0x0000000000000000]

         java.lang.Thread.State: RUNNABLE

      "Finalizer" daemon prio=10 tid=0x0000000012292800 nid=0x4965 in Object.wait() [0x00000000420bd000]

         java.lang.Thread.State: WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000f9c58a20> (a java.lang.ref.ReferenceQueue$Lock)

          at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)

          - locked <0x00000000f9c58a20> (a java.lang.ref.ReferenceQueue$Lock)

          at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)

          at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

      "Reference Handler" daemon prio=10 tid=0x0000000012290000 nid=0x4964 in Object.wait() [0x0000000041fbc000]

         java.lang.Thread.State: WAITING (on object monitor)

          at java.lang.Object.wait(Native Method)

          - waiting on <0x00000000f9c58ab8> (a java.lang.ref.Reference$Lock)

          at java.lang.Object.wait(Object.java:503)

          at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

          - locked <0x00000000f9c58ab8> (a java.lang.ref.Reference$Lock)

      "main" prio=10 tid=0x0000000012208800 nid=0x4960 runnable [0x0000000040aa1000]

         java.lang.Thread.State: RUNNABLE

          at java.net.SocketInputStream.socketRead0(Native Method)

          at java.net.SocketInputStream.read(SocketInputStream.java:150)

          at java.net.SocketInputStream.read(SocketInputStream.java:121)

          at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)

          at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)

          at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)

          - locked <0x00000000fa45a0c8> (a java.io.InputStreamReader)

          at java.io.InputStreamReader.read(InputStreamReader.java:184)

          at java.io.BufferedReader.fill(BufferedReader.java:154)

          at java.io.BufferedReader.readLine(BufferedReader.java:317)

          - locked <0x00000000fa45a0c8> (a java.io.InputStreamReader)

          at java.io.BufferedReader.readLine(BufferedReader.java:382)

          at org.jboss.byteman.agent.submit.Submit$Comm.readResponse(Submit.java:857)

          at org.jboss.byteman.agent.submit.Submit.submitRequest(Submit.java:738)

          at org.jboss.byteman.agent.submit.Submit.deleteScripts(Submit.java:646)

          at org.jboss.byteman.agent.submit.Submit.deleteRulesFromFiles(Submit.java:619)

          at org.jboss.byteman.contrib.bmunit.BMUnit.unloadScriptFile(BMUnit.java:291)

          at org.jboss.byteman.contrib.bmunit.BMUnitRunner$5.evaluate(BMUnitRunner.java:252)

          at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)

          at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)

          at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)

          at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)

          at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)

          at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)

          at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)

          at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)

          at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)

          at org.junit.runners.ParentRunner.run(ParentRunner.java:236)

          at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)

          at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)

          at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)

          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

          at java.lang.reflect.Method.invoke(Method.java:601)

          at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)

          at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)

          at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:172)

          at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:78)

          at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:70)

      "VM Thread" prio=10 tid=0x0000000012288800 nid=0x4963 runnable

      "GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000012213000 nid=0x4961 runnable

      "GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000012215000 nid=0x4962 runnable

      "VM Periodic Task Thread" prio=10 tid=0x00002aaab4044000 nid=0x496a waiting on condition

      JNI global references: 6928

      Heap

      PSYoungGen      total 29312K, used 843K [0x00000000fdeb0000, 0x0000000100000000, 0x0000000100000000)

        eden space 24448K, 3% used [0x00000000fdeb0000,0x00000000fdf82dd0,0x00000000ff690000)

        from space 4864K, 0% used [0x00000000ffb40000,0x00000000ffb40000,0x0000000100000000)

        to   space 4800K, 0% used [0x00000000ff690000,0x00000000ff690000,0x00000000ffb40000)

      PSOldGen        total 68288K, used 10050K [0x00000000f9c00000, 0x00000000fdeb0000, 0x00000000fdeb0000)

        object space 68288K, 14% used [0x00000000f9c00000,0x00000000fa5d0948,0x00000000fdeb0000)

      PSPermGen       total 83968K, used 81266K [0x00000000f4a00000, 0x00000000f9c00000, 0x00000000f9c00000)

        object space 83968K, 96% used [0x00000000f4a00000,0x00000000f995cb38,0x00000000f9c00000)

       

      (The test is a bit complex, but the BMUnit config can be seen here: https://github.com/jbosstm/narayana/blob/master/ArjunaJTS/integration/src/test/java/com/arjuna/ats/jta/distributed/SimpleIsolatedServers.java)

       

      Thanks!

      Tom

        • 1. Re: Byteman hanging during BMUnit.unloadScriptFile?
          adinn

          Hi Tom,

           

          The main thread is hanging in the loop where the Submit class is reading the response to an unload request from the Byteman agent. I don't see any evidence of the Byteman Agent Listener thread in the thread backtrace. I wonder if perhaps this thread has died and, hence, failed to post a response via the socket the Submit thread is listening on. If so then I would have expected to see some sort of exception trace in the JVM output (wherever System.out is directed to).

           

          I didn't think that it was possible for this thread just to die but I will take a look to see if it could be killed by an unhandled exception.

          • 2. Re: Byteman hanging during BMUnit.unloadScriptFile?
            tomjenkinson

            Hi Andrew,

             

            The output that I have is here: http://<albany>/job/jbossts-narayana-java7/36/console

             

            There is nothing in the console on that run, however I do think the way I had surefire configured it was caching the output till the end of the test and as the test failed it didn't capture any of it. I have now changed this, so if it happens again I should be able to give more information.

             

            (There are 3 sets of dumps ran in quick succession as I tried to work out which process was the sure fire one. For information it is the first and third dump, the middle one was the main surefire process I think but I have forked jvms on).

             

            Tom