2 Replies Latest reply on Mar 4, 2013 11:20 AM by willreichert

    Rule does not fire (missing log entry) at high concurrency

    willreichert

      I created simple method entry exit rules to track method execution times (wall time) but some of the rules are not firing when I increase the concurrency for the benchmark.

       

      Background

      I started by using the built in timers but the output reported a smaller elapsed time for methods that were lower on the call stack (which I believe is not possible). I could not figure out why the methods were reporting a longer execution time than the calling method so I switched my approach to log the timestamp of method entry and exit. This generates a great deal more log data (8G for a 20min test). Using the method entry / exit logging I noticed there were more method entries than exits for some of the methods in the callstack.

       

      Byteman Script:

      The actual script traces several methods but each method follows the entry / exit pattern below:

      RULE org.spec.jent.ejb.mfg.session.WorkOrderSession.scheduleWorkOrder_entry
      CLASS org.spec.jent.ejb.mfg.session.WorkOrderSession
      METHOD scheduleWorkOrder
      AT ENTRY
      BIND
      threadId = Thread.currentThread().getId();
      threadName = Thread.currentThread().getName();
      nanoTime = ""+System.nanoTime();
      milliTime = ""+System.currentTimeMillis();
      methodId = "org.spec.jent.ejb.mfg.session.WorkOrderSession.scheduleWorkOrder";
      IF TRUE
      DO
      openTrace("timedStacks","/home/benchuser/entryExit.log");
      traceln("timedStacks","{\"entry\":\""+methodId+"\",\"threadId\":"+threadId+",\"threadName\":\""+threadName+"\",\"nano\":"+nanoTime+",\"milli\":"+milliTime+"}");
      ENDRULE
      RULE org.spec.jent.ejb.mfg.session.WorkOrderSession.scheduleWorkOrder_exit
      CLASS org.spec.jent.ejb.mfg.session.WorkOrderSession
      METHOD scheduleWorkOrder
      AT EXIT
      BIND
      threadId = Thread.currentThread().getId();
      threadName = Thread.currentThread().getName();
      nanoTime = ""+System.nanoTime();
      milliTime = ""+System.currentTimeMillis();
      methodId = "org.spec.jent.ejb.mfg.session.WorkOrderSession.scheduleWorkOrder";
      IF TRUE
      DO
      openTrace("timedStacks","/home/benchuser/entryExit.log");
      traceln("timedStacks","{\"exit\":\""+methodId+"\",\"threadId\":"+threadId+",\"threadName\":\""+threadName+"\",\"nano\":"+nanoTime+",\"milli\":"+milliTime+"}");
      ENDRULE
      

       

      JVM arguments:

      JAVA_OPTS="$JAVA_OPTS -Dorg.jboss.byteman.compile.to.bytecode"
      JAVA_OPTS="$JAVA_OPTS -javaagent:${BYTEMAN_HOME}/lib/byteman.jar=listener:false,boot:${BYTEMAN_HOME}/lib/byteman.jar,script:/home/benchuser/entryExit.btm"
      JAVA_OPTS="$JAVA_OPTS -Dorg.jboss.byteman.transform.all"
      

       

      It looks like traceln immediately flushes the message to the stream so I do not think there is an issue with buffer overflow for logging but please correct me if I am wrong. I suspect the problem may be that an exception is being thrown and therefore AT EXIT is not being invoked. Is there a different rule type I could use which would invoke even if the method exits becasue of an exception?

        • 1. Re: Rule does not fire (missing log entry) at high concurrency
          adinn

          Hi Will,

           

          Thanks for posting your results here.

          Will Reichert wrote:

           

          Background

          I started by using the built in timers but the output reported a smaller elapsed time for methods that were lower on the call stack (which I believe is not possible). I could not figure out why the methods were reporting a longer execution time than the calling method so I switched my approach to log the timestamp of method entry and exit. This generates a great deal more log data (8G for a 20min test). Using the method entry / exit logging I noticed there were more method entries than exits for some of the methods in the callstack.

           

          Is there any possibility that you could be entering instrumented methods recursively? If so then that might explain the disparity.

           

          Will Reichert wrote:

           

          It looks like traceln immediately flushes the message to the stream so I do not think there is an issue with buffer overflow for logging but please correct me if I am wrong. I suspect the problem may be that an exception is being thrown and therefore AT EXIT is not being invoked. Is there a different rule type I could use which would invoke even if the method exits becasue of an exception?

           

          Yes, traceln will flush the message.

           

          If there are exceptions below a call to the instrumented method which propagate out through the  call then they will bypass the EXIT rule. So, that may account for the missing EXIT firings. The only other things I can think of to account for the disparity are i) a thread exit occuring below an ENTRY or ii) a thread left suspended under an ENTRY at the end of your test run. It may be that there is a concurrency bug in Byteman which means thats certain firings are not happening or are omitting to ouptut trace but I think that is very unlikely (after 30 years of writing concurrent code I am not going  to rule that out). is it possible for you to detect whether exceptions are occuring? or indeed whether any of the threads are exiting or suspended at test end? It would be nice to know that this was the cause of the disparity in the ENTRY and EXIT counts.

           

          Unfortunately, Byteman does not provide any way of dealing with the situation where an exception passes through a trigger method. It only transforms trigger methods by injecting code which gets executed during control flow within the method itself. Exception flow through a method happens because of a stack unwind and this unwind ignores any method which does not explicitly catch ands handle the exception that caused it. In theory it ought to be possible to support something like an AT EXCEPTIONAL_EXIT location type to catch this sort of transfer of control. Byteman could wrap the whole of the trigger method body in a try catch block catching Throwable. The handler block for this injected try catch could trigger the rule and then rethrow the exception. That's in theory, of course --  I am not sure how easy it would be in practice to implement this. I'll be happy to think about this as a possible feature request but I am not likely to be able to add it very soon ad it may turn out to be infeasible.

           

          Thanks very much for posting your results. I am sorry I cannot offer any simple fixes.

           

          regards,

           

           

          Andrew Dinn

          • 2. Re: Rule does not fire (missing log entry) at high concurrency
            willreichert

            The method I was using to set the flag to true does ocassionally throw an exception which is what caused the flag to not be set. Thank you for the explination and for taking a look at my issue. I changed to simply logging the entry and exit times individually and using a script to match entry and exit log entries.