0 Replies Latest reply on Apr 4, 2016 1:01 PM by guttorm2

    jconsole running at high cpu load and timing out, possibly because of large # of files in log dir

    guttorm2

      In our production environment we've occasionally had a problem with the all cores running at about 50%.

      In these instances I've been unable to run jconsole to look at the problem live, so instead I have run it after restart to monitor.

       

      My first problem was that jconsole timed out, forcing me to reconnect.

      While I was monitoring, I did get some new cores running at close to 100%, and I tried to use the threads view in jconsole to find the running threads.

      The first couple of times I did this I was unable to see anything that triggered my suspicion.


      Today when I looked at this, I noticed that some of the threads with State: RUNNABLE were doing "java.io.WinNTFileSystem.list(Native Method)", and further down in the stack trace I was seeing "org.jboss.as.jmx.*"

      Could it be that the CPU spikes I noticed were connected to the timeouts in jconsole?


      I clicked the disconnect icon in jconsole, waited several minutes until there no longer was any load, and then clicked connect again.

      At once, one of the cores shot up to 100% and remained there.

      I disconnected, waited until the CPU calmed down and tried again with the same result.


      So, it seems that each time I got a timeout and reconnected in jconsole, a new java process was started on our server.

      This process then ran for several minutes longer than it took for the timeout to happen, so if I continuously reconnected, the combined load just got higher and higher.


      These threads showed up in jconsole as "pool-2-thread-##", and I've attached one example stacktrace.

      When I had 8 cores busy, I had about 8 of these threads as well + some that were not RUNNABLE.

      I think the stack trace on all the RUNNABLE ones were identical.


      On a hunch, since the stacktrace seems to involve iterating over files, I deleted about 1000 files from our wildfly log directory and tried again.

      Now I get no spike while running jconsole, and I get no timeout..

       

      This was on windows server 2008 R2, with jdk1.8.0_66 and Wildfly 8.2.Final

      We run Wildfly as a service. We run jconsole on the server and connect with "service:jmx:http-remoting-jmx://127.0.0.1:9990"

       

      So, my problem with jconsole seem to be solved. Will be interesting to see if this also have an effect on the original problem I was trying to debug.