6 Replies Latest reply on Oct 3, 2013 5:39 PM by genman

    RHQ 4.9 performance issue

    genman

      Hi, I'm seeing the RHQ server use a lot of CPU. I didn't really do a lot of deep analysis, but this is what I'm seeing a lot of.


      23:12:50,760 INFO  [org.rhq.enterprise.server.util.concurrent.AvailabilityReportSerializer] (http-/0.0.0.0:7080-33) tid=433; agent=x: releasing write lock after being locked for millis=6075
      23:13:46,624 INFO  [org.rhq.enterprise.server.util.concurrent.AvailabilityReportSerializer] (http-/0.0.0.0:7080-14) tid=193; agent=y: releasing write lock after being locked for millis=5629
      23:14:34,826 INFO  [org.rhq.enterprise.server.util.concurrent.AvailabilityReportSerializer] (http-/0.0.0.0:7080-3) tid=178; agent=z: releasing write lock after being locked for millis=6273
      

       

      This is with about 200 agents.

      It seems like it's doing a lot of this. Basically hitting JMX stuff for whatever reason which is fairly inefficient. Maybe this is better in JDK 1.7, but I am using 1.6.

       

      "http-/0.0.0.0:7080-192" daemon prio=10 tid=0x00007f5118124000 nid=0x40d waiting for monitor entry [0x00007f505547f000]
         java.lang.Thread.State: BLOCKED (on object monitor)
              at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1088)
              - waiting to lock <0x000000040017b1e0> (a com.sun.jmx.interceptor.DefaultMBeanServerInterceptor)
              at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:833)
              at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
              at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:288)
              at $Proxy140.execute(Unknown Source)
              at org.rhq.enterprise.communications.command.server.CommandProcessor.handleIncomingInvocationRequest(CommandProcessor.java:290)
              at org.rhq.enterprise.communications.command.server.CommandProcessor.invoke(CommandProcessor.java:184)
              at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:967)
              at org.jboss.remoting.transport.servlet.ServletServerInvoker.processRequest(ServletServerInvoker.java:416)
              at sun.reflect.GeneratedMethodAccessor126.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:597)
              at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
              at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
              at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
              at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
              at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
              at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
              at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
              at org.jboss.as.jmx.PluggableMBeanServerImpl$TcclMBeanServer.invoke(PluggableMBeanServerImpl.java:527)
              at org.jboss.as.jmx.PluggableMBeanServerImpl.invoke(PluggableMBeanServerImpl.java:263)
              at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:288)
              at $Proxy145.processRequest(Unknown Source)
              at org.jboss.remoting.transport.servlet.web.ServerInvokerServlet$3.run(ServerInvokerServlet.java:413)
      

       

       

      There seems to be a lot of security stuff going on:

       

      "http-/0.0.0.0:7080-54" daemon prio=10 tid=0x00007fd7e4043000 nid=0xf80 waiting for monitor entry [0x00007fd8679f5000]
         java.lang.Thread.State: BLOCKED (on object monitor)
              at java.util.Collections$SynchronizedMap.get(Collections.java:1979)
              - waiting to lock <0x00000004006cc6e8> (a java.util.Collections$SynchronizedMap)
              at java.security.ProtectionDomain$2$1.get(ProtectionDomain.java:414)
              at sun.security.provider.PolicyFile.implies(PolicyFile.java:1108)
              at org.jboss.security.jacc.DelegatingPolicy.implies(DelegatingPolicy.java:160)
              at java.security.ProtectionDomain.implies(ProtectionDomain.java:224)
              at java.security.AccessControlContext.checkPermission(AccessControlContext.java:352)
              at java.security.AccessController.checkPermission(AccessController.java:549)
              at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
              at java.lang.reflect.AccessibleObject.setAccessible(AccessibleObject.java:107)
              at org.rhq.enterprise.server.safeinvoker.HibernateDetachUtility.nullOutFieldsByFieldAccess(HibernateDetachUtility.java:513)
      
      
      "http-/0.0.0.0:7080-57" daemon prio=10 tid=0x00007fd7e4046000 nid=0x1931 runnable [0x00007fd8673ef000]
         java.lang.Thread.State: RUNNABLE
              at java.security.AccessController.getStackAccessControlContext(Native Method)
              at java.security.AccessController.checkPermission(AccessController.java:523)
              at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
              at java.lang.reflect.AccessibleObject.setAccessible(AccessibleObject.java:107)
              at org.rhq.enterprise.server.util.CriteriaQueryGenerator.getFilterFields(CriteriaQueryGenerator.java:695)
              at org.rhq.enterprise.server.util.CriteriaQueryGenerator.setBindValues(CriteriaQueryGenerator.java:887)
              at org.rhq.enterprise.server.util.CriteriaQueryGenerator.getCountQuery(CriteriaQueryGenerator.java:882)
              at org.rhq.enterprise.server.util.CriteriaQueryRunner.execute(CriteriaQueryRunner.java:78)
              at org.rhq.enterprise.server.resource.ResourceManagerBean.findResourcesByCriteria(ResourceManagerBean.java:2614)
              at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:597)
              at org.jboss.as.ee.component.ManagedReferenceMethodInterceptorFactory$ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptorFactory.java:72)
      

       

      Again, I haven't really profiled the server heavily, but it is a bit of a concern.

        • 1. Re: RHQ 4.9 performance issue
          mazz

          Elias Ross wrote:

           

          It seems like it's doing a lot of this. Basically hitting JMX stuff for whatever reason which is fairly inefficient. Maybe this is better in JDK 1.7, but I am using 1.6.

           

          1. "http-/0.0.0.0:7080-192" daemon prio=10 tid=0x00007f5118124000 nid=0x40d waiting for monitor entry [0x00007f505547f000] 
          2.    java.lang.Thread.State: BLOCKED (on object monitor) 
          3.         at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1088) 
          4.         - waiting to lock <0x000000040017b1e0> (a com.sun.jmx.interceptor.DefaultMBeanServerInterceptor) 
          5.         at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:833) 
          6.         at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) 
          7.         at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:288) 
          8.         at $Proxy140.execute(Unknown Source) 
          9.         at org.rhq.enterprise.communications.command.server.CommandProcessor.handleIncomingInvocationRequest(CommandProcessor.java:290) 
          10.         at org.rhq.enterprise.communications.command.server.CommandProcessor.invoke(CommandProcessor.java:184) 
          11.         at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:967) 
          12.         at org.jboss.remoting.transport.servlet.ServletServerInvoker.processRequest(ServletServerInvoker.java:416) 
          13.         at sun.reflect.GeneratedMethodAccessor126.invoke(Unknown Source) 
          14.         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
          15.         at java.lang.reflect.Method.invoke(Method.java:597) 
          16.         at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) 
          17.         at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) 
          18.         at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) 
          19.         at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) 
          20.         at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) 
          21.         at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) 
          22.         at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) 
          23.         at org.jboss.as.jmx.PluggableMBeanServerImpl$TcclMBeanServer.invoke(PluggableMBeanServerImpl.java:527) 
          24.         at org.jboss.as.jmx.PluggableMBeanServerImpl.invoke(PluggableMBeanServerImpl.java:263) 
          25.         at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:288) 
          26.         at $Proxy145.processRequest(Unknown Source) 
          27.         at org.jboss.remoting.transport.servlet.web.ServerInvokerServlet$3.run(ServerInvokerServlet.java:413) 

           

           

          This is nothing new - this is the comm layer and its been doing this since RHQ 2.0. (i.e. since the beginning). So, I would say this doesn't have anything to do with any new CPU issues that might have cropped up recently.

          • 2. Re: RHQ 4.9 performance issue
            genman

            Yeah, probably not anything specific to JMX, but it was probably in the context of the security manager.

             

            I really felt like RHQ hardly used any CPU at all before, at least for RHQ 4.5.1. 'top' reports 800% to 700% CPU, which is a little scary.


            I turned off the security manager on one host, and left it on on another. Looking at the user load stats, the usage is about 30% on one host, compared to 13%. So JBoss is nearly 3 times slower with the security manager in place. The hot spot for security manager seems to be Hibernate with all the reflection. I'll actually gather some real stats and create a bug.


            During RHQ 4.5.1 the CPU usage was at about 3-4% not 30%. This was with the security manager on.


            The other thought was maybe the agents are doing a lot more communication. Meaning they are chatty.

            • 3. Re: RHQ 4.9 performance issue
              jayshaughnessy

              Not much to add at the moment, I would say that the Availability stuff should calm down after agents have started up as the reports should only be sent when there are actually availability changes.  Assuming your monitored resources are staying up (or down) there should not be much availability traffic.

              • 4. Re: Re: RHQ 4.9 performance issue
                genman

                What I figured out was there's something to do with inventory syncing, which doesn't seem to be happening. Which might explain the obvious load issues.

                 

                2013-10-03 20:56:59,381 DEBUG [InventoryManager.discovery-1] (enterprise.communications.command.client.JBossRemotingRemoteCommunicator)-
                {CommandService.remote-pojo-execute-not-permitted}Command not permitted - server reached its limit of concurrent
                invocations [Command: type=[remotepojo]; cmd-in-response=[false];
                config=[{rhq.agent-name=xxx, rhq.externalizable-strategy=AGENT, rhq.security-token=+W4vSg=, rhq.timeout=0, rhq.send-throttle=true}]
                ; params=[{invocation=NameBasedInvocation[mergeInventoryReport], targetInterfaceName=org.rhq.core.clientapi.server.discovery.DiscoveryServerService}]]. Retry in [5,000]ms
                
                

                 

                What I notice is that server discovery is basically stuck in a loop (every 5 seconds) and won't continue. Even though it says:

                 

                2013-10-03 21:02:36,803 INFO  [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Scanned platform and 0 server(s) and discovered 0 new descendant Resource(s).
                

                 

                I'm guessing it has to do with the merge inventory report step failing over and over again. This could be because a lot of agents were started at the same time.

                • 5. Re: Re: RHQ 4.9 performance issue
                  mazz

                  That "server reached its limit" message is because the server got clobbered with agent messages so its throttling the agents - this is the concurrency limits thing:

                   

                  Server Properties - RHQ - Project Documentation Editor

                  • 6. Re: RHQ 4.9 performance issue
                    genman

                    Yeah, I figured this out and increased the concurrency limits. The CPU is fine now, life is good.

                     

                    The thing that confused me was why the server does not report concurrent limit issues in the logs. I guess having an log message or alert there would be helpful...The agent doesn't mention anything except at debug.

                     

                    The other thing that confused me was why after 24 hours why the server could never get into a good state. One thing was I patched the agent with my async avail fix, which I thought might have been the problem. https://bugzilla.redhat.com/show_bug.cgi?id=971556