8 Replies Latest reply on Sep 12, 2011 5:12 AM by mdpjhammett

    Can't get more than 16 nodes to talk to mod_cluster

    mdpjhammett

      Solaris 10 SPARC

      Apache/2.2.20

      mod_cluster/1.1.3

      JBoss 5.1.1

       

      I'm getting an error from mod_cluster in the JBoss logs when I start up a 17th node (I need to be able to get 40 working with mod_cluster). The first 16 nodes start up fine and populate the nodes and contexts properly in the mod_cluster manager. The 17th node and beyond throw the errors below and only populate the manager with a node entry but no contexts. The Load factor stays at -1 for the erring nodes.


      2011-09-07 14:31:37,682 INFO  [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] (Incoming-11,10.2.32.21:55201) Error parsing response header for command CONFIG
      at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.sendRequest(DefaultMCMPHandler.java:666)
      at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.sendRequests(DefaultMCMPHandler.java:476)
      at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:417)
      at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:377)
      at org.jboss.modcluster.ha.ClusteredMCMPHandlerImpl.updateServersFromMasterNode(ClusteredMCMPHandlerImpl.java:144)
      at org.jboss.modcluster.ha.HAModClusterService$RpcHandler.getClusterCoordinatorState(HAModClusterService.java:907)
      
      2011-09-07 16:05:33,168 ERROR [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] (Incoming-15,10.2.32.22:55200) Error [null: null:
      {4}] sending command CONFIG to proxy pweb1dsm/10.2.32.48:6666, configuration will be reset
      2011-09-07 16:05:33,184 INFO  [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] (Incoming-15,10.2.32.22:55200) Error parsing respo
      nse header for command CONFIG
      java.lang.StringIndexOutOfBoundsException: String index out of range: -1
      
      at java.lang.String.substring(String.java:1937)
      at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.sendRequest(DefaultMCMPHandler.java:666)
      at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.sendRequests(DefaultMCMPHandler.java:476)
      at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:417)
      at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:377)
        at org.jboss.modcluster.ha.ClusteredMCMPHandlerImpl.updateServersFromMasterNode(ClusteredMCMPHandlerImpl.java:144)
      
      
      
      

       

      Debug logs in Apache and JBoss don't seem to show anything interesting. I set these in the Apache config to rule out any limits:

       

      Maxnode 400
      Maxhost 400
      Maxcontext 400
      

       

      It doesn't matter which nodes of the 40 I try to start up...only the first 16 will be recognized properly (and everything about those 16 works fine). I'm using proxyList to specify the web servers instead of Advertise. Any pointers?

        • 1. Re: Can't get more than 16 nodes to talk to mod_cluster
          pferraro

          Can you turn on trace logging for the org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler class?  That will output the problematic INFO-RSP response from the load balancer, and allow us to better assess the issue.

          • 2. Re: Can't get more than 16 nodes to talk to mod_cluster
            mdpjhammett

            I enabled TRACE on the org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler class...does this show the response?

             

            2011-09-10 22:49:35,369 TRACE [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] (Incoming-11,10.2.32.19:55206) Sending command [org.jboss.modcluster.mcmp.impl.DefaultMCMPRequest{requestType=INFO,wildcard=false,jvmRoute=null,parameters={}}] to proxy [pweb1dsm/10.2.32.48:6666]
            2011-09-10 22:49:35,381 TRACE [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] (Incoming-11,10.2.32.19:55206) [org.jboss.modcluster.mcmp.impl.DefaultMCMPRequest{requestType=CONFIG,wildcard=false,jvmRoute=117,parameters={Host=10.2.32.19, Maxattempts=1, Port=7171, StickySessionForce=No, Timeout=600, Type=ajp, ping=20}}, org.jboss.modcluster.mcmp.impl.DefaultMCMPRequest{requestType=ENABLE-APP,wildcard=false,jvmRoute=117,parameters={Alias=localhost, Context=/}}, org.jboss.modcluster.mcmp.impl.DefaultMCMPRequest{requestType=ENABLE-APP,wildcard=false,jvmRoute=117,parameters={Alias=localhost, Context=/ws/profile/v2}}, org.jboss.modcluster.mcmp.impl.DefaultMCMPRequest{requestType=ENABLE-APP,wildcard=false,jvmRoute=117,parameters={Alias=localhost, Context=/CSR}}, org.jboss.modcluster.mcmp.impl.DefaultMCMPRequest{requestType=ENABLE-APP,wildcard=false,jvmRoute=117,parameters={Alias=localhost, Context=/dyn}}, org.jboss.modcluster.mcmp.impl.DefaultMCMPRequest{requestType=ENABLE-APP,wildcard=false,jvmRoute=117,parameters={Alias=localhost, Context=/ws/profile}}]
            2011-09-10 22:49:35,382 TRACE [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] (Incoming-11,10.2.32.19:55206) Sending command [org.jboss.modcluster.mcmp.impl.DefaultMCMPRequest{requestType=CONFIG,wildcard=false,jvmRoute=117,parameters={Host=10.2.32.19, Maxattempts=1, Port=7171, StickySessionForce=No, Timeout=600, Type=ajp, ping=20}}] to proxy [pweb1dsm/10.2.32.48:6666]
            2011-09-10 22:49:35,383 INFO  [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] (Incoming-11,10.2.32.19:55206) Error parsing response header for command CONFIG
            java.lang.StringIndexOutOfBoundsException: String index out of range: -1
                    at java.lang.String.substring(String.java:1937)
                    at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.sendRequest(DefaultMCMPHandler.java:666)
                    at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.sendRequests(DefaultMCMPHandler.java:476)
                    at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:417)
                    at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:377)
                    at org.jboss.modcluster.ha.ClusteredMCMPHandlerImpl.updateServersFromMasterNode(ClusteredMCMPHandlerImpl.java:144)
                    at org.jboss.modcluster.ha.HAModClusterService$RpcHandler.getClusterCoordinatorState(HAModClusterService.java:907)
            ...
            2011-09-10 22:49:35,386 ERROR [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] (Incoming-11,10.2.32.19:55206) Error [null: null: {4}] sending command CONFIG to proxy pweb1dsm/10.2.32.48:6666, configuration will be reset
            
            
            • 3. Re: Can't get more than 16 nodes to talk to mod_cluster
              mdpjhammett

              I'm also seeing this:

               

               

              2011-09-11 00:00:23,188 ERROR [org.jboss.modcluster.load.impl.DynamicLoadBalanceFactorProvider] (ContainerBackgroundProcessor[Standa
              rdEngine[jboss.web]]) committed = 8559067136 should be < max = 8536260608
              java.lang.IllegalArgumentException: committed = 8559067136 should be < max = 8536260608
                      at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:145)
                      at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
                      at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:61)
                      at org.jboss.modcluster.load.metric.impl.HeapMemoryUsageLoadMetric.getLoad(HeapMemoryUsageLoadMetric.java:56)
                      at org.jboss.modcluster.load.impl.DynamicLoadBalanceFactorProvider.getLoadBalanceFactor(DynamicLoadBalanceFactorProvider.jav
              a:135)
                      at org.jboss.modcluster.ModClusterService.getLoadBalanceFactor(ModClusterService.java:508)
                      at org.jboss.modcluster.ha.HAModClusterService$ClusteredModClusterService.status(HAModClusterService.java:1161)
                      at org.jboss.modcluster.ha.HAModClusterService.status(HAModClusterService.java:360)
                      at org.jboss.modcluster.catalina.CatalinaEventHandlerAdapter.lifecycleEvent(CatalinaEventHandlerAdapter.java:323)
                      at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
                      at org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1348)
                      at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1612)
                      at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1601)
                      at java.lang.Thread.run(Thread.java:662)
              
              
              • 4. Re: Can't get more than 16 nodes to talk to mod_cluster
                mdpjhammett

                RESOLVED...I took a peek at org.jboss.ha.framework.server.ClusterPartition and saw that the JGroups cluster nodes were not all talking to each other. It turns out UDP multicast packets were going to other nodes on each server but not between servers. I bound JGroups to a different interface with -Djgroups.bind_addr to fix multicasting, restarted all nodes and it all works fine now.

                 

                I'm still curious about the error above but I guess it's a topic for another thread. I wonder if it has something to do with my using -XX:+UseAdaptiveGCBoundary

                 

                EDIT: I'm still getting the former error as well (String index out of range), so it must've been a red herring.

                • 5. Re: Can't get more than 16 nodes to talk to mod_cluster
                  rhusar

                  Which JDK are you using?

                   

                  $ java -version

                  • 6. Re: Can't get more than 16 nodes to talk to mod_cluster
                    mdpjhammett

                    java version "1.6.0_26"

                    Java(TM) SE Runtime Environment (build 1.6.0_26-b03)

                    Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)

                    • 7. Re: Can't get more than 16 nodes to talk to mod_cluster
                      rhusar

                      And are you using 8 GB heap? Just trying to see if that number is right.

                       

                      PS: there is already u27 http://www.oracle.com/technetwork/java/javase/downloads/index-jsp-138363.html (not that it would solve the problem just fyi)

                      • 8. Re: Can't get more than 16 nodes to talk to mod_cluster
                        mdpjhammett

                        Yes 8GB of heap. When I look at the JMX console under java.lang:type=Memory,name=HeapMemoryUsage it actually has this text "javax.management.RuntimeMBeanException: java.lang.IllegalArgumentException: committed = 8559067136 should be < max = 8536260608

                         

                        Sounds like a Java bug (and Google shows a lot of it out in the wild). Thanks everyone!