0 Replies Latest reply on Apr 10, 2014 11:48 AM by Stefano Nichele

    "All workers are in error state". What is the reason ?

    Stefano Nichele Newbie

      Hi Alll,

      sporadically i find in httpd error.log "All workers are in error state" message. I tried to undestand the root cause of the issue, but unfortunately i was not able to find any useful info to avoid that error.

       

      My environment is composed by 2 instances of httpd 2.2.24  and by 2 intances of tomcat 7.0.37 + mod cluster 1.2.0. Both httpd instances are connected to both tomcat instances. In front of the httpd instances there is another hardware load balancer.

       

      ModCluster configuration tomcat side is.

       

        <Listener className="org.jboss.modcluster.container.catalina.standalone.ModClusterListener"

                  advertise="false"

                  proxyList="172.18.200.16:6666,172.18.200.17:6666"

                  maxAttempts="3"

                  nodeTimeout="600"

                  workerTimeout="-1"

                  ping="60"

                  stickySession="true"

                  stickySessionRemove="false"

                  stickySessionForce="false"

                  loadMetricClass="org.jboss.modcluster.load.metric.impl.AverageSystemLoadMetric"

                  loadMetricCapacity="100"

        />

       

      MaxThreads for ajo connector is set to 3000 for both tomcat:

         <Connector port="8009" protocol="AJP/1.3" redirectPort="8443" maxThreads="3000" connectionTimeout="600000" />

       

      MaxClients for both httpd is set to 1480

       

      As said all is working fine but the fact that 2 or 3 times per days, just for few seconds, i have "All workers are in error state" even if the number of connections on port 8009 is pretty low (something like 100) and the number of requests/traffic as well (so it seems there is  not correlation between  the load and the error).

       

      In attachment you can find httpd error log with log level set to debug (i have hidden some info with XXXXXXXXXXXX). Error messages appear at 04:15:23.

       

      Can you please help me in figure out the root cause of that issue ?

       

      many thanks in advance

      ste