11 Replies Latest reply on Jun 25, 2013 8:40 AM by rhusar

    All / Most elections going to one node

    jmsjr

      mod_cluster-1.2.3

      Apache httpd 2.2.24

      JBoss AS Final 7.1.3 also modules for mod_cluster upgrade from 1.2.1 to 1.2.3

      All of the above built from source

       

      httpd+mod_cluster as load balancer for 2 JBoss nodes running 7.1.3. Final

       

      Failover is working fine, that's not an issue.

      What I am seeing is that the node election is mostly, if not all, going to one node.

      There are at least 5 users testing the system most of the time.

      Sessions are not long-lived, mostly short-lived ( Using @ViewScoped JSF managed beans ).

      Users would open up and close different "views" during the day.

       

      Here's the output from httpd mod_cluster_manager page:

       

      mod_cluster/1.2.3.Final

      Auto Refresh show DUMP output show INFO output

      Node 549de3eb-7aa1-3456-bcc1-8c884bd3bfe0 (ajp://172.21.5.217:8009):

      Enable Contexts Disable Contexts

      Balancer: bpm-webcluster,LBGroup: ,Flushpackets: Off,Flushwait: 10000,Ping: 10000000,Smax: 26,Ttl: 60000000,Status: OK,Elected: 2545,Read: 21757179,Transferred: 5317754,Connected: 0,Load: 80

      Virtual Host 1:

      Contexts:

      /insurance, Status: ENABLED Request: 0 Disable

       

       

      Aliases:

      default-host

      xxxx1.yyyy.zzzz

      xxxx2.yyyy.zzzz

       

       

      Node a80cee80-e1d4-3d88-8cd8-64413f9732eb (ajp://172.21.5.218:8009):

      Enable Contexts Disable Contexts

      Balancer: bpm-webcluster,LBGroup: ,Flushpackets: Off,Flushwait: 10000,Ping: 10000000,Smax: 26,Ttl: 60000000,Status: OK,Elected: 0,Read: 0,Transferred: 0,Connected: 0,Load: 79

      Virtual Host 1:

      Contexts:

      /insurance, Status: ENABLED Request: 0 Disable

       

       

      Aliases:

      default-host

      xxxx1.yyyy.zzzz

      xxxx2.yyyy.zzzz

       

       

      Both JBoss nodes running in standalone-ha.xml configuration, with the following load providers:

       

              <subsystem xmlns="urn:jboss:domain:modcluster:1.1">

                  <mod-cluster-config advertise-socket="modcluster" connector="ajp">

                      <dynamic-load-provider history="5">

                          <load-metric type="busyness" weight="3"/>

                          <load-metric type="heap" weight="2"/>

                          <load-metric type="mem" weight="1"/>

                      </dynamic-load-provider>

                  </mod-cluster-config>

              </subsystem>

       

      Both nodes have identical hardware ( VMWare templates ) and running processes.

       

      I was thinking:

      1) If both JBoss nodes return the same load factor, does httpd+mod_cluster always elect the first node on its list ?

      2) Is there a way to add some randomness to the load factor, but only when all nodes return the same load factor ?

        • 1. Re: All / Most elections going to one node
          jfclere

          1) yes.

          2) Well using a different dynamic-load-provider.

          Usually type="busyness" alone is good to demo the load balancing.

          • 2. Re: All / Most elections going to one node
            jmsjr

            Jean-Frederic Clere wrote:

             

            1) yes.

            2) Well using a different dynamic-load-provider.

            Usually type="busyness" alone is good to demo the load balancing.

             

            Thanks.

             

            After several days ... the distribution of the election is still not balanced ( e.g.majority of the election goes to the first node )

            It's probably because new sessions are started by different users at different times ( e.g. 2 or 3 users are currently NOT creating a new session at the same time ). Thus, when new sessions are started, the busyness in both nodes are the same.

             

            The only time when I see traffic / sessions being started on node2 is when node1 was actively serving a request.

             

            I'll probably add the number of active sessions into the dynamic-load-provide so that the distribution becomes even.

            • 3. Re: All / Most elections going to one node
              jfclere
              • 4. Re: All / Most elections going to one node
                jmsjr

                Jean-Frederic Clere wrote:

                 

                Try <load-metric type="sessions"/>

                See http://docs.jboss.org/mod_cluster/1.2.0/html/java.AS7config.html

                 

                Yup. That's what I meant and that's what I already did. Just needed to get a time / window to restart the clusters.

                • 5. Re: All / Most elections going to one node
                  rhusar

                  I took a look at your problem and what you are seeing did look suspicious. What is actually happening is that mod_cluster updates list of workers e.g. every 10 seconds. New request comes and node with highest load factor is elected. Before another request comes in withing the 10 second window, the list of workers is updated again which does not keep the election preferences from the previous window. So it looks like mod_cluster is not doing any balancing. But this is a result of very small number of requests which generate virtually no load on the server.

                   

                  So I don't think it's a problem. If your number of sessions or load increases you should see even balancing.

                   

                  Using number of active sessions metric seems to be a solution to your problem. The load factor will change the node preference for election. However, this load metric requires an explicit capacity http://docs.jboss.org/mod_cluster/1.2.0/html_single/#ActiveSessionsLoadMetric so it needs fine tuning.

                  • 6. Re: All / Most elections going to one node
                    jmsjr

                    Radoslav Husar wrote:

                     

                    I took a look at your problem and what you are seeing did look suspicious. What is actually happening is that mod_cluster updates list of workers e.g. every 10 seconds. New request comes and node with highest load factor is elected. Before another request comes in withing the 10 second window, the list of workers is updated again which does not keep the election preferences from the previous window. So it looks like mod_cluster is not doing any balancing. But this is a result of very small number of requests which generate virtually no load on the server.

                     

                    So I don't think it's a problem. If your number of sessions or load increases you should see even balancing.

                     

                    Using number of active sessions metric seems to be a solution to your problem. The load factor will change the node preference for election. However, this load metric requires an explicit capacity http://docs.jboss.org/mod_cluster/1.2.0/html_single/#ActiveSessionsLoadMetric so it needs fine tuning.

                     

                    I did try this combination below

                     

                                <mod-cluster-config advertise-socket="modcluster" connector="ajp">

                                    <dynamic-load-provider history="5">

                                        <load-metric type="busyness" weight="3"/>

                                        <load-metric type="heap" weight="2"/>

                                        <load-metric type="mem" weight="1"/>

                                        <load-metric type="sessions" weight="1" capacity="1024"/>

                                    </dynamic-load-provider>

                                </mod-cluster-config>

                     

                    ... but it still did not have the desired effect, but that's probably because the sessions metric are factored in too late / too low.

                    So I am trying changing the weight of the sessions metric to be of the same weight as the busyness metric as below:

                     

                                <mod-cluster-config advertise-socket="modcluster" connector="ajp">

                                    <dynamic-load-provider history="20">

                                        <load-metric type="busyness" weight="3"/>

                                        <load-metric type="sessions" weight="3" capacity="1024"/>

                                        <load-metric type="heap" weight="2"/>

                                        <load-metric type="mem" weight="1"/>

                                    </dynamic-load-provider>

                                </mod-cluster-config>

                     

                    ... and I'll find out that results earlier this coming week.

                    Will post the results later in the week.

                    • 7. Re: All / Most elections going to one node
                      rhusar

                      I suspect this again won't do what you want, because one GC here and there and it will overwhelm sessions/busyness metrics. I would try removing both heap and mem metrics.

                      • 8. Re: All / Most elections going to one node
                        mbabacek

                        Ad "mem" load metric

                        Guys, do not use it. It won't do any good. It has been deprecated and will be removed in the future.

                        See MODCLUSTER-288

                        • 9. Re: All / Most elections going to one node
                          jmsjr

                          Radoslav Husar wrote:

                           

                          I suspect this again won't do what you want, because one GC here and there and it will overwhelm sessions/busyness metrics. I would try removing both heap and mem metrics.

                           

                          Even if the heap metric has a lower weight than the sessions and busyness metric ?

                           

                          What I am trying to do is, when starting a new session:

                           

                          1) Elect the node that is less busy ( via the busyness metric )

                          2) If both nodes have the same load for [1], elect the node that has less sessions

                          3) If both nodes still have the same load for [1] and [2], elect the node that has more heap

                          4) If both nodes still have the same load for [1] and [2] and [3], elect the node that has more memory

                           

                          I'd like the load to be evenly distributed whether there is a very light load ( too few sessions ) or high load ( 100s to 1000s of sessions )

                          I'll remove the mem metric as per Michal Babacek's comment.

                           

                          SIlly question:

                          My assumption at the moment is that an election is only done when a NEW session is being created ( the HTTP request did not include j_session_id ).

                          In the mod_cluster_manager web page, does the "Elected" count includes election of the node for both NEW and EXISTING sessions ???

                           

                          e.g. If sessions A was created 2 minutes ago, and user does an HTTP POST with the j_session_id included, which forces httpd+mod_cluster to direct the HTTP request to the node where the session was initially created because of session stickyness ... does that election count in the mod_cluster_manager URL ?

                           

                          It seems that the "Elected" count includes both NEW and EXISTING, as I do not have thousands of sessions as indicated in my mod_cluster_manager page.

                          If so, would be good if there was a different count for election of NEW sessions.

                          • 10. Re: All / Most elections going to one node
                            jmsjr

                            Michal Babacek wrote:

                             

                            Ad "mem" load metric

                            Guys, do not use it. It won't do any good. It has been deprecated and will be removed in the future.

                            See MODCLUSTER-288

                             

                            OK ... will remove the "mem" metric. Thanks for the heads up.

                            Should it at least be documented in http://docs.jboss.org/mod_cluster/1.2.0/html/ .. .in the same way as "cpu" metric is not really available ?

                            • 11. Re: All / Most elections going to one node
                              rhusar

                              You see, what you described what you want to achieve is fine. But take GC into account. Few dozens of megabytes can easily be cleaned up, and say weights into factor calculation by value 3. But if you have one session on first node, and 2 sessions on second node, it will account to factor by both values of 1. So in that case with extremely small load, even the GC can influence your calculation significantly.

                               

                              In the mod_cluster_manager web page, does the "Elected" count includes election of the node for both NEW and EXISTING sessions ???

                              Should be only new. Maybe your clients are not doing what you are expecting?