13 Replies Latest reply on Oct 22, 2014 6:51 AM by mbabacek

Server load threshold

leaqui Oct 15, 2014 1:40 PM

Hi, I want to know if it is possible (in a sticky session scenario) to automatically send new requests to other servers when server load exceeds some configurable threshold in one server.

Thanks in advance

Leandro

1. Re: Server load threshold

mbabacek Oct 15, 2014 1:59 PM (in response to leaqui)

Dear Leandro,
no, there is no directly configurable attribute that would allow that on the balancer side, but you can tweak capacity attribute of your load metrics on the AS/WildFly side, within mod_cluster subsystem configuration. If the use case is that one/some of the servers are weaker than others, this tweaking might help.

Anyway, with sticky sessions, load balancer will keep sending requests to that one server until it becomes unavailable. Note that I mean requests within the same session, not new sessions requests...

If you feel like there is a need for an additional feature, please, state the expected behaviour clearly on the Jira, filling a feature request.

HTH

K.
Actions
2. Re: Server load threshold

leaqui Oct 15, 2014 2:47 PM (in response to mbabacek)

Thanks Michal, I think I can reach this behavior by implementing some metric that returns 0 or -1 when the load exceeds some threshold, what do you think?
Thanks again

Leandro
Actions
3. Re: Server load threshold

mbabacek Oct 16, 2014 6:42 AM (in response to leaqui)

Well, yes, doing it on the application server side would be definitely more in "mod_cluster way", i.e. not introducing static settings on the balancer side.
Take a look at this example of a custom load metric. It takes the load number from a file, so it's handy for testing.
1 of 1 people found this helpful
Actions
4. Re: Server load threshold

godiedelrio Oct 17, 2014 1:51 PM (in response to mbabacek)

Hi Michal, I'm also interested in the "application server side" taking into account a metric threshold so that the balancer side stops sending requests to the node in which the threshold was reached. Subsequent requests from the sessions on that node and requests from new sessions should then be redirected to other nodes in which the threshold isn't reached yet.
Should the ModClusterService send a STATUS message to the balancer side with value 0 or -1 to reflect this situation? LoadBalanceFactorProvider should have the responsability, in collaboration with LoadMetric, to calculate if any metric exceeded the threshold.
By the way, digging into the code I couldn't find the place in the application server side where a STATUS message is created with value 0 or -2. Are these values used at the present time?
Thanks in advance
Actions
5. Re: Server load threshold

godiedelrio Oct 17, 2014 3:45 PM (in response to godiedelrio)

Reading some related issues in mod_cluster jira, I've just realized that STATUS message with load 0 are generated by using a SimpleLoadBalanceFactorProvider with loadBalanceFactor 0. As I understand it, this value is used to indicate a node is stand-by wich makes that node elegible when other node fails. Therefore, that value can't be used to indicate a node has exceeded some metric threshold.
This leads us to resort to use -1 to indicate a node has reached a threshold in some metric. In order to do so, DynamicLoadBalanceFactorProvider could return -1 in this circumstance.
In a sticky-session scenario, taking into account a threshold prevents subsequent requests from these sessions from overloading the node, maintaining quality of service.
Thanks again.
Actions
6. Re: Server load threshold

mbabacek Oct 20, 2014 2:41 AM (in response to godiedelrio)

Dear Diego, I'm sorry, I'm not following you. Is this a question?
Yes, load 0, since MODCLUSTER-235, indicates a stand-by node, whereas -1 indicates the worker is in an error state.
Feel free to set -1 with your custom load metric and/or play with capacity and history values of the current metrics.

About the balancing logic in general
It is noteworthy that one has to send a substantial amount of requests to see the balancing behaviour, i.e. if you send only three or five requests, it might occur to you that the balancer is targeting overloaded nodes.
You might want to take a look at this article: FAQ · modcluster/mod_cluster Wiki · GitHub

Cheers
Actions
7. Re: Server load threshold

godiedelrio Oct 20, 2014 9:45 AM (in response to mbabacek)

Hi Michal, what i am trying to say is that sending a STATUS message with -1 requires the DynamicLoadBalanceFactorProvider to be modified or a new implementation of LoadBalanceFactorProvider. It doesn't depend only on the value returned by the LoadMetric. Currently, DynamicLoadBalanceFactorProvider normalizes the load factor, whatever is its value, to a number between 1 and 100. In fact, if the load metric value were -1, DynamicLoadBalanceFactorProvider would normalize it to a value of 100.
Hope I've made my self clear.
Actions
8. Re: Re: Server load threshold

mbabacek Oct 20, 2014 10:59 AM (in response to godiedelrio)
Hmm, I see, you don't like:
// apply ceiling & floor and invert to express as "load factor" // result should be a value between 1-100 return 100 - Math.max(0, Math.min(load, 99));
right?

Well, there are two options:

1) Set your returning load and capacity so as your metric does something like this:

measured something - returning 80
measured something - returning 60
measured something - returning 50
measured something - internal threshold crossed - returning 1
returning 1
...

This way, having low history, you will easily make the balancer to avoid this worker (see that FAQ article on GitHub I linked to previously). On the other hand, you are right, the requests within active sessions will keep coming until sessions become inactive.

2) Open a MODCLUSTER JIRA feature request
This might be something along the lines that you want to have the power to programatically disable (switch to error state) the worker node from within your own custom load metric, thus forcing failover.

By the way, you might not be aware of it, but it's possible to trigger failover from your web applications by returning a special predefined HTTP codes, e.g. you might define HTTP 203 as the code on which the balancer does failover to another box: See https://issues.jboss.org/browse/MODCLUSTER-390

Cheers
K.
Actions
9. Re: Server load threshold

godiedelrio Oct 20, 2014 12:37 PM (in response to mbabacek)

Yes, i've been playing around with option 1 to conclude that this can't be solved solely in a LoadMetric, so perhaps we should fill a feature request in jira.
Meanwhile I'm going to look around what you said about the application triggering a failover through a predefined HTTP code.

Many thanks.
Actions
10. Re: Server load threshold

rhusar Oct 21, 2014 4:22 PM (in response to godiedelrio)

Very roughly scanning through this thread, it seems the correct solution would be writing a custom treshold load metric (that can delegate to others or whatever you want to do) and indeed returning load of 0 once the theshold is reached. This way no new sessions will be sent to the node, but the remaining ones will still stay sticky to that node.
Actions
11. Re: Re: Server load threshold

mbabacek Oct 22, 2014 5:27 AM (in response to rhusar)

IMHO nope, see my previous comment, line 4. It's not possible with the current code.

Besides, Diego doesn't want even the current sessions to carry on, he wants to force a failover at a certain threshold. I understand it as some kind of an emergency overload precaution, when it's actually better for the client to be routed to a new worker, because digging out the session data is still faster than continuing with the former overloaded worker.
Anyway, whatever you set as Load in your custom load metric, it won't be lower than 1, hence my recommendations 1) and 2), previous comment.

K.
Actions
12. Re: Re: Server load threshold

rhusar Oct 22, 2014 5:35 AM (in response to mbabacek)

Ok, I guess it used to work, the commit that breaks my recommendation is:

MODCLUSTER-279 mod_cluster returns 503s after STATUS · 1fbff78 · modcluster/mod_cluster · GitHub
Actions
13. Re: Re: Server load threshold

mbabacek Oct 22, 2014 6:51 AM (in response to rhusar)

? [MODCLUSTER-434] Enable workers to tell the balancer they are overloaded at a certain threshold - JBoss Issue Tracker
Actions

Go to original post