I tried creating a custom LoadMetric that only uses the AJP connector and that helped, but the load number never gets to 0. Reviewing the mod_cluster code for Apache HTTP, it appears that it would stop sending traffic to a node if it's load was 0...but DynamicLoadBalanceFactorProvider forces the number to be a 1 when I return a load that == capacity:
return 100 - Math.max(0, Math.min(load, 99));
Any alternative ways of marking the node as "offline" when it needs to cool down?
Any recommendations or "best practices" to avoid a thunder herd when adding new nodes?
A challenge that we are seeing is that when new nodes are added to a busy cluster the node receives a majority of the traffic for a few minutes, i assume because his elected numbers are low and his load is high. Problem is however that often times this causes a DOS of that box and sometimes brings the JBoss server down. As a result we've had to make a practice of adding many servers at once to limit this impact.
Unfortunately this is a very important feature to be missing.