-
1. Re: Failover performance
wdfink Jun 27, 2011 9:33 AM (in response to ronsen)The EJB loadbalancing and failover works with a client proxy and a server communication.
The server communicate a shutdown or detect a full stop (e.g. complete hanging or looong full GC).
A crash, e.g. JVM cored, is also detected.
It depends on the situation how long it takes.
Fact is
- all Tx on the chrased server are not commited
- the client proxy might hung for a few millis and try the next server of its list
- the next server provide the new cluster view without the crashed server.
So there is no measurable time for such failover in best case and only a few millis in worst.
-
2. Re: Failover performance
ronsen Jun 27, 2011 10:02 AM (in response to wdfink)Thats means teh detection is done on serverside and the client will be informed?
Because there must be at least a test for a connection to the crashed server, nothing comes back -> next server in list (DP). So there should be a measureable time for it?
But yet, good to know, thanks for clarifying. But how about failover in case of session-replication would that be a measureable value?
-
3. Re: Failover performance
wdfink Jun 27, 2011 1:00 PM (in response to ronsen)For the internal communication you should have a look to:
http://community.jboss.org/wiki/Shunning
http://community.jboss.org/wiki/FDVersusFDSOCK
http://community.jboss.org/wiki/JGroupsPbcastGMS
http://community.jboss.org/wiki/JGroupsFD
You will find a lot of information about it works inside.
With HTTP session-replication I do not work this time.
I know that the most common way is a buddy-replication, only two nodes keep the state of the session.
If the one where the session is connected fail an other server will process the next call. If this is not the 'buddy' the session must be copied to the current instance and this will take it's time depend to the size of the session data.
-
4. Re: Failover performance
ronsen Jun 28, 2011 3:12 AM (in response to wdfink)Great, thanks. I'm going to take a look and this and will try to get on this with the replication
-
5. Re: Failover performance
ronsen Jul 1, 2011 5:47 AM (in response to ronsen)Hey, can somebody probably (please only if you are sure ) why with the load-balancing policy randomRobin/RoundRobin, only the first nodes-1 requests are slow and afterwards everything becomes way faster? Is there something cached and will there be a timeout? when do these values will be invalidated?
As an example, send counting numbers to a cluster with a round-robin policy with a 50ms pause in between and measure the amount of time it takes to print the first clustersize-1 values. I discovered that it increased by a factor of ~4
thanks a lot,