Definition
Failure detection protocol based on a ring of TCP sockets created between group members. Each member in a group connects to its neighbor (last member connects to first) thus forming a ring. Member B is suspected when its neighbor A detects abnormally closed TCP socket (presumably due to a node B crash). However, if a member B is about to leave gracefully, it lets its neighbor A know, so that it does not become suspected.
One FD_SOCK disadvantage is that hung servers and/or crashed switches will not cause sockets to be closed. Therefore hung members will not be suspected and network partitions due to switch failures will not be detected. A solution to this problem is to use both FD and FD_SOCK failure detection protocols. For more details refer to Failure Detection
Configuration Example
<FD_SOCK>
Configuration Example
Name | Description |
---|---|
bind_addr | The NIC on which the ServerSocket should listen on |
bind_interface_str | The interface (NIC) which should be used by this transport |
get_cache_timeout | Timeout for getting socket cache from coordinator. Default is 1000 msec |
id | Give the protocol a different ID if needed so we can have multiple instances of it in the same stack |
keep_alive | Whether to use KEEP_ALIVE on the ping socket or not. Default is true |
level | Sets the logger level (see javadocs) |
name | Give the protocol a different name if needed so we can have multiple instances of it in the same stack |
num_tries | Number of attempts coordinator is solicited for socket cache until we give up. Default is 3 |
sock_conn_timeout | Max time in millis to wait for ping Socket.connect() to return |
start_port | Start port for server socket. Default value of 0 picks a random port |
stats | Determines whether to collect statistics (and expose them via JMX). Default is true |
suspect_msg_interval | Interval for broadcasting suspect messages. Default is 5000 msec |
See also Protocol Configuration Common Parameters.
Comments