Definition
The TCPPING protocol layer retrieves the initial membership in answer to the GMS's FIND_INITIAL_MBRS event. The initial membership is retrieved by directly contacting other group members, sending Messages containing point-to-point membership requests. The responses should allow us to determine the coordinator whom we have to contact in case we want to join the group. When we are a server (after having received the BECOME_SERVER event), we'll respond to TCPPING requests with a TCPPING response. The FIND_INITIAL_MBRS event will eventually be answered with a FIND_INITIAL_MBRS_OK event up the stack.
The TCPPING protocol requires a static configuration, which assumes that you to know in advance where to find other members of your group. For dynamic discovery in a TCP-based stack, use the MPING protocol, which uses multicast discovery, or the TCPGOSSIP protocol, which contacts a Gossip Router to acquire the initial membership.
Configuration Example
<TCPPING initial_hosts="hosta[2300],hostb[3400],hostc[4500]" port_range="3" timeout="3000" num_initial_members="2"></TCPPING>
Configuration Parameters
Name | Description |
---|---|
break_on_coord_rsp | Return from the discovery phase as soon as we have 1 coordinator response |
id | Give the protocol a different ID if needed so we can have multiple instances of it in the same stack |
initial_hosts | Comma delimited list of hosts to be contacted for initial membership |
level | Sets the logger level (see javadocs) |
max_dynamic_hosts | max number of hosts to keep beyond the ones in initial_hosts |
name | Give the protocol a different name if needed so we can have multiple instances of it in the same stack |
num_initial_members | Minimum number of initial members to get a response from. Default is 2 |
num_initial_srv_members | Minimum number of server responses (PingData.isServer()=true). If this value is greater than 0, we'll ignore num_initial_members |
num_ping_requests | Number of discovery requests to be sent distributed over timeout. Default is 2 |
port_range | Number of ports to be probed for initial membership. Default is 1 |
return_entire_cache | Whether or not to return the entire logical-physical address cache mappings on a discovery request, or not. Default is false, except for TCPPING |
stats | Determines whether to collect statistics (and expose them via JMX). Default is true |
timeout | Timeout to wait for the initial members. Default is 3000 msec |
See also Protocol Configuration Common Parameters.
More on initial_hosts
TCPPING.initial_hosts is a static value read from the config when the channel is created. It never changes. If you decide to add a new member who isn't listed in TCPPING.initial_hosts, that node will still be able to join, but you may have the problems discussed below.
TCPPING.initial_hosts serves two functions:
1) First, during channel startup it provides the list of members to contact for the discovery algorithm described above. For this usage, it can be an incomplete list, as long as at least one of the initial members of the group will be included. Examples:
4 nodes, A,B,C,D. On C, TCPPING.initial_hosts="A,D".
Scenario 1:
Current view is {B, A}. B is coordinator. C tries to connect its channel. TCPPING on C contacts A (successfully) and D (unsuccessfully). A tells C that B is coordinator, so GMS on C contacts B to join the group. All is well.
Scenario 2:
Current view is . C tries to connect its channel. TCPPING on C tries to contact A (unsuccessfully) and D (unsuccessfully). C knows nothing about B, so doesn't try to contact it. C gets no responses from A and D, so decides that it is the coordinator. Not good, as we now have two subgroups that need to be merged.
2) Second function is assisting MERGE2. MERGE2 running on a coordinator node periodically tries to contact all possible members of the cluster to see if they think they are also the coordinator of a group with the same name. If it finds a node like that, that means somehow the cluster has split into 2 or more subgroups and the merge process is initiated to cause the subgroups to combine into one. For a stack using TCPPING, TCPPING.initial_hosts controls who gets contacted. If a node (e.g. 'B' above) is not in that list, it will never be contacted and thus a subgroup coordinated by that node will never get merged.
So, basically, to avoid the 2 issues above, TCPPING.initial_hosts should contain the universe of all possible members. If you decide to add a new member who isn't listed in TCP.initial_hosts, that node will still be able to join, but you may have the problems discussed above.
Use of MPING allows you to avoid this static configuration, but requires multicast for discovery.
Advanced
Comments