#1 Check that 'localhost' really resolve to the correct address (e.g. not to 127.0.0.1) on *all* hosts
#2 You can't set AUTO_RECONNECT declaratively, use the following code to do this:
The "localhost" resolves to a valid domain, and all of the nodes (five of them) run on a single box. We are using JGroups 2.4.1.
Is the configuration Okay otherwise?
Why does the split happen at all even if all the nodes are running on the same box in the first place? Could it be because of long GC pauses? Could there be any other reasons?
Please let us know.
Use FD_SOCK instead of or on top of FD (see http://wiki.jboss.org/wiki/Wiki.jsp?page=FDVersusFD_SOCK for details). Suspicions can happen due to a number of reasons, e.g. garbage collection, up queue blocked by callback etc, also explained there
Thanks for your suggestion Bela.
If I use FD_SOCK on top of FD, then what happens when the FD has timed out after retrying, but the socket (FD_SOCK) is still active between the nodes? Would FD send a SUSPECT message?
Thanks in advance.
yes. So set the timeout in FD to a sufficiently high value