I would also like to know. I am having issues with a server dropping out of a cluster and then not being able to re-join.
Jboss Version: 3.2.1
Java Version: 1.4.1_03-b02
OS Version: Solaris 5.9 Generic_112233-06
You got me excited there for a moment Mike.
I thought I actually got someone to reply to one of my
I'll post bak if I find anything.
FYI, one action I am taking is to increase the timeout
for the FD and VERIFY_SUSPECT portions of the protocol. I have no idea if this will help, but I have to
Verify suspect would have been my first guess if the node is busy and is not answering to other nodes.
I'd be curious to know if that solves it.
Here's what I know at this point.
I've made the following changes to my cluster-service.xml file:
Added a long timeout to the FD element timeout="20000" (20 seconds)
Extended the timeout on the VERIFY_SUSPECT element timeout="15000) (15 seconds)
The good side is that I've yet to see a cluster dropout since making these changes.
Unfortunately, due to circumstances beyond my control I haven't been able to keep our servers up for
extended periods. We have a power failure last week
and this weekend we had to shutdown our non-production servers.
I hope to be able to show multi-day cluster stability
Hope that helps.
Set shun=true in both FD and GMS, and the member will re-join the cluster after leaving it. Increasing the timeout in VERIFY_SUSPECT will mitigate the problem, but not completely eliminate it.