-
1. Re: Does anyone know what causes a server to drop out of the
maddmike Oct 1, 2003 8:46 AM (in response to pnwhitney)I would also like to know. I am having issues with a server dropping out of a cluster and then not being able to re-join.
Jboss Version: 3.2.1
Java Version: 1.4.1_03-b02
OS Version: Solaris 5.9 Generic_112233-06 -
2. Re: Does anyone know what causes a server to drop out of the
pnwhitney Oct 1, 2003 9:10 AM (in response to pnwhitney)You got me excited there for a moment Mike.
I thought I actually got someone to reply to one of my
postings!
I'll post bak if I find anything.
FYI, one action I am taking is to increase the timeout
for the FD and VERIFY_SUSPECT portions of the protocol. I have no idea if this will help, but I have to
try something.
Thanks,
Pete -
3. Re: Does anyone know what causes a server to drop out of the
juha Oct 1, 2003 5:35 PM (in response to pnwhitney)Verify suspect would have been my first guess if the node is busy and is not answering to other nodes.
I'd be curious to know if that solves it.
-- Juha -
4. Re: Does anyone know what causes a server to drop out of the
pnwhitney Oct 4, 2003 5:06 AM (in response to pnwhitney)OK Guys,
Here's what I know at this point.
I've made the following changes to my cluster-service.xml file:
Added a long timeout to the FD element timeout="20000" (20 seconds)
Extended the timeout on the VERIFY_SUSPECT element timeout="15000) (15 seconds)
The good side is that I've yet to see a cluster dropout since making these changes.
Unfortunately, due to circumstances beyond my control I haven't been able to keep our servers up for
extended periods. We have a power failure last week
and this weekend we had to shutdown our non-production servers.
I hope to be able to show multi-day cluster stability
next week.
Hope that helps.
Thanks,
Pete -
5. Re: Does anyone know what causes a server to drop out of the
belaban Oct 9, 2003 9:15 PM (in response to pnwhitney)Set shun=true in both FD and GMS, and the member will re-join the cluster after leaving it. Increasing the timeout in VERIFY_SUSPECT will mitigate the problem, but not completely eliminate it.
Bela