-
1. Re: Problem with GMS joining nodes when master node was killed
praveen.kumar Jun 11, 2010 4:03 AM (in response to felixreuthlinger)Hi Felix,
This problem may be due to multiple NIC/IP address.Edit the following two files:
.../deploy/cluster-service.xml
.../deploy/tc5-cluster.sar/META-INF/jboss-service.xml
<!--
The default UDP stack:
- If you have a multihomed machine, set the UDP protocol's bind_addr attribute to the
appropriate NIC IP address, e.g bind_addr="192.168.0.2".
- On Windows machines, because of the media sense feature being broken with multicast
(even after disabling media sense) set the UDP protocol's loopback attribute to true
-->Here in bind_addr,you specify the IP address which you want to bind for multicasting.
Check if it will work.
Cheers,
Praveen Kumar
-
2. Re: Problem with GMS joining nodes when master node was killed
payne51558 Oct 21, 2010 5:03 PM (in response to felixreuthlinger)Did you ever find a resolution to this? I am experiancing this same issue with 15 nodes in Jboss 5.1GA and the only way I have worked around is to change the multicast broadcast address and shutdown the nodes one at a time and update w/ the new multicast address. This only gets me around the isssue but seems to come back after a few days or if I manually down a node for maintenance.
2010-10-21 15:34:45,206 INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] (main) Initializing partition DefaultPartition
2010-10-21 15:34:45,277 INFO [STDOUT] (JBoss System Threads(1)-3)
---------------------------------------------------------
GMS: address is 10.230.3.15:54390 (cluster=DefaultPartition)
---------------------------------------------------------
2010-10-21 15:34:45,385 INFO [org.jboss.cache.jmx.PlatformMBeanServerRegistration] (main) JBossCache MBeans were successfully registered to the platform mbean server.
2010-10-21 15:34:45,449 INFO [STDOUT] (main)
---------------------------------------------------------
GMS: address is 10.230.3.15:54390 (cluster=DefaultPartition-HAPartitionCache)
---------------------------------------------------------
2010-10-21 15:34:45,501 INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] (JBoss System Threads(1)-3) Number of cluster members: 15
2010-10-21 15:34:45,504 INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] (JBoss System Threads(1)-3) Other members: 14
2010-10-21 15:34:48,464 WARN [org.jgroups.protocols.pbcast.GMS] (main) join(10.230.3.15:54390) sent to 10.230.3.16:42616 timed out (after 3000 ms), retrying
2010-10-21 15:34:51,467 WARN [org.jgroups.protocols.pbcast.GMS] (main) join(10.230.3.15:54390) sent to 10.230.3.16:42616 timed out (after 3000 ms), retrying
2010-10-21 15:34:54,470 WARN [org.jgroups.protocols.pbcast.GMS] (main) join(10.230.3.15:54390) sent to 10.230.3.16:42616 timed out (after 3000 ms), retrying
Also seeing this on the 10.230.3.16 node that 10.230.3.15 is trying to connect to:
2010-10-21 17:02:14,146 WARN [org.jgroups.protocols.pbcast.GMS] (ViewHandler,DefaultPartition-HAPartitionCache,10.230.3.16:42616) GMS flush by coordinator at 10.230.3.16:42616 failed
Appreciate your response!
Thanks
Cody
2010-10-21 15:34:45,206 INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] (main) Initializing partition DefaultPartition2010-10-21 15:34:45,277 INFO [STDOUT] (JBoss System Threads(1)-3)---------------------------------------------------------GMS: address is 10.230.3.15:54390 (cluster=DefaultPartition)---------------------------------------------------------2010-10-21 15:34:45,385 INFO [org.jboss.cache.jmx.PlatformMBeanServerRegistration] (main) JBossCache MBeans were successfully registered to the platform mbean server.2010-10-21 15:34:45,449 INFO [STDOUT] (main)---------------------------------------------------------GMS: address is 10.230.3.15:54390 (cluster=DefaultPartition-HAPartitionCache)---------------------------------------------------------2010-10-21 15:34:45,501 INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] (JBoss System Threads(1)-3) Number of cluster members: 152010-10-21 15:34:45,504 INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] (JBoss System Threads(1)-3) Other members: 142010-10-21 15:34:48,464 WARN [org.jgroups.protocols.pbcast.GMS] (main) join(10.230.3.15:54390) sent to 10.230.3.16:42616 timed out (after 3000 ms), retrying2010-10-21 15:34:51,467 WARN [org.jgroups.protocols.pbcast.GMS] (main) join(10.230.3.15:54390) sent to 10.230.3.16:42616 timed out (after 3000 ms), retrying2010-10-21 15:34:54,470 WARN [org.jgroups.protocols.pbcast.GMS] (main) join(10.230.3.15:54390) sent to 10.230.3.16:42616 timed out (after 3000 ms), retrying -
3. Re: Problem with GMS joining nodes when master node was killed
felixreuthlinger Oct 22, 2010 3:10 AM (in response to payne51558)Hey Praveen, hey Cody,
at first, I did not want to bind the address to a specific IP. Did not try this way @ Praveen Kumar.
And for your request @ Cody: no, I did not solve this problem... when this happened to me, I had to clean and restart the nodes more than once until somehow the information got lost and the nodes could join a well formed cluster again. But I can't remember the exact way how to get this running again.
Cheers
Felix
-
4. Re: Problem with GMS joining nodes when master node was killed
swapnath Jan 30, 2013 2:44 AM (in response to felixreuthlinger)Hi Felix,
Did you find solution for this?, I've similar problem.