Hello,
We have a 4 node jboss 5.1.0 cluster. We have observed a strange behavior during fail over:
Some times when there is heavy load on all the nodes and master node is shut down, no other member node becomes coordinator and the following log keeps on coming indefinitely on the other nodes:
DEBUG [org.jgroups.protocols.VERIFY_SUSPECT] diff=1500, mbr 10.XXX.XX.XXX:7600 is dead (passing up SUSPECT event)
DEBUG [org.jgroups.protocols.VERIFY_SUSPECT] diff=1500, mbr 10.XXX.XX.XXX:7600 is dead (passing up SUSPECT event)
DEBUG [org.jgroups.protocols.VERIFY_SUSPECT] diff=1500, mbr 10.XXX.XX.XXX:7600 is dead (passing up SUSPECT event)
..
..
..
and
INFO [org.jboss.ha.framework.interfaces.HAPartition.sfhsw-fdfksdjbvsdvsdv9fsdfj-311ee0d88e9f] Suspected member: 10.XXX.XX.XXX:7600
INFO [org.jboss.ha.framework.interfaces.HAPartition.sfhsw-fdfksdjbvsdvsdv9fsdfj-311ee0d88e9f] Suspected member: 10.XXX.XX.XXX:7600
INFO [org.jboss.ha.framework.interfaces.HAPartition.sfhsw-fdfksdjbvsdvsdv9fsdfj-311ee0d88e9f] Suspected member: 10.XXX.XX.XXX:7600
INFO [org.jboss.ha.framework.interfaces.HAPartition.sfhsw-fdfksdjbvsdvsdv9fsdfj-311ee0d88e9f] Suspected member: 10.XXX.XX.XXX:7600
INFO [org.jboss.ha.framework.interfaces.HAPartition.sfhsw-fdfksdjbvsdvsdv9fsdfj-311ee0d88e9f] Suspected member: 10.XXX.XX.XXX:7600
INFO [org.jboss.ha.framework.interfaces.HAPartition.sfhsw-fdfksdjbvsdvsdv9fsdfj-311ee0d88e9f] Suspected member: 10.XXX.XX.XXX:7600
...
...
..
Any idea why this can be happening?
Any help is highly appreciated.
Thanks