1 Reply Latest reply on Nov 11, 2010 6:14 AM by belaban

    Shutdown cluster node: Timing problem?

    kevin.lohmann

      Hi folks,

       

      I'm using JBoss 5.1.0 EAP and I've set up a cluster with 2 nodes.

       

      From time to time, when I shutdown one node (B with ip-address XXX), the other node (A, with ip-address YYY) logs this messages:

      2010-11-11 11:14:08,565 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to XXX:7900: java.net.SocketException: Socket closed
      2010-11-11 11:14:08,628 INFO  [org.jboss.messaging.core.impl.postoffice.GroupMember] org.jboss.messaging.core.impl.postoffice.GroupMember$ControlMembershipListener@16d478b got new view [YYY:55066|2] [YYY:55066], old view is [YYY:55066|1] [YYY:55066, XXX:55200]
      2010-11-11 11:14:08,628 INFO  [org.jboss.messaging.core.impl.postoffice.GroupMember] I am (YYY:55066)
      2010-11-11 11:14:08,628 INFO  [org.jboss.messaging.core.impl.postoffice.GroupMember] Dead members: 1 ([XXX:55200])
      2010-11-11 11:14:08,628 INFO  [org.jboss.messaging.core.impl.postoffice.GroupMember] All Members : 1 ([YYY:55066])
      2010-11-11 11:14:10,628 INFO  [org.jboss.ha.framework.interfaces.HAPartition.lifecycle.cluster] New cluster view for partition cluster (id: 2, delta: -1) : [YYY:3045]
      2010-11-11 11:14:10,628 INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.cluster] I am (YYY:3045) received membershipChanged event:
      2010-11-11 11:14:10,628 INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.cluster] Dead members: 1 ([XXX:3045])
      2010-11-11 11:14:10,628 INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.cluster] New Members : 0 ([])
      2010-11-11 11:14:10,628 INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.cluster] All Members : 1 ([YYY:3045])
      
      
      

      server.log from node B (the one, that is shutting down):

      2010-11-11 11:14:08,669 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to YYY:7900: java.net.SocketException: Socket closed
      2010-11-11 11:14:10,669 ERROR [org.jgroups.protocols.UNICAST] XXX:55200: sender window for YYY:55066 not found

       

      After restarting node B the two nodes build a cluster again without any error or warning messages.

      I am wondering if these errors are just a timing problem or if there is a really problem I should carry about.

       

      I would appreciate any hints.

       

      Thanks in advance,

      Kevin