0 Replies Latest reply on Mar 26, 2010 12:16 AM by sanjayms

    pls help me on JBoss Clustering :(

      Hi Everybody,

       

         I am trying Clustering Example on jboss-5.1.0.GA.

       

         In the mean process I copied JBOSS_HOME/server/all and pasted twice, node1 and node2.

       

         I ran node1 using

         $ ./run.sh -c node1 -g DocsPartition -u 239.255.100.100 -Djboss.messaging.ServerPeerID=1 -Djboss.service.binding.set=ports-default
         I ran node2 using
         $ ./run.sh -c node2 -g DocsPartition -u 239.255.100.100 -Djboss.messaging.ServerPeerID=2 -Djboss.service.binding.set=ports-01
         I was able to get two nodes on the GUI through the following URL respectively.
          
         2 Nodes were working fine within the cluster, till node1 failed ( I mean I executed Ctrl-c on node1).
         In an Ideal Clustering scenario when node1 fails it is the responsibility of node2 to take up node1.

       

         This switch over of responsibility is not happening.

       

         I am pasting last few lines of node2 log when node1 fails. Also find the attached full logs of node1 and node2.

       

         So can anybody pls Explain me in this switch over issue from node1 to node2.

       

       

      16:56:37,986 INFO  [AjpProtocol] Starting Coyote AJP/1.3 on ajp-127.0.0.1-8109

      16:56:37,986 INFO  [ServerImpl] JBoss (Microcontainer) [5.1.0.GA (build: SVNTag=

      JBoss_5_1_0_GA date=200905221053)] Started in 52s:655ms

      16:57:39,063 ERROR [ConnectionTable] failed sending data to 127.0.0.1:7900: java

      .net.SocketException: Connection reset by peer: socket write error

      16:57:40,578 INFO  [DocsPartition] Suspected member: 127.0.0.1:3756

      16:57:40,672 INFO  [RPCManagerImpl] Received new cluster view: [127.0.0.1:3769|2

      ] [127.0.0.1:3769]

      16:57:40,672 INFO  [GroupMember] org.jboss.messaging.core.impl.postoffice.GroupM

      ember$ControlMembershipListener@91d4b5 got new view [127.0.0.1:3769|2] [127.0.0.

      1:3769], old view is [127.0.0.1:3756|1] [127.0.0.1:3756, 127.0.0.1:3769]

      16:57:40,688 INFO  [GroupMember] I am (127.0.0.1:3769)

      16:57:40,688 INFO  [RPCManagerImpl] Received new cluster view: [127.0.0.1:3769|2

      ] [127.0.0.1:3769]

      16:57:40,672 INFO  [DocsPartition] New cluster view for partition DocsPartition

      (id: 2, delta: -1) : [127.0.0.1:1199]

      16:57:40,688 INFO  [MessagingPostOffice] JBoss Messaging is failing over for fai

      led node 1. If there are many messages to reload this may take some time...

      16:57:40,688 INFO  [DocsPartition] I am (127.0.0.1:1199) received membershipChan

      ged event:

      16:57:40,688 INFO  [DocsPartition] Dead members: 1 ([127.0.0.1:1099])

      16:57:40,703 INFO  [DocsPartition] New Members : 0 ([])

      16:57:40,703 INFO  [DocsPartition] All Members : 1 ([127.0.0.1:1199])

      16:57:42,672 WARN  [GMS] 127.0.0.1:3769 failed to collect all ACKs (1) for mcast

      ed view [127.0.0.1:3769|2] [127.0.0.1:3769] after 2000ms, missing ACKs from [127

      .0.0.1:3769] (received=[]), local_addr=127.0.0.1:3769

      16:57:42,703 INFO  [MessagingPostOffice] JBoss Messaging failover completed

      16:57:42,703 INFO  [GroupMember] Dead members: 1 ([127.0.0.1:3756])

      16:57:42,703 INFO  [GroupMember] All Members : 1 ([127.0.0.1:3769])

      16:58:13,703 WARN  [SimpleConnectionManager] A problem has been detected with th

      e connection to remote client 5c4o13w-vi9nra-g761t6bc-1-g761ubvu-am, jmsClientID

      =null. It is possible the client has exited without closing its connection(s) or

      the network has failed. All associated connection resources will be cleaned up.