1 Reply Latest reply on Aug 11, 2006 12:18 PM by brian.stansberry

    Clustering problem

    earniedyke

      Greetings all.

      We have two nodes, each on its own Win 2000 server, in our production environment. A couple of weeks ago we had to reboot the servers and restart the JBoss instances. Both instances started fine but they no longer see each other as a cluster. No changes where made to cluster-service.xml on either server. I can't seem to find out why they can't see each other anymore. Here is a snipet from cluster-service.xml:

      <attribute name="PartitionConfig">
       <Config>
       <!-- UDP: if you have a multihomed machine,
       set the bind_addr attribute to the appropriate NIC IP address -->
       <!-- UDP: On Windows machines, because of the media sense feature
       being broken with multicast (even after disabling media sense)
       set the loopback attribute to true -->
       <!--UDP mcast_addr="228.1.2.3" mcast_port="45566"-->
       <UDP mcast_addr="228.1.2.1" mcast_port="45566"
      
       ip_ttl="32" ip_mcast="true"
       mcast_send_buf_size="800000" mcast_recv_buf_size="150000"
       ucast_send_buf_size="800000" ucast_recv_buf_size="150000"
       loopback="true" />
       <PING timeout="2000" num_initial_members="3"
       up_thread="true" down_thread="true" />
       <MERGE2 min_interval="10000" max_interval="20000" />
       <FD shun="true" up_thread="true" down_thread="true"
       timeout="2500" max_tries="5" />
       <VERIFY_SUSPECT timeout="3000" num_msgs="3"
       up_thread="true" down_thread="true" />
       <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
       max_xmit_size="8192"
       up_thread="true" down_thread="true" />
       <UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
       down_thread="true" />
       <pbcast.STABLE desired_avg_gossip="20000"
       up_thread="true" down_thread="true" />
       <FRAG frag_size="8192"
       down_thread="true" up_thread="true" />
       <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
       shun="true" print_local_addr="true" />
       <pbcast.STATE_TRANSFER up_thread="true" down_thread="true" />
       </Config>
       </attribute>
      


      Note the mcast_addr is 228.1.2.1. I tried executing the Draw demo program to test things, one instance on each server, and nothing was shared. I did notice however that in the log that is displayed by the Draw program the mcast_addr=228.8.8.8. It was the same on both servers.

      I am at a loss and am looking for any and all suggestions.

      Thanks in advance!!!

      Earnie!