2 Replies Latest reply on Nov 25, 2004 4:46 AM by sheckler

    Second node slowing down

    sheckler

      Hi,

      I am running a cluster of 2-4 nodes of JBoss 3.2.6 with HAJMS. Within the development environment (W2K on pcs) everything works fine.
      Within the target environment (Solaris SunOS 5.8) the second node is slowing down dramatically, while deploying the clustered ejb components, but no error or warning is logged. After a long time (some hours) the second node comes up successfully. The second node is started after the forst node has started successfully. The network people told me, that everything was fine for them.

      <!-- The JGroups protocol configuration -->
      <attribute name="PartitionConfig">
      <Config>
      <!-- UDP: if you have a multihomed machine,
      set the bind_addr attribute to the appropriate NIC IP address -->
      <!-- UDP: On Windows machines, because of the media sense feature
      being broken with multicast (even after disabling media sense)
      set the loopback attribute to true
      -->
      <UDP mcast_addr="226.1.2.3" mcast_port="29944"
      ip_ttl="32" ip_mcast="true"
      mcast_send_buf_size="800000" mcast_recv_buf_size="150000"
      ucast_send_buf_size="800000" ucast_recv_buf_size="150000"
      loopback="false" />
      <PING timeout="2000" num_initial_members="3"
      up_thread="true" down_thread="true" />
      <MERGE2 min_interval="10000" max_interval="20000" />
      <FD shun="true" up_thread="true" down_thread="true"
      timeout="2500" max_tries="5" />
      <VERIFY_SUSPECT timeout="3000" num_msgs="3"
      up_thread="true" down_thread="true" />
      <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
      max_xmit_size="8192"
      up_thread="true" down_thread="true" />
      <UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
      down_thread="true" />
      <pbcast.STABLE desired_avg_gossip="20000"
      up_thread="true" down_thread="true" />
      <FRAG frag_size="8192"
      down_thread="true" up_thread="true" />
      <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
      shun="true" print_local_addr="true" />
      <pbcast.STATE_TRANSFER up_thread="true" down_thread="true" />
      </Config>
      </attribute>


      Does anybody know the reason for that behaviour?


      Thanks
      Stefan Heckler

        • 1. non multicating(TCP), server dont see each other.
          sheckler

          hi all ,
          i'm running my JBOSS in cluster without multicasting that means i had configured my JBOSS in TCP ... and i'm running 2 JBOSS on different machine .... but the problem is that my JBOSS starts without any error oe exception but the problem is that they both didnt identify each other ... my cluster-service .xml file is :---


          <!-- UDP: if you have a multihomed machine,
          set the bind_addr attribute to the appropriate NIC IP address -->
          <!-- UDP: On Windows machines, because of the media sense feature
          being broken with multicast (even after disabling media sense)
          set the loopback attribute to true -->
          <TCP start_port="12800"/>
          <TCPPING initial_hosts="192.9.200.152[12800],192.9.200.150[12800]" port_range="5" timeout="3000" num_initial_members="3" up_thread="true" down_thread="true"/>
          <MERGE2 min_interval="5000" max_interval="10000"/>
          <FD shun="true" up_thread="true" down_thread="true" timeout="2500" max_tries="5"/>
          <VERIFY_SUSPECT timeout="3000" num_msgs="3" up_thread="true" down_thread="true"/>
          <pbcast.STABLE desired_avg_gossip="20000" up_thread="true" down_thread="true"/>
          <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800" up_thread="true" down_thread="true"/>
          <UNICAST timeout="5000" window_size="100" min_threshold="10" down_thread="true"/>
          <FRAG frag_size="8192" down_thread="true" up_thread="true"/>
          <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="true" print_local_addr="true"/>
          <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>



          according to documents in "initail hosts" i had given the IP of two machines and in machineA has also the same file and in machineB also have the same cluster-service.xml.

          pls tell me any solution ....

          bela...?

          Regards
          Raj........

          • 2. Re: Second node slowing down
            sheckler

            I tried a separate oracle datasource (connection pool min=1, max =10) for JMS and now it works.