4 Replies Latest reply on Oct 2, 2005 8:23 AM by tomerbd2

    using tcp stack no manual tcp discovery

    tomerbd2 Newbie

      Hi

      I have enabled this section


      <Config>
      <TCP bind_addr="10.10.1.226" start_port="7800" loopback="true"/>
      <TCPPING initial_hosts="10.10.2.80[7800]" port_range="3" timeout="3500"
      num_initial_members="3" up_thread="true" down_thread="true"/>
      <MERGE2 min_interval="5000" max_interval="10000"/>
      <FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" />
      <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
      <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100"
      retransmit_timeout="3000"/>
      <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
      <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false"
      print_local_addr="true" down_thread="true" up_thread="true"/>
      <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
      </Config>


      while trying to make my nodes manualy discover each other.

      Is this the right way?


      Why do i see this "UDP" in the log wasnt it supposed to be "TCP"?

      17:54:03,984 INFO [TreeCache] setting cluster properties from xml to: UDP(ip_m
      cast=true;ip_ttl=8;loopback=false;mcast_addr=230.1.2.7;mcast_port=45577;mcast_re
      cv_buf_size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_sen
      d_buf_size=150000):PING(down_thread=false;num_initial_members=3;timeout=2000;up_
      thread=false):MERGE2(max_interval=20000;min_interval=10000):FD_SOCK:VERIFY_SUSPE
      CT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(down_thread=fal
      se;gc_lag=50;max_xmit_size=8192;retransmit_timeout=600,1200,2400,4800;up_thread=
      false):UNICAST(down_thread=false;min_threshold=10;timeout=600,1200,2400;window_s
      ize=100):pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=fals
      e):FRAG(down_thread=false;frag_size=8192;up_thread=false):pbcast.GMS(join_retry_
      timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):pbcast.STATE_TRA
      NSFER(down_thread=true;up_thread=true)


        • 1. Re: using tcp stack no manual tcp discovery
          Brian Stansberry Master

          I expect you enabled the TCP section in the cluster-service.xml file. The log entry you are seeing is from the TreeCache used for HttpSession replication; this is configured in the tc5-cluster-service.xml file. You'll want to make equivalent changes in that file as well. Note that the two services should use different ports; don't just cut-and-paste from one to the other.

          Also, in your TCPPING entry, it's good to add the local node to the initial node list:

          <TCPPING initial_hosts="10.10.1.226[7800],10.10.2.80[7800]" port_range="3" timeout="3500"
          num_initial_members="3" up_thread="true" down_thread="true"/>


          • 2. Re: using tcp stack no manual tcp discovery
            tomerbd2 Newbie

            Thanks,

            * Yes I have enabled the TCP section in the cluster-service.xml file and disabled the UDP one
            * I have added the local ip to the initial node list as you suggested

            * For some reason my 2 nodes do not recognize each other, i guess this is not an issue of a multicast since im using the TCP stack
            Folllowing is the TCCPING section in both nodes:
            in 10.10.2.80:

            <Config>
             <TCP bind_addr="10.10.2.80" start_port="7800" loopback="false"/>
             <TCPPING initial_hosts="10.10.2.80[7800],10.10.2.90[7800]" port_range="3" timeout="3500"
             num_initial_members="3" up_thread="true" down_thread="true"/>
             <MERGE2 min_interval="5000" max_interval="10000"/>
             <FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" />
             <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
             <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100"
             retransmit_timeout="3000"/>
             <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
             <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false"
             print_local_addr="true" down_thread="true" up_thread="true"/>
             <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
             </Config>
            


            and in 10.10.2.90:
             <Config>
             <TCP bind_addr="10.10.2.90" start_port="7800" loopback="false"/>
             <TCPPING initial_hosts="10.10.2.90[7800],10.10.2.80[7800]" port_range="3" timeout="3500"
             num_initial_members="3" up_thread="true" down_thread="true"/>
             <MERGE2 min_interval="5000" max_interval="10000"/>
             <FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" />
             <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
             <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100"
             retransmit_timeout="3000"/>
             <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
             <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false"
             print_local_addr="true" down_thread="true" up_thread="true"/>
             <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
             </Config>


            I know they do not recognize each other because in jmx-console i see this in the partition service

            10.10.2.80:

            CurrentView java.util.Vector R [10.10.2.80:2099] MBean Attribute.


            as you can see 10.10.2.80 did not recognize 10.10.2.90, however 10.10.2.90 in the following section recognized nodes comming from 10.10.2.32 which have a different partition name and are using UDP stack...

            Any suggestions of how can i investigate / make a progress with this issue?

            10.10.2.90:
            CurrentView java.util.Vector R [10.10.2.32:1a9d8094:10691ba153e:-7fff, 10.10.2.32:-4186d067:10691ba9254:-7fff, 10.10.2.90:2099] MBean Attribute.


            • 3. Re: using tcp stack no manual tcp discovery
              tomerbd2 Newbie

              I have some more important info to add.

              I tried two nodes on 2 different machines (not the ones previously mentioned) one of them is a linux and the other is a solaris and everything is just fine! the nodes did not see other nodes that didnt belong to their partition and they did see each other (each node saw the other that was belong to the same partition).

              That means that the problems of communication between nodes are due to the OS probably (my estimation), can anyone hint me what to check/update on linux/solaris OS such that the TCPPING configured nodes discovery and communication will work well?

              (note that i have read the troubleshooting section in JbossClustering7.pdf and it only talked about multicast but im not doing multicast...)

              Tomer

              • 4. Re: using tcp stack no manual tcp discovery
                tomerbd2 Newbie

                I kind of solved my problem so i want to update you, i have 2 running nodes in windows in TCP mode, one on port 7800 and another in port 7801 one is master and another is slave and im happy :)