4 Replies Latest reply on Oct 2, 2005 8:23 AM by tomerbd2

using tcp stack no manual tcp discovery

tomerbd2 Sep 27, 2005 10:58 AM

Hi

I have enabled this section

<Config>
<TCP bind_addr="10.10.1.226" start_port="7800" loopback="true"/>
<TCPPING initial_hosts="10.10.2.80[7800]" port_range="3" timeout="3500"
num_initial_members="3" up_thread="true" down_thread="true"/>
<MERGE2 min_interval="5000" max_interval="10000"/>
<FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" />
<VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
<pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100"
retransmit_timeout="3000"/>
<pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
<pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false"
print_local_addr="true" down_thread="true" up_thread="true"/>
<pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
</Config>

while trying to make my nodes manualy discover each other.

Is this the right way?

Why do i see this "UDP" in the log wasnt it supposed to be "TCP"?

17:54:03,984 INFO [TreeCache] setting cluster properties from xml to: UDP(ip_m
cast=true;ip_ttl=8;loopback=false;mcast_addr=230.1.2.7;mcast_port=45577;mcast_re
cv_buf_size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_sen
d_buf_size=150000):PING(down_thread=false;num_initial_members=3;timeout=2000;up_
thread=false):MERGE2(max_interval=20000;min_interval=10000):FD_SOCK:VERIFY_SUSPE
CT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(down_thread=fal
se;gc_lag=50;max_xmit_size=8192;retransmit_timeout=600,1200,2400,4800;up_thread=
false):UNICAST(down_thread=false;min_threshold=10;timeout=600,1200,2400;window_s
ize=100):pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=fals
e):FRAG(down_thread=false;frag_size=8192;up_thread=false):pbcast.GMS(join_retry_
timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):pbcast.STATE_TRA
NSFER(down_thread=true;up_thread=true)

1. Re: using tcp stack no manual tcp discovery

brian.stansberry Sep 27, 2005 12:26 PM (in response to tomerbd2)
I expect you enabled the TCP section in the cluster-service.xml file. The log entry you are seeing is from the TreeCache used for HttpSession replication; this is configured in the tc5-cluster-service.xml file. You'll want to make equivalent changes in that file as well. Note that the two services should use different ports; don't just cut-and-paste from one to the other.

Also, in your TCPPING entry, it's good to add the local node to the initial node list:

<TCPPING initial_hosts="10.10.1.226[7800],10.10.2.80[7800]" port_range="3" timeout="3500" num_initial_members="3" up_thread="true" down_thread="true"/>
Actions

2. Re: using tcp stack no manual tcp discovery

tomerbd2 Sep 28, 2005 1:55 AM (in response to tomerbd2)

Thanks,

* Yes I have enabled the TCP section in the cluster-service.xml file and disabled the UDP one
* I have added the local ip to the initial node list as you suggested

* For some reason my 2 nodes do not recognize each other, i guess this is not an issue of a multicast since im using the TCP stack
Folllowing is the TCCPING section in both nodes:
in 10.10.2.80:

<Config>
 <TCP bind_addr="10.10.2.80" start_port="7800" loopback="false"/>
 <TCPPING initial_hosts="10.10.2.80[7800],10.10.2.90[7800]" port_range="3" timeout="3500"
 num_initial_members="3" up_thread="true" down_thread="true"/>
 <MERGE2 min_interval="5000" max_interval="10000"/>
 <FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" />
 <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
 <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100"
 retransmit_timeout="3000"/>
 <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
 <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false"
 print_local_addr="true" down_thread="true" up_thread="true"/>
 <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
 </Config>

and in 10.10.2.90:

 <Config>
 <TCP bind_addr="10.10.2.90" start_port="7800" loopback="false"/>
 <TCPPING initial_hosts="10.10.2.90[7800],10.10.2.80[7800]" port_range="3" timeout="3500"
 num_initial_members="3" up_thread="true" down_thread="true"/>
 <MERGE2 min_interval="5000" max_interval="10000"/>
 <FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" />
 <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
 <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100"
 retransmit_timeout="3000"/>
 <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
 <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false"
 print_local_addr="true" down_thread="true" up_thread="true"/>
 <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
 </Config>

I know they do not recognize each other because in jmx-console i see this in the partition service

10.10.2.80:

CurrentView java.util.Vector R [10.10.2.80:2099] MBean Attribute.

as you can see 10.10.2.80 did not recognize 10.10.2.90, however 10.10.2.90 in the following section recognized nodes comming from 10.10.2.32 which have a different partition name and are using UDP stack...

Any suggestions of how can i investigate / make a progress with this issue?

10.10.2.90:

CurrentView java.util.Vector R [10.10.2.32:1a9d8094:10691ba153e:-7fff, 10.10.2.32:-4186d067:10691ba9254:-7fff, 10.10.2.90:2099] MBean Attribute.

3. Re: using tcp stack no manual tcp discovery

tomerbd2 Sep 28, 2005 4:47 AM (in response to tomerbd2)

I have some more important info to add.

I tried two nodes on 2 different machines (not the ones previously mentioned) one of them is a linux and the other is a solaris and everything is just fine! the nodes did not see other nodes that didnt belong to their partition and they did see each other (each node saw the other that was belong to the same partition).

That means that the problems of communication between nodes are due to the OS probably (my estimation), can anyone hint me what to check/update on linux/solaris OS such that the TCPPING configured nodes discovery and communication will work well?

(note that i have read the troubleshooting section in JbossClustering7.pdf and it only talked about multicast but im not doing multicast...)

Tomer
Actions
4. Re: using tcp stack no manual tcp discovery

tomerbd2 Oct 2, 2005 8:23 AM (in response to tomerbd2)

I kind of solved my problem so i want to update you, i have 2 running nodes in windows in TCP mode, one on port 7800 and another in port 7801 one is master and another is slave and im happy :)
Actions

Go to original post