could you give a bit more information?
What OS dou you use?
what version of JBoss do you use?
both JBoss on one system or two different systems?
do you use IP binding (option -b for run.sh)?
With which IP address your instances are started?
BTW to test the JGroups functionality it is recommend to use the same multicast address and the same ip binding as you try with JBoss, any other combination might have different behaviour.
OS is Windows 2008 server
JBoss 3.2.5 and JDK 1.4.2
we are using windows service using wrapper to run the jboss as a windows service
JBoss instances are running on two seperate servers on same VLAN
i am not using farm deployment. the application .ear files is deployed in server/all/deploy folder as exploded archive
please note currently both nodes are working as master nodes without forming master and child nodes.
i dont use ip binding
below is the configruation being for both nodes.
any idea why clustering is not being established by both nodes, though mutli casting is working.
<attribute name="PartitionConfig">- <Config>
UDP: if you have a multihomed machine,--> - <!--
set the bind_addr attribute to the appropriate NIC IP address
UDP: On Windows machines, because of the media sense feature-->
being broken with multicast (even after disabling media sense)
set the loopback attribute to true<UDP mcast_addr="184.108.40.206" mcast_port="45566" ip_ttl="32" ip_mcast="true" mcast_send_buf_size="800000" mcast_recv_buf_size="150000" ucast_send_buf_size="800000" ucast_recv_buf_size="150000" loopback="false" /><PING timeout="2000" num_initial_members="3" up_thread="true" down_thread="true" /><MERGE2 min_interval="10000" max_interval="20000" /><FD shun="true" up_thread="true" down_thread="true" timeout="2500" max_tries="5" /><VERIFY_SUSPECT timeout="3000" num_msgs="3" up_thread="true" down_thread="true" /><pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800" max_xmit_size="8192" up_thread="true" down_thread="true" /><UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10" down_thread="true" /><pbcast.STABLE desired_avg_gossip="20000" up_thread="true" down_thread="true" /><FRAG frag_size="8192" down_thread="true" up_thread="true" /><pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="true" print_local_addr="true" /><pbcast.STATE_TRANSFER up_thread="true" down_thread="true" />
Ok, quite old JBoss. but nevertheles it should work.
It may help to start JBoss for test without using the windows server mode.
Also I see in your post a hint for windows about setting loopback attribute to TRUE, but you've set it to false.
What IP addresses your windows server use? different subnet can cause problems.
Maybe a test with an actual jboss4.3 or the JGroups test with this version (see my previous comment) will help.
The strange behaviour is that we set up clustering about a two months ago and it worked 100%. After a month we applied small patch which required to restart JBoss. When we restarted one node then it did not joined cluster and when we stopped both and restarted they still not joined cluster. When we started both instances at the same time instantly, both became master nodes, each node has a cluster view having himself as a single node. We noticesd that the machine has two network interfaces, can this cause problems.
Thanks for your help!
The two network interfaces might cause this. It might happen that the missconfiguration is done without effect and you see it only when you restart the instances.
It depends on the IP where you bind your JBoss instances and the multicast address.
It might be that you can form a cluster with different IP binding and or multicast-address.
A simple way to test is http://community.jboss.org/wiki/TestingJBoss
You can use it in parallel to the productive JBoss without sideeffect.
Here you can check different IP binding and multicast addresses (use only a different mcast port)
This proved to be a network issue, where the switches were upgraded and multicasting was not enabled by default.