merge error in AS5.1
ablevine1 Aug 2, 2011 8:12 PMI am using jboss 5.1.0 with JGroups 2.6.10GA
I have a cluster with 3 members configured to use the TCP stack config seen below: and it seems to work fine for a while but eventually merging fails and I end up with multiple clusters.
<stack name="tcp"
description="TCP based stack, with flow control and message bundling.
TCP stacks are usually used when IP multicasting cannot
be used in a network, e.g. because it is disabled (e.g.
routers discard multicast)">
<config>
<TCP
singleton_name="tcp"
start_port="${jboss.jgroups.tcp.tcp_port:7600}"
tcp_nodelay="true"
loopback="false"
recv_buf_size="20000000"
send_buf_size="640000"
discard_incompatible_packets="true"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
enable_bundling="true"
use_send_queues="false"
sock_conn_timeout="300"
skip_suspected_members="true"
timer.num_threads="12"
enable_diagnostics="${jboss.jgroups.enable_diagnostics:true}"
diagnostics_addr="${jboss.jgroups.diagnostics_addr:224.0.0.75}"
diagnostics_port="${jboss.jgroups.diagnostics_port:7500}"
use_concurrent_stack="true"
thread_pool.enabled="true"
thread_pool.min_threads="20"
thread_pool.max_threads="200"
thread_pool.keep_alive_time="5000"
thread_pool.queue_enabled="true"
thread_pool.queue_max_size="1000"
thread_pool.rejection_policy="discard"
oob_thread_pool.enabled="true"
oob_thread_pool.min_threads="1"
oob_thread_pool.max_threads="20"
oob_thread_pool.keep_alive_time="5000"
oob_thread_pool.queue_enabled="false"
oob_thread_pool.queue_max_size="100"
oob_thread_pool.rejection_policy="run"/>
<!-- Alternative 1: multicast-based automatic discovery. -->
<!-- Alternative 2: non multicast-based replacement for MPING. Requires a static configuration
of *all* possible cluster members.
-->
<TCPPING timeout="3000"
initial_hosts="${jgroups.tcpping.initial_hosts:jboss-batch-stage-1[7600],jboss-batch-stage-2[7600],jboss-batch-stage-3[7600]}"
port_range="1"
num_initial_members="2"/>
<MERGE2 max_interval="100000" min_interval="20000"/>
<FD_SOCK/>
<FD timeout="6000" max_tries="5" shun="true"/>
<VERIFY_SUSPECT timeout="1500"/>
<pbcast.NAKACK use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
discard_delivered_msgs="true"/>
<UNICAST timeout="300,600,1200,2400,3600"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
shun="true"
view_bundling="true"
view_ack_collection_timeout="5000"/>
<FC max_credits="2000000" min_threshold="0.10"
ignore_synchronous_response="true"/>
<FRAG2 frag_size="60000"/>
<!-- pbcast.STREAMING_STATE_TRANSFER/ -->
<pbcast.STATE_TRANSFER/>
<pbcast.FLUSH timeout="0"/>
</config>
</stack>
I see the following ERROR log statement repeatedly.
ERROR [OOB-21202,10.10.67.81:7600][2011-07-31 09:12:22,290][org.jgroups.protocols.pbcast.GMS] CoordGmsImpl.java(217): merge_id ([10.10.67.81:7600|1312128725273]) or this.merge_id (null) is null (sender=10.10.67.81:7600).
One thing to note is that the sender IP is the IP of the machine I'm seeing the error on.
I see this on two of the 3 machines in the cluster
Any idea what exactly this means and/or how I can fix it?