5 Replies Latest reply on Feb 7, 2006 8:45 AM by hariv

Configuration

hariv Nov 18, 2005 10:27 AM

We are using JGROUPS in our application to primary propogate cached data across different nodes in a subnet. It is a very high volume site and has about 8 nodes in the cluster. The following is the Configuration I am using.

UDP(ucast_send_buf_size=800000;ucast_recv_buf_size=150000):
PING(timeout=2000;num_initial_members=3;up_thread=true;down_thread=true;):
MERGE2(min_interval=1000000;max_interval=2000000):
FD(shun=true;up_thread=true;down_thread=true;timeout=54000000;max_tries=5):
pbcast.NAKACK(gc_lag=50;retransmit_timeout=300,600,1200,2400,4800;max_xmit_size=8192;up_thread=true;down_thread=true):
pbcast.STABLE(desired_avg_gossip=2000000;up_thread=true;down_thread=true):
UNICAST:FRAG(frag_size=8192;down_thread=true;up_thread=true):
pbcast.GMS:VIEW_ENFORCER:QUEUE

Everything is working fine in our dev environment. I will appreciate if somebody can validate the connection string I am using before we move this to production.

Thanks

Hari

1. Re: Configuration

belaban Nov 18, 2005 10:39 AM (in response to hariv)

The timeout value in FD means that crashed members will only be detected after a very long time. Also, VIEW_ENFORCER and QUEUE and non-standard for this stack.
I suggest you use default.xml and modify it to your purpose
Actions
2. Re: Configuration

hariv Feb 1, 2006 2:20 PM (in response to hariv)

We rolled out a jgroups cache based propogation solution in production. This solution was working in our qa environment. But unfortunately the solution is not working in production. I am using the following configuration string

UDP(ucast_send_buf_size=800000;ucast_recv_buf_size=150000):
PING(timeout=2000;num_initial_members=3;up_thread=true;down_thread=true;):
MERGE2(min_interval=1000000;max_interval=2000000):
FD(shun=true;up_thread=true;down_thread=true;timeout=54000000;max_tries=5):
pbcast.NAKACK(gc_lag=50;retransmit_timeout=300,600,1200,2400,4800;max_xmit_size=8192;up_thread=true;down_thread=true):
pbcast.STABLE(desired_avg_gossip=2000000;up_thread=true;down_thread=true):
UNICAST:FRAG(frag_size=8192;down_thread=true;up_thread=true):
pbcast.GMS:VIEW_ENFORCER:QUEUE

I turned on debug for jgroups I can see the following messages in the log file.

xx.xx.105.65 ,xx.xx.105.67,xx.xx.105.69 and xx.xx.105.71
are the machines

[org.jgroups.protocols.pbcast.STABLE] received digest xx.xx.105.65:33163#5 (5), xx.xx.105.67:33020#0 (0), xx.xx.105.69:33018#2 (2), xx.xx.105.71:32975#3 (3) from xx.xx.105.69:33018

2006-02-01 14:00:34,868 DEBUG [org.jgroups.protocols.UDP] received (mcast) 51 bytes from /xx.xx.105.65:33166 (size=51 bytes)
2006-02-01 14:00:34,868 DEBUG [org.jgroups.protocols.UDP] message is [dst: 228.8.8.8:7600, src: xx.xx.105.65:33165 (2 headers), size = 0 bytes], headers are {UDP=[UDP:channel_name=dictionary], PING=[PING: type=GET_MBRS_REQ, arg=null]}
2006-02-01 14:00:34,868 WARN [org.jgroups.protocols.UDP] discarded message from different group (dictionary). Sender was xx.xx.105.65:33165
2006-02-01 14:00:34,868 DEBUG [org.jgroups.protocols.UDP] received (mcast) 51 bytes from /xx.xx.105.65:33166 (size=51 bytes)
2006-02-01 14:00:34,868 DEBUG [org.jgroups.protocols.UDP] message is [dst: 228.8.8.8:7600, src: xx.xx.105.65:33165 (2 headers), size = 0 bytes], headers are {UDP=[UDP:channel_name=dictionary], PING=[PING: type=GET_MBRS_REQ, arg=null]}
2006-02-01 14:00:34,869 WARN [org.jgroups.protocols.UDP] discarded message from different group (dictionary). Sender was xx.xx.105.65:33165
2006-02-01 14:00:34,869 DEBUG [org.jgroups.protocols.PING] received GET_MBRS_REQ from xx.xx.105.65:33165, sending response [PING: type=GET_MBRS_RSP, arg=[own_addr=xx.xx.105.67:33023, coord_addr=xx.xx.105.65:33165, is_server=true]]
2006-02-01 14:00:34,869 DEBUG [org.jgroups.protocols.UDP] sending msg to xx.xx.105.65:33165 (src=xx.xx.105.67:33023), headers are {PING=[PING: type=GET_MBRS_RSP, arg=[own_addr=xx.xx.105.67:33023, coord_addr=xx.xx.105.65:33165, is_server=true]], UDP=[UDP:channel_name=dictionary]}

My understanding is The above log message means that multicasting is enabled.

But when I publish text from one of the node; The other nodes does'nt revceive the publish text. I will appreciate your help in this.
Actions
3. Re: Configuration

hariv Feb 1, 2006 2:57 PM (in response to hariv)

Ben

I get the following message in the log file
DEBUG [org.jgroups.protocols.MERGE2] didn't find multiple coordinators in [[own_addr=xx.xx.105.71:32973, coord_addr=xx.xx.105.65:33161, is_server=true], [own_addr=xx.xx.105.69:33016, coord_addr=xx.xx.105.65:33161, is_server=true], [own_addr=xx.xx.105.67:33017, coord_addr=xx.xx.105.65:33161, is_server=true], [own_addr=xx.xx.105.65:33161, coord_addr=xx.xx.105.65:33161, is_server=true]], no need for merge

The following is the configuration

UDP(ucast_send_buf_size=800000;ucast_recv_buf_size=150000):
PING(timeout=2000;num_initial_members=3;up_thread=true;down_thread=true;):
MERGE2(min_interval=1000000;max_interval=2000000):
FD(shun=true;up_thread=true;down_thread=true;timeout=54000000;max_tries=5):
pbcast.NAKACK(gc_lag=50;retransmit_timeout=300,600,1200,2400,4800;max_xmit_size=8192;up_thread=true;down_thread=true):
pbcast.STABLE(desired_avg_gossip=2000000;up_thread=true;down_thread=true):
UNICAST:FRAG(frag_size=8192;down_thread=true;up_thread=true):
pbcast.GMS:VIEW_ENFORCER:QUEUE
Actions
4. Re: Configuration

belaban Feb 7, 2006 12:37 AM (in response to hariv)

What's the problem then ? Looks like there was no partition, so no need for a merge
Actions
5. Re: Configuration

hariv Feb 7, 2006 8:45 AM (in response to hariv)

The issue is whenever the application invokes castMessage on an instance of RpcDispatcher , then handler does'nt get invoked in all the other nodes. This solution was working in our qa environment but not working in our prod env.

The following is the config
UDP(ucast_send_buf_size=800000;ucast_recv_buf_size=150000):
PING(timeout=2000;num_initial_members=3;up_thread=true;down_thread=true;):
MERGE2(min_interval=1000000;max_interval=2000000):
FD(shun=true;up_thread=true;down_thread=true;timeout=54000000;max_tries=5):
pbcast.NAKACK(gc_lag=50;retransmit_timeout=300,600,1200,2400,4800;max_xmit_size=8192;up_thread=true;down_thread=true):
pbcast.STABLE(desired_avg_gossip=2000000;up_thread=true;down_thread=true):
UNICAST:FRAG(frag_size=8192;down_thread=true;up_thread=true):
pbcast.GMS:VIEW_ENFORCER:QUEUE

We rolled out a jgroups cache based propogation solution in production. This solution was working in our qa environment. But unfortunately the solution is not working in production. I am using the following configuration string

UDP(ucast_send_buf_size=800000;ucast_recv_buf_size=150000):
PING(timeout=2000;num_initial_members=3;up_thread=true;down_thread=true;):
MERGE2(min_interval=1000000;max_interval=2000000):
FD(shun=true;up_thread=true;down_thread=true;timeout=54000000;max_tries=5):
pbcast.NAKACK(gc_lag=50;retransmit_timeout=300,600,1200,2400,4800;max_xmit_size=8192;up_thread=true;down_thread=true):
pbcast.STABLE(desired_avg_gossip=2000000;up_thread=true;down_thread=true):
UNICAST:FRAG(frag_size=8192;down_thread=true;up_thread=true):
pbcast.GMS:VIEW_ENFORCER:QUEUE

From the below log messages I can see every node is seeing the other nodes.

When I turn on debug for jgroups I can see the following messages in the log file.

[org.jgroups.protocols.pbcast.STABLE] received digest xx.xx.105.65:33163#5 (5), xx.xx.105.67:33020#0 (0), xx.xx.105.69:33018#2 (2), xx.xx.105.71:32975#3 (3) from xx.xx.105.69:33018

2006-02-01 14:00:34,868 DEBUG [org.jgroups.protocols.UDP] received (mcast) 51 bytes from /xx.xx.105.65:33166 (size=51 bytes)
2006-02-01 14:00:34,868 DEBUG [org.jgroups.protocols.UDP] message is [dst: 228.8.8.8:7600, src: xx.xx.105.65:33165 (2 headers), size = 0 bytes], headers are {UDP=[UDP:channel_name=dictionary], PING=[PING: type=GET_MBRS_REQ, arg=null]}
2006-02-01 14:00:34,868 WARN [org.jgroups.protocols.UDP] discarded message from different group (dictionary). Sender was xx.xx.105.65:33165
2006-02-01 14:00:34,868 DEBUG [org.jgroups.protocols.UDP] received (mcast) 51 bytes from /xx.xx.105.65:33166 (size=51 bytes)
2006-02-01 14:00:34,868 DEBUG [org.jgroups.protocols.UDP] message is [dst: 228.8.8.8:7600, src: xx.xx.105.65:33165 (2 headers), size = 0 bytes], headers are {UDP=[UDP:channel_name=dictionary], PING=[PING: type=GET_MBRS_REQ, arg=null]}
2006-02-01 14:00:34,869 WARN [org.jgroups.protocols.UDP] discarded message from different group (dictionary). Sender was xx.xx.105.65:33165
2006-02-01 14:00:34,869 DEBUG [org.jgroups.protocols.PING] received GET_MBRS_REQ from xx.xx.105.65:33165, sending response [PING: type=GET_MBRS_RSP, arg=[own_addr=xx.xx.105.67:33023, coord_addr=xx.xx.105.65:33165, is_server=true]]
2006-02-01 14:00:34,869 DEBUG [org.jgroups.protocols.UDP] sending msg to xx.xx.105.65:33165 (src=xx.xx.105.67:33023), headers are {PING=[PING: type=GET_MBRS_RSP, arg=[own_addr=xx.xx.105.67:33023, coord_addr=xx.xx.105.65:33165, is_server=true]], UDP=[UDP:channel_name=dictionary]}
Actions

Go to original post