1 Reply Latest reply on Aug 29, 2007 9:33 AM by brian.stansberry

State transfer out of Memory.

fdifonzo Aug 29, 2007 4:36 AM

I'm using jboss-cache1.4.1-SPA (TreeCache) and jgroups 1.4.1 which it is shipped with, ina cluster environment.
My application run with a heap size of 2G. When I have a cache size of
372529026 Bytes (JBoss cache prints this value on my log) while slave node is
fetching state I get an outofmemory error due to JGroups. Here's my log:

2007-08-23 12:41:59,164 INFO [ztc.cache.JBossCachePool] FREE MEM: 1209736176
2007-08-23 12:41:59,164 INFO [ztc.cache.JBossCachePool] TOTAL MEM: 2029518848
2007-08-23 12:42:04,071 INFO [ztc.tftpd.TFTPServerWrapper] processRequest(), RRQ by 127.0.0.1
2007-08-23 12:42:04,073 INFO [PROFILE] configure:1 msecs
2007-08-23 12:42:04,073 INFO [ztc.tftpd.TFTPServerWrapper] File "/000000000000" not found for client 127.0.0.1.34616
2007-08-23 12:42:14,081 INFO [ztc.tftpd.TFTPServerWrapper] processRequest(), RRQ by 127.0.0.1
2007-08-23 12:42:14,083 INFO [PROFILE] configure:1 msecs
2007-08-23 12:42:14,083 INFO [ztc.tftpd.TFTPServerWrapper] File "/000000000000" not found for client 127.0.0.1.34616
2007-08-23 12:42:24,007 INFO [org.jboss.cache.TreeCache] viewAccepted(): [192.168.1.249:34224|3] [192.168.1.249:34224, 192.168.1.250:32789]
2007-08-23 12:42:24,115 INFO [org.jboss.cache.TreeCache] locking the subtree at / to transfer state
2007-08-23 12:42:29,457 INFO [org.jboss.cache.statetransfer.StateTransferGenerator_140] returning the state for tree rooted in /(372529026 bytes)
2007-08-23 12:42:34,514 ERROR [org.jgroups.stack.DownHandler] DownHandler (FRAG) caught exception
java.lang.OutOfMemoryError
2007-08-23 12:42:34,514 INFO [ztc.tftpd.TFTPServerWrapper] processRequest(), RRQ by 127.0.0.1
2007-08-23 12:42:34,538 INFO [PROFILE] configure:23 msecs

Note that before getting error my memory checker thread states there's nearly
1.2G of memory!!!

Working with smaller cache, everything works fine.

Debugging your code I found the slave hangs on the following jGroups method:

boolean rc = channel.getState(null, state_fetch_timeout);

So, at first glance, it seems to me jgroups introduces a memory leak, but it may be a protocol problem
In my configuration file, the part related to jgroups looks like this:

<UDP mcast_addr="229.1.2.4" mcast_port="45555"
ip_ttl="64" ip_mcast="true"
bind_addr="192.168.1.250"
mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
loopback="false" />
<PING timeout="2000" num_initial_members="3"
up_thread="true" down_thread="true" />
<MERGE2 min_interval="5000" max_interval="10000" />
<FD_SOCK/>
<VERIFY_SUSPECT timeout="3000" num_msgs="3"
up_thread="true" down_thread="true" />
<pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
up_thread="true" down_thread="true" />
<pbcast.STABLE desired_avg_gossip="20000"
up_thread="true" down_thread="true" />
<UNICAST timeout="5000" window_size="100" min_threshold="10"
down_thread="true" />
<FRAG frag_size="8192"
down_thread="true" up_thread="true" />
<pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
shun="true" print_local_addr="true" />
<pbcast.STATE_TRANSFER down_thread="false" up_thread="false"/>

I read on jgroups user guide, pbcast.STATE_TRANSFER consumes a lot of memory, so STREAMING_STATE_TRANFER is better for big caches.
I replaced <pbcast.STATE_TRANSFER down_thread="false" up_thread="false"/> with <pbcast.STREAMING_STATE_TRANSFER down_thread="false" up_thread="false"/>,
but slave hangs and I see no attempt to tranfer state on master log (Consider that now I have no cache, so by using STATE_TRASFER everything works fine).

Have you got any suggestion?

Many thanks, Fabrizio

1. Re: State transfer out of Memory.

brian.stansberry Aug 29, 2007 9:33 AM (in response to fdifonzo)

Use FRAG2 instead of FRAG in your protocol stack. See http://wiki.jboss.org/wiki/Wiki.jsp?page=JGroupsFRAG2 .
Actions