4 Replies Latest reply on Jul 28, 2004 12:12 PM by belaban

Cluster Problem: Is there cache object size limits ?

hanson Jul 27, 2004 10:48 PM

I am testing the cluster function of TreeCache.

steps
1)One cache is started up first, it will be put about 10,000 objects.
2)The other cache is start up. it get data from the first cache

the testing source is
/////////////////////////////////////////////////////////////////////////////////
import org.jboss.cache.*;
import java.io.*;

public class MyTreeCache {

public static void main(String[] args) {

try{

TreeCache tree = new TreeCache();
PropertyConfigurator config = new PropertyConfigurator(); config.configure(tree, "META-INF/replAsync-service.xml");
tree.start(); // kick start tree cache

//the two line will be commented in second cache
//since the second cache just "read" data from the first
long time = fillCache(tree,"a",100000);
System.out.println("time = " +time);

while(true)
{
Thread.sleep(5000);
Node node = tree.get(new Fqn( new Object[] { "a" } ));
System.out.println("data size " +node.getDataKeys().size() );
}
}
catch(Exception e)
{
e.printStackTrace();
}
}

private static long fillCache(TreeCache cache, String regionRoot, int count)
throws Exception {
long time = System.currentTimeMillis();
for (int i = 0; i < count; i++) {

String item = "item"+i;
CacheMessage value = new CacheMessage(i);
cache.put(new Fqn( new Object[] { regionRoot} ), item, value);
}

return System.currentTimeMillis() - time;
}

}

class CacheMessage implements Serializable
{
public int index;
public byte [] body;

CacheMessage(int index)
{
body = new byte[100];
this.index = index;
}

}
///////////////////////////////////////////////////////////////////////////////////

when CacheMessage's member "body" szie is 100, the second cache will not able to get data from fisrt cache.

This is the error message
10:36:22,171 WARN [AckReceiverWindow] discarded msg with seqno=159 (next msg to
receive is 164)
10:36:22,171 WARN [AckReceiverWindow] discarded msg with seqno=145 .............................

but if the size is 50, it's ok

the cache config files replAsync-service.xml is copy from etc/META-INF directory. I adjust the log level to INFO.

is there any size limits in jboss TreeCache?

////////////////////////////////////////////////////////////////////////////////////
the tree config file

<?xml version="1.0" encoding="UTF-8"?>











jboss:service=Naming
jboss:service=TransactionManager


org.jboss.cache.DummyTransactionManagerLookup


REPEATABLE_READ


REPL_ASYNC


false


0


0


TreeCache-Cluster





<UDP mcast_addr="228.1.2.3" mcast_port="45566"
ip_ttl="64" ip_mcast="true"
mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
loopback="false"/>
<PING timeout="2000" num_initial_members="3"
up_thread="false" down_thread="false"/>
<MERGE2 min_interval="10000" max_interval="20000"/>

<FD_SOCK/>
<VERIFY_SUSPECT timeout="1500"
up_thread="false" down_thread="false"/>
<pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800"
max_xmit_size="8192" up_thread="false" down_thread="false"/>
<UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10"
down_thread="false"/>
<pbcast.STABLE desired_avg_gossip="20000"
up_thread="false" down_thread="false"/>
<FRAG frag_size="8192"
down_thread="false" up_thread="false"/>
<pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
shun="true" print_local_addr="true"/>
<pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>


2000000


true


5000


10000


15000


60000

1. Re: Cluster Problem: Is there cache object size limits ?

hanson Jul 28, 2004 1:26 AM (in response to hanson)

The problem is solved. the default "InitialStateRetrievalTimeout" is 5 secs. I change it to 500 secs.it's ok
but i got new problem: the time retrieving data between two cache is much longer than that inserting same data into local cache.
inserting data into cache locally is about 8 secs. but retrieving data from another cache is
1) when "body" size is 100, retrieval time is about 30 secs.
2) when "body" size is 500 , retrieval time is about 120 secs.

I distributed the two cache to different machine. The machine used was single-processor Intel Pentium 4 1.8GHz, 256M RAM.

I noticed that when second cache startup, it's jvm memory usage is not increased unit retrieval's last serveral seconds.
I don't now the what the clustering is doing before it . when the retrieval is over, the fisrt jvm use almost 600M memory, i don't
know whether memory leaks.
Actions
2. Re: Cluster Problem: Is there cache object size limits ?

belaban Jul 28, 2004 4:41 AM (in response to hanson)

Increase 5000 in your cache XML file.

Bela
Actions
3. Re: Cluster Problem: Is there cache object size limits ?

hanson Jul 28, 2004 5:41 AM (in response to hanson)

Bela , thanks for your reply.

I just test replicating performance under solaris platform, meet some question.

Steps:
Start up first jboss cache , inserted 100,000 objects into it , the object size is about 500 bytes. The inserting time is about 10 secs.

-------------------------------------------------------
GMS: address is fire1:38833
-------------------------------------------------------
16:57:36,433 INFO [TreeCache] viewAccepted(): new members: [fire1:38833]
16:57:36,447 INFO [TreeCache] state could not be retrieved (must be first member in group)
16:57:36,447 INFO [TreeCache] setState(): new cache is null (maybe first member in cluster)
time = 10319
data size 100000

then start second jboss cache to backup the contents of the first , the replicating time is about 20 secs.
17:03:56,372 WARN [AckReceiverWindow] discarded msg with seqno=6232 (next msg to receive is 6558)
17:03:56,373 WARN [AckReceiverWindow] discarded msg with seqno=6278 (next msg to receive is 6558)
17:03:56,373 WARN [AckReceiverWindow] discarded msg with seqno=6360 (next msg to receive is 6558)
17:03:56,374 WARN [AckReceiverWindow] discarded msg with seqno=6412 (next msg to receive is 6558)
17:03:59,545 INFO [TreeCache] setState(): locking the old tree
17:03:59,567 INFO [TreeCache] setState(): locking the old tree was successful
17:03:59,568 INFO [TreeCache] setState(): forcing release of all locks in old tree
17:03:59,568 INFO [TreeCache] state was retrieved successfully (in 21450 milliseconds

the memory usage on first
1597 hanson 26 29 10 570M 474M sleep 0:21 0.95% java

the memory usage on backup
18990 hanson 24 28 10 566M 277M sleep 0:10 0.03% java

if insert 400,000 objects, inserting time is about 40 secs

-------------------------------------------------------
GMS: address is fire1:38856
-------------------------------------------------------
17:03:55,271 INFO [TreeCache] viewAccepted(): new members: [fire1:38856]
17:03:55,291 INFO [TreeCache] setState(): new cache is null (maybe first member in cluster)
17:03:55,292 INFO [TreeCache] state could not be retrieved (must be first member in group)
time = 40046

and replicating time is about 100 secs

17:13:46,654 WARN [AckReceiverWindow] discarded msg with seqno=25746 (next msg to receive is 26175)
17:13:46,655 WARN [AckReceiverWindow] discarded msg with seqno=25810 (next msg to receive is 26175)
17:13:46,655 WARN [AckReceiverWindow] discarded msg with seqno=25852 (next msg to receive is 26175)
17:13:46,655 WARN [AckReceiverWindow] discarded msg with seqno=26036 (next msg to receive is 26175)
17:14:00,739 INFO [TreeCache] setState(): locking the old tree
17:14:00,763 INFO [TreeCache] setState(): locking the old tree was successful
17:14:00,764 INFO [TreeCache] setState(): forcing release of all locks in old tree
17:14:00,764 INFO [TreeCache] state was retrieved successfully (in 98070 milliseconds
data size 400000

the memory usage on first
23659 hanson 26 29 10 1881M 1632M sleep 1:37 0.12% java
the memory usage on backup
19012 hanson 25 28 10 613M 579M sleep 0:35 0.16% java

My Question is

[1] why the first jboss cache eat so much memory after replicating? before replicating, it only eat about 560M memory

26793 hanson 24 29 10 566M 320M sleep 0:41 8.87% java

[2] when insert 800,000 objects into first cache , then start backup cache. I found the repli cating failed in 500 sec .However, the first jboss cache eat almost all memoy (1800M)
[3] Is there any way to improve the performance when replicating huge volume data (>500M)? How many meomory required?
Actions
4. Re: Cluster Problem: Is there cache object size limits ?

belaban Jul 28, 2004 12:12 PM (in response to hanson)

2 issues:

#1 When we do state transfer, we have to copy the state (actually worse: serialize it) into a byte[] buffer. Same happens on the receiver. This means that you will have a memory spike that is double the size of your state. If your state is 400M, then allocate at least 1GB of memory to your JVM

#2 For large states I have an item on the todo list which is to provide a streaming state transfer API, where you transfer chunks (e.g. 10K in size) of state across the network. Your app (sender and receiver) therefore don't have to allocate double the memory of their state, but just an additional <chunk-size>, e.g. 10K.

Bela
Actions

Go to original post