1 2 Previous Next 19 Replies Latest reply on Jul 26, 2008 6:41 PM by phpguy99

JBoss Cache performance looks pretty poor :(

phpguy99 Jul 22, 2008 6:43 PM

Hi,
I'm evaluating various java-based distributed caching solutions: JBoss Cache (version 2.1.1 GA), EHCache, and TerraCotta.
My data size is 10 million objects and using 3 nodes each with 10GB HeapSize.
So far EHCache is proven to be very fast, reliable (never goes down for many tests iteration for many hours), and uses small footprint. I can do 40,000 puts/second on 3 nodes cluster.

Ok - I'm here to ask about JBoss Cache.
Putting 1 million Objects (small ones - consisting of 3-4 Strings), the rate is like 2000/second, and once I start the second node the second node just dies right away giving me this error:
org.jboss.cache.CacheException: Unable to fetch state on startup
.....

The memory usage is way too high. 1 million objects give require 2GB of Heap (after I GC'd of course and watched this from JConsole).

They way I'm using the tree cache is I create *all* 1 million objects on their own node/fqn. So I have ROOT/Object-1, ROOT/Object-2 ... ROOT/Object-1000000.

My configuration file:
<?xml version="1.0" encoding="UTF-8"?>

jboss:service=Naming
jboss:service=TransactionManager

org.jboss.cache.transaction.GenericTransactionManagerLookup

READ_COMMITTED

false

REPL_SYNC

JBossCache-Cluster

<UDP mcast_addr="228.1.2.3" mcast_port="48866"
ip_ttl="64" ip_mcast="true"
mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
loopback="false"/>
<PING timeout="2000" num_initial_members="3"/>
<MERGE2 min_interval="10000" max_interval="20000"/>

<FD_SOCK/>
<VERIFY_SUSPECT timeout="1500"/>
<pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800" />

<pbcast.STABLE desired_avg_gossip="400000"/>
<FC max_credits="2000000" min_threshold="0.10"/>
<FRAG2 frag_size="8192"/>
<pbcast.GMS join_timeout="5000" shun="true" print_local_addr="true"/>
<pbcast.STATE_TRANSFER/>

20000
20000
15000

I believe I'm missing something here so be great if anybody can help.
Thanks a lot.

1. Re: JBoss Cache performance looks pretty poor :(

phpguy99 Jul 22, 2008 6:46 PM (in response to phpguy99)

my xml configuration didn't get posted correctly. :(
Actions
2. Re: JBoss Cache performance looks pretty poor :(

jason.greene Jul 22, 2008 7:25 PM (in response to phpguy99)

You have to wrap xml in a code bbcode tag for it to look right.

You probably need to up the StateRetrievalTimeout if you have that much state. Could you do a jmap -histo on the process to see whats taking up all that space?

Thanks
Actions
3. Re: JBoss Cache performance looks pretty poor :(

jason.greene Jul 22, 2008 7:33 PM (in response to phpguy99)

Also could you tell us about your EHCache config, where you using asynchronous replication for example?
Actions
4. Re: JBoss Cache performance looks pretty poor :(

phpguy99 Jul 22, 2008 7:41 PM (in response to phpguy99)

Reposting my config again now with code-tag:

<?xml version="1.0" encoding="UTF-8"?>

jboss:service=Naming
jboss:service=TransactionManager

org.jboss.cache.transaction.GenericTransactionManagerLookup

READ_COMMITTED

false

REPL_SYNC

JBossCache-Cluster

<UDP mcast_addr="228.1.2.3" mcast_port="48866"
ip_ttl="64" ip_mcast="true"
mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
loopback="false"/>
<PING timeout="2000" num_initial_members="3"/>
<MERGE2 min_interval="10000" max_interval="20000"/>

<FD_SOCK/>
<VERIFY_SUSPECT timeout="1500"/>
<pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800" />

<pbcast.STABLE desired_avg_gossip="400000"/>
<FC max_credits="2000000" min_threshold="0.10"/>
<FRAG2 frag_size="8192"/>
<pbcast.GMS join_timeout="5000" shun="true" print_local_addr="true"/>
<pbcast.STATE_TRANSFER/>

20000

20000

15000

$ jmap -histo 14359 ## for 200,000 of my object in the cache.

num #instances #bytes class name
----------------------------------------------
1: 6058302 290798496 java.util.concurrent.locks.ReentrantLock$NonfairSync
2: 6058260 290796480 java.util.concurrent.ConcurrentHashMap$Segment
3: 6058260 198060352 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
4: 379178 57632968 [Ljava.util.HashMap$Entry;
5: 378642 57553488 [Ljava.util.concurrent.ConcurrentHashMap$Segment;
6: 765551 57436024 [C
7: 378601 33316888 org.jboss.cache.UnversionedNode
8: 763508 30540320 java.lang.String
9: 378601 30288080 org.jboss.cache.invocation.NodeInvocationDelegate
10: 378642 27262224 java.util.concurrent.ConcurrentHashMap
11: 378600 27259200 org.jboss.cache.lock.NonBlockingWriterLock
12: 379058 24259712 java.util.HashMap
13: 383949 18429552 java.util.concurrent.ConcurrentHashMap$HashEntry
14: 379889 18234672 java.util.HashMap$Entry
15: 378600 18172800 org.jboss.cache.lock.IdentityLock
16: 396538 15881688 [Ljava.lang.Object;
17: 384436 15377440 java.util.ArrayList
18: 381480 15259200 org.jboss.cache.Fqn
19: 378599 15143960 com.ssn.jbosscache.Meter
20: 378600 12115200 org.jboss.cache.lock.LockMap
21: 378600 9086400 org.jboss.cache.lock.ReadWriteLockWithUpgrade$WriterLock
22: 378600 9086400 org.jboss.cache.lock.ReadWriteLockWithUpgrade$ReaderLock
23: 378600 9086400 org.jboss.cache.util.concurrent.ConcurrentHashSet
24: 378600 9086400 org.jboss.cache.lock.LockStrategyReadCommitted
25: 21274 2804184
26: 21274 2560560
27: 1838 2036776
28: 33111 1620128
29: 1838 1356816
30: 1612 1315584
31: 2919 552944 [I
32: 2123 544336 [B
33: 2989 454328 java.lang.reflect.Method
34: 2008 369472 java.lang.Class
Actions
5. Re: JBoss Cache performance looks pretty poor :(

phpguy99 Jul 22, 2008 7:47 PM (in response to phpguy99)

I'm using:
CacheMode=REPL_SYNC
IsolationLevel=READ_COMMITTED
LockParentForChildInsertRemove=false
StateRetrievalTimeout=20000
SyncReplTimeout=20000=
LockAcquisitionTimeout=15000

I put each of my object into a Fqn. So the ROOT has 1,000,000 direct children.

jdk: 1.6.0_07
OS: RHEL 5.1 on 8 cores Xeon
all my nodes are on the same subnet (and same switch).
Actions
6. Re: JBoss Cache performance looks pretty poor :(

manik Jul 22, 2008 7:55 PM (in response to phpguy99)

Is all your data placed under the same Fqn? All locking and replication granularity happens on a per-Node (Fqn) basis. I would recommend making better use of the tree structure of the cache and spreading your state around a bit better.

Also, if you don't need the atomicity guarantees of REPL_SYNC (and there is not much you can do with it if you don't use transactions anyway) you're better off using REPL_ASYNC.

Cheers,
Manik
Actions
7. Re: JBoss Cache performance looks pretty poor :(

phpguy99 Jul 22, 2008 8:05 PM (in response to phpguy99)

That is why I place all my data into their own Fqn. I have 1,000,000 objects (Meters) and each of them is constructed using:

Fqn fqn = new Fqn("root", "MeterID" + i);
Node<Object,Object> node = rootNode.addChild(fqn);
node.put("data", meter);

Does it mean I *over* do it? Instead of using a sunshine structure (1 level), should I structure my objects to fit into something like radix-tree?

The reason I'm using REPL_SYNC is to be fair to EHCache since I'm using SYNC, too, as well as Terracotta (write-lock).

Thanks for the quick response :)
Actions
8. Re: JBoss Cache performance looks pretty poor :(

jason.greene Jul 22, 2008 8:09 PM (in response to phpguy99)

You have to use the squary brackets with code since its bbcode see:
http://en.wikipedia.org/wiki/BBCode

BTW in addition to Manik's suggestions, you are also experiencing this bug (will be fixed in the next 2.2 release), which is why your memory usage is so high:

http://jira.jboss.org/jira/browse/JBCACHE-1383
http://www.jboss.com/index.html?module=bb&op=viewtopic&t=138338

-Jason
Actions
9. Re: JBoss Cache performance looks pretty poor :(

manik Jul 22, 2008 8:11 PM (in response to phpguy99)

spreading your stuff across the tree structure will help. The node structure is maintained using a CHM per Node to hold references to its children. And these CHMs are tuned for a lower-than-normal memory footprint so this means having lots of children per node will hurt concurrency. I'd recommend not putting more than 50 children per node and going as deep as you have to.

Also, re: your state retrieval, 20000 is pretty low (20 seconds) and if you have a lot of state, there is no way you will be able to transfer all that in 20 secs! :-)
Actions
10. Re: JBoss Cache performance looks pretty poor :(

phpguy99 Jul 22, 2008 8:20 PM (in response to phpguy99)

That is the kind of advice I liked to hear. I may have missed it - but I don't recall reading about the proper way to spread objects into the tree to get maximum performance.
Thanks. I'll modify the code and the configuration and download 2.2 beta and test it again.
Actions
11. Re: JBoss Cache performance looks pretty poor :(

manik Jul 22, 2008 8:24 PM (in response to phpguy99)

2.2 is in CR6 and very close to a GA release. :-)

If you want to have some more fun, check out 3.0.0.ALPHA which I recently released. Early benchmarks show that it is *much* faster.

http://jbosscache.blogspot.com/2008/07/jboss-cache-300-naga-first-alpha-now.html
Actions

12. Re: JBoss Cache performance looks pretty poor :(

phpguy99 Jul 23, 2008 1:14 PM (in response to phpguy99)

Really would like to have that fun but I'm evaluating it for production use in the next 3-4 months.
I downloaded 2.2 CR6 and changed the:
StateRetrievalTimeout=600000 (5 minutes)
pbcast.GMS join_timeout="60000"

I haven't changed my code to spread objects further down the tree.
BTW, this seems odd or I may have missed something, but shouldn't the cache system do this spreading behind the scene? Depends on the key of the objects to cache, it could be difficult to balance the tree. And to know the "path" before I can do a "get". It's much simpler to do straight "key" lookup. (just my 2cents)

Back to performance and memory.
It's stable now with 2 nodes. The rate of my insert is increased from 2000/s to 4000/s. This is one at a time and SYNC that is 0.25ms/operation which is very good and multithreaded should increase this by a lot (I hope).
But the memory consumption is still very high (maybe b/c I put everything right under "root"). 4GB for my 1M objects

 num #instances #bytes class name
----------------------------------------------
 1: 16000897 768043056 java.util.concurrent.locks.ReentrantLock$NonfairSync
 2: 16000820 768039360 java.util.concurrent.ConcurrentHashMap$Segment
 3: 16000820 528806520 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
 4: 43792 323829848 [I
 5: 2021461 153453344 [C
 6: 1002432 152294120 [Ljava.util.HashMap$Entry;
 7: 1000052 152007808 [Ljava.util.concurrent.ConcurrentHashMap$Segment;
 8: 1000002 88000176 org.jboss.cache.UnversionedNode
 9: 2016545 80661800 java.lang.String
 10: 1000002 80000160 org.jboss.cache.invocation.NodeInvocationDelegate
 11: 1000052 72003744 java.util.concurrent.ConcurrentHashMap
 12: 1000002 72000144 org.jboss.cache.lock.NonBlockingWriterLock
 13: 1002113 64135232 java.util.HashMap
 14: 1001773 48085104 java.util.HashMap$Entry
 15: 1000281 48013488 java.util.concurrent.ConcurrentHashMap$HashEntry
 16: 1000002 48000096 org.jboss.cache.lock.IdentityLock
 17: 1008529 40619728 [Ljava.lang.Object;
 18: 1000771 40030840 java.util.ArrayList
 19: 1000004 40000160 org.jboss.cache.Fqn
 20: 1000002 40000080 java.util.RegularEnumSet
 21: 1000000 40000000 com.ssn.jbosscache.Meter (my objects)

I constantly see:

2008-07-23 10:09:47,929 [Incoming,JBossCache-Cluster,10.57.132.54:38174] WARN org.jgroups.protocols.pbcast.NAKACK.handleMessage - 10.57.132.54:38174] discarded message from non-member 10.57.132.53:33187, my view is [10.57.132.54:38174|0] [10.57.132.54:38174]

13. Re: JBoss Cache performance looks pretty poor :(

manik Jul 23, 2008 4:02 PM (in response to phpguy99)

"phpguy99" wrote:
Really would like to have that fun but I'm evaluating it for production use in the next 3-4 months.

I would still recommend trying it out - I may push out 3.0.0 fairly quickly (next 2 mths), the major bits are ready and a lot of people are keen start using it.

Either way, it should be a painless upgrade path from 2.2.0.

"phpguy99" wrote:

I haven't changed my code to spread objects further down the tree.
BTW, this seems odd or I may have missed something, but shouldn't the cache system do this spreading behind the scene? Depends on the key of the objects to cache, it could be difficult to balance the tree. And to know the "path" before I can do a "get". It's much simpler to do straight "key" lookup. (just my 2cents)

I agree - but there are always 2 sides to that argument. Some people want the more direct control, some don't. It is on our roadmap as an option, and we do have an implementation that someone contributed that may even make it into 3.0.0.

See https://jira.jboss.org/jira/browse/JBCACHE-67
and
https://jira.jboss.org/jira/browse/JBCACHE-941.

Cheers
Manik
Actions
14. Re: JBoss Cache performance looks pretty poor :(

jason.greene Jul 23, 2008 5:58 PM (in response to phpguy99)
"phpguy99" wrote:

I haven't changed my code to spread objects further down the tree.
BTW, this seems odd or I may have missed something, but shouldn't the cache system do this spreading behind the scene? Depends on the key of the objects to cache, it could be difficult to balance the tree. And to know the "path" before I can do a "get". It's much simpler to do straight "key" lookup. (just my 2cents)

Could you tell us about your access patterns? How often do you insert? How many simultaneous writers will you have on the same server/process? A node currently has 4 segments, so it's tuned for allowing 4 simultaneous child node inserts. So, only if you have > 4 concurrent threads (more than 4 cpu cores) all inserting at the same time, that could become a bottleneck.

If this is the case, and you need to spread, you don't really need to do active balancing, just a simple modulus of a spread would work fine. Like
x = ID % 10000 fqn = /x/ID

We do plan to make this concurrency level configurable in 3.0, so you won't need to spread things out if you don't need to.

Back to performance and memory.
It's stable now with 2 nodes. The rate of my insert is increased from 2000/s to 4000/s. This is one at a time and SYNC that is 0.25ms/operation which is very good and multithreaded should increase this by a lot (I hope).
But the memory consumption is still very high (maybe b/c I put everything right under "root"). 4GB for my 1M objects

You are still experiencing JBCACHE-1383. Which causes 16 CHM segments and locks to be created per node. The update reduced it to 4. It is not yet in a release, although you can build the latest 2.2.x branch if you want. The MVCC locking mode in 3.0 completely eliminates the 4 CHM segment and lock overhead, so you might want to give that a try.

I constantly see:
2008-07-23 10:09:47,929 [Incoming,JBossCache-Cluster,10.57.132.54:38174] WARN org.jgroups.protocols.pbcast.NAKACK.handleMessage - 10.57.132.54:38174] discarded message from non-member 10.57.132.53:33187, my view is [10.57.132.54:38174|0] [10.57.132.54:38174]

This could indicate you have other traffic on the same multicast address. You might want to make sure the nodes in your cluster are the only ones using that address.
Actions

1 2 Previous Next

Go to original post