To answer part of your question. We are scheduled to have a 1.0 release in middle of this month. Bugs fixes and new features such as eviction policy and aop object graphs handling. On the aop side, it will also use the latest JBossAop that Bill has been working on.
I will be interested to hear some user experiences in this forum as well. :-)
Thanks for the info on releases ill wait until then to implement and will post experience.
Excellent, we are just starting to use TreeCache this week (as a second level cache for hibernate). We are still in early development phases, and we were hoping it would be production ready by the time we hit rollout (several months from now).
Keep up the good work!
We (www.digijava.org) are trying to use JbossCache under Hibernate on production. Only ASYNC replication gave reasonable performance.
Number of caches objects that we are testing on is about hundred thousand objects
We are having real serious problems with the existing LRU implementation. LRU parameter wakeUpIntervalSeconds set to anything higher than 5secs gives inaccaptable performance.
We get nice results with wakeUpIntervalSeconds=1sec but I am not, really, very happy with some process starting up every second.
Profiler shows that the bottleneck is put() in edu.oswego.cs.dl.util.concurrent.BoundedLinkedQuee
"Something is rotten in the state of Denmark..."
Regarding to performance, SYNC mode has to be slower than ASYNC since it is blocking. But if you have read-mostly data, then it can be acceptable. Otherwise, use ASYNC if you can.
For LRU, basically there is a TimerTask thread that wake up every x seconds to check for the node event queue. Do you evict the nodes often? Otherwise, I am a bit puzzled why it is slow.
If you can provide more information, I will be glad to look into that.
thanks for replying.
here's a graph of LRU-enabled test benchmark. The only thing it does is makes put()s in JBossCache. Test is a standalone app, no interference with anything, no Hibernate or whatever. This snapshot is with wakeUpTime=5sec
We are doing the same with different wakeUpTimes and with more fine-grained dots (on this picture too many dots were taken in the begining).
I will post results (graphs, code, profiler results) as soon as we have them (in several more hours). From the first glance - LRU queue usage has some serious performance/scalability problems
here are the final results.
We did series of tests on the following wakeUpIntervalSeconds: 1,3,5,10,15. The default one indicated in the sample code on your site was 5sec. In each series, several series of puts() were performed: from 500 to 50,000 with step 1,000 (10 points for each series of wakeUpInterval).
Following is the source that performs the whole cycle:
And sample configuration for 5 seconds:
You can, also, download the complete package with all the config files, JARs, Ant build script and run.bat from here:
The memory settings used in the test were the same as in run.bat:
-Xms 512M -Xmx 800M
The machine used was single-processor Intel Pentium 4 2.6GHz with Hyperthreading enabled.
Now the results.
Following is all the series together:
As you can see not only time of put() is proportional of the number of objects (inneficient algorithm) but the coefficient of linear dependence increases drastically with the increase of wakeUpInterval. If you want to get a feeling of the "speed" you can, also, see the proportionally scaled graph of the same results:
As you can see - at 15 secs it is almost vertical!
Please, also note the brown line representing the same test with LRU turned off - it is almost horizontal. Without LRU the cache performance, per se, is not bad - that if you can imagine a cache running without LRU in production, when JVM has hard limit of ~2GB on available memory :)
Following is just the LRU-turned-off test:
Then we did profiling to get some feeling of what is going wrong.
Following are overall and zoomed-in snapshots:
Diving into one of the slow branches:
These are just first-glance conclusions so may not be 100% the reson but this is what we think is wrong:
1. org.jboss.cache.Fqn class uses non-static logger, which means that its instance is created every time someone calls new Fqn() or clone().
2. The toString() method in org.jboss.cache.Fqn is VERY slow (no surprise there: StringBuffer and other slow stuff is used). Odd, for us, is that org.jboss.cache.eviction.LRUPolicy uses code like Region region = regionManager_.getRegion(fqn.toString()); while Fqn can be used as a key in java.util.Map (?)
Thank you very much for all efforts. This is valuable information for me to troubleshoot. Just one quick question. What is the log level for org.jboss.cache? "DEBUG" or "INFO"? Default shipped with the package is "DEBUG" and that may have signicant impact on the overall performance.
In the tests above the log level was INFO. You can find the log4j.properties in the zip file that irakli has posted, under src directory.
Thanks for the info. I will take a look.
OK. I have fixed the performance bottleneck problem in LRUPolicy. I replaced the eviction queue from BoundLinkedQueue to BoundedBuffer. I have also increased the initial capacity of the queue. Eventually, we will externalize the initial capacity in the next full release.
I have run the example that you guys setup, now the worst case (25 seconds) is about twice as slow as the one without eviction policy turned on. So that's more reasonable. :-)
I will upload the patch JBoss1_02 to the jboss site tomorrow and then announce it here.
Thanks everyone for the help.
Wooow. Very impressive performance improvement.
Thanks a lot, Ben!