Tree versus flat hashmap cache structure

nonius Oct 27, 2008 1:45 PM

We are studying the internals of JBossCache and we are wondering what exactly are the performance reasons that made you choose to struture cache as a tree rather than as a flat hastable.

In this tutorial:
http://www.jboss.org/file-access/default/members/jbosscache/freezone/docs/1.4.0/TreeCache/en/html_single/index.html#d0e100

We found this paragraph:

The first version of TreeCache was essentially a single HashMap that replicated. However, the decision was taken to go with a tree structured cache because (a) it is more flexible and efficient and (b) a tree can always be reduced to a map, thereby offering both possibilities. The efficiency argument was driven by concerns over replication overhead, and was that a value itself can be a rather sophisticated object, with aggregation pointing to other objects, or an object containing many fields. A small change in the object would therefore trigger the entire object (possibly the transitive closure over the object graph) to be serialized and propagated to the other nodes in the cluster. With a tree, only the modified nodes in the tree need to be serialized and propagated. This is not necessarily a concern for TreeCache, but is a vital requirement for PojoCache (as we will see in the separate PojoCache documentation).

We agree on the fact that a tree structure can be always reduced to a hashtable, but we do not see particular difficulties in handling with incremental updates to replicate a hashtable.

We could not find a previous discussion on this topic in the forum. So, we would appreciate if someone can help us in better understanding the reasons underlying this design choice.

Thanks in advance,
nuno