-
1. Re: Recommendation for collections with large number of item
manik Dec 20, 2008 6:53 AM (in response to kringdahl)There hasn't been any specific testing or benchmarking for the best approach with which to store large collections, but I suspect the best may be something like the following:
1. Create a facade implementation of your collection, let's say a List.
2. The facade creates a node in the cache for the list, say /lists/mylist
3. Each list item is stored as child nodes under the node in question, e.g., /lists/mylist/uuid
4. /lists/mylist contains metadata about the collection, including a mapping of location in the list to uuid, as well as list size.
This way, you get the fine grained replication needed as the only nodes changed when adding or removing elements are /lists/mylist and the specific sub-node added or removed. Retrieval and iteration should be efficient as well, since you just need to query /lists/mylist and each subnode in turn. Sizing operations would be fast as well. The only thing that would be slow is List.contains() but IMO that is tedious even in a JDK linked list or array list.
Yes, this is a lot of extra hoops to jump through, but to gain performance with large collections, this may be the most efficient thing you can do.
I'm sure others will appreciate it if you started a wiki page on this and posted results of your findings/experiments and maybe even code for such collection facades.
HTH,
Manik -
2. Re: Recommendation for collections with large number of item
kringdahl Dec 22, 2008 4:23 PM (in response to kringdahl)Thanks Manik. We'll certainly be doing some experimentation here as it is a problem we need to address in the near term. When we've settled on an approach, I will post some results here.
-
3. Re: Recommendation for collections with large number of item
jason.greene Dec 22, 2008 11:49 PM (in response to kringdahl)Can you qualify what you mean by performance? It sounds like you are measuring full collection attach time, and not individual collection element operations.
-
4. Re: Recommendation for collections with large number of item
kringdahl Dec 23, 2008 12:10 AM (in response to kringdahl)Right, performance is a very general comment. But, for my purposes, I'll quantify performance as how quickly I can add or remove a single element from a collection of several hundred. That should be a relatively quick operation. But referencing my earlier post, using a native collection (as a List or Set) is a non-starter due to the massive amount of CPU used weaving the collection.