-
1. Re: What is the correct way to get all keySet of a cache
Martin Gencur May 21, 2012 3:39 PM (in response to dex chen)Getting all keys by keySet is not recommended because it's a dangerous method. It's not atomic and could result in inconsistencies, i.e. you could get incomplete list in the event of adding new keys by another thread on another node in the cluster while getting the keySet.
The proper solution is IMO to store the key set as a separate cache entry:
Set<String> keys = new HashSet<String>();
keys.add("key1");
...
keys.add("keyX");
cache.put(KNOWN_KEYS, keys);
Then you can atomically get the whole set of keys.
-
2. Re: What is the correct way to get all keySet of a cache
dex chen May 21, 2012 4:01 PM (in response to Martin Gencur)thanks for the suggestion and the insight.
The suggested approache does not scale. In my case, the number of keys could be up to 100K. This will result in extra replication across nodes.
I am aware of the potential "inconsistencies" or "not atmoic". That is one of the reasons that I posted question to have a cache wide lock or make a cache readonly earlier.
In my case, I have the cache configured to use sync in replication, and use a cluster wide lock-token to ensure there is not adding/deleting of cache items when I call cache.keySet().
Do you see there are any other reasons that keySet() method on cache is not recommend to use?
The comment in the source code does not state why the method is not recommend for production.
Then, it seems the size() method wil not be reliable either.
It seems to me that operations such as keySet() or values() or size() of cache is so fundmental that we have to support.
-
3. Re: What is the correct way to get all keySet of a cache
Vladimir Rodionov May 21, 2012 6:06 PM (in response to dex chen)Some caches can have millions or even billions of keys and this API is not suitable for very large caches, imo (it will takes too long and too much RAM to collect all keys from the cluster), but you can try, of course. I think instead of returning Set<> this API call should return iterator of keys and must allow to specify Filter object as well.
-
4. Re: What is the correct way to get all keySet of a cache
dex chen May 21, 2012 8:19 PM (in response to Vladimir Rodionov)I agree it could take long time to get all keys. The memory is a different issue. The question is how a user can get the keyset of a cache reliably.
The memory is a different issue.
-
5. Re: What is the correct way to get all keySet of a cache
Galder Zamarreño May 23, 2012 3:33 AM (in response to dex chen)With replication mode, keySet() is not problematic, you can use it anytime really.
With distribution mode though, it only gives you a local view of the keys present in the cache. IOW, it doesn't go and try to find all keys in the distributed cache, since that could be lengthy.
-
6. Re: What is the correct way to get all keySet of a cache
dex chen May 23, 2012 1:25 PM (in response to Galder Zamarreño)Galder: that's what I want to get comfirmed. thatnks.