Getting all keys by keySet is not recommended because it's a dangerous method. It's not atomic and could result in inconsistencies, i.e. you could get incomplete list in the event of adding new keys by another thread on another node in the cluster while getting the keySet.
The proper solution is IMO to store the key set as a separate cache entry:
Set<String> keys = new HashSet<String>();
Then you can atomically get the whole set of keys.
thanks for the suggestion and the insight.
The suggested approache does not scale. In my case, the number of keys could be up to 100K. This will result in extra replication across nodes.
I am aware of the potential "inconsistencies" or "not atmoic". That is one of the reasons that I posted question to have a cache wide lock or make a cache readonly earlier.
In my case, I have the cache configured to use sync in replication, and use a cluster wide lock-token to ensure there is not adding/deleting of cache items when I call cache.keySet().
Do you see there are any other reasons that keySet() method on cache is not recommend to use?
The comment in the source code does not state why the method is not recommend for production.
Then, it seems the size() method wil not be reliable either.
It seems to me that operations such as keySet() or values() or size() of cache is so fundmental that we have to support.
Some caches can have millions or even billions of keys and this API is not suitable for very large caches, imo (it will takes too long and too much RAM to collect all keys from the cluster), but you can try, of course. I think instead of returning Set<> this API call should return iterator of keys and must allow to specify Filter object as well.