I think the biggest problem will be the in-memory merge and OOMs. One of the primary reasons to use a cache loader is because you have more state in the cache than memory. :-)
Lets think why we bother with the !exists test in memory first. If this is just an optimisation so we don't have to write the state to the DB when the state already exists, then in this case the optimisation doesn't help but hinder. We should just write *everything* to the CL.
The other reason why you may not want to write everything to the CL is if you are using passivation. Then, stuff in-memory should not be in the CL.
So, perhaps this is what we need to do (if we are using a JDBC cache loader only):
1. If !using passivation, write all state to the DB, regardless of whether it exists in memory or not.
2. If using passivation, when attempting to deserialize state to put into your batch, ignore statements which pertain to Fqns that are in memory.
I agree about having configurable batch size limits, with perhaps a 1k batch size default.
Isn't the idea of state transfer that the recipient has no state in the region being transferred; i.e. it's a complete replace?
If so, then for sure there's no need for an exists check.
And, if so, then in the passivation case it's the responsibility of the sender to properly segregate the in-memory nodes from the passivated nodes (i.e. it's a bug if it isn't that way already). So, no need to check if a node in the persistent state is in memory before writing it.
There could be state on the recipient, since in-memory state is integrated before persistent state. This is why I suspected that the exists() check is an optimisation, since integrating in-memory state will result in a cache loader put() as well.
And as for passivation, if the state is in-memory (after integrating in-memory state), it won't (and shouldn't) be in the cacheloader.
Ah, now I understand. Yeah, that makes sense.