[ModeShape 5.x] Caching strategy
illia.khokholkov Sep 15, 2016 11:35 AMI am struggling to find a description of the caching mechanism used by ModeShape 5.x. The documentation [1] is great, but unfortunately it did not help to get a gist of caching policies. I was able to find only the notes presented below, which are certainly helpful, but not complete.
Clustering [2]:
A cluster in this model can have any number of members each with it's own in-memory cache but all using a shared database for persisting and reading the content.
Persistence [3]:
ModeShape 3 and 4 used, in additional to the main Infinispan cache which stored the repository data, a second, local, in-memory cache, for each repository workspace in order to provider fast read access to frequently used nodes. This cache exists solely for performance reasons and ModeShape 5 preserves the concept, using a LRU ConcurrentMap implementation.
Repository and Session [4]:
ModeShape uses the copy-on-write behavior. Note that this is different than ModeShape 2.x, which used copy-on-read.
Knowing that my understanding of caching behavior is practically non-existent, please consider the scenario presented below.
Cluster members: M1, M2.
Application consumers: C1, C2.
Node: N.
- C1 creates N in M1. The N gets persisted in DB, the cache in M1 is now aware of N and a notification about new node creation is sent to M2 via JGroups.
- M2 sees a message from M1, however, its cache does not contain N, so nothing has to be refreshed.
- C2 gets N, but does so using M2. Since this is a read-only operation, no notifications to other members get sent.
- M2 loads N in its cache.
- C1 updates N in M1 by changing some of its properties. A change notification is getting ready to be sent to M2.
- M1 loses network connectivity before JGroups message about node update gets sent to M2.
- C2 gets N from M2, expecting to see changes made by C1, however, nothing changed, because N was already in the cache of M2 and no update notifications were received.
- M2 has a stale data now, i.e. N is no longer current. Is M2 cache entry for N ever going to be updated, assuming M1 no longer updates N (so that update notification does not get sent, even if network connectivity on M1 is re-established) and M2 only reads N?
What are the caching policies regarding the following?
- Adding a new node to the cache (i.e. when exactly a new entry gets added).
- Removing a node from the cache (i.e. under what conditions an entry gets removed).
- Refreshing an already cached node (i.e. when an entry gets refreshed, e.g., periodically, on write, etc.).
- There is a property called "cacheSize" under "workspaces" entry in repository JSON configuration file. Is it related to the caching of JCR nodes a consumer directly works with, or is it something related to internal caching done by ModeShape under the hood for some kind of system nodes? Furthermore, if that value is set to 0, is it effectively cancels the caching, i.e. forces to always read from the underlying DB?
Many thanks in advance, any help is greatly appreciated. My apologies if I missed the part of the official documentation that explains exactly what I want to know about caching.
[1] Home - ModeShape 5 - Project Documentation Editor
[2] Clustering - ModeShape 5 - Project Documentation Editor
[3] Persistence - ModeShape 5 - Project Documentation Editor
[4] Repository and Session - ModeShape 5 - Project Documentation Editor