I would like your feedback on a new locking strategy design. This design,
if ratified, would be targeted for a future version of JBoss Cache (not 2.0.0)
Modern databases are able to achieve high concurrency by ensuring that
readers are never blocked by writers. Our current pessimistic locking
strategy does not offer this precondition, which leads to poor
performance (lock contention, upgrade exceptions, timeouts, etc) when
there is a mixture of reads and writes on the same node (Pretty much the
use case of PojoCache). Also, the current pessimistic model also has a
dirty-read race condition with READ COMMITTED.
Optimistic locking solves the concurrency problems by implementing full
MVCC, this however has other problems. It has memory overhead, since
every tx has to keep its own node copy. It can't be used with write
locks on reads (SELECT FOR UPDATE, Option.setForceWriteLock()) and for
the same reason it can't implement the SERIALIZATION isolation level.
Last, it has to prevent overwrites (rollback exception), which while
useful in some cases, it's not always the desired behavior.
My proposed solution is capable of all isolation levels, high read
concurrency, and has a relatively low memory footprint. This is done by
using limited global versioning with write locking, and MVCC for reads:
An incoming reader is notated as R, a writer as W, the current version
as v, and the node state is a function N(v). Essentially at any given
time a reader will see N(v). Once a write occurs, we copy the node,
N(v)' which only the writer sees. So in short we have:
R = N(v)
W = N(v)'
There can only ever be one writer (one write lock), so therefore only
one instance of N(v)'. This gives us the essential property "readers
never blocked by writers" as well as prevents dirty reads.
Once the writer TX commits, the new version is visible to new readers,
this will of course require a short lived atomic operation for the swap,
which gives us READ COMMITTED:
N(v + 1) = N(v)'
v = v + 1
If REPEATABLE READ is desired, on the first read of a node, the TX holds
a reference to that specific version (this is just a reference! no
copying, so everyone that hits the same version shares the same
version), z. This version is used for all read calls, until either a
write occurs within the same TX or the TX ends.
z = v
RR = N(z)
When a RR TX obtains a write lock on a node we toss the stale version,
N(z), (which is eventually GCd). The same semantics in READ COMMITTED
are followed, N(v)' is created as a copy of N(v). Reads and writes from
the write lock holding TX then use N(v)' This gives us SELECT FOR
READ UNCOMMITTED can be implemented by either substituting READ
COMMITTED (which is allowed), or by not creating an N(v)' for writes as
SERIALIZABLE can be implemented by just obtaining a write lock on all
reads, N(v)' is not needed as well.
So, in summary, all isolation modes with the exception of SERIALIZABLE
will never block on another TX for a read (unless setForceWriteLock() is
used on the read). Memory footprint will only be 1 additional copy per
active writer in READ COMMITTED, and in the case of REPEATABLE READ only
1 additional copy per write commit that is visible to a reading TX.
I created a nice table to outline the differences between the previous
locking modes, and this proposal. I refer to the proposal
as NL (New Locking)
Item PL OL NL --------------------------------------------------------------- Non-Blocking Reads N Y Y All Isolation Modes Y N Y SELECT FOR UPDATE Y N Y Prevents write-skew in RR Y Y Y (Optional) Low memory overhead Y N Y No comparison phase Y N Y No copy on read Y N Y No copy on write Y N Y (Only RU + S) Fail-fast version checking N/A N Y (Optional) 1. READ COMMITTED has to use a RW lock to be reliable, this isn't the case now (dirty read race). 2. SERIALIZABLE is not possible