8 Replies Latest reply on Jul 27, 2007 7:59 AM by manik

    New Locking Strategy Proposal for JBoss Cache

    jason.greene

      Hello everyone,

      I would like your feedback on a new locking strategy design. This design,
      if ratified, would be targeted for a future version of JBoss Cache (not 2.0.0)

      Modern databases are able to achieve high concurrency by ensuring that
      readers are never blocked by writers. Our current pessimistic locking
      strategy does not offer this precondition, which leads to poor
      performance (lock contention, upgrade exceptions, timeouts, etc) when
      there is a mixture of reads and writes on the same node (Pretty much the
      use case of PojoCache). Also, the current pessimistic model also has a
      dirty-read race condition with READ COMMITTED.

      Optimistic locking solves the concurrency problems by implementing full
      MVCC, this however has other problems. It has memory overhead, since
      every tx has to keep its own node copy. It can't be used with write
      locks on reads (SELECT FOR UPDATE, Option.setForceWriteLock()) and for
      the same reason it can't implement the SERIALIZATION isolation level.
      Last, it has to prevent overwrites (rollback exception), which while
      useful in some cases, it's not always the desired behavior.

      My proposed solution is capable of all isolation levels, high read
      concurrency, and has a relatively low memory footprint. This is done by
      using limited global versioning with write locking, and MVCC for reads:

      An incoming reader is notated as R, a writer as W, the current version
      as v, and the node state is a function N(v). Essentially at any given
      time a reader will see N(v). Once a write occurs, we copy the node,
      N(v)' which only the writer sees. So in short we have:

      R = N(v)
      W = N(v)'

      There can only ever be one writer (one write lock), so therefore only
      one instance of N(v)'. This gives us the essential property "readers
      never blocked by writers" as well as prevents dirty reads.

      Once the writer TX commits, the new version is visible to new readers,
      this will of course require a short lived atomic operation for the swap,
      which gives us READ COMMITTED:
      N(v + 1) = N(v)'
      v = v + 1

      If REPEATABLE READ is desired, on the first read of a node, the TX holds
      a reference to that specific version (this is just a reference! no
      copying, so everyone that hits the same version shares the same
      version), z. This version is used for all read calls, until either a
      write occurs within the same TX or the TX ends.

      z = v
      RR = N(z)

      When a RR TX obtains a write lock on a node we toss the stale version,
      N(z), (which is eventually GCd). The same semantics in READ COMMITTED
      are followed, N(v)' is created as a copy of N(v). Reads and writes from
      the write lock holding TX then use N(v)' This gives us SELECT FOR
      UPDATE.

      READ UNCOMMITTED can be implemented by either substituting READ
      COMMITTED (which is allowed), or by not creating an N(v)' for writes as
      an optimization.

      SERIALIZABLE can be implemented by just obtaining a write lock on all
      reads, N(v)' is not needed as well.

      So, in summary, all isolation modes with the exception of SERIALIZABLE
      will never block on another TX for a read (unless setForceWriteLock() is
      used on the read). Memory footprint will only be 1 additional copy per
      active writer in READ COMMITTED, and in the case of REPEATABLE READ only
      1 additional copy per write commit that is visible to a reading TX.

      I created a nice table to outline the differences between the previous
      locking modes, and this proposal. I refer to the proposal
      as NL (New Locking)

      Item PL OL NL
      ---------------------------------------------------------------
      Non-Blocking Reads N Y Y
      All Isolation Modes Y[1] N[2] Y
      SELECT FOR UPDATE Y N Y
      Prevents write-skew in RR Y Y Y (Optional)
      Low memory overhead Y N Y
      No comparison phase Y N Y
      No copy on read Y N Y
      No copy on write Y N Y (Only RU + S)
      Fail-fast version checking N/A N Y (Optional)
      
      
      1. READ COMMITTED has to use a RW lock to be reliable,
      this isn't the case now (dirty read race).
      2. SERIALIZABLE is not possible
      


      Thoughts?

        • 1. Re: New Locking Strategy Proposal for JBoss Cache
          fredrikj

          Allthough I'm not familiar enough with the code to understand every implication of the proposed implementation, I think that the idea is sound.

          I encountered blocking reads in our application which need to be addressed. In the end I constructed a local cache in front of the distributed cache. The local cache is updated by the distributed in a fashion where we never block readers. (Optimistically locking was not a perfect fit here, I might revise that in the future if I get the time).

          The blocking encountered was not only due to write locked nodes but also synchronization blocks. However, the local cache solved the issues and I now get very performant concurrent reading from the cache layers.

          If I haven't gotten this all backwards this is a bit similar to what you propose. And the implications by this is that you would infact be solving a real problem that at least we had to deal with. I think that given the application areas for a cache, fast, non-blocking read access is definitely crucial. And I would love to see it implemented in the core cache instead of a tacked on solution like mine. =)





          • 2. Re: New Locking Strategy Proposal for JBoss Cache
            manik

            Ok, after much discussion over the last 2 days with Jason et al in Austin, we've hashed out more details on this.

            I've documented this on this wiki page, and would like feedback/thoughts on this.

            http://wiki.jboss.org/wiki/Wiki.jsp?page=JBossCacheMVCC

            • 3. Re: New Locking Strategy Proposal for JBoss Cache
              gavin.king

               

              "jason.greene@jboss.com" wrote:
              Hello everyone,

              I would like your feedback

              Thoughts?


              I think the colored lights on your laptop are great!

              • 4. Re: New Locking Strategy Proposal for JBoss Cache
                galder.zamarreno

                Manik, thanks for the wiki, it's clearer now :)

                For curiosity, how would you enforce SERIALIZABLE? It's quite an edge case but I guess it's exclusively locking read operations and the rest as it is, correct?

                • 5. Re: New Locking Strategy Proposal for JBoss Cache
                  jason.greene

                   

                  "gavin.king@jboss.com" wrote:
                  "jason.greene@jboss.com" wrote:
                  Hello everyone,

                  I would like your feedback

                  Thoughts?


                  I think the colored lights on your laptop are great!


                  LOL thanks Gavin, they are almost as cool as the seam blog demo color scheme :)



                  • 6. Re: New Locking Strategy Proposal for JBoss Cache
                    jason.greene

                     

                    "galder.zamarreno@jboss.com" wrote:
                    Manik, thanks for the wiki, it's clearer now :)

                    For curiosity, how would you enforce SERIALIZABLE? It's quite an edge case but I guess it's exclusively locking read operations and the rest as it is, correct?


                    Yes, essentially an exclusive FQN lock is acquired on all read operations. We also have the ability to skip the node copy on writes, since it's not needed.

                    I think everyone is more likely to use the forceWriteLock option on a read (AKA SELECT FOR UPDATE) with READ COMMITTED or REPEATABLE READ instead of SERIALIZABLE, since it allows you to selectively chose when and what you serialize access to.

                    • 7. Re: New Locking Strategy Proposal for JBoss Cache
                      manik

                       

                      "jason.greene@jboss.com" wrote:
                      "galder.zamarreno@jboss.com" wrote:
                      Manik, thanks for the wiki, it's clearer now :)

                      For curiosity, how would you enforce SERIALIZABLE? It's quite an edge case but I guess it's exclusively locking read operations and the rest as it is, correct?


                      Yes, essentially an exclusive FQN lock is acquired on all read operations. We also have the ability to skip the node copy on writes, since it's not needed.

                      I think everyone is more likely to use the forceWriteLock option on a read (AKA SELECT FOR UPDATE) with READ COMMITTED or REPEATABLE READ instead of SERIALIZABLE, since it allows you to selectively chose when and what you serialize access to.


                      Either way, to keep the design clean and standardised, this could be encapsulated in a SerializableNode. But the concept is essentially an exclusive lock even for reads.



                      • 8. Re: New Locking Strategy Proposal for JBoss Cache
                        manik

                        FYI I've just updated the wiki with some spiffy diagrams I put together on the flight back to London. I've also created a JIRA to keep track of this: JBCACHE-1151