9 Replies Latest reply on Oct 19, 2005 5:50 PM by starksm64

    Improving node versioning semantics in optimistic locking

    manik

      Background:

      Read http://wiki.jboss.org/wiki/Wiki.jsp?page=OptimisticNodeLocking

      Where it is:

      Currently when a node in a workspace is modified (created, deleted, or has any of its attributes changed) it is marked as dirty. When a node is marked as dirty, it's state is copied back into the underlying TreeCache, and it's version incremented.

      The problem:

      If we create a new node in the workspace, the new node is marked as dirty (as expected) but even it's direct parent is marked as dirty since it's list of children has changed.

      While this works on a fundamental level (albeit not as efficiently) it does create issues when there are a large number of threads creating new nodes directly under the root. The root node's version is frequently updated, and a number of these threads would have to roll back due to validation exceptions.

      My solution:

      Don't mark parent nodes as dirty when children are created. Instead, at commit time, lock the parent node and update the TreeNode's children list directly (adding references to new children, removing old ones). Since the actual data in the parent is unchanged, it's version should be kept the same.

      What do people think about this?

      Cheers,
      Manik

        • 1. Re: Improving node versioning semantics in optimistic lockin
          starksm64

          My first thought is that this is working around an improper use of optimistic locking. If I have many concurrent writes to the child list of a node its not a read-mostly situation.

          If the root and children map to a Order and OrderItems collection and a tx updates the Order.sum, its possible that sum written is inconsistent with the OrderItems and this would not be detected because the Order version has not updated beween initial read and the time the sum is written.

          • 2. Re: Improving node versioning semantics in optimistic lockin
            manik

            Wouldn't that argument then extend beyond direct parents?

            E.g., if I have an OrderGroup, which has many Orders, and each Order having many OrderItems, my having an OrderGroup.grandTotal written would also be inconsistent unless OrderGroup's version gets updated if I add an OrderItem to one of my Orders.

            And if we were to do this, we would have to update the version of the root node if any node directly or indirectly under it is changed/added/removed... not very good for concurrency.

            I agree that with that many writes it is an incorrect application of optimistic locking, though.

            • 3. Re: Improving node versioning semantics in optimistic lockin
              starksm64

              Yes it would, so a question that arises is what is being optimistically locked? Whether or not you want the tree to be treated as a consistently versioned object vs hierarchical data where each node should be versioned independently depends on what is being stored and how its being used.

              Take the case of the cache being used as a second level cache for an entity that is mapped onto the cache using aop. How is a change to a leaf node going to be reflected in terms of the overall object dirty state? Such a graph certainly cannot have two txs write to different leaf nodes with two writes of the object to a db that results in a last commit wins mode to the db. If just the changed columns were written this actually would not be a conflicting concurrent writes, but if the full state was written back to the db it would be.

              • 4. Re: Improving node versioning semantics in optimistic lockin
                steview

                See below for email conversation with Manik by email on this topic

                > Hi manik,
                > Yes I agree and in an optimistic sense that is really to do with the
                > validation stage. There are a few strategies that you can apply
                > (principally forward and backward validation with merging to make
                > things
                > more flexible).
                >
                > So, one approach (which is to say these two threads both changed the
                > parent - therefore one is rolledback).
                >
                > However, in validation we can take a slightly more permissive approach
                > and say - is this particular node incompatible with the other
                > transaction (with incompatible being defined in the code) - so if two
                > children are put in under different keys perhaps this is not
                > incompatible (although it can become very difficult to cover all bases
                > here).
                >
                > The real problem I think is that because the structure is a tree
                > and if
                > there is a lot of nodes put in under the root node then this is a
                > problem (lots of contention) - if the same contention was happening
                > on a
                > node much further down the branches then we might say that this is
                > fine
                > as the optimistic stuff is guaranteeing that the second change is
                > rolled
                > back as it has optimistically failed (the point of the optimistic
                > stuff)
                >
                > We can look at some slightly more permissive strategies if you want (I
                > even mention this in my roadmap docs that I sent along with the
                > original
                > code)
                >
                > Let me know what you think
                >
                > steve
                >
                >
                > -----Original Message-----
                > From: Manik Surtani
                > To: Steve Woodcock
                > Subject: Re: Opt locking - node versioning and merging
                >
                > Yes, the versioning and dirty marking does take place on the
                > workspace node only, but on commit (after successful validation) this
                > version is written back to the underlying node so that concurrent
                > threads will be able to validate their workspace nodes when they
                > commit.
                >
                > The problem is that parent nodes (which have no changes except that
                > they have gained/lost child nodes) are being marked as dirty as well,
                > and their version sget incremented on commit time. This only causes
                > a problem if a parent has many children (imagine a tree 1-level deep,
                > where all nodes are directly under the root). Completely beats the
                > purpose of opt locking if every tx will need to roll back because the
                > parent (root) is marked as modified just because a new node is added!
                >
                > Cheers,
                > --

                >
                >
                > On 19 Oct 2005, at 14:16, Steve Woodcock wrote:
                >
                >
                >> Manik,
                >> I am pretty sure that the versioning and the dirty marking all take
                >> place in the workspace (on a wrapper of the real node - not the real
                >> node)- the root node's version and state is only updated on commit
                >> after
                >> the validation has been run (at least that is how I remember it
                >> working
                >> ) - so I do not think the rollback is an issue as it only affects the
                >> copies - not the real node
                >>
                >> So I am not sure the description is entirely accurate

                • 5. Re: Improving node versioning semantics in optimistic lockin
                  steview

                  I think that the versioning is actually a node level issue - not an overall tree level one.

                  It just happens to be a tree in this case but could easily be anything else - such as a Map or we could even (at a stretch) imagine a set of leaves under a parent to be roughly analogous to a table. In these cases the optimistic versions tend to apply at a row level rather than a table level.

                  I think an issue here is that we want the optimistic strategy (which is currently based on the traditional backward validation approach) to somehow work out the compatible operations on the same parent node (in a sense the locking method is not really relevant - the aim is that the operations should produce consistent results as if the operations were run in a serialized manner).

                  Therefore the addition of two nodes in a parent under different keys could be seen to be compatible (as would adding two rows to a table) - but something like the first tx removing a node which is later updated by a second tx would not be.

                  There is some existing research on this sort of conflict resolution and it bears some relation to the algorithms used in version control systems to provide merge and conflict detection functionality.

                  I suppose it depends on how far down this path (in a complexity sense) you want to go.

                  • 6. Re: Improving node versioning semantics in optimistic lockin
                    starksm64

                    It makes sense when the nodes do correlate with rows. The disconnect is when the nodes correlate with columns as could be the case when mapping an entity using aop. In such a case couldn't we just say that a write to any subtree node should require an optimistic lock on the root? The problem would be how to identify such a subtree to the cache.

                    • 7. Re: Improving node versioning semantics in optimistic lockin
                      steview

                      hmmm..... I am not sure I quite understand the analogy to columns. By columns are you suggesting that the same property of many objects are changed?
                      Surely an optimisitic version increase of the root node with any other node change assumes that the root node is always the root of all objects - which I assume not the case in the majority of cases (If I am mistaken as I am not as familiar with the AOP stuff - please correct me).

                      I think what we need to identify in the AOP case is the parent of the root of the object - not necessarily the whole tree (which may be what you are actually getting at) - in which case the issue of merging still becomes the same problem - i.e do we allow the changes to different properties of one object to be merged - as we are effectively treating it as a collection of properties - or do we say this subtree has to be changed as a whole - in which case the AOP code would require a mechanism to update all nodes that make up the object -

                      • 8. Re: Improving node versioning semantics in optimistic lockin
                        starksm64

                         

                        "steview" wrote:

                        I think what we need to identify in the AOP case is the parent of the root of the object - not necessarily the whole tree (which may be what you are actually getting at) - in which case the issue of merging still becomes the same problem - i.e do we allow the changes to different properties of one object to be merged - as we are effectively treating it as a collection of properties - or do we say this subtree has to be changed as a whole - in which case the AOP code would require a mechanism to update all nodes that make up the object

                        Yes, this is what I'm driving at. Some arbitrary subtree in the cache corresponds to a fine-grained mapping of an object that is being updated concurrently. Generally you cannot merge the individual changes so the version of all nodes in the subtree needs to be changed atomically, which certainly does make optimistic locking of such a mapping more expensive.



                        • 9. Re: Improving node versioning semantics in optimistic lockin
                          starksm64

                          Or instead of maintaining the version on every node in the subtree, all nodes point to the root of the subtree and this is the only node used for the optimistic lock check.