1 2 Previous Next 15 Replies Latest reply on Oct 16, 2007 1:39 PM by brian.stansberry

    Optimistic locking doesn't scale well with large 'flat' cach

    brian.stansberry

      Customer reported an issue where an OPTIMISTIC 1.4.1.SP3 didn't scale well when Hibernate was caching a large # of instances of an entity. All entity instances are stored in child nodes directly under a base node for the entity. They were caching tens of thousands of instances and saw put() performance drop off dramatically as more instances were added.

      I believe the problem is that as the workspace is set up for a put, the base node for the entity is copied. This involves making a copy of the children map. That map copy operation won't scale.

      One solution is to increase the depth of the tree. Customer tried adding 3 more levels below the entity base. This leads to problems with eviction though (all those added "structural" nodes affect eviction algorithms).

      One concept I've been thinking about today is trying to avoid the children map copy. Half baked thought:

      1) For optimistic cache, regular base node uses a subclass of ConcurrentHashMap for its children map.

      2) Subclass exposes listener registration methods. When a WorkspaceNodeImpl is created, its registers as a listener on the base node's children map. Unregisters as part of workspace reconciliation.

      3) When the put or remove methods are called on the base node's map, those methods are overridden to send notification events to any registered listeners.

      4) WorkspaceNodeImpl tracks any events and uses the info to present a consistent image of the original children. E.g., if a "remove" event is received, the workspace node caches the removed node and uses it to satisfy any internal get calls. If an "add" event is received, the workspace node caches the key to ensure that any get call for that key ignores the base map.


      This might be full of holes or more trouble than it's worth, but wanted to throw it out there.

        • 1. Re: Optimistic locking doesn't scale well with large 'flat'
          jason.greene

          A copy-on-read approach would allow for isolation semantics to be maintained. Gets trigger a copy, adds ignore the base table, and deletes would use a tombstone.

          -Jason

          • 2. Re: Optimistic locking doesn't scale well with large 'flat'
            manik

            Listeners may not be necessary - the WorkspaceNodeImpl could do something like:

            1) Load child nodes lazily. I.e., when the base node is loaded into the workspace, don't load child map.

            2) Changes made to child node map maintained as a delta in the workspace (e.g., adds, modifications, removes) with lazy loading of children.

            3) trying to get a child node will cache a null in the workspace if the child node doesn't exist - to maintain repeatable_read

            4) Merge back changes on commit time.

            The drawback here is that it allows for s slight difference in semantics from pessimistic locking, due to the lazy locking (copying into workspace) of children ...

            Is this approach workable?

            • 3. Re: Optimistic locking doesn't scale well with large 'flat'
              jason.greene

               

              "manik.surtani@jboss.com" wrote:
              trying to get a child node will cache a null in the workspace if the child node doesn't exist - to maintain repeatable_read


              Technically we don't have to do this, repeatable read allows phantoms, and they are still possible if the lazy loading occurs after a new child is created.


              The drawback here is that it allows for s slight difference in semantics from pessimistic locking, due to the lazy locking (copying into workspace) of children .


              I don't think there is a difference here. Pessimistic locking already allows phantoms by not forcing a write lock when nodes are added/remove.


              Is this approach workable?


              Yes, another option is to just lazy copy the entire map the first time it is actually read.

              -Jason

              • 4. Re: Optimistic locking doesn't scale well with large 'flat'
                brian.stansberry

                 

                "jason.greene@jboss.com" wrote:
                "manik.surtani@jboss.com" wrote:
                trying to get a child node will cache a null in the workspace if the child node doesn't exist - to maintain repeatable_read


                Technically we don't have to do this, repeatable read allows phantoms, and they are still possible if the lazy loading occurs after a new child is created.


                The drawback here is that it allows for s slight difference in semantics from pessimistic locking, due to the lazy locking (copying into workspace) of children .


                I don't think there is a difference here. Pessimistic locking already allows phantoms by not forcing a write lock when nodes are added/remove.


                By default, this is true, but there is a flag to acquire a write lock for node add/remove.


                Is this approach workable?


                Yes, another option is to just lazy copy the entire map the first time it is actually read.


                In the entity caching use case, the map is 100% sure to be read, so a lazy copy doesn't gain anything.

                In terms of entity caching, this should be workable since lockParentForChildInsertRemove=false for this use case.

                • 5. Re: Optimistic locking doesn't scale well with large 'flat'
                  jason.greene

                   

                  "bstansberry@jboss.com" wrote:

                  "jason.greene@jboss.com" wrote:

                  I don't think there is a difference here. Pessimistic locking already allows phantoms by not forcing a write lock when nodes are added/remove.

                  By default, this is true, but there is a flag to acquire a write lock for node add/remove.


                  Yes, that's a good point. Since some people might want phantom prevention, and since optimistic locking has it now, we could make the lazy behavior optional, but default to true since isolation doesn't require it.


                  In the entity caching use case, the map is 100% sure to be read, so a lazy copy doesn't gain anything.


                  To clarify, Does it ever call Node.getChildren() or Node.getChildrenNames? What I was meaning was that a direct node lookup (like "/a/b") can use the actual real node's child map, only a direct child map access by the application would require copying.

                  -Jason

                  • 6. Re: Optimistic locking doesn't scale well with large 'flat'
                    jason.greene

                     



                    In the entity caching use case, the map is 100% sure to be read, so a lazy copy doesn't gain anything.


                    To clarify, Does it ever call Node.getChildren() or Node.getChildrenNames? What I was meaning was that a direct node lookup (like "/a/b") can use the actual real node's child map, only a direct child map access by the application would require copying.


                    Just realized, i am guessing the case you describe does alot of writes though, which would have to force a copy, so this is not a good approach.

                    -Jason

                    • 7. Re: Optimistic locking doesn't scale well with large 'flat'
                      brian.stansberry

                      There are such calls, although they are invoked via the Hibernate 2nd level cache statistics API rather than the normal caching mechanism. So, presumably their use would not be frequent.

                      • 8. Re: Optimistic locking doesn't scale well with large 'flat'
                        brian.stansberry

                        My last post was in response to your getChildren(Names) question.

                        I've gotten a bit lost about what you meant by your lazy copy idea. But, yes, the use case involves a lot of writes.

                        • 9. Re: Optimistic locking doesn't scale well with large 'flat'
                          manik

                          I'm not too sure the lazy copy approach helps here. I think it is a good idea, but won't solve the scalability issue in the original post since it will still involve a map copy of tens of thousands of entries.

                          And this will happen every time you put an entity in the 2nd level cache.

                          Going back to my suggestion of not actually copying the child map but only loading specific children as they are accessed, this does add a bit more complexity by way of keeping track of each child node loaded and it's status, but I think this quite effectively solves your scalability problem.


                          • 10. Re: Optimistic locking doesn't scale well with large 'flat'
                            brian.stansberry

                             

                            "bstansberry@jboss.com" wrote:
                            In terms of entity caching, this should be workable since lockParentForChildInsertRemove=false for this use case.


                            This comment was re: your suggestion of not actually copying the child map but only loading specific children as they are accessed. Seems like it should work for cases where phantom reads are acceptable.

                            • 11. Re: Optimistic locking doesn't scale well with large 'flat'
                              manik

                              I've created a JIRA for this so it doesn't get "lost"...

                              JBCACHE-1121

                              • 12. Re: Optimistic locking doesn't scale well with large 'flat'
                                manik

                                bringing this up again. I spent some time investigating this and the way it is currently implemented (in trunk) is as such:


                                * Child map copied to workspace node.
                                * Map of child deltas ALSO maintained.

                                * Child deltas used when children and added/removed
                                * Child map also updated

                                * When nodes are requested using get(), etc., the actual nodes are queried and copied to the workspace. child map not used for this
                                * When merging back changes at commit time, the child deltas are used.

                                * child map not used, for performance reasons.



                                So looking at the above, the actual child map only really is needed in 2 places:

                                1. getChildrenNames()
                                2. move()

                                And I think for both of the above this can be loaded lazily, effectively giving us the scalability we need (except in the cases where the above 2 API calls are made).

                                What do you think?


                                  • 13. Re: Optimistic locking doesn't scale well with large 'flat'
                                    brian.stansberry

                                    That sounds good. If that could be done for 2.1.0, that would be great, as it removes the need to consider any workaround in the Hibernate/JBC integration. (Workaround == hack adding extra levels to the tree).

                                    • 14. Re: Optimistic locking doesn't scale well with large 'flat'
                                      manik

                                      Ok, will work this in, in time for CR1

                                      1 2 Previous Next