9 Replies Latest reply on Nov 19, 2007 9:49 AM by brian.stansberry

    Optimising contents of a prepare call - JBCACHE-611

    manik

      From Bela's original comments in JBCACHE-611


      When we use transactions, all modifications (PUT,REMOVE) are bundled (keyed by TX-ID) and replicated on TX commit.
      This is very inefficient if (a) we have many modifications in the TX scope and (b) if the modifications touched the same parts of the cache over and over again.
      We need to optimize the modification list before replication. (This maybe enabled/disabled via an attribute).

      Example: put(a/b/c) --> put(/a/b/c/d) --> remove(/a/b/c) --> put(a/b/c) should simply result in a PUT modification with *all* of the attributes of /a/b/c, so 1 modification instead of 4 !

      A general algorithm might be:
      - Maintain a dirty-node map Map<FQN, status>) (= "dirty", "removed")
      - On a put(FQN): add or set FQN to the map, set status="dirty"
      - On a remove(FQN): set FQN in map to "removed", needs to mark all subnodes as well, so maybe the dirty-node map needs to be
      a tree...

      We might get rid of *constructing the modification lists during modifications*, and only populate the dirty-node map. The modification list could only be created at *TX commit time*, from the dirty-node map, e.g in the above example: PUT(a/b/c).

      This only applies to pessimistic locking (does it ? can't we optimize optimistic locking in similar ways ?)


        • 1. Re: Optimising contents of a prepare call - JBCACHE-611
          manik

          For optimistic locking, see JBCACHE-331 which talks about mods being replayed on remote caches when we already have a workspace node with updated state. Why not just transmit the updated state from the workspace.

          The algorithm you mentioned works well for pess. locking. We'd also need to create another map of 'original state' for each Fqn so a rollback is possible (since we will not create 'equal and opposite' modifications for each modification invoked).

          On the remote caches, the 'dirty node map' + a new 'state map' could be broadcast with a prepare, and the TxInterceptor could convert these to MethodCalls (puts or removes) and push them up the interceptor chain.

          • 2. Re: Optimising contents of a prepare call - JBCACHE-611

            Yes, from a long running tx, this can indeed provide extra saving in replication message size and also speed!

            Currently, in PojoCache putObject and removeObject, I have many modifications that all use the same tx (as a batch processing). Doing it in one shot is important.

            • 3. Re: Optimising contents of a prepare call - JBCACHE-611
              manik

              The only drawback is for simple txs with very few operations that cannot be optimised, we would end up wasting both memory and cpu cycles for 0 gain. But then these are the tradeoffs. Overall I still think this is a good thing.

              • 4. Re: Optimising contents of a prepare call - JBCACHE-611
                belaban

                That's what the EnableCompaction attribute is there for, you can turn it off and fall back to the old mechanism

                • 5. Re: Optimising contents of a prepare call - JBCACHE-611
                  manik

                  Bringing this back up, since this is scheduled for inclusion in 2.2.0.

                  As a basic design, I think we should look at a consistent "replay" approach for both pessimistic and optimistic locking (rather than shipping the workspace deltas across for optimistic locking), so we can stick with a single TxInterceptor. Shipping the workspace across will require different TxInterceptor behaviour when receiving a prepare.

                  The actual approach could be quite simple - the TransactionEntry currently maintains a List of MethodCalls. This list is populated by TransactionEntry.addModification(MethodCall m).

                  This method could be modified to perform any compacting on the modification list at this time (if compacting is enabled).

                  Also, as a default, I think compacting *should* be enabled since most transactional calls would involve > 1 method invocation, and even if they didn't the compacting algorithm can only start compacting if there are, say, > 3 modifications at the time.

                  • 6. Re: Optimising contents of a prepare call - JBCACHE-611
                    mircea.markus

                    what about compacting only at replication time? Iterating over the list of changes(method calls) it the transaction entry only once, at replication time might reduce the overall time consumtion of processing this by optimizing the processing (e.g. if there are only 3 method calls then don't use this pattern, start processing the removals first and igonore the corresponding puts etc). This way we can calculate the drawback introducec by compacting as it only happens once, in one place.

                    • 7. Re: Optimising contents of a prepare call - JBCACHE-611
                      genman

                      When I looked at this bug, I'm curious if there are real-world application use cases that would benefit from transaction compacting. For those applications that don't use JBoss Cache directly, would any of the following benefit from this: Hibernate 2nd level cache, HTTP session cache, POJO cache?

                      I guess what I wonder is if there is common use case, you could benchmark it, make some compacting optimizations, then see if the changes actually had any significant impact.

                      My money is on not much benefit of this.

                      • 8. Re: Optimising contents of a prepare call - JBCACHE-611
                        manik

                        I expect that this could help HTTP session replication, since invocation batching is done (achieved currently via a transaction). Brian?

                        You're probably right that such optimisations are already done for Hibernate/EJB3 though.

                        Either way, I don't think the cost/complexity of this is that great.

                        • 9. Re: Optimising contents of a prepare call - JBCACHE-611
                          brian.stansberry

                          For web session replication, this would only be useful for FIELD granularity. SESSION is a single put; ATTRIBUTE the session manager already optimizes (tracks what attributes are dirty and writes the final state of those to the cache).