1 2 Previous Next 26 Replies Latest reply on Oct 24, 2005 6:51 PM by Brian Stansberry

    JBAS-2142 and JBCACHE-273 Discussion Thread

    Brian Stansberry Master

      Ben, Bela and I have been discussing how to enable "partial" fetching of TreeCache state. This feature would allow delayed (i.e. after the normal full state transfer that occurs when a cache is started) fetching of a portion of a cache's state.

      The need for this feature specifically arises from the requirements of HttpSession replication, but once implemented it should be generally useful.

      For HttpSession replication:


      1) The same TreeCache instance is used to store sessions from multiple webapps. Each webapp stores its sessions in a separate branch (aka region) of the tree.
      2) Proper unmarshalling of replicated sessions for a particular region requires the use of the appropriate webapp classloader. To support this, methods have been added to TreeCache to allow applications to register/unregister a classloader for a particular region.
      3) The web service depends on the TreeCache, so the TreeCache will be started before webapps can register their classloaders. This makes using the existing state transfer mechanism to load state in a new cache problematic. (see JBAS-2142).
      4) The webapps have a separate lifecycle from the cache. They can be undeployed. redeployed, new apps added etc. The TreeCache marshalling layer needs to be able to handle this. Specifically, if a given webapp is not currently deployed on a particular cluster node, that node's cache will not have access to the classloader needed to unmarshal replication events related to the app's cache region. The marshalling layer needs to be able to detect this situation and silently ignore replication messages for cache regions it ca't handle.


      For background on the issue of using different classloaders to different unmarshal portions of the cache tree, see:

      http://www.jboss.com/index.html?module=bb&op=viewtopic&t=67858

      To properly handle deployment of a webapp, the cache needs to expose an API allowing the app's JBossCacheManager to:

      1) Register a classloader for a region.
      2) "Activate" the region. This means:

      a) Lock the region's parent node.
      b) Fetch any existing state for the region from across the cluster and integrate it into the cache (i.e. add it to the parent).
      c) Notify the TreeCacheMarshaller that it can begin normal handling of replication messages for the region.
      d) Unlock the parent node.



      To properly handle undeployment of a webapp, the cache needs to expose an API allowing the app's JBossCacheManager to:

      1) "Inactivate" the region. This means:

      a) Notify the TreeCacheMarshaller that it should begin suppressing handling of replication messages for the region.
      b) Evict the region from the cache.

      2) Unregister the webapp classloader from the region (to prevent leaking the classloader).


      To support this, I propose adding the following methods to TreeCache:

      public void activateRegion(String fqn) throws RegionNotEmptyException
      
      public void inactivateRegion(String fqn)
       throws RegionNameConflictException, CacheException
      
      public void fetchState(String fqn) throws RegionNotEmptyException
      


      The methods needed for registering/unregistering ClassLoaders have already been added by Ben.

      The fetchState(String fqn) method handles the cluster calls needed for partial state transfer. I'm not certain about exposing this method as public at this time. It doesn't need to be exposed to support the HttpSession replication use case, as a call to activateRegion calls through to fetchState. Actually, as I write this I don't see any use case for exposing it, but if anyone can, please let me know :)

      I'll follow up with separate posts on various detail points.


      Brian Stansberry
      Developer, JBoss Clustering
      JBoss, Inc.

        • 1. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
          Brian Stansberry Master

          Partial State Transfer

          In many respects, partial state transfer should work very similarly to the existing "complete" state transfer mechanism (fetchStateAtStartup). In both cases a remote method call is sent across the cluster, and the responder sends back a byte[][], where the first element in the outer array is the marshalled representation of a DataNode and the second element is whatever was returned by a loadState call on the responder's CacheLoader.

          There are some key differences:


          1) fetchStateAtStartup() is handled asynchronously via the JGroups state transfer mechanism. Partial state transfer would be handled synchronously via the TreeCache's callRemoteMethods mechanism.

          2) The fetchStateAtStartup() mechanism only queries the coordinator for state. Partial state transfer must handle the fact that the relevant region may be inactive on the coordinator. To handle this, the new fetchState(String fqn) method will iterate through the cache's member's Vector until it gets a valid response.

          3) The fetchStateAtStartup() mechanism integrates the received state by

          a) Locking the existing root node of the tree.
          b) Assigning the newly unmarshalled root node to the cache's "root" member variable.
          c) Unlocking the old root node of the tree.

          fetchState(String fqn) will need to:

          a) Check the fqn doesn't exist (throw RegionNotEmptyException if it does).
          Do this before doing anything expensive.
          b) Get a reference to "fqn"'s parent node. (Q: What if it doesn't exist?)
          c) Make the remote call, getting back the byte[][].
          d) Lock the parent.
          e) Recheck that the fqn doesn't exist, in case it was added between a) and d).
          f) Use any classloader registered for the region to unmarshall the transient state from the byte[][] into a DataNode.
          g) Add the new DataNode to the parent.
          h) Unlock the parent.

          The above skips some steps that both mechanisms have in common.


          4) How to handle persistentState for a partial state transfer is an open question. The CacheLoader interface doesn't seem to expose a method that fits. storeEntireState obviously doesn't fit; and the put() methods are designed for persisting changes to existing data, not for wholesale replacement. One possibility is to create a subinterface of CacheLoader that adds needed methods ( storeRegionState(Fqn) and loadRegionState(Fqn) ?? ) and then only support peristentState if the cache loader instance implements the extended interface??


          Brian Stansberry
          Developer, JBoss Clustering
          JBoss, Inc.

          • 2. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
            Manik Surtani Master

            I take it this refers to replicating only a part of the tree across the entire cluster, as opposed to replicating a part of the tree across a subset of the cluster?

            Just wondering how this would work in relation to the cache partitioning/buddy replication bits (JBCACHE-60, JBCACHE-61) I would be starting on soon.

            • 3. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
              Brian Stansberry Master

              It's more replicating a part of the tree from just one node to another, with replication occuring at the request of the receiving node, rather than being initiated by some change on the sending node. The idea is to support nodes that can't/don't want to keep a portion of the tree active by 1) giving them a mechanism to get the current state for that portion once they decide they want it and 2) making it possible for them to ignore replication events for that portion of the tree.

              Good question how this all exactly relates to JBCACHE-60 and JBCACHE-61 (which I confess to not being too familiar with :) ); something we'll need to sort out with Ben and Bela.

              Not sure exactly what JBCACHE-60 means; is the idea that different nodes in the cluster are (solely??) responsible for different portions of the tree? This seems like it could have overlap with JBCACHE-273.

              When I look at JBCACHE-61, I think it's focus is on limiting normal replication traffic to just a subset of the cluster nodes, but between those nodes the entire tree would be kept consistent. Correct? We'd certainly need to think this through if we also nodes to inactivate portions of the tree -- if a node's buddy inactivated part of the tree, there would then be no backup of that part of the tree.

              • 4. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                Brian Stansberry Master

                "Swallowing" Events

                If a cluster node inactivates a region of the tree, it wants to ignore replication events for that region. However, it can't simply throw an exception when it gets a replication message, as in a REPL_SYNC config, an exception will cause the overall replication to fail.

                TreeCache replication events are managed by an RpcDispatcher, which delegates the task of unmarshalling the message body to TreeCacheMarshaller. RpcDispatcher expects TreeCacheMarshaller to return an instance of MethodCall; if it does not (e.g. returns null), RpcDispatcher treats this as an error condition, and again REPL_SYNC replications will fail.

                So, if TreeCacheMarshaller detects that a region is inactivated when asked to unmarshal, it cannot throw an exception or return null.

                I propose we add a no-op "public void _swallow()" method to TreeCache. If TreeCacheMarshaller.objectFromByteBuffer() detects that a region is inactivated, it will return a MethodCall targetting the _swallow method. RpcDispatcher will then invoke that method on the cache.

                This seems a bit of a hack; anyone have any better ideas or see any reasons why the "_swallow" idea will not work?

                • 5. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                  Brian Stansberry Master

                  TreeCacheAOP Issues

                  In TreeCacheAOP, if 2 objects hold a reference to a 3rd object, that 3rd object is stored in a special, internal area of the cache. This could potentially cause problems with the region-based repliction scheme discussed above. The following snippet from an e-mail between myself and Ben Wang sums up our thinking:


                  Brian: What about shared references in TreeCacheAOP?

                  Ben: This is a problem in general that we have no control of. We will have to say the shared references should occur under the same region (same as class loader actually). For a web app, for example, you can share within the web app but not across web app. What do you think?

                  Brian: Seems reasonable. I think the biggest issue will be documenting/explaining this correctly, as it's a pretty subtle point.


                  • 6. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                    Bela Ban Master

                    why is there a need for activateRegion() *and* fetchState() ? Can't they be clubbed together ?

                    • 7. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                      Bela Ban Master

                      Looks as if we need a subinterface ExtendedCacheLoader with methods

                      byte[] loadState(Fqn subtree) throws Exception;
                      void storeState(byte[] state, Fqn subtree) throws Exception;
                      


                      This would be integrated into the CacheLoader interface proper in 1.3, where we break the existing APIs anyway.

                      However, what do we do with CacheLoaders that don't support these new methods ? I suggest they need to throw an UnsupportedOperationException, rather than silently return null.
                      In 1.3, we can probably move {load/store}EntireState() into {load/store}State(Fqn), with an Fqn of "/" meaning get/store the *entire* state.




                      • 8. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                        Bela Ban Master

                         

                        "manik.surtani@jboss.com" wrote:
                        I take it this refers to replicating only a part of the tree across the entire cluster, as opposed to replicating a part of the tree across a subset of the cluster?

                        Just wondering how this would work in relation to the cache partitioning/buddy replication bits (JBCACHE-60, JBCACHE-61) I would be starting on soon.


                        It is completely orthogonal. Buddy Replication (BR) hides the fact that we replicate to only a subset of the cluster, if we are on a node that doesn't have the data, BR will transparently fetch it from someone else.

                        • 9. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                          Bela Ban Master

                           

                          "bstansberry@jboss.com" wrote:
                          It's more replicating a part of the tree from just one node to another, with replication occuring at the request of the receiving node, rather than being initiated by some change on the sending node. The idea is to support nodes that can't/don't want to keep a portion of the tree active by 1) giving them a mechanism to get the current state for that portion once they decide they want it and 2) making it possible for them to ignore replication events for that portion of the tree.


                          Partial state transfer is *not* activation across a cluster (that could be done with activation/passivation enabled plus a remote CacheLoader); it is fetching of the state of an *entire* subtree for the purpose of reloading it in the case when a package has been redeployed. This came from the need to ne able to (re)deploy a webapp.

                          • 10. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                            Ben Wang Master

                            1. Just want to point out that this feature can be used in ejb/sfsb as well when we are dealing with marshalling and re-dployment.

                            2. If there is a buddy replication, it will be even better. All we need to do probably is asking the corrdinator for data, and it should obtain it (from some other buddy replication node) first and return.

                            • 11. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                              Scott Stark Master

                               

                              "bstansberry@jboss.com" wrote:
                              "Swallowing" Events

                              Its not clear that the replication msg should even be seen by the higher layers if a node does not want to participate, as we should be limiting the propagation of data as soon as possible. I would look at this notion in the context of a complete state machine where an invalidated node has a truncated processing of the replication msg. It certainly could be that a notification might be needed, but a more descriptive name than swallow is in order.


                              • 12. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                                Scott Stark Master

                                 

                                "bstansberry@jboss.com" wrote:
                                TreeCacheAOP Issues

                                In TreeCacheAOP, if 2 objects hold a reference to a 3rd object, that 3rd object is stored in a special, internal area of the cache. This could potentially cause problems with the region-based repliction scheme discussed above. ...

                                Since class loading is such a pain to debug, especially in an envrionment with redeployment, it would be good if there was some optional ability to validate the type system when a component wants to join a shared region. Maybe some type of region visitor implementation that simply validated that the currently loaded types were consistent with the class loader associated with the component wishing to join.


                                • 13. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                                  Brian Stansberry Master

                                   

                                  "bela@jboss.com" wrote:
                                  why is there a need for activateRegion() *and* fetchState() ? Can't they be clubbed together ?


                                  Yes. Initially thought there might be a use case for exposing fetchState(), but I don't think there is.

                                  • 14. Re: JBAS-2142 and JBCACHE-273 Discussion Thread
                                    Brian Stansberry Master

                                     

                                    "scott.stark@jboss.org" wrote:

                                    Its not clear that the replication msg should even be seen by the higher layers if a node does not want to participate, as we should be limiting the propagation of data as soon as possible. I would look at this notion in the context of a complete state machine where an invalidated node has a truncated processing of the replication msg.


                                    The processing is largely truncated. Basically the JGroups RpcDispatcher gets a byte[] from the message, passes it to TreeCacheMarshaller. Marshaller reads the first bytes to get the Fqn of the target node and does a lookup to see if the node is inactivated. If so. it reads no more data and returns a MethodCall (from static variable) pointing to the _swallow method. RpcDispatcher invokes the method on the TreeCache. Method takes no arguments and is a no-op.

                                    The inefficiency is the invocation of the MethodCall, which I'd certainly like to avoid. But to do this we'd need to change the functioning of the RpcDispatcher so a marshaller can return something to indicate a message should not be processed but that it's not an error condition.

                                    BTW, I misread RpcDispatcher. It doesn't throw an exception if the marshaller returns null, it just logs an error message and returns null. But, logging an error is not OK either.

                                    It certainly could be that a notification might be needed, but a more descriptive name than swallow is in order.


                                    TreeCache doesn't need a notfication; it just has to expose a method that RpcDispatcher can call. (although if someone can think of a use case for a notification I'd feel better about the call).

                                    I'll come up with a better name.

                                    1 2 Previous Next