1 2 Previous Next 22 Replies Latest reply on Nov 29, 2006 7:38 PM by galder.zamarreno

    Default classloader for deserialization

    brian.stansberry

      Right now the default classloader JBC uses for deserializing RPC calls is the CL that loads RpcDispatcher. This thread is to discuss if that's the correct choice.

      Would the TCCL in effect when the cache was deployed be a better option? In the AS, this would likely be the default UCL for the deploy dir. If the cache was specifically deployed as part of an scoped ear or sar or even a webapp, it would be the deployment's CL.

      The same effect can be achieved by using the region API and registering the deployment CL for region "/", but:

      1) That requires programmatic intervention; a simple -service.xml deployment is insufficient.
      2) If forces use of VersionAwareMarshaller, i.e. serializing a region Fqn at the head of each RPC call. Extra overhead.

        • 1. Re: Default classloader for deserialization
          starksm64

          We need to separate services from runtime aspects tied to what are in general type scoped contexts. If an aspect needs to deal with typed data associated with a potentially scoped deployment, that aspect has to be presented the class loader to use. It also needs to have a life-cycle associated with the deployment as it needs to be recycled if the type system needs to be updated.

          In the past we hid these type issues by encapsulating the data in MarshalledObject/MarshalledValue instances so that the common service layers did not in fact see the type.

          • 2. Re: Default classloader for deserialization
            brian.stansberry

            JBC does have a mechanism for encapsulating the classloading aspect (TreeCacheMarshaller) and an API for code that uses it to register the classloader and manage the lifecycle. We need to be sure to use this mechanism for our own services that use JBC (i.e. doing that in Hibernate is the fix for JBCLUSTER-150.)

            Problem with this is it requires an integration layer to register the classloader, manage the lifecycle, etc. It's also built around an assumption that different regions of the cache use different classloaders. Overly complex and painful for a simple ear that wants to deploy a cache from the ear and use it.

            It would be nice to make it easy for a user to specify during the configuration stage that the TCCL should be registered as the classloader for the entire cache and that the ordinary create/destroy lifecycle methods should deal with the classloader.

            • 3. Re: Default classloader for deserialization
              starksm64

               

              "bstansberry@jboss.com" wrote:

              Problem with this is it requires an integration layer to register the classloader, manage the lifecycle, etc. It's also built around an assumption that different regions of the cache use different classloaders. Overly complex and painful for a simple ear that wants to deploy a cache from the ear and use it.

              It would be nice to make it easy for a user to specify during the configuration stage that the TCCL should be registered as the classloader for the entire cache and that the ordinary create/destroy lifecycle methods should deal with the classloader.

              To me this is another wrapper that integrates well with this usage. Its an mbean service/mc bean that performs the same deployment level integration as the other jboss cache based aspects. This is an independent distributed cache usage outside of any ejb3/hibernate/session usage?

              I guess I'm not clear on what the point of this thread is. The only concern it raises is that if we are promoting indpendent use of jboss cache, it leads to an obvious conflict of versions when an ear is trying to deploy a version of jboss cache that differs from that bundled with the server. This either requires obfuscation of the package names or very careful control of the container class loading setup.


              • 4. Re: Default classloader for deserialization
                brian.stansberry

                This thread is about independent usage of JBC, not as part of ejb3/hibernate/httpsession repl where we have an integration layer that can take advantage of the existing region-based marshalling API.

                Here's the problem:

                EAR packages class Foo, and also deploys a JBC instance via a -service.xml or -beans.xml. Application places instances of Foo in the cache, which then get replicated.

                When the replication message is received, Foo needs to be deserialized. This is done by the thread coming up from JGroups. This deserialization is done using the classloader for server/all/lib, which is where the JBC and JGroups jars are located. Thus class Foo is not visible to the classloader. Replication fails with a CNFE. Same problem occurs with state transfer when a 2nd node joins a cluster where there's existing data.

                Solutions to this problem are:

                1) Place Foo.class in server/all/lib.
                2) Turn on region-based marshalling and register a classloader with the cache, effectively saying for example, for any replication traffic related to Fqn's below "/a", use this classloader to deserialize the message. If this feature is turned on, all replication messages consist of two components -- first a serialized Fqn that identifies which region the message pertains to (used by the recipient to look up the classloader) and then the regular serialized MethodCall.

                To me this is another wrapper that integrates well with this usage. Its an mbean service/mc bean that performs the same deployment level integration as the other jboss cache based aspects.


                This is one possible solution, and perhaps the best. The downside to it is it adds the overhead of including the region-identifier Fqn in each replication message -- which isn't really needed if there is only one correct classloader for the whole cache.

                So, the purpose of this thread is to explore that and possibly other solutions.

                • 5. Re: Default classloader for deserialization
                  starksm64

                  Ok, then the disconnect is why control of serialization requires a region based configuration. The first problem would seem to be that even the typed rpc functionality of jgroups should have a serialization api that allows shared use of a channel with marshalling/unmarshalling of types based on application endpoints with different type systems. Serialization is an aspect of the user, not an inherent function of the jbc or jgroups. The default configuration of the serialization aspect needs to be triggered by an application level aspect, which could pick up the app thread context class loader.

                  It seems we are picking up a naive standalone component behavior where serialization is an inherent function of the messaging layer. That is not the case for any service that needs to integrate into jbossas, esb, etc.

                  • 6. Re: Default classloader for deserialization
                    brian.stansberry

                    The region-based configuration thing is somewhat of a holdover from the pre-JGroups-multiplexer days, when a cache was an expensive object due to the need for its own JChannel. It allows multiple deployments with different type systems to use the same underlying cache, as long as they store their data in separate regions. HTTP session repl with FIELD granularity uses this, since MarshalledValue wouldn't work properly with what PojoCache needs.

                    My instinct is that with the multiplexer it makes more sense to have an architecture where deployments that need a cache instantiate their own -- see comments on the second half of http://www.jboss.com/index.html?module=bb&op=viewtopic&t=93825 for some thoughts on this. But this takes some thought. It goes in the opposite direction, for example, from an idea Sacha raised last winter about having all data related to a Seam app be stored in the same cache so there could just be one replication event per request.

                    JGroups does provide the needed hooks for applications to do marshalling/unmarshalling of messages as they see fit. JBC makes extensive use of these (as does HAPartition). So JGroups isn't naive in this respect. It's more that JBC is naive, if the overly complex region-based thing isn't used.

                    The default configuration of the serialization aspect needs to be triggered by an application level aspect, which could pick up the app thread context class loader.


                    OK, sounds like we're moving toward something like:

                    1) JBC exposes an API that allows registration of a default classloader but which doesn't trigger the whole region-based marshalling overhead. I suppose registration can be a simple setter in the Configuration object graph.
                    2) We have a wrapper class that in create() picks up the TCCL and registers it, then calls create on the cache. In destroy() it calls destroy() on the cache and then unregisters the classloader.

                    • 7. Re: Default classloader for deserialization
                      starksm64

                      That still sounds potentially too coarse. I would look at the cache as message oriented tree where a usage context has to bind a serialization manager to obtain a typed view of the cache data. When there are puts/gets to the cache, it needs to be done through the context "connection" to the cache messaging layer that has the correct serialization manager.

                      I view a deployment wanting their own cache as wanting their own typed view, and namespace, but they may want to automagically integrate into the deployment env transport that binds distributed cache contexts together. Not overriding the transport/clustered configuration gets into being able to take a deployment and have it inherrit aspects of its environment.

                      • 8. Re: Default classloader for deserialization
                        brian.stansberry

                        If I understand you correctly, that implies that internally the cache stores everything as a byte[] or MarshalledValue or some such. This includes the elements of Fqns, as these can also be typed.

                        • 9. Re: Default classloader for deserialization

                           

                          "bstansberry@jboss.com" wrote:

                          Would the TCCL in effect when the cache was deployed be a better option? In the AS, this would likely be the default UCL for the deploy dir. If the cache was specifically deployed as part of an scoped ear or sar or even a webapp, it would be the deployment's CL.


                          I don't quite understand your proposal to handle "scoped" deployment though? Each scoped application will have their own UCL, isn't it? And this is deployment specific.

                          • 10. Re: Default classloader for deserialization
                            starksm64

                             

                            "bstansberry@jboss.com" wrote:
                            If I understand you correctly, that implies that internally the cache stores everything as a byte[] or MarshalledValue or some such. This includes the elements of Fqns, as these can also be typed.


                            At some level, but it does not have to be the core. Having the namespace be typed and potentially scoped I'm sure complicates interaction of what might be shared/common aspects. If the cache is used in pure in memory mode, there should be no need to serialize the data. If you through in a cache loader, then you have another serialization path. It certainly needs the correct class loader, but its separate from the current jgroups message issue. Is there a common serialization abstraction for these two cases?


                            • 11. Re: Default classloader for deserialization
                              brian.stansberry

                               

                              "ben.wang@jboss.com" wrote:

                              I don't quite understand your proposal to handle "scoped" deployment though? Each scoped application will have their own UCL, isn't it? And this is deployment specific.


                              The concept there was that if a cache was deployed as part of an ear with a scoped classloader, the default classloader for deserializing messages would be the ear's classloader. So the cache would be able to deserialize any class visible to the ear.

                              • 12. Re: Default classloader for deserialization
                                brian.stansberry

                                 

                                "scott.stark@jboss.org" wrote:
                                At some level, but it does not have to be the core. Having the namespace be typed and potentially scoped I'm sure complicates interaction of what might be shared/common aspects. If the cache is used in pure in memory mode, there should be no need to serialize the data. If you through in a cache loader, then you have another serialization path. It certainly needs the correct class loader, but its separate from the current jgroups message issue. Is there a common serialization abstraction for these two cases?


                                Currently no. There is some recent discussion around unifying this somewhat (see http://jira.jboss.com/jira/browse/JBCACHE-879.)

                                I'm not aware of any serialization problems related to classloaders, presumably because cacheloader operations are typically performed by application threads where the TCCL has visibility to the required classes. That could certainly break down though in a more exotic use case.

                                • 13. Re: Default classloader for deserialization
                                  manik

                                  Well, to boil it down (and from what I understand) is that the default CL used is by JGroups, in the RpcDispatcher, when deserializing MethodCall objects streamed over the wire. And you are talking about overriding this with the TCCL in effect when JBossCache is instantiated?

                                  • 14. Re: Default classloader for deserialization
                                    maxandersen

                                    It should be the TCCL that is in effect for the application that needs to read/write to the cache.

                                    Wether that is actually possible because of the asynchronous ways of JBC usage...that is the question. (I guess this is why bstansberry talked about classloader per region)

                                    1 2 Previous Next