7 Replies Latest reply on Dec 1, 2006 11:42 AM by brian.stansberry

    Problems with replicating entity queries via JBC

    brian.stansberry

      Using JBC as a 2nd level cache leads to problems deserializing user classes when the Hibernate query cache is enabled. See http://jira.jboss.com/jira/browse/JBCLUSTER-150 for details.

      This is really a JBC/Hibernate issue, although workarounds are possible in EJB3. Since any solution impacts EJB3, and any very short term workaround (i.e. for a December stacks release) is only possible in EJB3, I'm discussing the issue here.

      Issue is JBC's classloader is from server/all/lib, and thus can't see and deserialize classes that are loaded from deploy. Deserialization happens when Hibernate's use of JBC as 2nd level cache causes replication. Specifically:

      1) The Hibernate query cache is used. Here a user class could end up being used as an argument in a query. Hibernate replicates the actual query as the part of the FQN of a node in JBC.
      2) A custom class is used as a field in an entity, and its not mapped as a component, but rather Hibernate treates it as a BLOB. The custom class would be replicated. (I haven't actually confirmed this failure mode, but it quite likely exists.)

      Solutions:

      A) Register a classloader with JBC for a region of the cache. JBC exposes an API to allow this to happen. The Hibernate/JBC integration code (part of Hibernate) can be updated to take advantage of this API. Hibernate already has the logical concept of different cache regions; the integration code can be updated to register the thread context classloader with JBC when a region is created.

      EJB3 already partially overrides the standard Hibernate/JBC integration code, so as a short term workaround it's possible to fix this in EJB3, pending a later Hibernate release with this fixed.

      Problems with solution A):

      i) Hibernate has a "default" query cache region that will be shared by all queries that don't specifically name a region. There is no foolproof way to assign a specific classloader to this region, since it's storing to a JBC instance that may be shared between multiple EJB deployments. The cache region can be specified by users, although it's ugly:

      @Entity
      @Cache (usage=CacheConcurrencyStrategy.TRANSACTIONAL)
      @NamedQueries({
       @NamedQuery(name="account.highbalance.default",query="select account.balance from Account as account where account.accountHolder = ?1 and account.balance > ?2"),
       @NamedQuery(name="account.highbalance.namedregion",query="select account.balance from Account as account where account.accountHolder = ?1 and account.balance > ?2",
       hints={@QueryHint(name="org.hibernate.cacheRegion",value="AccountRegion")})
      })
      public class Account implements Serializable
      {
      ...
      }
      

      The second query above uses the named region.


      ii) With the query cache, the custom class is actually part of the Fqn of a JBC node, rather than part of its data map. I realized over the weekend that the existing JBC code for handling region-based classloaders doesn't handle custom classes in an Fqn. JIRA to remove this limitation is http://jira.jboss.com/jira/browse/JBCACHE-876, which should be pretty straightforward. But, fixing it requires JBC 1.4.1.GA, which is likely at least a couple weeks away.


      Bringing us to solution..

      B) Don't replicate the query cache.

      Hibernate provides a hook to specifying the factory class used to set up any query cache. In persistence.xml we can specify a custom query cache factory that integrates with JBC, but which instructs JBC not to replicate any writes for that region of the cache. Thus the query cache is a local only cache, while entities are replicated. Hibernate also has an "UpdateTimestampsCache", which is used to trigger invalidations of query cache entries. AFAIK, there is no replication problem with this cache, and it needs to remain replicated.

      I've pinged the Hibernate guys re: any issues w/ not replicating the query cache.

      Solution B can also be implemented in the EJB3 code base pending a better solution in a Hibernate release. If there is an immediate requirement to get a workaround for this problem, it's the only way I see to go.


      Another long term possibility is solution...

      C) Like A), but now we create one JBC entity cache per deployment (i.e. per Hibernate session factory) rather than a single shared cache. This avoids the problem of deciding what classloader to use for the "default" query cache -- there's only one classloader per cache instance. With the JGroups multiplexer, having numerous JBC instances is a workable option. But going this route requires more thought.

        • 1. Re: Problems with replicating entity queries via JBC
          maxandersen

          Hi Brian,

          The important part of querycaching is updatetimestampscache which does require replication - otherwise hibernate will not be able to decide wether a result in the query cache is out-of-date or not.

          So it should be ok (i guess) :)

          Manik, Bela, Galder and I had a talk during JBW which I hope Manik will soon send out his notes for.
          The results of that meeting would probably help fix some of these things to.

          Saying that, I'm surprised to hear that JBC isn't able to use the applications current thread context classloader ?! I thought this were the way these serilaization issues were solved before ?

          Note, I don't think the serialization issues only occur for querycache - it would AFAIK also occur if a user has his own UserType in the id of his entities.

          /max



          • 2. Re: Problems with replicating entity queries via JBC
            brian.stansberry

            Thanks, Max.

            I figured if update timestamps was replicated it should be OK. I'll of course test it out :).


            Saying that, I'm surprised to hear that JBC isn't able to use the applications current thread context classloader ?! I thought this were the way these serilaization issues were solved before ?


            Problem arises because the cache is not deployed as part of the application -- it's a shared resource. Deserialization is done by a thread coming up from the JGroups layer, not an "application" thread, so there is no single correct TCCL to use.

            But, you raise an interesting point about when the cache is not a shared resource. In that case it still uses the classloader that loaded JGroups to deserialize messages, rather than the TCCL that was in effect when the cache was deployed. I'll open a forum thread about that.


            Note, I don't think the serialization issues only occur for querycache - it would AFAIK also occur if a user has his own UserType in the id of his entities.


            Yep. That requires JBCACHE-876 to fix, since the user class will be in the Fqn rather than the data map. :(

            • 3. Re: Problems with replicating entity queries via JBC
              brian.stansberry
              • 4. Re: Problems with replicating entity queries via JBC
                brian.stansberry

                Seems this is only an issue when a scoped classloader is used. If one isn't used, the classloader that loaded JGroups sees the classes without any problem.

                • 5. Re: Problems with replicating entity queries via JBC
                  maxandersen

                  but scoped classloading is what we want to allow deployment of apps that uses different versions of classes.

                  • 6. Re: Problems with replicating entity queries via JBC
                    brian.stansberry

                    Absolutely. I didn't mean to imply that this meant it wasn't a serious problem.

                    • 7. Re: Problems with replicating entity queries via JBC
                      brian.stansberry

                      A temporary workaround for this has been added as described in http://jira.jboss.com/jira/browse/EJBTHREE-798.

                      To give more details about how this works to those unfamiliar with the JBC marshalling API, I'll go through an example here where 2 different EJB deployments (i.e. different jars with their persistence.xml files) are deployed and used on 2 different servers in a cluster.

                      I) Startup of Server 1.

                      A) Shared JBoss Cache instance used for entity replication is started. JBC is configured for region based marshalling. JBC is configured "inactive at startup". This means the cache will ignore any replication traffic it receives from across the cluster, until the "region" (i.e. branch of the cache's tree) to which the traffic pertains is activated.

                      B) EJB deployment A.jar is deployed on Server 1.

                      1) Hibernate discovers @Entity com.titan.Foo in A.jar.
                      a) Hibernate calls TreeCacheProviderHook asking for a org.hibernate.cache.Cache instance for "com.titan.Foo".
                      b) TreeCacheProviderHook instantiates an instance of o.j.e.entity.JBCCache.
                      c) JBCCache registers the TCCL with JBC as the classloader for /com/titan/Foo.
                      d) JBCCache activates region /com/titan/Foo.
                      e) JBC requests a transfer from the cluster of the state for region /com/titan/Foo. Nothing is returned since Server 1 is the only active server.
                      f) JBC instructs its replication layer to begin accepting replicated messages related to /com/titan/Foo.

                      2) Hibernate sees that the query cache is enabled, so it calls TreeCacheProviderHook asking for a org.hibernate.cache.Cache instance for "org.hibernate.cache.StandardQueryCache".
                      a) Same process as 1) above occurs, except:
                      i) JBCCaches recognizes that "org.hibernate.cache.StandardQueryCache" is a special region that multiple deployments can share, so it doesn't register its classloader.
                      ii) No registered classloader means data can't be safely replicated, so JBCCache sets a flag telling itself to make any writes to the cache "local only", i.e. non-replicated.

                      3) Hibernate sees that the query cache is enabled, so it calls TreeCacheProviderHook asking for a org.hibernate.cache.Cache instance for "org.hibernate.cache.UpdateTimestampsCache".
                      a) Same process as 1) above occurs, except:
                      i) JBCCaches recognizes that "org.hibernate.cache.UpdateTimestampsCache" is a special region that multiple deployments can share, so it doesn't register its classloader.
                      ii) Only Strings and longs are used in the "org.hibernate.cache.UpdateTimestampsCache", and this data *must* be replicated, so JBCCache does *not* set the flag telling itself to make any writes to the cache "local only".

                      C) EJB deployment B.jar is deployed on Server 1. Jar contains entity com.titan.Account. Same process as B) above occurs except:
                      1) When JBCCache instance for StandardQueryCache is created, it doesn't bother activating the region, as it sees that it is already active.
                      2) Same for UpdateTimestampsCache.

                      D) Some instances of Foo and Account get cached. These would replicate (if there were other servers in the cluster.)

                      E) A cacheable query is executed, with no special cache region specified. JBCCache for the StandardQueryCache region puts it into JBC, but sets an option so it doesn't replicate.

                      F) A cacheable query is executed, with special cache region "QueriesA" specified.
                      1) Hibernate sees it needs to instantiate a Cache for region "QueriesA".
                      2) Same process as B1) above occurs.
                      3) Query is put into the cache, and would replicate if there were other cluster members.

                      II) Server 2 starts

                      A) Same as IA) above.
                      B/C) Same as IB/C) above, except:

                      B1) now when the region is activated, a state transfer does occur, bringing over the cached queries from Server 1.
                      B2) Same as IB2) above, but again a state transfer will occur.

                      ISSUE This would be a problem if custom classes were stored from multiple deployments, as the TCCL will only be able to deserialize classes for the current deployment (A.jar). I need to think if there is any workaround for this. Note that this only causes a problem if an instance of a custom class is directly stored as a query parameter or as a single result. If the custom class is broken down by Hibernate into primitives, it's not a problem.

                      B3) Same as IB3) above, but again a state transfer will occur. This won't be an issue as no custom classes are stored.

                      III) EJB deployment A.jar is undeployed on Server 1. (Not sure of the exact order of this, but shouldn't matter).

                      A) Hibernate calls destroy() on JBCCache for "QueriesA" region.
                      1) All data in the region is evicted from the local cache. This isn't replicated.
                      2) inactivateRegion() is called on JBC for "QueriesA". JBC starts ignoring replication traffic for nodes under Fqn /QueriesA. All data under the region is then evicted again. This ensure no classloader leakage from objects stored in the cache.
                      3) JBC unregisters the classloader from JBC for "QueriesA".

                      B) Same as A) for the "com.titan.Foo" region.

                      C) Hibernate calls destroy() on JBCCache for "org.hibernate.cache.StandardQueryCache" region.
                      1) All data in the region is evicted from the local cache. This isn't replicated. But, any cached queries for B.jar will be evicted as well. (Just means they need to be executed against DB again if needed.)
                      2) inactivateRegion() is *not* called on JBC, since JBCCache has no way to know if other deployments are still using it.

                      D) Same as C) above for "org.hibernate.cache.UpdateTimestampsCache" region.