Clustered JPA/Hibernate Second Level Caching in JBoss AS 5

Version 6

    Overview

    One of the big improvements in the clustering area in JBoss AS 5 is the use of the new Hibernate/JBoss Cache integration for second level caching that was introduced in Hibernate 3.3.  Used along with AS 5's new CacheManager service, the combination provides a flexible framework for caching entities and query results.

     

    In the JPA/Hibernate context, a second level cache refers to a cache whose contents are retained beyond the scope of a transaction.  Because items are maintained in memory after any transactional locks in the underlying database are released, a second level cache needs to be very careful to ensure that cached items are either updated or invalidated out of the cache whenever the contents of the underlying database are changed. If you use more than one JBoss AS instance to run your JPA/Hibernate application and you use second level caching for read-write entities, you must use a cluster-aware cache. Otherwise a cache on server A will still hold out-of-date data after activity on server B update some entities. JBoss AS provides a cluster-aware second level cache based on JBoss Cache.

     

    There are actually four distinct types of items that can be cached in the second level cache: entities, collections, query results and entity timestamps. (Collections are the primary keys of entities stored in a collection field of another entity. Timestamps are used in conjunction with query result caching to ensure out-of-date query results are ignored.)  The optimal cache configuration for entities/collections is different from what is optimal for query results, and what's optimal for timestamps is different from the others. The new Hibernate/JBoss Cache integration and the CacheManager service make it possible to relatively easily set up an optimal caching configuration; in AS 4 only a single cache could be used, forcing a sort of suboptimal lowest-common-denominator configuration.

     

    Configuration

     

    Configuration of a JPA second level cache is done via your deployment's persistence.xml.

     

    Here's an example persistence.xml that configures a clustered second level cache with query result caching enabled.[1]

     

    <?xml version="1.0" encoding="UTF-8"?>
    <persistence xmlns="http://java.sun.com/xml/ns/persistence"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://java.sun.com/xml/ns/persistence
       http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd"
       version="1.0">

       <persistence-unit name="tempdb" transaction-type="JTA">
          <jta-data-source>java:/DefaultDS</jta-data-source>
          <properties>
             <property name="hibernate.cache.use_second_level_cache" value="true"/>
             <property name="hibernate.cache.use_query_cache" value="true"/>

            
    <property name="hibernate.cache.region.factory_class" value="org.hibernate.cache.jbc2.JndiMultiplexedJBossCacheRegionFactory"/>
            
    <property name="hibernate.cache.region.jbc2.cachefactory" value="java:CacheManager"/>
             <property name="hibernate.cache.region.jbc2.cfg.entity" value="mvcc-entity"/>
             <property name="hibernate.cache.region.jbc2.cfg.query" value="local-query"/>

            
    <property name="hibernate.cache.region_prefix" value="tempdb"/>
          </properties>
       </persistence-unit>
    </persistence>

     

    Some comments on the above:

     

    • The hibernate.cache.use_second_level_cache setting tells Hibernate to enable entity and collection caching.
    • The hibernate.cache.use_query_cache setting tells Hibernate to enable query result set caching.[1]
    • The hibernate.cache.region.factory_class setting tells Hibernate to use the JBC second level caching integration, using a variant that uses a CacheManager found in JNDI as the source for JBC instances.

    • The hibernate.cache.region.jbc2.cachefactory setting tells JndiMultiplexedJBossCacheRegionFactory where to find the CacheManager in JNDI.
    • The hibernate.cache.region.jbc2.cfg.entity setting tells JndiMultiplexedJBossCacheRegionFactory to use the standard mvcc-entity configuration described above for entity and collection caching. See the CacheManager service standard cache configurations for more information on this configuration.
    • The hibernate.cache.region.jbc2.cfg.query setting tells JndiMultiplexedJBossCacheRegionFactory to use the standard local-query configuration described above for query result set caching. See the CacheManager service standard cache configurations for more information on this configuration.

    • By default, JndiMultiplexedJBossCacheRegionFactory will use the timestamps-cache configuration described above for timestamps caching, so no specific configuration is needed in persistence.xml.
    • The hibernate.cache.region_prefix setting provides a unique name that Hibernate will use to scope cache entries related to this persistence unit. This value must be unique across all persistence units that will use a particular JBoss Cache instance. In the example above we use the simple name of the persistence unit. Configuring hibernate.cache.region_prefix is optional. However, in a JPA environment, setting the property is highly recommended, since if it is not set the JPA deployer will generate a lengthy synthetic one, e.g.  /persistence.unit:unitName=#tempdb or /persistence.unit:unitName=foo.ear/bar.jar#tempdb for a persistence unit packaged in an ear. That kind of lengthy name increases the cost of cache operations and makes configuring things like eviction more difficult. In a non-JPA Hibernate case, setting hibernate.cache.region_prefix is optional if only one SessionFactory will be using any JBoss Cache instance. If multiple session factories will be sharing a JBC instance, setting the region prefix in your cfg.xml is required. For non-JPA Hibernate there is no equivalent to the JPA deployer that generates a synthetic prefix if one is not configured.

     

    For non-JPA Hibernate second level caching, the same configuration properties and values are used; they are just declared in a Hibernate SessionFactory cfg.xml file using its syntax (and with "hibernate." removed from the property names):

     

    <?xml version='1.0' encoding='utf-8'?>
    <!DOCTYPE hibernate-configuration PUBLIC
            "-//Hibernate/Hibernate Configuration DTD 3.0//EN"
            "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
    
    <hibernate-configuration>
    
        <session-factory>
     
             <property name="cache.use_second_level_cache">true</property>
             <property name="cache.use_query_cache">true</property>
             <property name="cache.region.factory_class">org.hibernate.cache.jbc2.JndiMultiplexedJBossCacheRegionFactory</property>
             <property name="cache.region.jbc2.cachefactory">java:CacheManager</property>
             <property name="cache.region.jbc2.cfg.entity">mvcc-entity</property>
             <property name="cache.region.jbc2.cfg.query">local-query</property> 
             <property name="cache.region_prefix">tempdb</property>
     
             ... other non-caching related configuration
    
        </session-factory>
    
    </hibernate-configuration>

     

    Note that the above examples only tell JPA/Hibernate how to use JBC as a second level cache. They don't tell it what to cache; i.e. what entities or query results. Since indiscriminate caching can actually hurt performance, caching of items is only done if specifically enabled. The following example shows how to do this for JPA entities using annotations.

     

    package org.example.entities;
     
    import java.io.Serializable;
     
    import javax.persistence.Entity;
    import javax.persistence.NamedQueries;
    import javax.persistence.NamedQuery;
    import javax.persistence.QueryHint;
     
    import org.hibernate.annotations.Cache;
    import org.hibernate.annotations.CacheConcurrencyStrategy;
     
    @Entity
    @Cache (usage=CacheConcurrencyStrategy.TRANSACTIONAL)
    @NamedQueries({   
       @NamedQuery(name="account.totalbalance",
                   query="select account.balance from Account as account where account.accountHolder = ?1",
                   hints={@QueryHint(name="org.hibernate.cacheRegion", value="ExampleRegion"),
                          @QueryHint(name="org.hibernate.cacheable", value="true")})
    })
    public class Account implements Serializable
    

     

    For full details on the javax.persistence annotations above, see the JEE 5 javadocs. Here we'll focus on the Hibernate extensions to JPA that configure second level caching. (Second level caching itself is not part of the JPA spec.) For non-JPA Hibernate usage, see the Hibernate Reference Documentation for information on including cache elements in your entity mappings and programatically configuring query result caching.

     

    • @org.hibernate.annotations.Cache is what tells Hibernate entities of this type should be cached in a second level cache. The usage annotation indicates what strategy for controlling concurrent access to cache contents Hibernate should use. When JBoss Cache is used as the caching provider, the only valid choices are CacheConcurrencyStrategy.TRANSACTIONAL and CacheConcurrencyStrategy.READ_ONLY. Typically CacheConcurrencyStrategy.TRANSACTIONAL is used. For more on cache concurrency strategies, see the Hibernate Reference Documentation. For full details on the @Cache annotation, see the Hibernate Annotations Reference Documentation.
    • @javax.persistence.NamedQuery allows declaration of an EJBQL query.  The hints attribute allows vendor-specific configuration related to the query via comma-separated list of @javax.persistence.QueryHint annotations. This is where configuration of caching query results comes in:
      • @QueryHint(name="org.hibernate.cacheable", value="true") tells Hibernate to cache the results of executing this query.
      • @QueryHint(name="org.hibernate.cacheRegion", value="ExampleRegion") tells Hibernate to store the query results in an area of the cache name "ExampleRegion".  This query hint is optional; if not specified Hibernate will create a synthetic region based on the name of the deployment and the bean's type. The advantages of specifying a region are 1) you can group queries declared in multiple beans in the same region, making it easier to control memory usage in the cache by configuring eviction (see the docs) 2) the synthetic region name Hibernate creates if you don't specify one is long, which can be a minor performance drag, particularly if you replicate query results.[2]

     

    There's a lot of power in the clustered second level caching in AS 5/Hibernate 3.3, far more than can adequately be discussed here. For complete details, see the Using JBoss Cache as a Hibernate Second Level Cache reference manual.

     


    Query result caching (or for that matter entity caching) may not improve  application performance. Be sure to benchmark your application with and  without caching.

     

    Replicating query results is not recommended; use of the local-query cache configuration for query result  caching is advised.  Replicating query results only makes sense if 1)  the result set is small enough that the cost of replicating it doesn't  outweigh the benefit of only executing it from one AS instance; and 2)  the given result set is likely to be used on other nodes. Note also that  all query results associated with a session factory or stored in the  same cache, so you can't replicate one query and not another.