4 Replies Latest reply on Sep 23, 2009 7:37 AM by mircea.markus

    JBoss 4.2.2 with TreeCache 1.4.1.SP5: nodes, attributes not

    the_olo

      Hi!

      I have a clustered TreeCache set up on the standard JBoss 4.2.2 GA release (for caching jBPM-BPEL data).

      It seems that the contents of the cache are properly replicated only during the initial state transfer - when a node joins the cluster, the logs show:

      2009-09-18 13:44:41,932 INFO [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
      2009-09-18 13:44:41,936 INFO [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
      2009-09-18 13:44:41,939 INFO [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
      2009-09-18 13:44:41,942 INFO [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
      2009-09-18 13:44:41,943 DEBUG [org.jboss.hibernate.jbc.cacheprovider.OptimisticJBCCache] activateCacheRegion(): Region /org/jbpm/graph/def/ProcessDefinition/a
      ctions is already active
      2009-09-18 13:44:41,945 INFO [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
      2009-09-18 13:44:41,946 DEBUG [org.jboss.hibernate.jbc.cacheprovider.OptimisticJBCCache] activateCacheRegion(): Region /org/jbpm/graph/def/ProcessDefinition/d
      efinitions is already active



      While on the node that supplies the state transfer data, I can see in the logs:

      2009-09-18 13:44:41,941 INFO [org.jboss.cache.interceptors.OptimisticNodeInterceptor] read Method _getState; id:19(/org/jbpm/file/def/FileDefinition/processF
      iles, 400, false, false) called - don't know how to handle, passing on!
      2009-09-18 13:44:41,941 INFO [org.jboss.cache.statetransfer.StateTransferGenerator_140] returning the state for tree rooted in /org/jbpm/file/def/FileDefinit
      ion/processFiles(1024 bytes)
      2009-09-18 13:44:41,944 INFO [org.jboss.cache.interceptors.OptimisticNodeInterceptor] read Method _getState; id:19(/org/jbpm/bpel/graph/scope/Scope/onEvents,
       400, false, false) called - don't know how to handle, passing on!
      2009-09-18 13:44:41,944 INFO [org.jboss.cache.statetransfer.StateTransferGenerator_140] returning the state for tree rooted in /org/jbpm/bpel/graph/scope/Sco
      pe/onEvents(1024 bytes)
      2



      During normal operation, however, it seems that cache contents aren't replicated.

      I'm looking at the attributes of the "jboss.cache:service=MyTreeCache" MBean.

      When I execute a business process on node A, then I can see rapid growth of the "NumberOfAttributes" and "NumberOfNodes" attribute values (until the cache gets saturated).

      But the expected synchronous growth on node B doesn't happen. When I do the same on node B, the counts rise on that node, but node A remains uninfluenced.

      So I suspect that cache replication does not work. Or maybe it should work that way? Maybe I am looking at the wrong MBean attributes?

      Here's the configuration of my TreeCache:

      <?xml version="1.0" encoding="UTF-8"?>
      <server>
       <mbean code="org.jboss.cache.TreeCache" name="jboss.cache:service=MyTreeCache">
       <depends>jboss:service=Naming</depends>
       <depends>jboss:service=TransactionManager</depends>
      
       <!-- Configure the TransactionManager -->
       <attribute name="TransactionManagerLookupClass">org.jboss.cache.JBossTransactionManagerLookup
       </attribute>
      
       <!--
       Node locking scheme :
       PESSIMISTIC (default)
       OPTIMISTIC
       -->
       <attribute name="NodeLockingScheme">OPTIMISTIC</attribute>
      
       <!--
       Node locking isolation level : SERIALIZABLE REPEATABLE_READ (default) READ_COMMITTED READ_UNCOMMITTED NONE
       (ignored if NodeLockingScheme is OPTIMISTIC)
       -->
       <attribute name="IsolationLevel">REPEATABLE_READ</attribute>
      
       <!--
       Valid modes are LOCAL REPL_ASYNC REPL_SYNC INVALIDATION_ASYNC INVALIDATION_SYNC
       -->
       <attribute name="CacheMode">REPL_ASYNC</attribute>
      
       <!-- Whether each interceptor should have an mbean
      registered to capture and display its statistics. -->
       <!-- javax.management.AttributeNotFoundException: Attribute 'UseInterceptorMbeans' is not writable -->
       <!--
       <attribute name="UseInterceptorMbeans">true</attribute>
       -->
      
       <attribute name="UseRegionBasedMarshalling">true</attribute>
       <attribute name="InactiveOnStartup">true</attribute>
      
       <!-- Name of cluster. Needs to be the same for all clusters, in order
      to find each other -->
       <attribute name="ClusterName">${jboss.partition.name:DefaultPartition}-MyTreeCache
       </attribute>
      
       <attribute name="ClusterConfig">
       <config>
       <UDP mcast_addr="${jboss.partition.udpGroup:228.1.4.5}" mcast_port="${jboss.hapartition.mcast_port:48566}"
       tos="8" ucast_recv_buf_size="20000000" ucast_send_buf_size="640000" mcast_recv_buf_size="25000000"
       mcast_send_buf_size="640000" loopback="false" discard_incompatible_packets="true" enable_bundling="false"
       max_bundle_size="64000" max_bundle_timeout="30" use_incoming_packet_handler="true"
       use_outgoing_packet_handler="false" ip_ttl="${jgroups.udp.ip_ttl:2}" down_thread="false" up_thread="false" />
       <PING timeout="2000" down_thread="false" up_thread="false" num_initial_members="3" />
       <MERGE2 max_interval="100000" down_thread="false" up_thread="false" min_interval="20000" />
       <FD_SOCK down_thread="false" up_thread="false" />
       <FD timeout="10000" max_tries="5" down_thread="false" up_thread="false" shun="true" />
       <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
       <pbcast.NAKACK max_xmit_size="60000" use_mcast_xmit="false" gc_lag="0" retransmit_timeout="300,600,1200,2400,4800"
       down_thread="false" up_thread="false" discard_delivered_msgs="true" />
       <UNICAST timeout="300,600,1200,2400,3600" down_thread="false" up_thread="false" />
       <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" down_thread="false" up_thread="false"
       max_bytes="400000" />
       <pbcast.GMS print_local_addr="true" join_timeout="3000" down_thread="false" up_thread="false"
       join_retry_timeout="2000" shun="true" view_bundling="true" />
       <FRAG2 frag_size="60000" down_thread="false" up_thread="false" />
       <pbcast.STATE_TRANSFER down_thread="false" up_thread="false" use_flush="false" />
       </config>
       </attribute>
      
       <!--
       The max amount of time (in milliseconds) we wait until the initial state (ie. the contents of the cache) are
       retrieved from existing members in a clustered environment
       -->
       <attribute name="InitialStateRetrievalTimeout">5000</attribute>
      
       <!-- Number of milliseconds to wait until all responses for a
       synchronous call have been received.
       -->
       <attribute name="SyncReplTimeout">10000</attribute>
      
       <!-- Max number of milliseconds to wait for a lock acquisition -->
       <attribute name="LockAcquisitionTimeout">15000</attribute>
      
       <!-- Name of the eviction policy class. -->
       <attribute name="EvictionPolicyClass">org.jboss.cache.eviction.LRUPolicy</attribute>
      
       <!-- Specific eviction policy configurations. This is LRU -->
       <attribute name="EvictionPolicyConfig">
       <config>
       <attribute name="wakeUpIntervalSeconds">5</attribute>
       <!-- Cache wide default -->
       <region name="/_default_">
       <attribute name="maxNodes">5000</attribute>
       <attribute name="timeToLiveSeconds">1000</attribute>
       <!-- Maximum time an object is kept in cache regardless of idle time -->
       <attribute name="maxAgeSeconds">120</attribute>
       </region>
      
       <region name="/org/jboss/data">
       <attribute name="maxNodes">5000</attribute>
       <attribute name="timeToLiveSeconds">1000</attribute>
       </region>
      
       <region name="/org/jboss/test/data">
       <attribute name="maxNodes">5</attribute>
       <attribute name="timeToLiveSeconds">4</attribute>
       </region>
      
       <region name="/org/jbpm/graph">
       <attribute name="maxNodes">5000</attribute>
       <!-- 259200 secs = 3 * 24 hours in order to survive the weekend: -->
       <attribute name="timeToLiveSeconds">259200</attribute>
       </region>
       <region name="/org/jbpm/bpel/graph">
       <attribute name="maxNodes">5000</attribute>
       <!-- 259200 secs = 3 * 24 hours in order to survive the weekend: -->
       <attribute name="timeToLiveSeconds">259200</attribute>
       </region>
       <region name="/org/jbpm">
       <attribute name="maxNodes">10000</attribute>
       <attribute name="timeToLiveSeconds">7200</attribute>
       <attribute name="maxAgeSeconds">259200</attribute>
       </region>
      
       </config>
       </attribute>
       </mbean>
      </server>
      
      I've tried both with REPL_ASYNC and REPL_SYNC CacheMode and the result is the same.


        • 1. Re: JBoss 4.2.2 with TreeCache 1.4.1.SP5: nodes, attributes
          the_olo

          BTW, anticipating th question: I can see that indeed the cache cluster gets formed, I can see the IP addresses of all the members in the "Members" attribute of the TreeCache MBean, the list of node IP addresses and their ports is identical on all cluster members.

          Only one of all the cluster nodes has a TreeCache MBean with Coordinator=true.

          The exact TreeCache version reported by the MBean is:

          JBossCache 'Cayenne' 1.4.1.SP5[ $Id: Version.java 4510 2007-09-26 18:45:19Z manik.surtani@jboss.com $]


          The only problem I see is that the number of nodes on the caches seems to be indepentent of each other, which suggests that they don't really replicate.

          • 2. Re: JBoss 4.2.2 with TreeCache 1.4.1.SP5: nodes, attributes
            the_olo

            What's very curious is the fact that logs on both cluster nodes indicate that the elements are added to the cache. E.g., after the BPEL process is executed and transaction gets committed:

            node 1 (initiator):

            2009-09-22 15:56:04,686 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/ProcessInstance/org.jbpm.graph.exe.ProcessInstance#15967 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:04,686 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15969 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:04,686 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15971 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:04,686 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15972 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:04,686 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15973 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:04,686 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15974 for a node that doesn't exist yet. Process as an add.
            


            node 2 (acceptor):

            2009-09-22 15:56:03,536 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/ProcessInstance/org.jbpm.graph.exe.ProcessInstance#15967 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:03,536 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15969 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:03,536 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15971 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:03,536 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15972 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:03,536 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15973 for a node that doesn't exist yet. Process as an add.
            2009-09-22 15:56:03,536 DEBUG [org.jboss.cache.eviction.BaseEvictionAlgorithm] Adding element /org/jbpm/graph/exe/Token/org.jbpm.graph.exe.Token#15974 for a node that doesn't exist yet. Process as an add.
            


            The problem is that on node 2, the number of cache contents does not go up like it does on node 1. Cache contents on node 2 remain the same after the operation.

            If I execute the same operation on node 2, then the execution goes as slow as on node 1 before it had populated its cache. Node 2's cache contents go up then, and reach the same level as on node 1.

            So it looks like the caches of both nodes operate independently, regardless of what the logs say.

            • 3. Re: JBoss 4.2.2 with TreeCache 1.4.1.SP5: nodes, attributes
              the_olo

              OK, I've found the solution to my problem.

              First, it wasn't in the configuration of the TreeCache itself, but of the Hibernate Cache provider.

              It turns out that this behaviour is by design the default in the Hibernate JBoss Cache provider (hibernate-jbc-cacheprovider-1.0.1.GA.jar).

              See: http://www.jboss.org/community/wiki/NewJBossCache14xBasedHibernate32CacheProvider.

              The option responsible for this behaviour is hibernate.treecache.local_puts_only:

              The aim of this property, whose default value is true, is to makes sure any put calls on the Second Level Cache as a result of a read operation from the database are done locally which means that data read from database is not replicated to other nodes.


              So placing the following line in my hibernate.cfg.xml has enabled full cache replication across the cluster:

              <property name="hibernate.treecache.local_puts_only">false</property>
              


              The relevant portion of my Hibernate session factory configuration now looks like this:

              <property name="hibernate.cache.provider_class">org.jboss.hibernate.jbc.cacheprovider.JmxBoundTreeCacheProvider
               </property>
               <property name="hibernate.treecache.mbean.object_name">jboss.cache:service=MyTreeCache</property>
               <property name="hibernate.treecache.local_puts_only">false</property>
               <property name="hibernate.cache.use_second_level_cache">true</property>
               <property name="hibernate.cache.use_query_cache">false</property>
              
               <!-- Tuning -->
               <property name="hibernate.jdbc.batch_size">32</property>
               <property name="hibernate.jdbc.fetch_size">32</property>
               <property name="hibernate.default_batch_fetch_size">32</property>
               <property name="hibernate.jdbc.batch_versioned_data">true</property>


              • 4. Re: JBoss 4.2.2 with TreeCache 1.4.1.SP5: nodes, attributes
                mircea.markus

                thanks for sharing!