3 Replies Latest reply on Aug 6, 2007 2:08 PM by manik

    Data integrity in a clustered jboss cache

    itchy75

      Hello,

      Im' using JBoss Cache (tree cache) in a cluster environment. I try to do the following tings :

      1. get a value from the cache = get(Fqn, key)
      2. modify this value
      3. put the value in cache = put(Fqn, key, value)

      I use SERIALIZABLE isolation level and PESSIMISTIC lock.

      The problem is when two programs running on differents servers modify the value at the same time.

      Server 1 Server 2
      x = getvalue();
      x++ x =getValue();
      put x in cache x++
       put xin cache
      


      If the value was bound to 0, it should be 2 at the end (serializable) but x=1 at the end of the execution.

      The questions are :

      - How could I serialize an operation in a clustered cache (ie reading/modify/persist a value in cache) ? I try to manage the transaction myself with transaction manager but it doesn't work.

      - I also understood from documentation that SERIALIZABLE isolation level lock the node when you read or write value in the cache. I am using jboss cache in a spring context and no transaction attributes are defined. In this case, when the transaction is initialized/commited if I use SERIALIZABLE is used ?

      - Is it possible to use spring to declare a cache transaction on classes' method ?

      Thanks a lot.










        • 1. Re: Data integrity in a clustered jboss cache
          manik

          You would need to use transactions and REPL_SYNC if you want to achieve proper coherence, although this may not be the most performant with SERIALIZABLE. You could use REPEATABLE_READ, but some transactions may roll back due to upgrade exceptions and the like when you have a write collission - something you would have to deal with using a retry.

          Regarding Spring declaring transactions on the cache, I'm afraid I cannot help you here as I do not know Spring that well. You would have to declare a TransactionManagerLookup in the cache configuration, and you could do something like:

          cache.getTransactionManager().begin();
          // .. do stuff with your cache ..
          cache.getTransactionManager().commit();
          


          with the cache instance that Spring injects into your code.


          • 2. Re: Data integrity in a clustered jboss cache
            itchy75

            I am already using REPL_SYNC and I have already tested the transaction manager as you said.

            But I have a lot of time out out exception when I use SERIALIZABLE isolation level (Fail to acquire lock after 15000ms). So it doesn't work in cluster.

            The problem is that I have a very high load on the two servers. When I run 100 requests (50 par servers) the first requests end well but after 10 requests everything is broken.

            I try to implement the solution with REPEATABLE_READ and a retry policy. I tested it in one server and it works well, but in a cluster it doesn't work.

            There is always a Thread that want to modify the value in cache, so all Threads have errors and last thread must wait a very long time before acquiring (and commiting) the value in cache. I tried o reduce lock acquisition timeout but with a big number of thread, it doesn't really change anything.

            Does Jboss Cache in cluster support concurency on a very high load ?


            The configuration of the cache

            
            <server>
             <mbean code="org.jboss.cache.TreeCache"
             name="Billetel.cache:service=ConcurencyTreeCache" >
            
             <!-- Used inside JBoss AS -->
             <depends>jboss:service=Naming</depends>
             <depends>jboss:service=TransactionManager</depends>
            
             <!--getTransactionManagerLookupClass
             Configure the TransactionManager
             -->
             <!-- Configure the TransactionManager -->
             <attribute name="TransactionManagerLookupClass">org.jboss.cache.JBossTransactionManagerLookup</attribute>
            
             <!--
             Node locking scheme:
             OPTIMISTIC
             PESSIMISTIC (default)
             -->
             <attribute name="NodeLockingScheme">PESSIMISTIC</attribute>
            
             <!--
             Note that this attribute is IGNORED if your NodeLockingScheme above is OPTIMISTIC.
            
             Isolation level : SERIALIZABLE
             REPEATABLE_READ (default)
             READ_COMMITTED
             READ_UNCOMMITTED
             NONE
             -->
             <attribute name="IsolationLevel">REPEATABLE_READ</attribute>
            
             <!--
             Valid modes are LOCAL
             REPL_ASYNC
             REPL_SYNC
             INVALIDATION_ASYNC
             INVALIDATION_SYNC
             -->
             <attribute name="CacheMode">REPL_SYNC</attribute>
            
            
             <!--
             Just used for async repl: use a replication queue
             -->
             <attribute name="UseReplQueue">false</attribute>
            
             <!--
             Replication interval for replication queue (in ms)
             -->
             <attribute name="ReplQueueInterval">100</attribute>
            
             <!--
             Max number of elements which trigger replication
             -->
             <attribute name="ReplQueueMaxElements">1</attribute>
            
             <!-- Name of cluster. Needs to be the same for all clusters, in order
             to find each other
             -->
             <attribute name="ClusterName">${jboss.partition.name:DefaultPartition}</attribute>
            
             <attribute name="ClusterConfig">
             <Config>
             <UDP mcast_addr="${jboss.cachecluster.udpGroup:228.1.2.41}"
             mcast_port="${jboss.cachecluster.mcast_port:47566}"
             ip_ttl="${jgroups.cachecluster.mcast.ip_ttl:0}"
             ip_mcast="true" mcast_send_buf_size="800000"
             mcast_recv_buf_size="150000" ucast_send_buf_size="800000"
             ucast_recv_buf_size="150000" loopback="false" />
             <PING timeout="2000" num_initial_members="3"
             up_thread="true" down_thread="true" />
             <MERGE2 min_interval="10000" max_interval="20000" />
             <FD_SOCK down_thread="false" up_thread="false" />
             <FD shun="true" up_thread="true" down_thread="true"
             timeout="2500" max_tries="5" />
             <VERIFY_SUSPECT timeout="3000" num_msgs="3"
             up_thread="true" down_thread="true" />
             <pbcast.NAKACK gc_lag="50"
             retransmit_timeout="300,600,1200,2400,4800" max_xmit_size="8192"
             up_thread="true" down_thread="true" />
             <UNICAST timeout="300,600,1200,2400,4800"
             window_size="100" min_threshold="10" down_thread="true" />
             <pbcast.STABLE desired_avg_gossip="20000"
             max_bytes="400000" up_thread="true" down_thread="true" />
             <FRAG frag_size="8192" down_thread="true"
             up_thread="true" />
             <pbcast.GMS join_timeout="5000"
             join_retry_timeout="2000" shun="true" print_local_addr="true" />
             <pbcast.STATE_TRANSFER up_thread="true"
             down_thread="true" />
             </Config>
             </attribute>
            
            
             <!--
             Whether or not to fetch state on joining a cluster
             NOTE this used to be called FetchStateOnStartup and has been renamed to be more descriptive.
             -->
             <attribute name="FetchInMemoryState">true</attribute>
            
             <!--
             The max amount of time (in milliseconds) we wait until the
             initial state (ie. the contents of the cache) are retrieved from
             existing members in a clustered environment
             -->
             <attribute name="InitialStateRetrievalTimeout">30000</attribute>
            
             <!--
             Number of milliseconds to wait until all responses for a
             synchronous call have been received.s
             -->
             <attribute name="SyncReplTimeout">15000</attribute>
            
             <!-- Max number of milliseconds to wait for a lock acquisition -->
             <attribute name="LockAcquisitionTimeout">5000</attribute>
            
             <!-- Specific eviction policy configurations. This is LRU -->
             <attribute name="EvictionPolicyConfig">
             <config>
             <attribute name="wakeUpIntervalSeconds">5</attribute>
             <!-- Cache wide default -->
             <region name="/_default_" policyClass="org.jboss.cache.eviction.LRUPolicy">
             <attribute name="timeToLiveSeconds">1800</attribute>
             </region>
             <region name="/reservationConcurency/" policyClass="org.jboss.cache.eviction.LRUPolicy">
             <attribute name="timeToLiveSeconds">0</attribute>
             </region>
             </config>
             </attribute>
            
             <attribute name="CacheLoaderConfiguration">
             <config>
            
             <!-- if passivation is true, only the first cache loader is used; the rest are ignored -->
             <passivation>false</passivation>
            
             <!-- comma delimited FQNs to preload -->
             <preload>/</preload>
            
             <!-- are the cache loaders shared in a cluster? -->
             <shared>true</shared>
            
             <!-- we can now have multiple cache loaders, which get chained -->
             <!-- the 'cacheloader' element may be repeated -->
             <cacheloader>
             <class>org.jboss.cache.loader.ClusteredCacheLoader</class>
            
             <!-- same as the old CacheLoaderConfig attribute -->
             <properties>
             timeout=60000
             </properties>
            
             <async>false</async>
            
             <!-- only one cache loader in the chain may set fetchPersistentState to true. An exception is thrown if more than one cache loader sets this to true. -->
             <fetchPersistentState>false</fetchPersistentState>
            
             <!-- determines whether this cache loader ignores writes - defaults to false. -->
             <ignoreModifications>false</ignoreModifications>
            
             <!-- if set to true, purges the contents of this cache loader when the cache starts up. Defaults to false. -->
             <purgeOnStartup>true</purgeOnStartup>
             </cacheloader>
            
             </config>
             </attribute>
             </mbean>
            
            </server>
            


            • 3. Re: Data integrity in a clustered jboss cache
              manik

              We do support high concurrency under load, even in a cluster, but keep in mind that this is a cache and caches are tuned for high READ concurrency. I.e., 90% reads to writes, or better, for optimal performance.