2 Replies Latest reply on Jun 5, 2008 8:16 AM by Ken Ringdahl

    Recommendation for IsolationLevel

    Ken Ringdahl Newbie

      We are seeing lots of replication timeout exceptions and have extensively played with the different isolation levels and locking schemes with little success. Things are all good with a single node cluster. Once we add a 2nd node to the cluster and attempt concurrent writes to the same node in the tree cache we see lots of timeout exceptions. I believe we need Serializable as an IsolationLevel since we need to ensure global synchronization. But, it does not seem to be locking the nodes appropriately. Environment is JBoss AS 4.2.2.GA and JBoss Cache 2.0.0.GA. A few questions about locking and transactions:

      - With Serializable IsolationLevel, should this not prevent reads to any of the nodes touched in the cache until the transaction commits? When is the lock fetched?
      - Can you recommend the appropriate configurations for a reasonably high transaction environment? Basically we are looking for the ability to synchronize the entire boundaries of a transaction. In general, a txn would take 10 seconds or less.

      Here is our existing config:

      <?xml version="1.0" encoding="UTF-8"?>
      
      <!-- ===================================================================== -->
      <!-- -->
      <!-- Sample TreeCache Service Configuration -->
      <!-- -->
      <!-- ===================================================================== -->
      
      <server>
      
       <!-- ==================================================================== -->
       <!-- Defines TreeCache configuration -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.cache.pojo.jmx.PojoCacheJmxWrapper"
       name="jboss.cache:service=TreeCache">
      
       <depends>jboss:service=Naming</depends>
       <depends>jboss:service=TransactionManager</depends>
      
       <!--
       Configure the TransactionManager
       -->
       <attribute name="TransactionManagerLookupClass">org.jboss.cache.transaction.GenericTransactionManagerLookup</attribute>
      
       <!--
       Isolation level : SERIALIZABLE
       REPEATABLE_READ (default)
       READ_COMMITTED
       READ_UNCOMMITTED
       NONE
       -->
       <attribute name="IsolationLevel">SERIALIZABLE</attribute>
      
       <!--
       Valid modes are LOCAL
       REPL_ASYNC
       REPL_SYNC
       INVALIDATION_ASYNC
       INVALIDATION_SYNC
       -->
       <attribute name="CacheMode">REPL_SYNC</attribute>
      
       <!--
       Node locking scheme:
       OPTIMISTIC
       PESSIMISTIC (default)
       -->
       <attribute name="NodeLockingScheme">PESSIMISTIC</attribute>
      
       <!--
       Just used for async repl: use a replication queue
       -->
       <attribute name="UseReplQueue">false</attribute>
      
       <!--
       Replication interval for replication queue (in ms)
       -->
       <attribute name="ReplQueueInterval">0</attribute>
      
       <!--
       Max number of elements which trigger replication
       -->
       <attribute name="ReplQueueMaxElements">0</attribute>
      
       <!-- Name of cluster. Needs to be the same for all TreeCache nodes in a
       cluster in order to find each other. Needs to be different in order to maintain
       separate caches
       -->
       <attribute name="ClusterName">kr-dtFabricCache</attribute>
      
       <!--Uncomment next three statements to enable JGroups multiplexer.
      This configuration is dependent on the JGroups multiplexer being
      registered in an MBean server such as JBossAS. -->
       <!--
       <depends>jgroups.mux:name=Multiplexer</depends>
       <attribute name="MultiplexerService">jgroups.mux:name=Multiplexer</attribute>
       <attribute name="MultiplexerStack">fc-fast-minimalthreads</attribute>
       -->
      
       <!-- JGroups protocol stack properties.
       ClusterConfig isn't used if the multiplexer is enabled and successfully initialized.
       -->
       <attribute name="ClusterConfig">
       <config>
       <UDP mcast_addr="228.10.10.10"
       mcast_port="50008"
       tos="8"
       ucast_recv_buf_size="20000000"
       ucast_send_buf_size="640000"
       mcast_recv_buf_size="25000000"
       mcast_send_buf_size="640000"
       loopback="false"
       discard_incompatible_packets="true"
       max_bundle_size="64000"
       max_bundle_timeout="30"
       use_incoming_packet_handler="true"
       ip_ttl="2"
       enable_bundling="false"
       enable_diagnostics="true"
      
       use_concurrent_stack="true"
      
       thread_naming_pattern="pl"
      
       thread_pool.enabled="true"
       thread_pool.min_threads="1"
       thread_pool.max_threads="25"
       thread_pool.keep_alive_time="30000"
       thread_pool.queue_enabled="true"
       thread_pool.queue_max_size="10"
       thread_pool.rejection_policy="Run"
      
       oob_thread_pool.enabled="true"
       oob_thread_pool.min_threads="1"
       oob_thread_pool.max_threads="4"
       oob_thread_pool.keep_alive_time="10000"
       oob_thread_pool.queue_enabled="true"
       oob_thread_pool.queue_max_size="10"
       oob_thread_pool.rejection_policy="Run"/>
      
       <PING timeout="2000" num_initial_members="3"/>
       <MERGE2 max_interval="30000" min_interval="10000"/>
       <FD_SOCK/>
       <FD timeout="10000" max_tries="5" shun="true"/>
       <VERIFY_SUSPECT timeout="1500"/>
       <pbcast.NAKACK max_xmit_size="60000"
       use_mcast_xmit="false" gc_lag="0"
       retransmit_timeout="300,600,1200,2400,4800"
       discard_delivered_msgs="true"/>
       <UNICAST timeout="300,600,1200,2400,3600"/>
       <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
       max_bytes="400000"/>
       <AUTH auth_class="org.jgroups.auth.MD5Token"
       auth_value="desktone"
       token_hash="MD5"/>
       <pbcast.GMS print_local_addr="true" join_timeout="5000"
       join_retry_timeout="2000" shun="false"
       view_bundling="true" view_ack_collection_timeout="5000"/>
       <FRAG2 frag_size="60000"/>
       <pbcast.STREAMING_STATE_TRANSFER use_reading_thread="true"/>
       <!-- <pbcast.STATE_TRANSFER/> -->
       <pbcast.FLUSH timeout="0"/>
       </config>
       </attribute>
      
      
       <!--
       Whether or not to fetch state on joining a cluster
       NOTE this used to be called FetchStateOnStartup and has been renamed to be more descriptive.
       -->
       <attribute name="FetchInMemoryState">false</attribute>
      
       <!--
       The max amount of time (in milliseconds) we wait until the
       state (ie. the contents of the cache) are retrieved from
       existing members in a clustered environment
       -->
       <attribute name="StateRetrievalTimeout">15000</attribute>
      
       <!--
       Number of milliseconds to wait until all responses for a
       synchronous call have been received.
       -->
       <attribute name="SyncReplTimeout">15000</attribute>
      
       <!-- Max number of milliseconds to wait for a lock acquisition -->
       <attribute name="LockAcquisitionTimeout">30000</attribute>
      
       <!--
       Indicate whether to use region based marshalling or not. Set this to true if you are running under a scoped
       class loader, e.g., inside an application server. Default is "false".
       -->
       <attribute name="UseRegionBasedMarshalling">false</attribute>
      
       <!-- Cache Loader configuration block -->
       <attribute name="CacheLoaderConfig">
       <config>
       <!-- if passivation is true, only the first cache loader is used; the rest are ignored -->
       <passivation>false</passivation>
       <preload>/</preload>
       <shared>true</shared>
      
       <!-- we can now have multiple cache loaders, which get chained -->
       <cacheloader>
       <class>org.jboss.cache.loader.JDBCCacheLoader</class>
      
       <properties>
       cache.jdbc.table.name=dht
       cache.jdbc.table.primarykey=dht_pk
       cache.jdbc.table.create=true
       cache.jdbc.table.drop=false
       cache.jdbc.fqn.column=fqn
       cache.jdbc.fqn.type=varchar(255)
       cache.jdbc.node.column=value
       cache.jdbc.node.type=LONGBLOB
       cache.jdbc.parent.column=parent_fqn
       cache.jdbc.datasource=java:/jdbc/FabricDS
       cache.jdbc.sql-concat=concat(1,2)
       </properties>
      
       <!-- whether the cache loader writes are asynchronous -->
       <async>false</async>
      
       <!-- only one cache loader in the chain may set fetchPersistentState to true.
       An exception is thrown if more than one cache loader sets this to true. -->
       <fetchPersistentState>false</fetchPersistentState>
      
       <!-- determines whether this cache loader ignores writes - defaults to false. -->
       <ignoreModifications>false</ignoreModifications>
      
       <purgeOnStartup>false</purgeOnStartup>
       </cacheloader>
      
       </config>
       </attribute>
      
       <!-- Buddy Replication config -->
       <attribute name="BuddyReplicationConfig">
       <config>
      
       <!-- Enables buddy replication. This is the ONLY mandatory configuration element here. -->
       <buddyReplicationEnabled>false</buddyReplicationEnabled>
      
       <!-- These are the default values anyway -->
       <buddyLocatorClass>org.jboss.cache.buddyreplication.NextMemberBuddyLocator</buddyLocatorClass>
      
       <!-- numBuddies is the number of backup nodes each node maintains. ignoreColocatedBuddies means that
       each node will *try* to select a buddy on a different physical host. If not able to do so though,
       it will fall back to colocated nodes. -->
       <buddyLocatorProperties>
       numBuddies = 1
       ignoreColocatedBuddies = true
       </buddyLocatorProperties>
      
       <!-- A way to specify a preferred replication group. If specified, we try and pick a buddy why shares
       the same pool name (falling back to other buddies if not available). This allows the sysdmin to hint at
       backup buddies are picked, so for example, nodes may be hinted topick buddies on a different physical rack
       or power supply for added fault tolerance. Note: to override this value, use system property desktone.cache.buddyName -->
       <buddyPoolName>myBuddyPoolReplicationGroup</buddyPoolName>
      
       <!-- Communication timeout for inter-buddy group organisation messages (such as assigning to and removing
       from groups, defaults to 1000. -->
       <buddyCommunicationTimeout>2000</buddyCommunicationTimeout>
      
       <!-- Whether data is removed from old owners when gravitated to a new owner. Defaults to true. -->
       <dataGravitationRemoveOnFind>true</dataGravitationRemoveOnFind>
      
       <!-- Whether backup nodes can respond to data gravitation requests, or only the data owner is supposed to respond.
       defaults to true. -->
       <dataGravitationSearchBackupTrees>true</dataGravitationSearchBackupTrees>
      
       <!-- Whether all cache misses result in a data gravitation request. Defaults to false, requiring callers to
       enable data gravitation on a per-invocation basis using the Options API. -->
       <autoDataGravitation>false</autoDataGravitation>
      
       </config>
       </attribute>
      
       </mbean>
      
       <!-- Uncomment to get a graphical view of the TreeCache MBean above -->
       <!-- <mbean code="org.jboss.cache.TreeCacheView" name="jboss.cache:service=TreeCacheView">-->
       <!-- <depends>jboss.cache:service=TreeCache</depends>-->
       <!-- <attribute name="CacheService">jboss.cache:service=TreeCache</attribute>-->
       <!-- </mbean>-->
      
      
      </server>
      


        • 1. Re: Recommendation for IsolationLevel
          Manik Surtani Master

          when you say high transaction, what do you mean? High degree of writes? If you have a write-mostly setup, then using a cache in the first place is really not a good idea since caches are optimised for read-mostly situations.

          SERIALIZABLE will ensure that you cannot have any concurrency in your system, which may not be a good thing at all. REPEATABLE_READ or READ_COMMITTED at least allows concurrent readers.

          • 2. Re: Recommendation for IsolationLevel
            Ken Ringdahl Newbie

            Well, high transaction may have been overshooting a little bit. What we really need is guaranteed consistency of writes. We have chosen a caching solution because we have a distributed application where reads need to be very fast and writes typically happen outside of the normal user data path (e.g. we have inventories that are collected on a periodic basis). Anyway, the scenario I describe is actually very similar to what's described here:

            http://www.jboss.com/index.html?module=bb&op=viewtopic&t=114513

            Which ultimately means that we're looking at MVCC. In the meantime I am thinking of implementing our own mutex lock within the cache itself to help synchronization across the cluster. If we do this, we can change the IsolationLevel back to something more reasonable like REPEATABLE_READ as you say. My interpretation of SERIALIZABLE is that access to a particular node would be locked when first read inside of a transaction. But this appears to happen during the commit phase and we still have contention of resources.

            Also, we have tried OPTIMISTIC locking but it seems to have problems with our cache loader. We use a JDBC cache loader and preload the / node. We see exceptions when the cache first initializes and tries to preload the cache. Is this a known problem? I can certainly recreate this and provide the stack trace if it helps.