2 Replies Latest reply on Jun 5, 2008 8:16 AM by kringdahl

Recommendation for IsolationLevel

kringdahl Jun 4, 2008 5:21 PM

We are seeing lots of replication timeout exceptions and have extensively played with the different isolation levels and locking schemes with little success. Things are all good with a single node cluster. Once we add a 2nd node to the cluster and attempt concurrent writes to the same node in the tree cache we see lots of timeout exceptions. I believe we need Serializable as an IsolationLevel since we need to ensure global synchronization. But, it does not seem to be locking the nodes appropriately. Environment is JBoss AS 4.2.2.GA and JBoss Cache 2.0.0.GA. A few questions about locking and transactions:

- With Serializable IsolationLevel, should this not prevent reads to any of the nodes touched in the cache until the transaction commits? When is the lock fetched?
- Can you recommend the appropriate configurations for a reasonably high transaction environment? Basically we are looking for the ability to synchronize the entire boundaries of a transaction. In general, a txn would take 10 seconds or less.

Here is our existing config:

<?xml version="1.0" encoding="UTF-8"?>

<!-- ===================================================================== -->
<!-- -->
<!-- Sample TreeCache Service Configuration -->
<!-- -->
<!-- ===================================================================== -->

<server>

 <!-- ==================================================================== -->
 <!-- Defines TreeCache configuration -->
 <!-- ==================================================================== -->

 <mbean code="org.jboss.cache.pojo.jmx.PojoCacheJmxWrapper"
 name="jboss.cache:service=TreeCache">

 <depends>jboss:service=Naming</depends>
 <depends>jboss:service=TransactionManager</depends>

 <!--
 Configure the TransactionManager
 -->
 <attribute name="TransactionManagerLookupClass">org.jboss.cache.transaction.GenericTransactionManagerLookup</attribute>

 <!--
 Isolation level : SERIALIZABLE
 REPEATABLE_READ (default)
 READ_COMMITTED
 READ_UNCOMMITTED
 NONE
 -->
 <attribute name="IsolationLevel">SERIALIZABLE</attribute>

 <!--
 Valid modes are LOCAL
 REPL_ASYNC
 REPL_SYNC
 INVALIDATION_ASYNC
 INVALIDATION_SYNC
 -->
 <attribute name="CacheMode">REPL_SYNC</attribute>

 <!--
 Node locking scheme:
 OPTIMISTIC
 PESSIMISTIC (default)
 -->
 <attribute name="NodeLockingScheme">PESSIMISTIC</attribute>

 <!--
 Just used for async repl: use a replication queue
 -->
 <attribute name="UseReplQueue">false</attribute>

 <!--
 Replication interval for replication queue (in ms)
 -->
 <attribute name="ReplQueueInterval">0</attribute>

 <!--
 Max number of elements which trigger replication
 -->
 <attribute name="ReplQueueMaxElements">0</attribute>

 <!-- Name of cluster. Needs to be the same for all TreeCache nodes in a
 cluster in order to find each other. Needs to be different in order to maintain
 separate caches
 -->
 <attribute name="ClusterName">kr-dtFabricCache</attribute>

 <!--Uncomment next three statements to enable JGroups multiplexer.
This configuration is dependent on the JGroups multiplexer being
registered in an MBean server such as JBossAS. -->
 <!--
 <depends>jgroups.mux:name=Multiplexer</depends>
 <attribute name="MultiplexerService">jgroups.mux:name=Multiplexer</attribute>
 <attribute name="MultiplexerStack">fc-fast-minimalthreads</attribute>
 -->

 <!-- JGroups protocol stack properties.
 ClusterConfig isn't used if the multiplexer is enabled and successfully initialized.
 -->
 <attribute name="ClusterConfig">
 <config>
 <UDP mcast_addr="228.10.10.10"
 mcast_port="50008"
 tos="8"
 ucast_recv_buf_size="20000000"
 ucast_send_buf_size="640000"
 mcast_recv_buf_size="25000000"
 mcast_send_buf_size="640000"
 loopback="false"
 discard_incompatible_packets="true"
 max_bundle_size="64000"
 max_bundle_timeout="30"
 use_incoming_packet_handler="true"
 ip_ttl="2"
 enable_bundling="false"
 enable_diagnostics="true"

 use_concurrent_stack="true"

 thread_naming_pattern="pl"

 thread_pool.enabled="true"
 thread_pool.min_threads="1"
 thread_pool.max_threads="25"
 thread_pool.keep_alive_time="30000"
 thread_pool.queue_enabled="true"
 thread_pool.queue_max_size="10"
 thread_pool.rejection_policy="Run"

 oob_thread_pool.enabled="true"
 oob_thread_pool.min_threads="1"
 oob_thread_pool.max_threads="4"
 oob_thread_pool.keep_alive_time="10000"
 oob_thread_pool.queue_enabled="true"
 oob_thread_pool.queue_max_size="10"
 oob_thread_pool.rejection_policy="Run"/>

 <PING timeout="2000" num_initial_members="3"/>
 <MERGE2 max_interval="30000" min_interval="10000"/>
 <FD_SOCK/>
 <FD timeout="10000" max_tries="5" shun="true"/>
 <VERIFY_SUSPECT timeout="1500"/>
 <pbcast.NAKACK max_xmit_size="60000"
 use_mcast_xmit="false" gc_lag="0"
 retransmit_timeout="300,600,1200,2400,4800"
 discard_delivered_msgs="true"/>
 <UNICAST timeout="300,600,1200,2400,3600"/>
 <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
 max_bytes="400000"/>
 <AUTH auth_class="org.jgroups.auth.MD5Token"
 auth_value="desktone"
 token_hash="MD5"/>
 <pbcast.GMS print_local_addr="true" join_timeout="5000"
 join_retry_timeout="2000" shun="false"
 view_bundling="true" view_ack_collection_timeout="5000"/>
 <FRAG2 frag_size="60000"/>
 <pbcast.STREAMING_STATE_TRANSFER use_reading_thread="true"/>
 <!-- <pbcast.STATE_TRANSFER/> -->
 <pbcast.FLUSH timeout="0"/>
 </config>
 </attribute>


 <!--
 Whether or not to fetch state on joining a cluster
 NOTE this used to be called FetchStateOnStartup and has been renamed to be more descriptive.
 -->
 <attribute name="FetchInMemoryState">false</attribute>

 <!--
 The max amount of time (in milliseconds) we wait until the
 state (ie. the contents of the cache) are retrieved from
 existing members in a clustered environment
 -->
 <attribute name="StateRetrievalTimeout">15000</attribute>

 <!--
 Number of milliseconds to wait until all responses for a
 synchronous call have been received.
 -->
 <attribute name="SyncReplTimeout">15000</attribute>

 <!-- Max number of milliseconds to wait for a lock acquisition -->
 <attribute name="LockAcquisitionTimeout">30000</attribute>

 <!--
 Indicate whether to use region based marshalling or not. Set this to true if you are running under a scoped
 class loader, e.g., inside an application server. Default is "false".
 -->
 <attribute name="UseRegionBasedMarshalling">false</attribute>

 <!-- Cache Loader configuration block -->
 <attribute name="CacheLoaderConfig">
 <config>
 <!-- if passivation is true, only the first cache loader is used; the rest are ignored -->
 <passivation>false</passivation>
 <preload>/</preload>
 <shared>true</shared>

 <!-- we can now have multiple cache loaders, which get chained -->
 <cacheloader>
 <class>org.jboss.cache.loader.JDBCCacheLoader</class>

 <properties>
 cache.jdbc.table.name=dht
 cache.jdbc.table.primarykey=dht_pk
 cache.jdbc.table.create=true
 cache.jdbc.table.drop=false
 cache.jdbc.fqn.column=fqn
 cache.jdbc.fqn.type=varchar(255)
 cache.jdbc.node.column=value
 cache.jdbc.node.type=LONGBLOB
 cache.jdbc.parent.column=parent_fqn
 cache.jdbc.datasource=java:/jdbc/FabricDS
 cache.jdbc.sql-concat=concat(1,2)
 </properties>

 <!-- whether the cache loader writes are asynchronous -->
 <async>false</async>

 <!-- only one cache loader in the chain may set fetchPersistentState to true.
 An exception is thrown if more than one cache loader sets this to true. -->
 <fetchPersistentState>false</fetchPersistentState>

 <!-- determines whether this cache loader ignores writes - defaults to false. -->
 <ignoreModifications>false</ignoreModifications>

 <purgeOnStartup>false</purgeOnStartup>
 </cacheloader>

 </config>
 </attribute>

 <!-- Buddy Replication config -->
 <attribute name="BuddyReplicationConfig">
 <config>

 <!-- Enables buddy replication. This is the ONLY mandatory configuration element here. -->
 <buddyReplicationEnabled>false</buddyReplicationEnabled>

 <!-- These are the default values anyway -->
 <buddyLocatorClass>org.jboss.cache.buddyreplication.NextMemberBuddyLocator</buddyLocatorClass>

 <!-- numBuddies is the number of backup nodes each node maintains. ignoreColocatedBuddies means that
 each node will *try* to select a buddy on a different physical host. If not able to do so though,
 it will fall back to colocated nodes. -->
 <buddyLocatorProperties>
 numBuddies = 1
 ignoreColocatedBuddies = true
 </buddyLocatorProperties>

 <!-- A way to specify a preferred replication group. If specified, we try and pick a buddy why shares
 the same pool name (falling back to other buddies if not available). This allows the sysdmin to hint at
 backup buddies are picked, so for example, nodes may be hinted topick buddies on a different physical rack
 or power supply for added fault tolerance. Note: to override this value, use system property desktone.cache.buddyName -->
 <buddyPoolName>myBuddyPoolReplicationGroup</buddyPoolName>

 <!-- Communication timeout for inter-buddy group organisation messages (such as assigning to and removing
 from groups, defaults to 1000. -->
 <buddyCommunicationTimeout>2000</buddyCommunicationTimeout>

 <!-- Whether data is removed from old owners when gravitated to a new owner. Defaults to true. -->
 <dataGravitationRemoveOnFind>true</dataGravitationRemoveOnFind>

 <!-- Whether backup nodes can respond to data gravitation requests, or only the data owner is supposed to respond.
 defaults to true. -->
 <dataGravitationSearchBackupTrees>true</dataGravitationSearchBackupTrees>

 <!-- Whether all cache misses result in a data gravitation request. Defaults to false, requiring callers to
 enable data gravitation on a per-invocation basis using the Options API. -->
 <autoDataGravitation>false</autoDataGravitation>

 </config>
 </attribute>

 </mbean>

 <!-- Uncomment to get a graphical view of the TreeCache MBean above -->
 <!-- <mbean code="org.jboss.cache.TreeCacheView" name="jboss.cache:service=TreeCacheView">-->
 <!-- <depends>jboss.cache:service=TreeCache</depends>-->
 <!-- <attribute name="CacheService">jboss.cache:service=TreeCache</attribute>-->
 <!-- </mbean>-->


</server>

1. Re: Recommendation for IsolationLevel

manik Jun 5, 2008 5:49 AM (in response to kringdahl)

when you say high transaction, what do you mean? High degree of writes? If you have a write-mostly setup, then using a cache in the first place is really not a good idea since caches are optimised for read-mostly situations.

SERIALIZABLE will ensure that you cannot have any concurrency in your system, which may not be a good thing at all. REPEATABLE_READ or READ_COMMITTED at least allows concurrent readers.
Actions
2. Re: Recommendation for IsolationLevel

kringdahl Jun 5, 2008 8:16 AM (in response to kringdahl)

Well, high transaction may have been overshooting a little bit. What we really need is guaranteed consistency of writes. We have chosen a caching solution because we have a distributed application where reads need to be very fast and writes typically happen outside of the normal user data path (e.g. we have inventories that are collected on a periodic basis). Anyway, the scenario I describe is actually very similar to what's described here:

http://www.jboss.com/index.html?module=bb&op=viewtopic&t=114513

Which ultimately means that we're looking at MVCC. In the meantime I am thinking of implementing our own mutex lock within the cache itself to help synchronization across the cluster. If we do this, we can change the IsolationLevel back to something more reasonable like REPEATABLE_READ as you say. My interpretation of SERIALIZABLE is that access to a particular node would be locked when first read inside of a transaction. But this appears to happen during the commit phase and we still have contention of resources.

Also, we have tried OPTIMISTIC locking but it seems to have problems with our cache loader. We use a JDBC cache loader and preload the / node. We see exceptions when the cache first initializes and tries to preload the cache. Is this a known problem? I can certainly recreate this and provide the stack trace if it helps.
Actions

Go to original post