Rollback during session replication
fotero May 29, 2006 5:13 PMHi,
I'm trying to setup a 2 node cluster with HTTP session replication. Everything seems correct, the aplications get deployed, the users can access them, and so on. The problem is that I´m receiving rollback exceptions during session replication:
10:04:28,371 ERROR [DummyTransaction] beforeCompletion() failed for tx=org.jboss.cache.transaction.DummyTransaction@e3e4fafe, handlers=[TxInterceptor.LocalSynchronizationHandler(gtx=GlobalTransaction:<172.31.5.65:37591>:167, tx=org.jboss.cache.transaction.DummyTransaction@e3e4fafe)] java.lang.RuntimeException: at org.jboss.cache.interceptors.TxInterceptor$LocalSynchronizationHandler.beforeCompletion(TxInterceptor.java:1065) at org.jboss.cache.interceptors.OrderedSynchronizationHandler.beforeCompletion(OrderedSynchronizationHandler.java:72) at org.jboss.cache.transaction.DummyTransaction.notifyBeforeCompletion(DummyTransaction.java:247) at org.jboss.cache.transaction.DummyTransaction.commit(DummyTransaction.java:54) at org.jboss.cache.transaction.DummyBaseTransactionManager.commit(DummyBaseTransactionManager.java:61) at org.jboss.web.tomcat.tc5.session.JBossCacheManager.endTransaction(JBossCacheManager.java:1038) at org.jboss.web.tomcat.tc5.session.JBossCacheManager.processSessionRepl(JBossCacheManager.java:1017) at org.jboss.web.tomcat.tc5.session.JBossCacheManager.storeSession(JBossCacheManager.java:637) at org.jboss.web.tomcat.tc5.session.InstantSnapshotManager.snapshot(InstantSnapshotManager.java:52) at org.jboss.web.tomcat.tc5.session.ClusteredSessionValve.invoke(ClusteredSessionValve.java:105) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:74) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWorkerThread.java:112) at java.lang.Thread.run()V(Unknown Source) Caused by: org.jboss.cache.ReplicationException: rsp=sender=172.31.5.66:37857, retval=null, received=false, suspected=false at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3747) at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3672) at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3770) at org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:87) at org.jboss.cache.interceptors.ReplicationInterceptor.runPreparePhase(ReplicationInterceptor.java:143) at org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:61) at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:67) at org.jboss.cache.interceptors.TxInterceptor.runPreparePhase(TxInterceptor.java:781) at org.jboss.cache.interceptors.TxInterceptor$LocalSynchronizationHandler.beforeCompletion(TxInterceptor.java:1043) at org.jboss.cache.interceptors.OrderedSynchronizationHandler.beforeCompletion(OrderedSynchronizationHandler.java:72) at org.jboss.cache.transaction.DummyTransaction.notifyBeforeCompletion(DummyTransaction.java:247) at org.jboss.cache.transaction.DummyTransaction.commit(DummyTransaction.java:54) at org.jboss.cache.transaction.DummyBaseTransactionManager.commit(DummyBaseTransactionManager.java:61) at org.jboss.web.tomcat.tc5.session.JBossCacheManager.endTransaction(JBossCacheManager.java:1038) at org.jboss.web.tomcat.tc5.session.JBossCacheManager.processSessionRepl(JBossCacheManager.java:1017) at org.jboss.web.tomcat.tc5.session.JBossCacheManager.storeSession(JBossCacheManager.java:637) at org.jboss.web.tomcat.tc5.session.InstantSnapshotManager.snapshot(InstantSnapshotManager.java:52) at org.jboss.web.tomcat.tc5.session.ClusteredSessionValve.invoke(ClusteredSessionValve.java:105) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:74) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) Caused by: org.jboss.cache.lock.TimeoutException: timeout for 172.31.5.66:37857 at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:3745) ... 25 more 10:04:28,387 WARN [JBossCacheManager] JBossCacheManager.endTransaction(): rolling back transaction with exception: javax.transaction.RollbackException: outcome is false stats: 1
I'm using JBoss 4.0.4 GA with JBoss Cache 1.3.0 and JGroups 2.2.9.2, Solaris 10 SPARC and JRockit VM 1.5.0_06. I changed the JGroups configuration in tc-cluster.sar/META-INF/jboss-service.xml based on the fc-fast.xml, as follows:
<?xml version="1.0" encoding="UTF-8"?> <!-- ===================================================================== --> <!-- --> <!-- Customized TreeCache Service Configuration for Tomcat 5 Clustering --> <!-- --> <!-- ===================================================================== --> <server> <!-- ==================================================================== --> <!-- Defines TreeCache configuration --> <!-- ==================================================================== --> <!-- Note we are using TeeCacheAop --> <mbean code="org.jboss.cache.aop.TreeCacheAop" name="jboss.cache:service=TomcatClusteringCache"> <depends>jboss:service=Naming</depends> <depends>jboss:service=TransactionManager</depends> <!-- We need the AspectDeployer to deploy our FIELD granularity aspects --> <depends>jboss.aop:service=AspectDeployer</depends> <!-- Name of cluster. Needs to be the same for all nodes in the cluster, in order to find each other --> <attribute name="ClusterName">Tomcat-${jboss.partition.name:Cluster}</attribute> <!-- Isolation level : SERIALIZABLE REPEATABLE_READ (default) READ_COMMITTED READ_UNCOMMITTED NONE --> <attribute name="IsolationLevel">REPEATABLE_READ</attribute> <!-- Valid modes are LOCAL, REPL_ASYNC and REPL_SYNC If you use REPL_SYNC and a UDP-based ClusterConfig we recommend you comment out the FC (flow control) protocol in the ClusterConfig section below. --> <attribute name="CacheMode">REPL_SYNC</attribute> <!-- Configuration options for use with JBossCache 1.2.4 and later. Comment out and replace with the JBossCache 1.2.3 options below if you are using JBossCache version 1.2.3.1 or earlier. UseMarshalling Indicates whether to the cache should unmarshall objects replicated from other cluster nodes, or store them internally as a byte[] until a web app requests them. Must be "true" if session replication granularity "FIELD" is used in any webapp, otherwise "false" is recommended. InactiveOnStartup Whether or not the entire tree is inactive upon startup, only responding to replication messages after activateRegion() is called to activate one or more parts of the tree when a webapp is deployed. Must have the same value as "UseMarshalling". TransactionManagerLookupClass Make sure to specify BatchModeTransactionManager only! --> <attribute name="UseMarshalling">false</attribute> <attribute name="InactiveOnStartup">false</attribute> <attribute name="TransactionManagerLookupClass">org.jboss.cache.BatchModeTransactionManagerLookup</attribute> <!-- Configuration to use with JBossCache 1.2.3 and earlier. Uncomment and comment out the JBossCache 1.2.4 options above if you are using JBossCache version 1.2.3.1 or earlier. Any valid implementation of TransactionManagerLookup can be used. <attribute name="TransactionManagerLookupClass">org.jboss.cache.JBossTransactionManagerLookup</attribute> --> <!-- JGroups protocol stack properties. Can also be a URL, e.g. file:/home/bela/default.xml <attribute name="ClusterProperties"></attribute> --> <attribute name="ClusterConfig"> <Config> <UDP bind_addr="${jboss.sync.bind.address}" mcast_send_buf_size="10000000" mcast_addr="${jboss.partition.udpGroup}" mcast_port="45577" tos="16" ucast_recv_buf_size="10000000" receive_on_all_interfaces="false" loopback="false" mcast_recv_buf_size="10000000" max_bundle_size="64000" max_bundle_timeout="30" use_incoming_packet_handler="false" use_outgoing_packet_handler="true" ucast_send_buf_size="10000000" ip_ttl="32" enable_bundling="true"/> <PING timeout="2000" down_thread="false" num_initial_members="3"/> <MERGE2 max_interval="10000" down_thread="false" min_interval="5000"/> <FD_SOCK srv_sock_bind_addr="${jboss.sync.bind.address}" down_thread="false"/> <VERIFY_SUSPECT timeout="1500" down_thread="false"/> <pbcast.NAKACK max_xmit_size="60000" down_thread="false" use_mcast_xmit="true" gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"/> <UNICAST timeout="300,600,1200,2400,3600" down_thread="false"/> <pbcast.STABLE stability_delay="1000" desired_avg_gossip="5000" down_thread="false" max_bytes="250000"/> <VIEW_SYNC avg_send_interval="60000" down_thread="false" up_thread="false" /> <pbcast.GMS print_local_addr="true" join_timeout="3000" down_thread="false" join_retry_timeout="2000" shun="true"/> <!--FC max_credits="1000000" down_thread="false" min_threshold="0.10"/--> <FRAG2 frag_size="60000" down_thread="false" up_thread="true"/> <!--COMPRESS down_thread="false" min_size="500" compression_level="3" up_thread="true"/--> <pbcast.STATE_TRANSFER down_thread="false" up_thread="false"/> </Config> </attribute> <!-- Number of milliseconds to wait until all responses for a synchronous call have been received. --> <attribute name="SyncReplTimeout">5000</attribute> <!-- Max number of milliseconds to wait for a lock acquisition --> <attribute name="LockAcquisitionTimeout">15000</attribute> </mbean> </server>
After a while of getting this exception, the server slow down and a receive the following message:
16:19:38,401 WARN [TimeScheduler] task org.jgroups.protocols.TP$Bundler$BundlingTimer@ebb8f15c took 6783ms to execute, please check why it is taking so long. It is delaying other tasks 16:21:50,139 WARN [TimeScheduler] task org.jgroups.protocols.pbcast.STABLE$StabilitySendTask@ebc73c3f took 6187ms to execute, please check why it is taking so long. It is delaying other tasks
Am I missing something or doing something wrong?
Thanks in advance,
Fernando