org.infinispan.CacheException: Could not prepare - caused by XAException
apatispelikan Jun 20, 2012 7:40 AMHello,
I try to tune JBoss AS 7.1.2 for higher through-put and get errors for which I cannot find a reason:
09:59:17,123 WARN [com.arjuna.ats.arjuna] (http-executor-threads - 41) ARJUNA012125: TwoPhaseCoordinator.beforeCompletion - failed for SynchronizationImple< 0:ffffc29e84b8:-4561d3a3:4fe1818d:43524, SynchronizationAdapter{localTransaction=LocalTransaction{remoteLockedNodes=null, isMarkedForRollback=false, transaction=TransactionImple < ac, BasicAction: 0:ffffc29e84b8:-4561d3a3:4fe1818d:4325a status: ActionStatus.ABORT_ONLY >, lockedKeys=null, backupKeyLocks=null, viewId=1} org.infinispan.transaction.synchronization.SyncLocalTransaction@de0b} org.infinispan.transaction.synchronization.SynchronizationAdapter@de2a >: org.infinispan.CacheException: Could not prepare.
at org.infinispan.transaction.synchronization.SynchronizationAdapter.beforeCompletion(SynchronizationAdapter.java:70) [infinispan-core-5.1.4.FINAL.jar:5.1.4.FINAL]
at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:76)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:273)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:93)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:164)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1165)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:117)
...
Caused by: javax.transaction.xa.XAException
at org.infinispan.transaction.TransactionCoordinator.prepare(TransactionCoordinator.java:160) [infinispan-core-5.1.4.FINAL.jar:5.1.4.FINAL]
at org.infinispan.transaction.TransactionCoordinator.prepare(TransactionCoordinator.java:122) [infinispan-core-5.1.4.FINAL.jar:5.1.4.FINAL]
at org.infinispan.transaction.synchronization.SynchronizationAdapter.beforeCompletion(SynchronizationAdapter.java:68) [infinispan-core-5.1.4.FINAL.jar:5.1.4.FINAL]
... 85 more
These exceptions only occur during stressing JBoss by a test. This test sends (configurable) parallel requests on different webservice-methods (implemented by a slsb) but only on one node (this is a two-node-cluster in domain-mode - please don't ask: yes much more nodes will follow :-) ).
This is my cache configuration:
<cache-container name="hibernate" default-cache="local-query" module="org.jboss.as.jpa.hibernate:4" eviction-executor="infinispan-eviction">
<transport lock-timeout="60000"/>
<invalidation-cache name="local-query" mode="SYNC">
<transaction mode="NONE"/>
<eviction strategy="LRU" max-entries="5000"/>
<expiration max-idle="660000"/>
<locking concurrency-level="100"/>
</invalidation-cache>
<invalidation-cache name="entity" mode="SYNC">
<transaction mode="NON_XA"/>
<eviction strategy="LRU" max-entries="100000"/>
<expiration max-idle="3600000"/>
<locking concurrency-level="100"/>
</invalidation-cache>
<replicated-cache name="timestamps" mode="ASYNC">
<transaction mode="NONE"/>
<eviction strategy="NONE"/>
</replicated-cache>
</cache-container>
Furthur changes:
- I use an apache in front of JBoss and it is configured to serve 600 requests concurrently
- I added a separat thread-pool for the web-module
<bounded-queue-thread-pool name="http-executor">
<core-threads count="58"/>
<queue-length count="148"/>
<max-threads count="58"/>
<keepalive-time time="10" unit="seconds"/>
</bounded-queue-thread-pool>
- I increased my database-pool (XA-datasource to Oracle)
<xa-pool>
<min-pool-size>2</min-pool-size>
<max-pool-size>100</max-pool-size>
<is-same-rm-override>false</is-same-rm-override>
<interleaving>false</interleaving>
<pad-xid>false</pad-xid>
<wrap-xa-resource>false</wrap-xa-resource>
</xa-pool>
- I adapted infinispan-configuration (cache-type, concurrency-level)
If this exception occurs I can see that JBoss seems to block and no requests are served for approximatly 10-12 seconds. After the requests getting those exceptions are done it works for a while until the next 10-seconds sequence arives.
I did some changes and got different results:
- Changing query-cache form invalidation delays those exceptions for a long while - but I get them after a while
- Depending on the concurrency-level of my tests those exceptions arive earlier or later:
- invalidation-cache/concurrency=25: after 30sec (throughput: ~300 requests/second)
- local-cache/concurrency=100: after 2-4 minutes (throughput: ~1500 requests/second)
- local-cache/concurrency=200: after 1-2 minutes (throughput: ~1000 requests/second)
- Stressing if the second node is offline also delays the occurence of those exceptions (but does not avoid them)
- system-load increases during my tests (2-4; of course this is normal) but suddenly it raises to a very high (24-40) and at that moment the 10-seconds-period and those exceptions occur.
- The duration of the requests increases permanently starting at 300ms ending at 800ms (although those requests do the same work on different database-records)
- One type of request checks for a certain database-record: If this records does not exists it will be created. If I force this situation (by peparing data for the test) those exceptions occur earlier.
So my questions:
- What causes this exception?
- What kind of logging can I enable to get further informations?
- It seems that anythings is to slow or it is to small set. I tried to increase everything in the chain of resources (http, slsb-pool, db-pool). Did I forget anything?
- I use UDP for cluster-communication. Could this be a bottleneck on the network? (the server has two 1-gigabit network-cards!)
- Can this be a deadlock on a resource? I noticed infinispan got a deadlock-detection but I cannot find any information how to configure in AS7.
I hope anyone can help me. These is the last hurdle to get my migration from 5.1 to 7.1 done!
Thanks,
Stephan