Failover in Clustered Deployment of RiftSaw
anujbhatia Jul 21, 2011 7:04 AMHello,
If one of the nodes in a clustered RiftSaw deployment dies, is it possible for BPEL process instances running on that node to fail over to the other nodes in that cluster?
I'm trying to test this using RiftSaw 2.3.0 deployed on a two node JBoss 5.1.0 cluster with both RiftSaw instances pointing to the same MySQL instance. I initiate a BPEL process that has a wait activity in between a couple of invoke activities and then shutdown the node it is executing on. In the other surviving node I get an error message:
14:56:12,534 INFO [org.jboss.soa.bpel.clustering.ODEJobClusterListener] The available nodes now are [127.0.0.1:1099]
14:56:12,534 ERROR [org.hibernate.ejb.AbstractEntityManagerImpl] Unable to mark for rollback on PersistenceException:
java.lang.IllegalStateException: [com.arjuna.ats.internal.jta.transaction.arjunacore.nosuchtx] [com.arjuna.ats.internal.jta.transaction.arjunacore.nosuchtx] No such transaction!
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.setRollbackOnly(BaseTransaction.java:191)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.setRollbackOnly(BaseTransactionManagerDelegate.java:123)
at org.hibernate.ejb.AbstractEntityManagerImpl.markAsRollback(AbstractEntityManagerImpl.java:421)
at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:576)
at org.hibernate.ejb.QueryImpl.executeUpdate(QueryImpl.java:48)
at org.apache.ode.dao.jpa.scheduler.SchedulerDAOConnectionImpl.updateReassign(SchedulerDAOConnectionImpl.java:177)
at org.jboss.soa.bpel.clustering.ODEJobClusterListener.membershipChanged(ODEJobClusterListener.java:79)
at org.jboss.ha.framework.server.ClusterPartition.notifyListeners(ClusterPartition.java:1589)
at org.jboss.ha.framework.server.ClusterPartition.processEvent(ClusterPartition.java:1437)
at org.jboss.ha.framework.server.AsynchEventHandler.run(AsynchEventHandler.java:108)
at java.lang.Thread.run(Thread.java:619)
14:56:12,549 WARN [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] HAMembershipListener callback failure: org.jboss.soa.bpel.clustering.ODEJobClusterListener@125310f
javax.persistence.TransactionRequiredException: Executing an update/delete query
at org.hibernate.ejb.QueryImpl.executeUpdate(QueryImpl.java:48)
at org.apache.ode.dao.jpa.scheduler.SchedulerDAOConnectionImpl.updateReassign(SchedulerDAOConnectionImpl.java:177)
at org.jboss.soa.bpel.clustering.ODEJobClusterListener.membershipChanged(ODEJobClusterListener.java:79)
at org.jboss.ha.framework.server.ClusterPartition.notifyListeners(ClusterPartition.java:1589)
at org.jboss.ha.framework.server.ClusterPartition.processEvent(ClusterPartition.java:1437)
at org.jboss.ha.framework.server.AsynchEventHandler.run(AsynchEventHandler.java:108)
at java.lang.Thread.run(Thread.java:619)
The process does not execute further on the second node. It looks like there is some fail over support but it's not working because of the TransactionRequiredException. Even after I restart the first node the process does not seem to resume execution on the first node.
Does anybody know if this error is preventing fail over from working or is this not a valid test case in the first place?
Thanks
Anuj