4 Replies Latest reply on Jul 22, 2011 2:15 AM by anujbhatia

Failover in Clustered Deployment of RiftSaw

anujbhatia Jul 21, 2011 7:04 AM

Hello,

If one of the nodes in a clustered RiftSaw deployment dies, is it possible for BPEL process instances running on that node to fail over to the other nodes in that cluster?

I'm trying to test this using RiftSaw 2.3.0 deployed on a two node JBoss 5.1.0 cluster with both RiftSaw instances pointing to the same MySQL instance. I initiate a BPEL process that has a wait activity in between a couple of invoke activities and then shutdown the node it is executing on. In the other surviving node I get an error message:

14:56:12,534 INFO [org.jboss.soa.bpel.clustering.ODEJobClusterListener] The available nodes now are [127.0.0.1:1099]

14:56:12,534 ERROR [org.hibernate.ejb.AbstractEntityManagerImpl] Unable to mark for rollback on PersistenceException:

java.lang.IllegalStateException: [com.arjuna.ats.internal.jta.transaction.arjunacore.nosuchtx] [com.arjuna.ats.internal.jta.transaction.arjunacore.nosuchtx] No such transaction!

at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.setRollbackOnly(BaseTransaction.java:191)

at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.setRollbackOnly(BaseTransactionManagerDelegate.java:123)

at org.hibernate.ejb.AbstractEntityManagerImpl.markAsRollback(AbstractEntityManagerImpl.java:421)

at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:576)

at org.hibernate.ejb.QueryImpl.executeUpdate(QueryImpl.java:48)

at org.apache.ode.dao.jpa.scheduler.SchedulerDAOConnectionImpl.updateReassign(SchedulerDAOConnectionImpl.java:177)

at org.jboss.soa.bpel.clustering.ODEJobClusterListener.membershipChanged(ODEJobClusterListener.java:79)

at org.jboss.ha.framework.server.ClusterPartition.notifyListeners(ClusterPartition.java:1589)

at org.jboss.ha.framework.server.ClusterPartition.processEvent(ClusterPartition.java:1437)

at org.jboss.ha.framework.server.AsynchEventHandler.run(AsynchEventHandler.java:108)

at java.lang.Thread.run(Thread.java:619)

14:56:12,549 WARN [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] HAMembershipListener callback failure: org.jboss.soa.bpel.clustering.ODEJobClusterListener@125310f

javax.persistence.TransactionRequiredException: Executing an update/delete query

at org.hibernate.ejb.QueryImpl.executeUpdate(QueryImpl.java:48)

at org.apache.ode.dao.jpa.scheduler.SchedulerDAOConnectionImpl.updateReassign(SchedulerDAOConnectionImpl.java:177)

at org.jboss.soa.bpel.clustering.ODEJobClusterListener.membershipChanged(ODEJobClusterListener.java:79)

at org.jboss.ha.framework.server.ClusterPartition.notifyListeners(ClusterPartition.java:1589)

at org.jboss.ha.framework.server.ClusterPartition.processEvent(ClusterPartition.java:1437)

at org.jboss.ha.framework.server.AsynchEventHandler.run(AsynchEventHandler.java:108)

at java.lang.Thread.run(Thread.java:619)

The process does not execute further on the second node. It looks like there is some fail over support but it's not working because of the TransactionRequiredException. Even after I restart the first node the process does not seem to resume execution on the first node.

Does anybody know if this error is preventing fail over from working or is this not a valid test case in the first place?

Thanks

Anuj

1. Re: Failover in Clustered Deployment of RiftSaw

objectiser Jul 21, 2011 7:44 AM (in response to anujbhatia)

This sounds like a valid use case - if you could raise a jira and attach your example and logs, and describe the steps to reproduce the issue.

Regards
Gary
Actions
2. Re: Failover in Clustered Deployment of RiftSaw

anujbhatia Jul 21, 2011 9:29 AM (in response to objectiser)

jira created - https://issues.jboss.org/browse/RIFTSAW-404

I found that the problem happened even with an empty installations, i.e. without any BPEL process deployed, so have not attached any exmple BPEL file.

Thanks
Anuj
Actions
3. Re: Failover in Clustered Deployment of RiftSaw

objectiser Jul 21, 2011 9:57 AM (in response to anujbhatia)

Ok thanks, although it probably would be good to include the BPEL process as well, so that it can be tried after sorting out the transaction exception.

Regards
Gary
Actions
4. Re: Failover in Clustered Deployment of RiftSaw

anujbhatia Jul 22, 2011 2:15 AM (in response to objectiser)

Added sample BPEL process as well to jira.
Actions

Go to original post