JobExecutor becomes broken and cannot recover (JBPM3.2.2)
soft Dec 22, 2010 5:49 AMProblem: JobExecutor thread becomes broken due to unknown transactional problem. As a reason no tasks are executed and only JBoss restart helps.
Configuration and environment
- components stack: JBOSS 4.2.3, JBOSS TS 4.2.3, JBPM 3.2.2, Hibernate 3.2.4.sp1.cp04
- database server: MySQL 5.1.47
- distributed transaction only within the one database server
- operated on several Linux environments
The problem occurs irregurarly (once per several days and only on some environments). When it happens following errors are logged by JBPMExecutor thread:
2010-12-15 15:53:20,566 WARN [com.arjuna.ats.arjuna.logging.arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator_2
TwoPhaseCoordinator.beforeCompletion - failed for null
java.lang.NullPointerException
at org.hibernate.connection.DatasourceConnectionProvider.closeConnection(DatasourceConnectionProvider.java:74)
at org.hibernate.jdbc.ConnectionManager.closeConnection(ConnectionManager.java:451)
at org.hibernate.jdbc.ConnectionManager.aggressiveRelease(ConnectionManager.java:406)
at org.hibernate.jdbc.ConnectionManager.afterStatement(ConnectionManager.java:281)
at org.hibernate.jdbc.ConnectionManager.flushEnding(ConnectionManager.java:480)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:305)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:27)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1000)
at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:338)
at org.hibernate.transaction.CacheSynchronization.beforeCompletion(CacheSynchronization.java:59)
at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:114)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:247)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)
at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)
at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:80)
at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:41)
at org.jbpm.svc.Services.close(Services.java:225)
at org.jbpm.JbpmContext.close(JbpmContext.java:139)
at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)
at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)
2010-12-15 15:53:20,566 WARN [com.arjuna.ats.arjuna.logging.arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check - atomic actiona471f1f:c46e:4d08ce0f:c5a aborting with 1 threads active!
2010-12-15 15:53:20,582 ERROR [org.jbpm.svc.Services] problem closing service 'persistence'
org.jbpm.JbpmException: couldn't commit JTA transaction
at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:82)
at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:41)
at org.jbpm.svc.Services.close(Services.java:225)
at org.jbpm.JbpmContext.close(JbpmContext.java:139)
at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)
at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)
Caused by: java.lang.NullPointerException
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:235)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)
at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)
at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:80)
... 5 more
2010-12-15 15:53:20,589 ERROR [org.jbpm.job.executor.JobExecutorThread] exception in job executor thread. waiting 5000 milliseconds
org.jbpm.JbpmException: problem closing services {persistence=org.jbpm.JbpmException: couldn't commit JTA transaction}
at org.jbpm.svc.Services.close(Services.java:245)
at org.jbpm.JbpmContext.close(JbpmContext.java:139)
at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)
at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)
Caused by: org.jbpm.JbpmException: couldn't commit JTA transaction
at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:82)
at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:41)
at org.jbpm.svc.Services.close(Services.java:225)
... 3 more
Caused by: java.lang.NullPointerException
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:235)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)
at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)
at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:80)... 5 more
From now all the successive iterations of JobExecutor always end up with following error (and the waiting time doubles after error):
2010-12-15 15:53:23,524 ERROR [org.jbpm.job.executor.JobExecutorThread] exception in job executor thread. waiting 10000 milliseconds
org.jbpm.JbpmException: couldn't start JTA transaction
at org.jbpm.persistence.jta.JtaDbPersistenceService.beginJtaTransaction(JtaDbPersistenceService.java:55)
at org.jbpm.persistence.jta.JtaDbPersistenceService.<init>(JtaDbPersistenceService.java:28)
at org.jbpm.persistence.jta.JtaDbPersistenceServiceFactory.openService(JtaDbPersistenceServiceFactory.java:17)
at org.jbpm.svc.Services.getService(Services.java:144)
at org.jbpm.svc.Services.getPersistenceService(Services.java:183)
at org.jbpm.JbpmContext.getPersistenceService(JbpmContext.java:628)
at org.jbpm.JbpmContext.getJobSession(JbpmContext.java:561)
at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:112)
at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)
Caused by: javax.transaction.NotSupportedException
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.begin(BaseTransaction.java:79)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.begin(BaseTransactionManagerDelegate.java:77)
at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.begin(ServerVMClientUserTransaction.java:124)
at org.jbpm.persistence.jta.JtaDbPersistenceService.beginJtaTransaction(JtaDbPersistenceService.java:53)
... 8 more
I suppose that the root cause is some database connection problem or network problem (maybe firewall). The problem occurs irregurarly only on production environments and I wasn't able to find out what it coudl be.
Anyway, I would expect that regardless of such low-level errors, the JBPM (or Hibernate, Arjuna or anything else inside) should be able to fully recover.
Locally I can reproduce problem by locking Mysql database tables using scenario:
1) deployed testing jbpm process whose transitions are changed automatically every minute
2) locked mysql database tables using "FLUSH TABLES WITH READ LOCK \G;" command
3) wait 1 hour
4) unlock database tables using "UNLOCK TABLES;" command
5) now errors (see above) appear in the log
Can you please give me any advice how to ensure that JBPM recovers correctly?
Maybe I omited some important configuration in Hibernate/Arjuna/JBPM, Jboss (the JBPM datasource is configured correctly now).
I cannot belive that the JBOSS restart is sufficent solution.
Thanks a lot.
Petr