1 Reply Latest reply on Jan 25, 2011 4:29 AM by Chatura kalansooriya

    JobExecutor becomes broken and cannot recover (JBPM3.2.2)

    soft Newbie

      Problem: JobExecutor thread becomes broken due to unknown transactional problem. As a reason no tasks are executed and only JBoss restart helps.

       

      Configuration and environment
      - components stack: JBOSS 4.2.3, JBOSS TS 4.2.3, JBPM 3.2.2, Hibernate 3.2.4.sp1.cp04
      - database server: MySQL 5.1.47
      - distributed transaction only within the one database server
      - operated on several Linux environments

       

       

      The problem occurs irregurarly (once per several days and only on some environments). When it happens following errors are logged by JBPMExecutor thread:

      2010-12-15 15:53:20,566 WARN  [com.arjuna.ats.arjuna.logging.arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator_2

      TwoPhaseCoordinator.beforeCompletion - failed for null
      java.lang.NullPointerException
      at org.hibernate.connection.DatasourceConnectionProvider.closeConnection(DatasourceConnectionProvider.java:74)
      at org.hibernate.jdbc.ConnectionManager.closeConnection(ConnectionManager.java:451)
      at org.hibernate.jdbc.ConnectionManager.aggressiveRelease(ConnectionManager.java:406)
      at org.hibernate.jdbc.ConnectionManager.afterStatement(ConnectionManager.java:281)
      at org.hibernate.jdbc.ConnectionManager.flushEnding(ConnectionManager.java:480)
      at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:305)
      at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:27)
      at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1000)
      at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:338)
      at org.hibernate.transaction.CacheSynchronization.beforeCompletion(CacheSynchronization.java:59)
      at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:114)
      at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:247)
      at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)
      at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
      at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)
      at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)
      at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)
      at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)
      at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:80)
      at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:41)
      at org.jbpm.svc.Services.close(Services.java:225)
      at org.jbpm.JbpmContext.close(JbpmContext.java:139)
      at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)
      at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)
      2010-12-15 15:53:20,566 WARN  [com.arjuna.ats.arjuna.logging.arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check - atomic action

      a471f1f:c46e:4d08ce0f:c5a aborting with 1 threads active!
      2010-12-15 15:53:20,582 ERROR [org.jbpm.svc.Services] problem closing service 'persistence'
      org.jbpm.JbpmException: couldn't commit JTA transaction
      at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:82)
      at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:41)
      at org.jbpm.svc.Services.close(Services.java:225)
      at org.jbpm.JbpmContext.close(JbpmContext.java:139)
      at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)
      at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)
      Caused by: java.lang.NullPointerException
      at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:235)
      at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)
      at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
      at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)
      at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)
      at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)
      at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)
      at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:80)
      ... 5 more
      2010-12-15 15:53:20,589 ERROR [org.jbpm.job.executor.JobExecutorThread] exception in job executor thread. waiting 5000 milliseconds
      org.jbpm.JbpmException: problem closing services {persistence=org.jbpm.JbpmException: couldn't commit JTA transaction}
      at org.jbpm.svc.Services.close(Services.java:245)
      at org.jbpm.JbpmContext.close(JbpmContext.java:139)
      at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)
      at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)
      Caused by: org.jbpm.JbpmException: couldn't commit JTA transaction
      at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:82)
      at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:41)
      at org.jbpm.svc.Services.close(Services.java:225)
      ... 3 more
      Caused by: java.lang.NullPointerException
      at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:235)
      at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)
      at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
      at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)
      at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)
      at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)
      at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)
      at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:80)

      ... 5 more

       

      From now all the successive iterations of JobExecutor always end up with following error (and the waiting time doubles after error):

      2010-12-15 15:53:23,524 ERROR [org.jbpm.job.executor.JobExecutorThread] exception in job executor thread. waiting 10000 milliseconds
      org.jbpm.JbpmException: couldn't start JTA transaction
      at org.jbpm.persistence.jta.JtaDbPersistenceService.beginJtaTransaction(JtaDbPersistenceService.java:55)
      at org.jbpm.persistence.jta.JtaDbPersistenceService.<init>(JtaDbPersistenceService.java:28)
      at org.jbpm.persistence.jta.JtaDbPersistenceServiceFactory.openService(JtaDbPersistenceServiceFactory.java:17)
      at org.jbpm.svc.Services.getService(Services.java:144)
      at org.jbpm.svc.Services.getPersistenceService(Services.java:183)
      at org.jbpm.JbpmContext.getPersistenceService(JbpmContext.java:628)
      at org.jbpm.JbpmContext.getJobSession(JbpmContext.java:561)
      at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:112)
      at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)
      Caused by: javax.transaction.NotSupportedException
      at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.begin(BaseTransaction.java:79)
      at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.begin(BaseTransactionManagerDelegate.java:77)
      at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.begin(ServerVMClientUserTransaction.java:124)
      at org.jbpm.persistence.jta.JtaDbPersistenceService.beginJtaTransaction(JtaDbPersistenceService.java:53)
      ... 8 more

       

       

      I suppose that the root cause is some database connection problem or network problem (maybe firewall). The problem occurs irregurarly only on production environments and I wasn't able to find out what it coudl be.

       

      Anyway, I would expect that  regardless of such low-level errors, the JBPM (or Hibernate, Arjuna or anything else inside) should be able to fully recover.

       

      Locally I can reproduce problem by locking Mysql database tables using scenario:

      1) deployed testing jbpm process whose transitions are changed automatically every minute
      2) locked mysql database tables using "FLUSH TABLES WITH READ LOCK \G;" command
      3) wait 1 hour
      4) unlock database tables using "UNLOCK TABLES;" command
      5) now errors (see above) appear in the log


      Can you please give me any advice how to ensure that JBPM recovers correctly?

      Maybe I omited some important configuration in Hibernate/Arjuna/JBPM, Jboss (the JBPM datasource is configured correctly now).

      I cannot belive that the JBOSS restart is sufficent solution.


      Thanks a lot.
      Petr

        • 1. Re: JobExecutor becomes broken and cannot recover (JBPM3.2.2)
          Chatura kalansooriya Newbie

          Hi All,

           

          I had similar issue with jbpm in my production environment. it’s very difficult pinpoint the real issue.

           

          10:07:04,782 WARN  [arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator_2] TwoPhaseCoordinator.beforeCompletion - failed for com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple@30f230f2

          org.hibernate.exception.GenericJDBCException: could not update: [org.jbpm.job.ExecuteActionJob#591486]

                  at org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:126)

                  at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:114)

                  at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)

                  at org.hibernate.persister.entity.AbstractEntityPersister.update(AbstractEntityPersister.java:2453)

                  at org.hibernate.persister.entity.AbstractEntityPersister.updateOrInsert(AbstractEntityPersister.java:2335)

                  at org.hibernate.persister.entity.AbstractEntityPersister.update(AbstractEntityPersister.java:2635)

                  at org.hibernate.action.EntityUpdateAction.execute(EntityUpdateAction.java:115)

                  at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:279)

                  at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:263)

                  at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:168)

                  at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)

                  at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)

                  at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1027)

                  at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:365)

                  at org.hibernate.transaction.CacheSynchronization.beforeCompletion(CacheSynchronization.java:88)

                  at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:114)

                  at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:247)

                  at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)

                  at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)

                  at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)

                  at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)

                  at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)

                  at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)

                  at org.jbpm.persistence.jta.JtaDbPersistenceService.commit(JtaDbPersistenceService.java:98)

                  at org.jbpm.persistence.db.DbPersistenceService.endTransaction(DbPersistenceService.java:135)

                  at org.jbpm.persistence.db.DbPersistenceService.close(DbPersistenceService.java:201)

                  at org.jbpm.svc.Services.close(Services.java:247)

                  at org.jbpm.JbpmContext.close(JbpmContext.java:131)

                  at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)

                  at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:60)

          Caused by:

          java.sql.SQLException: ORA-01591: lock held by in-doubt distributed transaction 10.23.24923

           

                  at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)

                  at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:331)

                  at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:288)

                  at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:745)

                  at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:219)

                  at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:970)

                  at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1190)

                  at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3370)

                  at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:3454)

                  at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeUpdate(WrappedPreparedStatement.java:365)

                  at org.hibernate.persister.entity.AbstractEntityPersister.update(AbstractEntityPersister.java:2435)

                  ... 26 more

          10:07:04,784 ERROR [JobExecutorThread] exception in JbpmJobExector:172.26.23.52:1, waiting 3600000 ms

          java.lang.IllegalStateException: BaseTransaction.rollback - [com.arjuna.ats.internal.jta.transaction.arjunacore.notx] [com.arjuna.ats.internal.jta.transaction.arjunacore.notx] no transaction!

                  at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.rollback(BaseTransaction.java:158)

                  at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.rollback(BaseTransactionManagerDelegate.java:126)

                  at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.rollback(ServerVMClientUserTransaction.java:148)

                  at org.jbpm.persistence.jta.JtaDbPersistenceService.rollback(JtaDbPersistenceService.java:119)

                  at org.jbpm.persistence.db.DbPersistenceService.endTransaction(DbPersistenceService.java:137)

                  at org.jbpm.persistence.db.DbPersistenceService.close(DbPersistenceService.java:201)

                  at org.jbpm.svc.Services.close(Services.java:247)

                  at org.jbpm.JbpmContext.close(JbpmContext.java:131)

                  at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)

                  at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:60)

           

          environmet -jbpm 3.2.7(jta enable persistence service)

                             -jboss 4.2.3 databace

                             -oracle 10g(XA ds)

           

          Pls help me to solve this.

           

          Thanks in Advance !!!!!!!!