1 2 Previous Next 16 Replies Latest reply on Oct 13, 2008 4:20 PM by Ronald van Kuijk

    Action not executing in timer

    James Cook Newbie

      I am having some trouble getting a workflow to function the way I want it to. I have stripped it down to a very simple example.

      I have a node that will attempt to process a credit card. If it fails to communicate to the server I would like it to pause for a couple minutes, then try again. This example is bare bones and node1 represents the initial attempt anf state1 represents a delay. After the delay, the next node will eventually be a node that retries the communication, but for now it simply ends.

      <process-definition xmlns="" name="examplePause">
       <start-state name="startPauseExample">
       <transition to="node1"></transition>
       </start-state>
      
       <node name="node1">
       <script>
       System.out.println("Node: " + node);
       token.signal();
       </script>
       <transition to="state1"></transition>
       </node>
      
       <state name="state1">
       <script>
       System.out.println("Node: " + node);
       </script>
       <timer name="timer1" duedate="10 seconds" transition="end" cancel-event="node-left">
       <script>
       System.out.println("Timer executed.");
       token.signal();
       </script>
       </timer>
       <transition to="endPauseExample" name="end"></transition>
       </state>
      
      
       <end-state name="endPauseExample"></end-state>
      </process-definition>
      


      My unit test executes the workflow by calling getRootToken.signal(). It then sleeps for 15 seconds. At the end of 15 seconds, I was hoping that the timer would of fired and I would transition to the next node (the end state). But when the 15 seconds ends, the token is still in 'state1'.

      I'm not sure if the following logs show the appropriate steps, but the major players seem to be there:

      [org.jbpm.graph.def.GraphElement] - event 'process-start' on 'ProcessDefinition(examplePause)' for
       'Token(/)'
      [org.jbpm.graph.def.GraphElement] - event 'before-signal' on 'StartState(startPauseExample)' for
       'Token(/)'
      [org.jbpm.graph.def.GraphElement] - event 'node-leave' on 'StartState(startPauseExample)' for
       'Token(/)'
      [org.jbpm.graph.def.GraphElement] - event 'transition' on 'Transition(7857a5)' for 'Token(/)'
      [org.jbpm.graph.def.GraphElement] - event 'node-enter' on 'Node(node1)' for 'Token(/)'
      [org.jbpm.graph.def.GraphElement] - executing action 'Script(19deddb)'
      [org.jbpm.graph.def.GraphElement] - event 'before-signal' on 'Node(node1)' for 'Token(/)'
      [org.jbpm.graph.def.GraphElement] - event 'node-leave' on 'Node(node1)' for 'Token(/)'
      [org.jbpm.graph.def.GraphElement] - event 'transition' on 'Transition(3744bc)' for 'Token(/)'
      [org.jbpm.graph.def.GraphElement] - event 'node-enter' on 'State(state1)' for 'Token(/)'
      [org.jbpm.graph.def.GraphElement] - executing action 'CreateTimerAction(8032df)'
      [org.jbpm.graph.def.GraphElement] - event 'timer-create' on 'State(state1)' for 'Token(/)'
      I see the database insert statement...
      [org.jbpm.graph.def.GraphElement] - event 'after-signal' on 'Node(node1)' for 'Token(/)'
      [ org.jbpm.graph.action.Script] - script output: {}
      [org.jbpm.graph.def.GraphElement] - event 'after-signal' on 'StartState(startPauseExample)' for
       'Token(/)'
      I see the job being pulled from the database by the JobExecutor...
      [ org.hibernate.pretty.Printer] - org.jbpm.job.Timer{isExclusive=false, lockTime=null,
       isSuspended=false, exception=null,
       graphElement=org.jbpm.graph.node.State#18,
       processInstance=org.jbpm.graph.exe.ProcessInstance#1,
       repeat=null, version=0, id=1, lockOwner=null,
       token=org.jbpm.graph.exe.Token#1, name=timer1,
       action=org.jbpm.graph.def.Action#20, retries=1,
       dueDate=2008-10-12 02:56:25, transitionName=end,
       taskInstance=null}
      


      The database record references the correct action in the database (System.out.println("Timer executed...) but the action is never initiated.

      Any ideas what I might try? Is the jpdl correct for what I intend to do?

      Thanks.



        • 1. Re: Action not executing in timer
          Ronald van Kuijk Master

          remove the token.signal.... there is already a transition declared on the timer. Signalling is superfluous.

          • 2. Re: Action not executing in timer
            James Cook Newbie

            Thanks, I'll do that. The problem seems to be that the script never gets invoked, so the signal isn't called anyway. Is there some reason why my process definition doesn't work as I expect even though I can see that the JobExecutor reads the job from the database?

            • 3. Re: Action not executing in timer
              Ronald van Kuijk Master

              Try the change I proposed and see. There have been (very recently) similar 'problems' due to this 'misconfiguration'

              • 4. Re: Action not executing in timer
                James Cook Newbie

                I removed the signal call, with the same results.

                I can trace the problem to the JobExecutorThread's acquireJobs() method. The job is read from the db, and it is not "exclusive". Because of this, a lock is acquired on the job. I'm perplexed how this will have any bearing on whether the job is executed later or not. It doesn't really matter however, because the next step is to close the jbpmContext. During this process, the thread doesn't return for a minute or so. When it does eventually return, I receive the following exception:

                update JBPM_JOB set VERSION_=1, DUEDATE_=2008-10-12 15:41:51.0, PROCESSINSTANCE_=3, TOKEN_=3,
                TASKINSTANCE_=null, ISSUSPENDED_=false, ISEXCLUSIVE_=false, LOCKOWNER_='JbpmJobExecutor:10.1.1.22:1',
                LOCKTIME_=2008-10-12 15:41:51.009, EXCEPTION_=null, RETRIES_=1, NAME_='timer1', REPEAT_=null,
                TRANSITIONNAME_='end', ACTION_=80, GRAPHELEMENTTYPE_='org.jbpm.graph.node.State', GRAPHELEMENT_=82
                where ID_=3 and VERSION_=0
                java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
                 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055)
                 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)
                 at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3491)
                 at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3423)
                 at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1936)
                 at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2060)
                 at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2542)
                 at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1734)
                 at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2019)
                 at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1937)
                 at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1922)
                 at net.sf.log4jdbc.PreparedStatementSpy.executeUpdate(PreparedStatementSpy.java:1006)
                 at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:102)
                 at org.hibernate.persister.entity.AbstractEntityPersister.update(AbstractEntityPersister.java:2435)
                 at org.hibernate.persister.entity.AbstractEntityPersister.updateOrInsert(AbstractEntityPersister.java:2335)
                 at org.hibernate.persister.entity.AbstractEntityPersister.update(AbstractEntityPersister.java:2635)
                 at org.hibernate.action.EntityUpdateAction.execute(EntityUpdateAction.java:115)
                 at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:279)
                 at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:263)
                 at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:168)
                 at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
                 at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)
                 at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1027)
                 at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:365)
                 at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:137)
                 at org.jbpm.persistence.db.DbPersistenceService.commit(DbPersistenceService.java:263)
                 at org.jbpm.persistence.db.DbPersistenceService.close(DbPersistenceService.java:218)
                 at org.jbpm.svc.Services.close(Services.java:236)
                 at org.jbpm.JbpmContext.close(JbpmContext.java:136)
                 at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:138)
                 at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:52)
                


                I don't yet understand the transaction boundaries in place in jBPM. Any time DbPersistenceService.getSession() is called, a new transaction is started. When jbpmContext is closed, the sql is flushed, and transactions are commited. Is it possible that the transactions are started, but do not commit in a timely (or balanced) manner?

                • 5. Re: Action not executing in timer
                  James Cook Newbie

                  I am noticing that after the insert of the job by the main thread, the sql is issued but there is no flush from hibernate or subsequent commit. When the JobExecutorThread fires it loads the job, presumably from the second-level cache. When it updates the job and attempts to commit its transaction, I am assuming it is blocked from doing so by the main thread. In fact, the main thread only issues one commit when the process definition is inserted into the database at startup.

                  Is there a way to cause jBPM to flush and commit transactions more frequently, like with every transition? How about after the creation of a job?

                  • 6. Re: Action not executing in timer
                    James Cook Newbie

                    I have confirmed the above suspicion. By adding a context.close() call before I go to sleep, the job is flushed to the database and the JobExecutorThread can update it without the Lock wait timeout exception.

                    Is this a bug or an accepted workaround?

                    I would expect that when I close the context in my unit test, I will now have to reopen it again to lookup a process definition or other such tasks. I feel that the engine should take care of properly commiting as necessary. I can see why commits may want to be deferred for performance reasons, but that strategy falls apart when there are other threads of execution (ala JobExecutor, custom web service, etc.) that is needing to update the database.

                    • 7. Re: Action not executing in timer
                      Ronald van Kuijk Master

                       

                      By adding a context.close() call before I go to sleep

                      ??? You go to sleep? like in a thread? Wrong usage.

                      Is this a bug or an accepted workaround?

                      Neither, it is how you *should* do things.

                      I will now have to reopen it again to lookup a process definition or other such tasks.

                      Correct.. but in normal cases you do not keep it open and a process goes in a waitstate. You only open it again if e.g. an external system or person acts on something.

                      but that strategy falls apart when there are other threads of execution (ala JobExecutor, custom web service, etc.) that is needing to update the database.

                      In most cases there only is one real thread of execution. Sure, you can have parallel things in forks, but they seldom act on the same thing at the same time. There are some things being looked into regarding locks/transactions and a join (see the jira)

                      In fact, the main thread only issues one commit when the process definition is inserted into the database at startup.


                      Show us your unittest... might be 'wrong' usage in there

                      • 8. Re: Action not executing in timer
                        James Cook Newbie

                        I go to sleep in my unit test. I have to give the timer a chance to do its thing. I suppose I could add a listener to some transition events but they would only fire if everything is working correctly. Defeats the purpose of the unit test.

                        If you are saying that I have to manually close a context in order to flush/commit, I would say that's a very bad design. Sure, while I am in my unit test I have the option of calling context.close(), a longer running process with various wait states does not have that luxury. It has to be handled internally by the engine in these cases.


                        but in normal cases you do not keep it open and a process goes in a waitstate. You only open it again if e.g. an external system or person acts on something. ... In most cases there only is one real thread of execution. Sure, you can have parallel things in forks, but they seldom act on the same thing at the same time. There are some things being looked into regarding locks/transactions and a join (see the jira)


                        Yes, and the JobExecutor is this external system that acts on something. Any time you are using timers, you have multiple threads of execution. Therefore the engine must have some means of flushing & commit when a timer is created, but it looks to me that it has to be done manually.

                        I will open a Jira if you can confirm that some automatic commit/flush does not exist.

                        BTW, here is my unit test. It works now as long as I commit manually from the main thread. I get the lock exception if I do not.

                         @Test
                         public void test() {
                         // Create an instance of the order workflow process
                         JbpmContext context = _jbpmConfiguration.createJbpmContext();
                         final GraphSession graphSession = context.getGraphSession();
                         final ProcessDefinition process = graphSession.findLatestProcessDefinition("examplePause");
                         ProcessInstance processInstance = process.createProcessInstance();
                        
                         // Kick off the process
                         Token token = processInstance.getRootToken();
                         token.signal();
                        
                         Assert.assertEquals(token.getNode().getName(), "state1");
                        
                         // Force the timer task to be committed (Note: this is where I don't understand
                         // why this is a manual step. If this was running outside of a unit test,
                         // Where would this commit go?
                         context.close();
                        
                         // Sleep. Timer should fire within 10 seconds. At wakeup, we should be in the end node.
                         try {
                         Thread.sleep(15000);
                         } catch (InterruptedException e) {
                         e.printStackTrace();
                         }
                        
                         // note the need to recreate the context and refetch the token...
                         context = _jbpmConfiguration.createJbpmContext();
                         token = context.getToken(token.getId());
                        
                         Assert.assertEquals(token.getNode().getName(), "endPauseExample");
                         }
                        


                        Thanks Ronald for your assistance throughout this process. Seems like you are a one-man show. :)

                        • 9. Re: Action not executing in timer
                          Joram Barrez Master

                           

                          If you are saying that I have to manually close a context in order to flush/commit, I would say that's a very bad design. Sure, while I am in my unit test I have the option of calling context.close(), a longer running process with various wait states does not have that luxury. It has to be handled internally by the engine in these cases.


                          The context is *always* closed by yourself. In a longer running process, the wait state will return the thread of control to the caller, which means that you can close the context if needed.

                          This is not a bad design. The Hibernate session mechanism works exactly the same way.

                          • 10. Re: Action not executing in timer
                            Ronald van Kuijk Master

                            And for the unittest, the AbstractDBTestCase (in jBPM) has the option to start the jobexecutor and wait a number of miliseconds. That closes and opens the context again for you. You only have to 'keep' the processId and reload the instance again. I'm currently working on some even more convenient methods that even reload the processInstance again, so your testcases become very clean.

                            • 11. Re: Action not executing in timer
                              James Cook Newbie

                               


                              The context is *always* closed by yourself. In a longer running process, the wait state will return the thread of control to the caller, which means that you can close the context if needed.

                              This is not a bad design. The Hibernate session mechanism works exactly the same way.


                              Usually, commits occur along transactional boundaries, be it at the end of a service call or the end of a web request, etc. As part of committing the transaction, Hibernate's flush/commit occurs. Note: flush can occur at any time if Hibernate deems it necessary.

                              My point is that if jBPM creates a Timer object but does not commit the transaction then the JobExecutor will find the transaction by means of the second level cache and update it. At this point the JobExecutor will attempt a commit, but the main thread will not. This causes a lock exception in jBPM.

                              Also, this would not occur if the main thread committed after the creation of a timer, or better yet, any change of state in the process execution.

                              I'll file it as a bug and see what happens.

                              What did you mean by:

                              the wait state will return the thread of control to the caller



                              • 12. Re: Action not executing in timer
                                James Cook Newbie

                                 


                                And for the unittest, the AbstractDBTestCase (in jBPM) has the option to start the jobexecutor and wait a number of miliseconds. That closes and opens the context again for you. You only have to 'keep' the processId and reload the instance again. I'm currently working on some even more convenient methods that even reload the processInstance again, so your testcases become very clean.

                                That's a good class to have as a reference. I didn't notice it before, because it isn't used in any of the test cases I have seen.

                                • 13. Re: Action not executing in timer
                                  Ronald van Kuijk Master

                                   


                                  What did you mean by:
                                  Quote:

                                  the wait state will return the thread of control to the caller


                                  That is when e.g. a call token.signal() or task.end() really returns. If there are no wait-states, the process continues to the end.

                                  signalling tokens happens in (mainly) by a caller:
                                  - a webapp e.g. ends a task
                                  - a timer goes off
                                  - MDB
                                  - webservice

                                  In both cases after signalling you can close the context unless you want STP like functionality and want to continue signalling (in that case you would not have needed a waitstate (state, task-node) but could have used a regular node.


                                  • 14. Re: Action not executing in timer
                                    Ronald van Kuijk Master

                                    I don't think you will have any luck with your Jira issue. Let me explain a little more (this can be read in other articles as well, even the jBPM docs I think)

                                    Normally the client only has 'control' for a very small amount of time. A second is a lot already. Timers that need to go off within a second are mostly useless. Remember, this is mainly workflow!!! so clashes occur rarely

                                    If a transition is taken and after that something goes wrong, the best thing to do is to go back to the last wait-state. That is the situation where external action was needed. That external action can be repeated (e.g. task not ended, user has to do it again) This is easiest done with a transaction roll-back in the database. Committing on each node boundary or even within nodes (creation of timers) would make this impossible to do and result in a lot more complicated system.

                                    If you want to have a STP system with (probably) persistency on nodes etc... you can have a look at the PVM and simply develop your own language and behaviour on top of that. Might also be that an 'stp' language is developed, though I'm not sure.

                                    1 2 Previous Next