4 Replies Latest reply on Oct 5, 2007 7:30 AM by gogoasa

    async processes : favour termination instead of inception

    gogoasa

      Hello,

      I have a process that contains several nodes that may take a long time to execute (hours). I order to be able to have a trace of what's happening, I use asynchronous continuations (async="true" for the slow nodes).

      The whole things goes like this: a process initiator send lots (hundreds) of JMS StartProcessInstanceCommands on the jms/JbpmCommandQueue. The command queue is rapidly depleted by the CommandListenerBean which starts a new process and then starts the execution of the first node by sending a message to jms/JmsJobQueue. What happens next is that there are lots of processes that start and very few (if any) that finish. When the first node finishes, because of the async="true"behaviour, another jms message is sent to the JobQueue, at the end of the queue.

      This way, for async processes, the start of new processes is favoured instead of finishing the already started.

      Do you have any idea on how to tackle this?

      As for me, I came up with the idea of associating a priority to a node which would be transferred to the JobCommand and then to the JMS message as a JMS priority parameter. Some modifications of jBPM are required though, so that sending a JMS message be done with the proper priority. You guys might have better ideas, I am eager to hear them...

      Thank you.

        • 1. Re: async processes : favour termination instead of inceptio
          jeffj55374

          Hi,
          I'm facing a similar issue. We need to process 100's of groups of files. Each file group results in the creation of a process instance. Each process has a number of steps. Some of these steps take 10's of minutes or hours to process. Therefore we are using the embedded async continuation method to allow for nodes to be processed in parallel to leverage our multi-cpu system. When a new job is created for each async node, it is given a due date of the current system time.

          See Node.createAsyncContinuationJob:
          
          protected ExecuteNodeJob createAsyncContinuationJob(Token token) {
           ExecuteNodeJob job = new ExecuteNodeJob(token);
           job.setNode(this);job.setDueDate(new Date()); job.setExclusive(isAsyncExclusive);
           return job;
           }



          The job executor thread acquires jobs based on the results of a querying the JBPM_JOB table and sorting by due date. The JobExecutorThread acquires jobs by calling JobSession.getFirstAcquirableJob() which executes the query below. Note the results are ordered by due date.
          
          <query name="JobSession.getFirstAcquirableJob">
           <![CDATA[
           select job
           from org.jbpm.job.Job as job
           where ( (job.lockOwner is null) or (job.lockOwner = :lockOwner) )
           and job.retries > 0
           and job.dueDate <= :now
           and job.isSuspended != trueorder by job.dueDate asc ]]>
           </query>


          Essentially async nodes are processed in the order in which they were created, this is completely different than the order in which the process instances where created. This essentially means that all the early nodes of all processes instances are executed before the later nodes of the first process instance. i.e. I would really like the job executor to work on nodes associated with the first process instance before working on nodes associated with subsequent process instances. i.e. if possible complete ready process instance 1 jobs before working on process instance 2 jobs.

          Options I'm considering:

          1. Modify Node.createAsyncContinuationJob() to use the create date of the processes instance for the job due date rather than just using new Date();
          2. Creating my own sub class of org.jbpm.job to add in another member that saves my priority and modify the JobSession.getFirstAcquirableJob query accordingly
          3. Modifying the JobSession.getFirstAcquirableJob query to order the list by process instance ID first and then due date. At the moment this seems like the most straight forward approach and doesn't require me to modify or extend jBPM code or modify the schema.


          Anyone have other ideas?

          • 2. Re: async processes : favour termination instead of inceptio
            kukeltje

             

            if possible complete ready process instance 1 jobs before working on process instance 2 jobs.


            sounds like serial processing. Then why not delay the start of the processes? Start process 2 once process 1 is finished? I know, does not sound realistic, but how should a 'scheduler' know in advance how long jobs will take? Only then it can decide to start a realtively short job of process 2 while waiting for one or more jobs for process one. So just start as many processes as you can have parallel actions for procesess. In any other way, it realy becomes complicated quickly. Funny thing is then, that you do not even need async continuations. Just prevent the front-end to get saturated (always better then to 'fix' it further on in the system)

            <<Me Just thinking out loud.... >>

            • 3. Re: async processes : favour termination instead of inceptio
              gogoasa

              Jeff, if I understand correctly the difference between my installation and yours is that I use the JMS system for async execution while you use the built-in database-based messaging system.

              For what it's worth, in my case I don't feel the need to use asynchronous continuations because of parallel processing (the multithreading is done by the app server and I can start multiple process instances in parallel) -- but because, having a series of nodes that take a long time to execute, I need to have shorter, one-node transactions so that in a web console I can have the status of the process. I only use async for shorter transactions around each node and I must say that I would find it very useful if jBPM did its database access behind a RequiresNew EJB (in "enterprise" deployment). That would allow people to have long-taking series of nodes in non-async processes.

              Ok, as to the solution, I added an attribute to node :

              <node name="1st-slow-node" async="true" async-priority="1">
              ...
              </node>
              <node name="2nd-slow-node" async="true" async-priority="2">
              ...
              </node>
              <node name="3rd-slow-node" async="true" async-priority="3">
              ...
              </node>
              <node name="4th-slow-node" async="true" async-priority="4">
              ...
              </node>
              


              and a field to Node.java :

              protected int asyncPriority;

              and the corresponding mapping to the Hibernate file.

              When a Jms message is sent, its priority is sent using the node attribute.

              I haven't checked, but I suppose this should work in your case too, by simply changing the HQL query to do a join with Node and order by node.asyncPriority.


              • 4. Re: async processes : favour termination instead of inceptio
                gogoasa

                Do you think there should be a fix for this issue ? Should I file a Jira ?

                I personnaly think this will be an issue for all people that use jBPM for orchestrating remote services that take a long time to execute.