1 2 Previous Next 24 Replies Latest reply on Mar 25, 2009 8:20 AM by kukeltje

Failure handling

bill.burke Jul 31, 2007 2:47 PM

I'm thinking of a scenario of asynchronous continuations that travel over JMS. I don't see a way of handling a transaction rollback and transitioning to a new state based on that event. Yes, jBPM has exception handlers, but these seem to run within the JTA transaction started by the command service MDB. Since this transaction would be rolling back you wouldn't be able to transition out of that state, right? Or am i missing something fundamentral?

So, what I'm saying is, shouldn't there be the ability to rollback the transaction, catch that, then transition to a failure state?

1. Re: Failure handling

estaub Jul 31, 2007 3:40 PM (in response to bill.burke)

Yeah, I've written a couple of posts about this too. No one responded with a solution.

I intend to run actionhandlers that may setRollback() in another transaction. I'm running in a J2EE CMT environment, so I've got a local bean with a command-pattern interface for performing the action, set to REQUIRES_NEW. In practice, it looks like ActionHandlers using this pattern will mostly consist of a static inner command class.

If my description isn't clear, let me know and I'll elaborate.

I don't have any experience with this approach yet, so it may fall apart for some reason I haven't thought of.

If you think of something better, please sing out!

-Ed Staub
Actions
2. Re: Failure handling

bill.burke Jul 31, 2007 4:03 PM (in response to bill.burke)

I think I have a different solution to yours: dead letter queue processing.

You can take the code as-is, add a MDB that listens on the dead letter queue. The node could have a "failure" transition. Dead Leter MDB could see if the node had that transition and just transition to it. Thoughts?

I'm thinking that JBPM should have a failure-transition that could be describe in the process definition. The failure-transition is different from exception handling in that exception handling allows you to recover, failure handling changes the state of the PI to a failure state that you can do BPM on.
Actions
3. Re: Failure handling

bill.burke Jul 31, 2007 4:25 PM (in response to bill.burke)

"estaub" wrote:

If my description isn't clear, let me know and I'll elaborate.

Hey Ed, can you elaborate? You're doing async actions? Not async nodes? Async actions seem pretty scary to me, mainly because of failures and your PI could be in an inconsistent state of some actions succeeding and some failing. I've never really written a jBPM application, so maybe my fears are just superstition. You tell me...
Actions
4. Re: Failure handling

tom.baeyens Aug 1, 2007 8:08 AM (in response to bill.burke)

exception handlers indeed don't solve your scenario since you're right that they happen inside of the transaction.

i think this is out of scope for jBPM. the client controls the transaction (either through hibernate in standard java or through JTA in enterprise).

In enterprise environments, you could use a Synchronization, no ?

in the POJO implementation, i have done something similar to process exceptions in asynchronous continuations. There i had to implement the scheme that you mention: in case of exception, rollback the transaction, and then start a new to decrement the retry counter. If it is 0, mark it so that it doesn't keep on executing.

But i didn't give any facility for automatically processing dead messages. But you can see them in the newest console.
Actions
5. Re: Failure handling

estaub Aug 1, 2007 8:34 AM (in response to bill.burke)

Bill,

In your original scenario, I'm picturing the rollback as originating in an actionhandler performing a non-JBPM-related action. For these cases, I'm dealing with the transactional difficulties by isolating them in their own transaction. Once this is done, exception handlers should be usable.

You should probably also be aware that JBPM uses recursive execution of the graph. I mean that, when a node actionhandler calls leaveNode() or, less properly, signal(), the downstream execution of subsequent nodes happens within that method call; the execution of later nodes happens within the execution scope of the first actionhandler. This can lead to surprising behavior in error conditions, at least for me. I have a fix for this, but I'm not sure that anyone else believes it's a problem!

-Ed Staub
Actions
6. Re: Failure handling

bill.burke Aug 1, 2007 9:25 AM (in response to bill.burke)

"estaub" wrote:
Bill,

In your original scenario, I'm picturing the rollback as originating in an actionhandler performing a non-JBPM-related action. For these cases, I'm dealing with the transactional difficulties by isolating them in their own transaction. Once this is done, exception handlers should be usable.

But, in this scenario, you wouldn't be able to do JBPM related actions is this isolated transaction, specifically the setting of variables, correct? Kind of making things like Seam unusable.

You should probably also be aware that JBPM uses recursive execution of the graph. I mean that, when a node actionhandler calls leaveNode() or, less properly, signal(), the downstream execution of subsequent nodes happens within that method call; the execution of later nodes happens within the execution scope of the first actionhandler. This can lead to surprising behavior in error conditions, at least for me. I have a fix for this, but I'm not sure that anyone else believes it's a problem!

-Ed Staub

Ed, can you elaborate why this is a problem? I just want to learn something. IMO, this seems like an advantage. And, can't you just fix your issue with an asynchronous continuation?
Actions
7. Re: Failure handling

estaub Aug 1, 2007 9:37 AM (in response to bill.burke)

>> you wouldn't be able to do JBPM related actions is this isolated transaction

Correct, but I'm not sure it's a problem. The actionhandler calls a local EJB to perform the transactional behavior unrelated to JBPM. The return value from the local EJB is then used to perform whatever JBPM-specific behavior is necessary, on the JBPM transaction.

>> Kind of making things like Seam unusable.
I don't know enough about Seam to have an opinion.

--

Re recursive execution:

I can't really point at a specific "this test fails"... it's more that it's a likely source of bugs due to incorrect analysis, both for JBPM developers and JBPM users.

For example, an actionhandler might be incorrectly written:

acquireSomeResource()
try
{
useTheResource()
leaveNode();
}
finally
{
releaseSomeResource()
Actions
8. Re: Failure handling

estaub Aug 1, 2007 9:58 AM (in response to bill.burke)
...Sorry, I miskeyed and posted my last post prematurely.

Re recursive execution:

I can't really point at a specific "this test fails"... it's more that it's a likely source of bugs due to incorrect analysis, both for JBPM developers and JBPM users.

For example, an actionhandler might be incorrectly written:

acquireSomeResource() try { useTheResource(); leaveNode(); } finally { releaseSomeResource() }

Can you spot the bug?
The release is happening after all subsequent nodes that can be executed are executed. If any of them needed the same resource, they would be surprised to find that it was already in use. Billy Pilgrim would be at home ;-).

Less importantly, recursive evaluation causes resources to be tied up unnecessarily. I've seen posts on this forum complaining about a "memory leak in JBPM" - they perform a process loop a zillion times and watch the heap grow without bound. This would be silly as a production scenario, but it's a very typical kind of test to run.

The fix is to raise a flag in the token while a node's executionHandler is being performed, so that any internal leaveNode() or signal() calls know that they are being performed within this context. If an actionHandler performs a leaveNode or signal, the transition to take is recorded, but not actually taken until after the actionHandler exits.

-Ed Staub
Actions
9. Re: Failure handling

bill.burke Aug 1, 2007 10:59 AM (in response to bill.burke)

Ed, thought of a possible problem with your approach. What if you have a failure after your isolated business transaction completes? Like, for instance, a crash before the JBPM context has a chance to commit. Then the business process would be in an inconsistent state, wouldn't it?
Actions
10. Re: Failure handling

kukeltje Aug 1, 2007 11:45 AM (in response to bill.burke)

Less importantly, recursive evaluation causes resources to be tied up unnecessarily. I've seen posts on this forum complaining about a "memory leak in JBPM" - they perform a process loop a zillion times and watch the heap grow without bound. This would be silly as a production scenario, but it's a very typical kind of test to run.

This is 'fixed' in the PVM (upcomming core of jbpm 4.0) Koen gave Tom some good spanking and in the meantime came up with a solution.

With regard to the discussion about exceptionhandlers or the 'failure transitions', this is a hot topic in all processlanguages. I tried this once in a small test example where the actionhandler on getting an exception, did not throw it, but set a variable (which is possible if no exceptionhandlers are used). Transitions can be 'guarded' by expressions/conditions so the presence of a value in a certain variable can make the process leave on a certain transition. This can be decided upon designtime. On this leaving node you can e.g. use a compensating action that undo's previous actions. Kind of complex though, but that is the case with all (afaik) 'languages' like bpel, ebbp, bpmn etc. Maybe Tom or Alex could comment on this as well.
Actions
11. Re: Failure handling

estaub Aug 1, 2007 2:46 PM (in response to bill.burke)

Bill,

>> Then the business process would be in an inconsistent state, wouldn't it?

Yes.

I suppose yet another option is to commit the transaction that entered the node, then start a new transaction for all db work thenceforth (including the actionhandler), both JBPM and not. If, when the actionhandler returns, a rollback has occurred, a third transaction could be started to do exception handling and any subsequent behavior. Incidentally, this is another area where the current recursive-traversal model makes correct implementation difficult.

-Ed Staub
Actions
12. Re: Failure handling

tom.baeyens Aug 2, 2007 4:10 AM (in response to bill.burke)

i think you already got the first part, but i'll reformulate to build up the complete response.

whenever an exception comes out of a process execution, the process instance can be in an invalid state. so you (read: the client) *must* rollback your transaction.

every exception that is thrown into the engine, comes back out at the client side (usually wrapped in a JbpmException)

exception handlers only can 'overwrite' exceptions that occur in user code and cause the process execution to continue even when an exception occurred.

in case you want to rollback the workflow transaction and do something else, there is only one option: you start a new transaction as indicated before in this thread. but note that it is not a good idea to start working on the process execution directly in this thread. as the failing transaction will probably still have database locks on the process execution rows. therefor, i think the best way to handle this is by sending a new command as a job over the async messaging queue to the job executor. the command will do what you want to do (set variables as bill mentioned or take a 'failure' transition,...) in yet another (3rd) transaction.

in jee, you might want to send the command job in the afterCompletion of a synchronization. in that case, you're sure that the database locks of the failing transaction are released when the async job executor executes the 3rd transaction with the fail-action.
Actions
13. Re: Failure handling

estaub Aug 3, 2007 9:08 AM (in response to bill.burke)

Tom,

I'm not at all sure, but I suspect we're thinking about different problems, or at least approaching them from different perspectives.

I'm not sure, but I suspect this is the context for the scenario Bill originally described. If not, I am guilty of 3rd degree thread-hijacking ;-)

An application has a set of preexisting EJBs (or other analogous components). These use CMT with a REQUIRED transaction attribute, so that multiple EJB calls can share a transaction. The EJB's frequently throw and setRollbackOnly(), most frequently because of gross concurrency issues. As an example, think of trying to reserve a seat on an airplane, but someone else got the last seat just before you clicked.

Now imagine a business process that tries to explicitly model the failure recovery in this case, either using exception handlers or anything else that can be expressed in JPDL.

How would you make the process model behave well, i.e., fairly WYSIWYG? Would you isolate the business logic (calling these EJBs) in a separate transaction in the actionhandler, as I was thinking?

I'm a little wary of trying to use fine-grained control of transactions (e.g. inserting end/begin pairs via TransactionManager when exceptions occur), because we run in multiple appservers (WebLogic, WebSphere, and now maybe JBoss) and I understand that they tend to behave differently. If you think I'm making a mountain out of a molehill, please say so - I'm just repeating hearsay!

I also suspect that in order to use TransactionManager, we'd need to avoid participating in any surrounding CMT transaction - I'd think that the container would freak out if we ended it's Container-Managed-Transaction and started another within a CMT EJB. Do you have any experience with this?

-Ed Staub
Actions
14. Re: Failure handling

bill.burke Aug 3, 2007 9:30 AM (in response to bill.burke)
Ed, with the disclaimer that I've never written a jBPM application, doesn't it really depends who is driving the process. If user code is managing both the transaction and signaling the process, then you can handle this quite easily:

userTransaction.begin(); boolean rollback = false; try { process.signal(); if (userTransaction.getStatus() == ROLLBACK CONDITION) { rollback =true; ut.rollback(); else ut.commit(); } catch (Exception ex) { rollback = true; userTransaction.rollback()} if (rollback) { ut.begin(); process.signal("failure"); }

If the process is being driven by an asynchronous continuation or a user MDB, then you can use the DLQ or Transaction Synchronization ideas stated. I'm writing a blog about this very case. I'll link when I'm done.

As far as DLQ goes. Most(all?) major JMS providers have DLQ support. As for Transaction Synchronizations? You'd need access to TM yes. But there are well documented ways of obtaining the TM on each major application server. You just have to look. If you're in EE5, there is a TransactionSynchroniationRegistry that all vendors must provide.

Answer your questions?
Actions

1 2 Previous Next

Go to original post