This is a design descusion for how to implement [JBTM-1099]
Consider the following, common, use case for WS-BA:
A WS-BA enabled service offers an operation for doing some work. For example, making a booking or purchasing an item. When this operation is invoked a new order is committed to the database. If the business activity is later canceled, the compensate operation is invoked on the service. This compensation action will take the necessary steps to undo the work and will likely make changes to the database content. For example by deleting the order or marking it as canceled. Alternatively, if the activity is successful the 'close' operation will be invoked on the service as a confirmation. It's also possible that the service may make an update to the database here, to mark the order as confirmed.
Two of the key benefits of using WS-BA in this scenario are that; i) no locks are held between the doWork and close/compensate phases, and ii) the service is guaranteed to receive the outcome of the activity even in the presence of failures.
If the service wants to update the database within a transaction, they have these options (that I can currently come up with, there may be others):
Manage a single subordinate JTA transaction (Poor choice)
Here the service would begin a subordinate JTA transaction in the 'doWork' method, prepare it during 'confirmCompleted', and then commit it during the 'close' or rollback during the 'compensate'. Essentially the application is mapping the WS-BA lifecycle onto a 2PC ACID lifecycle.
The main problems with this approach are that:
- Locks are held between the two phases of the protocol. Violates i) from above.
- It's tricky/error-prone for the developer as they need to manage a subordinate transaction.
- There are failure windows which violate ii) from above.
Use multiple JTA transactions (Better choice, but not perfect)
Here the application begins a transaction at the start of the doWork method and commits it just before the method returns. A separate JTA transaction is begun at the start of the close/compensate method and committed just before it returns. This approach is more likely to be used than the previous as it can be achieved with just JPA and transactional annotations. On the surface it looks like it should just work. However I think it is susceptible to a failure window:
Failure window in first phase
Consider the following scenario:
- Client begins a BA
- Client invokes service
- Service begins a JTA transaction
- Service updates the DB
- Service commits the JTA
- Service crashes
- Service resumes and does nothing as the BA did not progress far enough to produce a log
- Client receives a failure from the service invocation and either cancels the BA or attempts to close it; which the coordinator will fail as the participant did not complete.
Here the BA was unsuccessful, but the JTA transaction committed. We have an inconsistent state.
No failure window in the second phase
I don't think failure windows exists for the close or compensate methods. This is because these methods are invoked after a log has been written and so survive failure. They are also invoked periodically until the coordinator receives an acknowledgement. For this to work, the application logic needs to tolerate retries; which is the case for plain WS-BA anyway. Consider this example:
- Coordinator sends 1st compensate message
- Compensate invoked for 1st time (C1)
- C1 begins JTA1
- C1 reads from the DB to check not compensated (Obtains a read lock)
- C1 discovers that not yet compensated so writes the change to the DB to do the compensate (Obtains write lock)
- Coordinator sends 2nd compensate message as it has not recived the ack to the firs tin a timely manor
- Compensate is invoked for the 2nd time (C2)
- C2 begins JTA2
- C2 reads form DB to check not compensated (Waits to obtain the read lock)
- C1 commits JTA1
- First compensate message is acked
- C2 now obtains the read lock on the data and discovers that already compensated
- C2 commits JTA2
- Second compensate message is acked.
Providing the same transaction is used to do both; i) detect compensation, and ii) compensate; I believe we are safe.
In this solution we configure an interceptor on the service request that automatically enlists a participant in the incoming WS-BA activity and maps the events onto calls to a subordinate JTA transaction. This is a similar approach to that used for WS-AT to JTA mapping. This is the mapping that I propose:
|Application request arrives||Begin subordinate JTA1|
|completed||Prepare JTA1 (ensures that we can later complete. Notify 'cannotComplete' if JTA1 fails to prepare)|
|close arrives||begin JTA2 and asociate with thread that calls 'close' on the WS-BA participant|
|closed leaves||commit JTA2. Failure here will cause the 'close'to be retried.|
|compensate arrives||begin JTA2 and asociate with thread that calls 'compensate' on the WS-BA participant|
|compensated leaves||commit JTA2. Failure here will cause the compensate to retry.|
The crux of the solution is:
- Don't commit JTA1 until the transaction log has been written. Do this by delaying the commit until the confirmCompleted event is raised.
- 'Surround' the calls to the close and compensate invocations on the WS-BA participant with JTA2. I don't think this can be done through the XTS API, so would require some internal changes.
Feedback on this issue would be much appreciated.