This article is currently a place for me to jot down my understanding of how the XTS recovery code works. It is very likely to contain errors and should not be relied upon. I hope that one day it will evolve into something a contributor can use to understand what is going on.
TX Object Store
For XTS a coordinator record is written for each transaction and a participant record is written for each participant. The record is written to the object store local to the associated coordinator/participant. The record is written at a particular point in the transaction and is removed when the transaction completes. An entry is only ever re-written to if a heuristic occurs. Entries in the object store represent either an inflight transaction or a transaction that was inflight when the server crashed. An entry that lives longer than the transaction timeout (plus some grace period) is suspected to be due to a crash and the recovery manager attempts recovery.
The recovery manager runs every 120 seconds by default. This period is configurable, but too low a value can result in strange behavior. It is recommended that you don't go any lower than 10seconds. Each time the recovery manager fires, it runs two scans. The first scan obtains all the in doubt transactions (those that currently have an entry in the object store). It is not yet known if these transactions need recovering or if they are simply inflight. The second scan occurs after the inflight transactions should have completed. At this point any remaining entries, that where also present in the first pass, are strong candidates for recovery.
In particular, the following steps occur:
- com.arjuna.ats.internal.arjuna.recovery#run() is the run method of the recovery manager. It loops until the thread is marked as finished
- The loop begins obtaining a lock that allows it to work on the object store.
- Once it has the lock, it does something around checking for other workers, before calling doWorkInternal() TODO: find out what it's doing here.
- doWorkInternal() carries out the two scans.
- It starts by calling periodicFirstPass on all registered recovery modules, passing in the associated classloader for the module. This ensures the recovery module is able to instantiate the classes it requires.
- The thread waits, to give currently active transactions time to complete.
- periodicSecondPass is then called on each registered recovery module in turn.
Many recovery modules will be registered and each will be responsible for its own type of log records. As this article is abut XTS, we will be focusing on the XTS recovery modules which are listed in xts-properties.xml. Two classes are responsible for registering these modules, org.jboss.jbossts.xts.recovery.coordinator.CoordinatorRecoveryInitialisation and org.jboss.jbossts.xts.recovery.participant.ParticipantRecoveryInitialisation. The former registers Recovery modules for recovering coordinators and the latter registers recovery modules for recovering participants. Each startup method looksup the class names of the asociated recovery modules in the xts-properties.xml and then registers it with the RecoveryManager.
TODO: find out differences between these modules. Subordinate ones are different as (I think) they are aware that that the parent coordinator will also be in the process of recovering. This needs investigating more.
Coordinator Recovery Modules
A coordinator recovery module exists for each type of XTS coordinator. These are as follows:
During the first pass, periodicWorkFirstPass() is invoked. This Invokes processTransactions() to obtain all the logs from the tx-object-store of the type this recovery module is responsible for recovering.
During the second pass, periodicWorkSecondPass() is invoked, this invokes processTransactionsStatus() which attempts to recover all the transactions with entries in the tx-object-store, found during the first pass. This method iterates over the ids of these transactions, checking with the recoveryStore that the corresponding transaction has not yet completed. For those still present, doRecoverTransaction is invoked.
- Checks that the transaction is not in a state that can only be held by an inflight transaction.
- Creates a recovery coordinator to recovery this transaction. This class extends the asociated coordinator class, and impliments the recovery the features.
- invokes replayPhase2 on the recovery coordinator to do recovery.
Recovery Coordinator's replayPhase2()
- Decides, based on the status of the transaction, wether to commit or rollback
- Invokes the parent Coordinator's phase2Commit() or phase2Abort() which replays the corresponding commit or rollback message.
Participant Recovery Modules
A seperate participant recovery module exists for AT and BA Participants. These are as follows:
WS-AT to JTA Bridge recovery
Recovery is done by the InboundBridgeRecoveryManager which implements three interfaces; XTSATRecoveryModule, RecoveryModule and XAResourceOrphanFilter. Implementation f these interfaces provides the following functionality:
Ony deserialize is supported as BridgeDurableParticipant is Serializable. deserialize starts by checking that it has a BridgeDurableParticipant, and if it does then it deserializes to an instance of BridgeDurableParticipant and adds it to the list of participantsAwaitingRecovery and then returns the BridgeDurableParticipant.
PeriodicWorkSecondPass is triggered after the XTS recovery module has recovered the BridgeDurableParticipants, therfore the first thing to do is to cleanip any participants that are not awaiting recovery. orphanedXAResourcesAreIdentifiable is then set to true, as all the participants known by the XTS recovery system have been recovered, thus any remaining must be orphans. The method then obtains ll in-doubt JTA subordinate transactions and then calls check(Xid) to find out what to do with them. All those that need to be rolled back are driven to rollback through the XATerminator.
Offers a single method, checkXid(Xid). Returns Vote.ABSTAIN if the participant is not a BridgeDurableParticipant. Returns Vote.LEAVE_ALONE if periodicSecondPass has not yet occured (or progressed far enough). Returns Vote.LEAVE_ALONE if the participant is in the participantsAwaitingRecovery list and thus could still be recovered. Otherwise Vote.ROLLBACK is returned, due to the presumed abort semantics.