Running the transaction manager (and recovery manager) in an environment with automatic provisioning - a discussion
tomjenkinson Jun 22, 2015 12:03 PMWe have had contact recently requesting guidance on the best way to structure transaction manager deployments in environments where nodes are provisioned and fully discarded dynamically.
Here are the restrictions that have been expressed for the cluster:
1. Nodes can be added dynamically
2. Nodes can be removed dynamically
3. Special nodes can be defined as "sticky" - i.e. can be reloaded
4. Nodes that are removed on failure may be restarted but they may also be aggressively removed by the cluster manager and never reappear
5. Nodes may have unique configuration provided but new nodes have no knowledge of previous nodes configuration
6. Nodes do not have reliable local storage but may access reliable databases
The constraints that Narayana adds are that transactions are ultimately safely partitioned by a "node identifier". It is not recommended to configure two transaction managers with the same node identifier when:
1. They talk to the same resource managers
2. They inhabit the same object store
3. They participate in propagated transactions (JTS or JTA/JBoss Remoting)
I have assumed that there are a set of persistent resource managers that have a lifespan beyond the application servers and are shared by the cluster of nodes - this requirement was implied.
Immediate observation:
Due to the restrictions of the environment (particularly 6) this leads to a situation where the ObjectStore can be deployed safely onto a database as the environment does provide reliable database storage.
As it is an (assumed) requirement of the environment that the same set of resource managers are accessed from different nodes it would not be recommended for the transaction managers to share the same node identifier as this could result in global transaction identifier collision - this would result in branches from the two different transactions being associated together for the purpose of acidity. This is seen below in the composition of an Xid:
* host address (InetAddress.getLocalHost)
* process id (may be configured)
* time the process started
* incrementing counter (initialized at zero)
* Node Name
The initial options that might be considered are:
1. Deploy all transaction managers with unique node identifiers, disable recovery on all these nodes and then deploy a single node to be the recovery manager for the entire cluster (recover for "*" nodes) and rely on the orphan safety interval to ensure that in flight transactions are not interfered with. This would therefore cater for the situation where communication between transaction manager and recovery manager is impaired but both processes have remained alive. This is not ideal for the reason described earlier about IPC failure and reliance of orphan safety interval. Furthermore, it is not ideal from the point of view of the environment as it does require a special node to be defined and administered specially.
2. Deploy all transaction managers with unique node identifiers, enable recovery on all these nodes for "*". This is a derivative of option 1. The advantage over option 1 is that the operating environment does not have any special considerations for a recovery manager node, the disadvantage is that it is likely for the different recovery managers to attempt to complete branches in resource managers in parallel resulting in many confusing ERROR/WARN messages.
Both of those solutions rely on IPC availability or for the administrator to be satisfied that reliance on safety intervals is acceptable. As a note to safety interval, the worse case scenario is that if you have a transaction with resources managers which take a long time to prepare (and the resource manager can't access the transaction manager due to a network partition) the external recovery manager will roll back prepared Xids that are prepared over "safety interval" ago. If the resource manager for these Xids then fail, when the transaction manager eventually moves to commit those branches it may receive RMFAIL which will mean that the transaction manager would continue to commit the rest of the branches and an unreported heuristic would occur. The only way to detect this happened is if:
* JBTM-860 is not enabled
* The recovery manager has INFO logging on
* The user cross references the rolling back log message from XARecoveryModule with the noxarecovery warnings.
This discussion may be used by interested parties to discuss these considerations and make further recommendations.