6 Replies Latest reply on Jun 26, 2015 7:45 AM by marklittle

Running the transaction manager (and recovery manager) in an environment with automatic provisioning - a discussion

tomjenkinson Jun 22, 2015 12:03 PM

We have had contact recently requesting guidance on the best way to structure transaction manager deployments in environments where nodes are provisioned and fully discarded dynamically.

Here are the restrictions that have been expressed for the cluster:

1. Nodes can be added dynamically

2. Nodes can be removed dynamically

3. Special nodes can be defined as "sticky" - i.e. can be reloaded

4. Nodes that are removed on failure may be restarted but they may also be aggressively removed by the cluster manager and never reappear

5. Nodes may have unique configuration provided but new nodes have no knowledge of previous nodes configuration

6. Nodes do not have reliable local storage but may access reliable databases

The constraints that Narayana adds are that transactions are ultimately safely partitioned by a "node identifier". It is not recommended to configure two transaction managers with the same node identifier when:

1. They talk to the same resource managers

2. They inhabit the same object store

3. They participate in propagated transactions (JTS or JTA/JBoss Remoting)

I have assumed that there are a set of persistent resource managers that have a lifespan beyond the application servers and are shared by the cluster of nodes - this requirement was implied.

Immediate observation:

Due to the restrictions of the environment (particularly 6) this leads to a situation where the ObjectStore can be deployed safely onto a database as the environment does provide reliable database storage.

As it is an (assumed) requirement of the environment that the same set of resource managers are accessed from different nodes it would not be recommended for the transaction managers to share the same node identifier as this could result in global transaction identifier collision - this would result in branches from the two different transactions being associated together for the purpose of acidity. This is seen below in the composition of an Xid:

* host address (InetAddress.getLocalHost)

* process id (may be configured)

* time the process started

* incrementing counter (initialized at zero)

* Node Name

The initial options that might be considered are:

1. Deploy all transaction managers with unique node identifiers, disable recovery on all these nodes and then deploy a single node to be the recovery manager for the entire cluster (recover for "*" nodes) and rely on the orphan safety interval to ensure that in flight transactions are not interfered with. This would therefore cater for the situation where communication between transaction manager and recovery manager is impaired but both processes have remained alive. This is not ideal for the reason described earlier about IPC failure and reliance of orphan safety interval. Furthermore, it is not ideal from the point of view of the environment as it does require a special node to be defined and administered specially.

2. Deploy all transaction managers with unique node identifiers, enable recovery on all these nodes for "*". This is a derivative of option 1. The advantage over option 1 is that the operating environment does not have any special considerations for a recovery manager node, the disadvantage is that it is likely for the different recovery managers to attempt to complete branches in resource managers in parallel resulting in many confusing ERROR/WARN messages.

Both of those solutions rely on IPC availability or for the administrator to be satisfied that reliance on safety intervals is acceptable. As a note to safety interval, the worse case scenario is that if you have a transaction with resources managers which take a long time to prepare (and the resource manager can't access the transaction manager due to a network partition) the external recovery manager will roll back prepared Xids that are prepared over "safety interval" ago. If the resource manager for these Xids then fail, when the transaction manager eventually moves to commit those branches it may receive RMFAIL which will mean that the transaction manager would continue to commit the rest of the branches and an unreported heuristic would occur. The only way to detect this happened is if:

* JBTM-860 is not enabled

* The recovery manager has INFO logging on

* The user cross references the rolling back log message from XARecoveryModule with the noxarecovery warnings.

This discussion may be used by interested parties to discuss these considerations and make further recommendations.

1. Re: Running the transaction manager (and recovery manager) in an environment with automatic provisioning - a discussion

marklittle Jun 22, 2015 1:34 PM (in response to tomjenkinson)

Of course there are other possible implementations not covered here, but they would require much more invasive changes. For example, you could imagine implementing a cluster of cooperating recovery managers using something like jGroups, such that they communicate with each other before attempting to recovery any state.
Actions
2. Re: Running the transaction manager (and recovery manager) in an environment with automatic provisioning - a discussion

mmusgrov Jun 25, 2015 4:40 AM (in response to tomjenkinson)

Option 1 sounds dangerous. How about something in between and deploy an "HA Recovery Manager" (we already have a prototype for one of these - see JBTM-1359). This solution should be safe, particularly if the cluster is on the same subnet so that we do not need to worry about network partitions.
Actions
3. Re: Running the transaction manager (and recovery manager) in an environment with automatic provisioning - a discussion

tomjenkinson Jun 25, 2015 6:59 AM (in response to mmusgrov)

I think that is what Mark has mentioned too. Another option that relies on IPC could be to introduce an orphan filter that can ping back to the remote transaction manager to check the status of the transactions using narayana/TransactionStatusConnectionManager.java at master · jbosstm/narayana · GitHub it is reliant on IPC still though so would only serve to close the window a little further.
Actions
4. Re: Running the transaction manager (and recovery manager) in an environment with automatic provisioning - a discussion

mmusgrov Jun 25, 2015 7:24 AM (in response to tomjenkinson)

I think that is what Mark has mentioned too.

Mark said his proposal was invasive and involved multiple cooperating recovery managers operating simultaneously. My proposal is not invasive since we only ever have a single recovery manager running (similar to what we have now) and we rely on a clustered singleton to do the fail over.
Actions
5. Re: Running the transaction manager (and recovery manager) in an environment with automatic provisioning - a discussion

tomjenkinson Jun 25, 2015 7:48 AM (in response to mmusgrov)

I read "co-operating" to mean that they would identify who was the leader (i.e. enabled), I guess you read it as partitioning the work.
Actions
6. Re: Running the transaction manager (and recovery manager) in an environment with automatic provisioning - a discussion

marklittle Jun 26, 2015 7:45 AM (in response to mmusgrov)

Yes, the difference is primary/copy (passive replication) versus active replication. There are downsides to both types of replication protocol based upon the assumptions they make.
Actions

Go to original post