-
30. Re: Remote txinflow: XID changes
tomjenkinson Oct 19, 2011 12:26 PM (in response to dmlloyd)Just to be clear, I am still not sold on the idea of touching EIS name as the API is "in the wild". That said, and terminology aside, what we need is some way to know at which entity an XID was generated so that the recovery manager can detect it is an orphan. I have been calling these node names, as the exisiting terminology is "nodeIdentifier". Remember though, these are transaction manager "nodes", not nodes as in hosts. Potentially there can be more than one transaction manager node per host and each of these will need to be given a unique persistent identifier. This is an existing and established requirement of JBoss TS.
I am not sure if I am misintrepetting Mark or David here but I think Mark was suggesting putting the EIS name to EIS XID key into a configuration file so that we can then fit in the node name as a String as David would like. I was saying that is not great as either the established API needs to change to pass us the key or we take a performance hit of resolving the key in the configuraiton file. Plus at the moment we are preferably going to put two node identifiers in the bqual.
In either scenario, the administrator is going to need to keep a mapping file. Either remoting node identifier to transaction manager node name (my preference) or EIS name to EIS int key.
-
31. Re: Remote txinflow: XID changes
jhalliday Oct 19, 2011 12:34 PM (in response to tomjenkinson)> In either scenario, the administrator is going to need to keep a mapping file. Either remoting node identifier to transaction manager node name (my preference) or EIS name to EIS int key.
That's not quite right. EIS names are scoped to the node and the mapping can be maintained programmatically by e.g. the JCA xa recovery plugin registration code. The nodeIdentifier is potentially enterprise (or at least data centre) scoped and needs manual coordination and maintenance or some centralized/hierarchic registry service to manage uniqueness. Than again so does allocation of a uniq string node name in the first place.
-
32. Re: Remote txinflow: XID changes
dmlloyd Oct 19, 2011 12:47 PM (in response to jhalliday)I can see wanting the node name for debugging purposes in the gtid (for the originator) and bqual (for the branch node). We'd have to make sure that we truncate names which are too long though as the limit for node names is currenty 255 characters (though we could reduce it further, as the maximum length of a host name label is 63 (recommended maximum length is 24)).
But what I'm not clear on is, why exactly do we need the other node names in the XID? I think that come out of off-forum discussions and without that information I can't really offer up any ideas. It seems like we should be able to take advantage of the unique topography of this scheme to provide some of the information.
-
33. Re: Remote txinflow: XID changes
tomjenkinson Oct 19, 2011 12:49 PM (in response to jhalliday)Jonathan Halliday wrote:
> In either scenario, the administrator is going to need to keep a mapping file. Either remoting node identifier to transaction manager node name (my preference) or EIS name to EIS int key.
That's not quite right. EIS names are scoped to the node and the mapping can be maintained programmatically by e.g. the JCA xa recovery plugin registration code. The nodeIdentifier is potentially enterprise (or at least data centre) scoped and needs manual coordination and maintenance or some centralized/hierarchic registry service to manage uniqueness. Than again so does allocation of a uniq string node name in the first place.
Agree with you here, the scope of the EIS name needs to be unique within the node, so it can be shorter. I think I maybe wasn't 100% clear but what I mean for EIS name was administrator would still need to keep a mapping of EIS JNDI name to EIS "short" name which is what I meant by keeping a mapping file in that scenario. At the moment the EIS name is read from the JCA configuration that says "jndi-data-source-name" basically (typically). If we no longer use that value then basically an extra bit of configuration (which I was terming mapping) must be held to say for XYZ data source it has a JNDI name of "foo" and a XAResourceWrapper identifier of "bar" - well, most likely a shorter version of foo
Also agree with you when you say "the nodeIdentifier is potentially enterprise (or at least data centre) scoped and needs manual coordination and maintenance or some centralized/hierarchic registry service to manage uniqueness. Than again so does allocation of a uniq string node name in the first place." Basically, what this says is we should not use the complexity of managing the node name as an argument for/against Strings or ints.
-
34. Re: Remote txinflow: XID changes
tomjenkinson Oct 19, 2011 12:53 PM (in response to dmlloyd)To be clear, the EIS name is the only debugging information in the XID.
XID:
formatId
gtrid { sequence, root node name }
bqual { sequence, subordinate node name, parent node name, eis name }
sequences are vital naturally
root node name is vital to detect orphans at the root in recovery
subordinate node name is vital to detect orphans at subordinates
parent node name is optional to filter list of recovered XIDs at a remote server by the caller
eis name is debugging
-
35. Re: Remote txinflow: XID changes
jhalliday Oct 19, 2011 12:55 PM (in response to tomjenkinson)> for EIS name was administrator would still need to keep a mapping of EIS JNDI name to EIS "short" name
No, the server can simply allocate and store a new uniq short EIS name for any long name it has not seen before. Because the server has view over the entire scope, so that mapping process can be automated. That's the key difference with nodeIdentifier - a server does not have sufficient local information to allocate itself a globally uniq node id.
-
36. Re: Remote txinflow: XID changes
dmlloyd Oct 19, 2011 1:04 PM (in response to tomjenkinson)Tom Jenkinson wrote:
Jonathan Halliday wrote:
> In either scenario, the administrator is going to need to keep a mapping file. Either remoting node identifier to transaction manager node name (my preference) or EIS name to EIS int key.
That's not quite right. EIS names are scoped to the node and the mapping can be maintained programmatically by e.g. the JCA xa recovery plugin registration code. The nodeIdentifier is potentially enterprise (or at least data centre) scoped and needs manual coordination and maintenance or some centralized/hierarchic registry service to manage uniqueness. Than again so does allocation of a uniq string node name in the first place.
Agree with you here, the scope of the EIS name needs to be unique within the node, so it can be shorter. I think I maybe wasn't 100% clear but what I mean for EIS name was administrator would still need to keep a mapping of EIS JNDI name to EIS "short" name which is what I meant by keeping a mapping file in that scenario. At the moment the EIS name is read from the JCA configuration that says "jndi-data-source-name" basically (typically). If we no longer use that value then basically an extra bit of configuration (which I was terming mapping) must be held to say for XYZ data source it has a JNDI name of "foo" and a XAResourceWrapper identifier of "bar" - well, most likely a shorter version of foo
From the Remoting perspective, the EIS name would most likely just be the full node name of the calling node. If I mix these two up in the future, this is why.
-
37. Re: Remote txinflow: XID changes
tomjenkinson Oct 19, 2011 1:09 PM (in response to jhalliday)Ah, OK, I can see what you mean now.
I am a bit concerned about that. Lets say the implementation of the compress routine was (I am going to basically compress each name to an int as it seems logical to do it this way instead of based on the lenght of the provided name):
static int counter;
static Map <String,int> names;
String jndiName = xaResourceWrapper.getJndiName();
int xidEisName = -1;
synchronzied (names) {
xidEisName = names.get(jndiName);
if (xidEisName == null) {
xidEisName = counter++;
names.put(jndiName, xidEisName);
System.out.prinltln("For the purpose of this run JNDI name: " + jndiName + " is " + xidEisName);
}
}
//write XID out using the xidEisName instead of what was provided
A> It would be a performance hit
B> We couldn't guarantee the name was the same between runs unless we used a file instead of static Map <String,int> names which would be even more of a performance hit to write this each time we get a new datasource presented to us.
Jonathan Halliday wrote:
> for EIS name was administrator would still need to keep a mapping of EIS JNDI name to EIS "short" name
No, the server can simply allocate and store a new uniq short EIS name for any long name it has not seen before. Because the server has view over the entire scope, so that mapping process can be automated. That's the key difference with nodeIdentifier - a server does not have sufficient local information to allocate itself a globally uniq node id.
-
38. Re: Remote txinflow: XID changes
dmlloyd Oct 19, 2011 1:09 PM (in response to tomjenkinson)Tom Jenkinson wrote:
To be clear, the EIS name is the only debugging information in the XID.
XID:
formatId
gtrid { sequence, root node name }
bqual { sequence, subordinate node name, parent node name, eis name }
sequences are vital naturally
root node name is vital to detect orphans at the root in recovery
subordinate node name is vital to detect orphans at subordinates
parent node name is optional to filter list of recovered XIDs at a remote server by the caller
eis name is debugging
Okay so here's where my confusion came in.
When I suggested that sequence idea, the thought was that root node + sequence is basically equivalent to (subordinate node name + parent node name) in that any participant should be able to tell whether a branch belonged to it or its descendants, and also should be adequate for filtering recovered XIDs. But I'm not sure what you mean by "orphans" exactly - aren't all branches normally recovered by the root? If so, does the subordinate path actually matter?
-
39. Re: Remote txinflow: XID changes
jhalliday Oct 19, 2011 1:11 PM (in response to dmlloyd)> From the Remoting perspective, the EIS name would most likely just be the full node name of the calling node
Nope, it's the name of the target node, not the originating node. If you have one parent server talking to two subordinate servers, that's two distinct EIS names, because the parent has to be able to tell the subordinates apart for recovery purposes, just as it has to be able to tell apart e.g. the Oracle and MSSQL server dbs it's talking to through the same mechanism. Which brings up back to the other list the parent needs to maintain: a set of all the subordinates it's talked to. Well actually it's the set of all subordantes with outstanding tx, but deleting items from the list frequently is probably more trouble than its worth.
-
40. Re: Remote txinflow: XID changes
dmlloyd Oct 19, 2011 1:17 PM (in response to jhalliday)Jonathan Halliday wrote:
> From the Remoting perspective, the EIS name would most likely just be the full node name of the calling node
Nope, it's the name of the target node, not the originating node. If you have one parent server talking to two subordinate servers, that's two distinct EIS names, because the parent has to be able to tell the subordinates apart for recovery purposes, just as it has to be able to tell apart e.g. the Oracle and MSSQL server dbs it's talking to through the same mechanism. Which brings up back to the other list the parent needs to maintain: a set of all the subordinates it's talked to. Well actually it's the set of all subordantes with outstanding tx, but deleting items from the list frequently is probably more trouble than its worth.
Ah, you're just looking at it from the opposite end; however the point still stands that the EIS name is equal to the node name. If I set up a Remoting connection into the EJB client, the node (and cluster) name of the remote side is noted for various purposes. This also has the advantage that a node can be contacted via a different configuration, socket, or protocol and still be recognized which allows us to port over UserTransaction as well as stateful EJBs to new connections. It should also help us with this issue.
-
41. Re: Remote txinflow: XID changes
jhalliday Oct 19, 2011 1:23 PM (in response to dmlloyd)> But I'm not sure what you mean by "orphans" exactly - aren't all branches normally recovered by the root? If so, does the subordinate path actually matter?
An orphan is a prepared RM whose immediate parent has died without writing a log. Under presumed abort it has got to be rolled back. Only the immediate parent can do that - NOT the root (unless they are the same). The reason is simple: only the immediate parent can recognize the orphan status, as it's a function of the absence of a log record at that location. Also, in some deployments only the immediate parent may have the driver and connection details to reach that orphan - they are not guaranteed to be available at the root. Part of the reason that nodes need a uniq id is that encoding it into the Xid allows them to recognize their own orphaned branches and likewise avoid tampering with ones they don't own. You can actually eliminate the need for globally uniq idfor this purpose by writing to the log twice - once before and once after preparing the RM, in which case you don't get orphans any more. The cure is worse than the disease though - those log writes to disk drop your performance to unacceptable levels and you still need a globally uniq value of some form for generating uniq tx ids anyhow.
-
42. Re: Remote txinflow: XID changes
dmlloyd Oct 19, 2011 1:22 PM (in response to dmlloyd)Ah I see I said it backwards, my bad,
-
43. Re: Remote txinflow: XID changes
tomjenkinson Oct 19, 2011 1:36 PM (in response to dmlloyd)Ideally the transactions would be recovered by the root TM. But you can have failure scenarios which orphan a subordinate.
HOWTO orphan a subordinate
We only write out information about a transaction after the prepare phase.
Proxy cascades the prepare to the subordinate
--- Crash at root node ---
You now have an orphan!
-
44. Re: Remote txinflow: XID changes
dmlloyd Oct 19, 2011 4:50 PM (in response to tomjenkinson)Okay so the point is to avoid recovering orphans until the parent comes back online, and then letting the parent do it? Because the hierarchical sequence thing works for that too (it's easy to tell when there's an orphan because every node knows exactly which subordinates it "owns").