transaction support in JBossAS 5.0| JBoss.org Content Archive (Read Only)

15. Re: transaction support in JBossAS 5.0

marklittle Oct 9, 2006 11:29 AM (in response to jhalliday)

Yes, but in TP systems you never take a chance, no matter how small. Plus, I'm sure someone made a similar comment about date fields back in the 1960's and looks what happened come the year 2000! ;-)

16. Re: transaction support in JBossAS 5.0

marklittle Oct 9, 2006 11:39 AM (in response to jhalliday)

Oh, and I assume you have to durably record the mapping between the short-hand notation (long) and the real transaction id. Essentially keep a reference as to where your index (the long) has reached so that, upon failure and recovery, you can continue counting from where you left off (or you risk generating a new mapping using the same counter before half a million years is up ;-). This durable recording is going to impose some overhead: every time you increment the counter you have to record it to disk . Plus, it probably needs to be managed transactionally too. Or did I miss something?

17. Re: transaction support in JBossAS 5.0

reverbel Oct 9, 2006 5:14 PM (in response to jhalliday)

"reverbel" wrote:

60 seconds/min * 60 mins/hr * 24 hrs/day * 365 days/year = 31536000 seconds/year

2^64 transactions/wrap * 10^-6 seconds/transaction * 31536000 seconds/year = 584942.41 years/wrap

Ooops, wrong math:

2^64 transactions/wrap * 10^-6 seconds/transaction * (1/31536000) years/second = 584942.41 years/wrap

Still, the result was correct.

Regards,

Francisco

18. Re: transaction support in JBossAS 5.0

reverbel Oct 9, 2006 5:23 PM (in response to jhalliday)

"mark.little@jboss.com" wrote:
Yes, but in TP systems you never take a chance, no matter how small. Plus, I'm sure someone made a similar comment about date fields back in the 1960's and looks what happened come the year 2000! ;-)

Now your argument sounds a bit like a rhetorical one, as the year 2000 issue was an entirely different problem! ;-)

Unless you use unbounded transaction ids you will be always taking a chance, which of course should be very, very, very small. Since the XA spec places an upper limit on the size of Xids, there is no way around this in the XA design space. It is really a matter of reaching a point that is safe enough.

Half a million years at one million TPS is certainly safe enough. Note that this would not be an upper limit on the entire lifetime of the TP system, but just on the time a transaction may remain in the active state or in a heuristically completed (and not yet forgotten) state. In other words, it is just an upper limit on the time any given transaction may exist in the current set of transaction log files. Half a million years is actually way over the top for that; a few months would probably be enough.

I understand "long transactions" are becoming increasingly important, but... Do you really expect them to be that long? ;-)

Regards,

Francisco

19. Re: transaction support in JBossAS 5.0

reverbel Oct 9, 2006 5:41 PM (in response to jhalliday)

"mark.little@jboss.com" wrote:
Oh, and I assume you have to durably record the mapping between the short-hand notation (long) and the real transaction id. Essentially keep a reference as to where your index (the long) has reached so that, upon failure and recovery, you can continue counting from where you left off (or you risk generating a new mapping using the same counter before half a million years is up ;-). This durable recording is going to impose some overhead: every time you increment the counter you have to record it to disk . Plus, it probably needs to be managed transactionally too. Or did I miss something?

Very good points. Generating local ids with a linear, one-dimensional transaction counter would have these problems indeed. Fortunately, a high/low approach works well here. Below I describe the one implemented by the old TM code in JBoss AS. (That part of the code is not actually old, BTW. It was written about one year ago).

Split the 64-bit counter in two parts: a high part with (say) 20 bits, and a low part with 44 bits. Whenever you need to generate a new local transaction id, you increment just the low part, without durably recording its value. Whenever the TM (re)starts, you increment the high part and durably record its new value. Thus the (durable) high part counts TM restarts and the (non-durable) low part counts transactions in a TM run. With this solution, the overhead per transaction is just the cost of an in-memory increment. The downside, of course, is a significant reduction on the maximum time any given transaction may exist in the current set of transaction log files, but half a million years was too much anyway. :-)

Regards,

Francisco

20. Re: transaction support in JBossAS 5.0

marklittle Oct 10, 2006 5:35 AM (in response to jhalliday)

"reverbel" wrote:
"mark.little@jboss.com" wrote:
Oh, and I assume you have to durably record the mapping between the short-hand notation (long) and the real transaction id. Essentially keep a reference as to where your index (the long) has reached so that, upon failure and recovery, you can continue counting from where you left off (or you risk generating a new mapping using the same counter before half a million years is up ;-). This durable recording is going to impose some overhead: every time you increment the counter you have to record it to disk . Plus, it probably needs to be managed transactionally too. Or did I miss something?

Very good points. Generating local ids with a linear, one-dimensional transaction counter would have these problems indeed. Fortunately, a high/low approach works well here. Below I describe the one implemented by the old TM code in JBoss AS. (That part of the code is not actually old, BTW. It was written about one year ago).

Split the 64-bit counter in two parts: a high part with (say) 20 bits, and a low part with 44 bits. Whenever you need to generate a new local transaction id, you increment just the low part, without durably recording its value. Whenever the TM (re)starts, you increment the high part and durably record its new value. Thus the (durable) high part counts TM restarts and the (non-durable) low part counts transactions in a TM run. With this solution, the overhead per transaction is just the cost of an in-memory increment. The downside, of course, is a significant reduction on the maximum time any given transaction may exist in the current set of transaction log files, but half a million years was too much anyway. :-)

Regards,

Francisco

Funnily enough we used to do something very similar in the Rajdoot RPC mechanism back in 1984 for orphan detection and elimination in the distributed environment.

21. Re: transaction support in JBossAS 5.0

marklittle Oct 10, 2006 5:42 AM (in response to jhalliday)

"reverbel" wrote:
"mark.little@jboss.com" wrote:
Yes, but in TP systems you never take a chance, no matter how small. Plus, I'm sure someone made a similar comment about date fields back in the 1960's and looks what happened come the year 2000! ;-)

Now your argument sounds a bit like a rhetorical one, as the year 2000 issue was an entirely different problem! ;-)

Yes and no.

Unless you use unbounded transaction ids you will be always taking a chance, which of course should be very, very, very small. Since the XA spec places an upper limit on the size of Xids, there is no way around this in the XA design space. It is really a matter of reaching a point that is safe enough.

This is interesting ;-) If we used 128 bit encoding then we could address every atom in the universe uniquely. Most schemes don't use that approach, XA being one of them. Most Uids are federated implementations, where uniqueness is guaranteed within the scope of a domain. The way XA was supposed to work was to use an approach similar to DNS, and approach a body like IANA where companies could get their own high-order bits for the Xid that would be unique to them. Then they only had to ensure that the low-order bits were unique within their own domain. Unfortunately thay didn't really take off.

Half a million years at one million TPS is certainly safe enough. Note that this would not be an upper limit on the entire lifetime of the TP system, but just on the time a transaction may remain in the active state or in a heuristically completed (and not yet forgotten) state. In other words, it is just an upper limit on the time any given transaction may exist in the current set of transaction log files. Half a million years is actually way over the top for that; a few months would probably be enough.

I understand "long transactions" are becoming increasingly important, but... Do you really expect them to be that long? ;-)

Regards,

Francisco

Hey, we shouldn't rule anything out ;-)

However, although I understand why you're doing this reference approach, I'm not convinced it buys you much. The size of an IOR is so large anyway when compared to an Xid, that saving a few bytes is unlikely to make much of a difference on the number of blocks that get written in the log. It's the number of physical disk blocks and not the amount of information, that makes a difference.

22. Re: transaction support in JBossAS 5.0

reverbel Oct 13, 2006 7:12 PM (in response to jhalliday)

"mark.little@jboss.com" wrote:
However, although I understand why you're doing this reference approach, I'm not convinced it buys you much. The size of an IOR is so large anyway when compared to an Xid, that saving a few bytes is unlikely to make much of a difference on the number of blocks that get written in the log. It's the number of physical disk blocks and not the amount of information, that makes a difference.

The difference needs to be measured. An Xid can take up to 128 bytes, but in an IOR it will take up to 256 bytes, due to the encoding of bytes as pairs of ASCII characters that represent hex digits. This looks like a significant increase in the size of IORs, but it may or may not have a significant impact on the performance of marshalling, transaction context propagation, and logging tasks. The impact is more likely to be significant in the case of JBossRemoting, whose URIs are much smaller than IORs and WS-Addressing endpoint references. But it needs to be measured anyway.

It appears that we are in agreement that this is a reasonable approach, which has a conceptual advantage (it avoids nesting of globally unique identifiers), but whose practical benefits need to be validated by measurements.

Regards,

Francisco

23. Re: transaction support in JBossAS 5.0

marklittle Oct 14, 2006 4:36 AM (in response to jhalliday)

"reverbel" wrote:
"mark.little@jboss.com" wrote:
However, although I understand why you're doing this reference approach, I'm not convinced it buys you much. The size of an IOR is so large anyway when compared to an Xid, that saving a few bytes is unlikely to make much of a difference on the number of blocks that get written in the log. It's the number of physical disk blocks and not the amount of information, that makes a difference.

The difference needs to be measured. An Xid can take up to 128 bytes, but in an IOR it will take up to 256 bytes, due to the encoding of bytes as pairs of ASCII characters that represent hex digits. This looks like a significant increase in the size of IORs,

For small IORs, yes. However, it really depends on the ORB implementation. I've used pretty much every C++ and Java ORB that has been around since the mid 1990's and I've seen several (names withheld to protect the innocent!) where the IOR is in excess of 4K.

but it may or may not have a significant impact on the performance of marshalling, transaction context propagation, and logging tasks. The impact is more likely to be significant in the case of JBossRemoting, whose URIs are much smaller than IORs and WS-Addressing endpoint references. But it needs to be measured anyway.

Agreed. Ultimately this approach may be something that is best chosen at runtime (dynamically) by a deployer/sys admin when all of the related implications are known. Plus it may depend on what implementation of logging you're using. For example, if you use replicated NVRAM, then there's no notion of a physical block size and it may be that my previous comment doesn't apply.

It appears that we are in agreement that this is a reasonable approach, which has a conceptual advantage (it avoids nesting of globally unique identifiers), but whose practical benefits need to be validated by measurements.

Regards,

Francisco

Yes, now that I have all of the information ;-) I think it's actually quite an interested research topic. If you need any help/input, let me know.