-
15. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
marklittle Oct 1, 2014 11:03 AM (in response to dmlloyd)Not sure if you're deliberately misunderstanding me or just stuck in some rut I didn't say the identity of a thread impacted whether it could be interrupted - of course any thread can be interrupted. What I was suggesting was that we *could* try to limit our automatic interruption of certain threads WITHIN THE SCOPE OF A TRANSACTION BY THE REAPER to those marked in some way by the application developer to force that developer to really think about whether they want to take the chance of data corruption, driver being left in some inconsistent state etc.
-
16. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
marklittle Oct 1, 2014 11:14 AM (in response to marklittle)I also strongly recommend anyone who thinks arbitrarily interrupting application threads within the scope of a transaction (at least by the TM) is a good idea should go through the evolution of JBossTS/Narayana over the past 7 years and pay particular attention to the classes involved in the reaper, e.g., TransactionReaper.java Much of that code is well documented and should help explain why we do what we do in the case of timeouts. The job of a transaction manager is to maintain consistency of state and one that doesn't do this is not doing it's job. We should always err on the side of safety and consistency. Of course what happens above/outside of the transaction manager is a very different story.
-
17. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
dmlloyd Oct 1, 2014 11:17 AM (in response to marklittle)My point is that since these failing resources are the outliers, they should be the exception, rather than the motivating factor for policy - for example we can easily add an "insulating" interrupt handler to the thread (for which a usable framework already exists) which protects the thread from interruption until after an operation is complete (rather the exact opposite of what I was proposing above, ironically). In other words, fix (guaranteed) the problem in a targeted manner, and only for the specific problematic drivers, rather than making a more blanket policy decision (which in this case can't really be universally enforced anyway since, as I said, we (and other frameworks and anyone else) can interrupt threads any time for any reason). This would mean that Narayana doesn't have to worry about it - it's the data source layer's job to perform this function.
-
18. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
dmlloyd Oct 1, 2014 12:33 PM (in response to dmlloyd)What if there was a configuration option in the transaction manager for enabling interruption on transaction timeout (and other similar situations, if any)? Then, WildFly could opt in to responsibility for ensuring correct behavior of interruption by enabling this option, since WildFly is uniquely in a position to provide such guarantees.
As an aside, if there are JDBC drivers that do indeed fail under interruption, then regardless of the outcome of this discussion, we need to proactively take measures in WildFly to ensure that they aren't interrupted. Do you have anywhere a list of what drivers/versions were observed to be affected, or is that information lost to history?
-
19. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
marklittle Oct 1, 2014 4:25 PM (in response to dmlloyd)These drivers are not the motivating factor for the policy. The policy of not interrupting a thread/process at arbitrary points in the scope of a transaction is generally a bad/risky thing to do. Even ignoring transactions it's risky. Yes I know it's possible because the language allows it, but that's hardly justification for why it should be supported. The drivers we've encountered which behave badly during interrupt happen to prove the rule. I haven't heard anything that would convince me that changing this rule for the transaction manager makes sense. As I said earlier, something else (e.g., WF) may allow this to happen and that's fine because it's a decision the container is going to make because it may have more semantic knowledge about the thread or may have a more global view of what's going on. But in terms of the TM it continues to make sense that it does not interrupt anything during the transaction timeout.
-
20. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
marklittle Oct 1, 2014 4:26 PM (in response to dmlloyd)Tom's already mentioned how WF could do this without any impact on the TM: simply add a CheckedAction implementation which can interrupt threads when the transaction terminates.
As for which drivers have been problematic in the past, we'd have to go back through support logs but we do have that data.
-
21. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
dmlloyd Oct 1, 2014 4:55 PM (in response to marklittle)Great, thanks.
-
22. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
marklittle Oct 1, 2014 6:38 PM (in response to dmlloyd)With the CheckedAction you get to add a specific instance per transaction or, obviously, the same one for every transaction. It's meant to be stateless and when the transaction ends you'll see that all of the threads enrolled with the transaction are passed to the instance you provide. What you do with that information is up to you. This was intended to solve the more generic problem of what to do with multiple threads/processes in a distributed system (http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=666792&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D666792)
-
23. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
dmlloyd Oct 1, 2014 6:53 PM (in response to marklittle)Is it all threads presently enrolled, or all threads ever enrolled? The former would be preferable for this particular case.
-
24. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
dmlloyd Oct 1, 2014 7:01 PM (in response to dmlloyd)Reading the code, it says "active threads" which implies the former, to answer my own question.
-
25. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
tomjenkinson Oct 2, 2014 7:24 AM (in response to dmlloyd)Hi David,
David Lloyd wrote:
Reading the code, it says "active threads" which implies the former, to answer my own question.
As you say, it is all threads that have not had the transaction disassociated from them (TM::suspend()).
Regarding [WFLY-3922] Add a CheckedAction to each transaction upon creation/inflow to interrupt the transaction's associated threa…, I just wanted to check this is definitely something you want me to add to the subsystem for all transactions? I can port Comparing jbosstm:master...tomjenkinson:exampleOfCAF · tomjenkinson/narayana · GitHub into WFLY really easily and it should satisfy Kens original request but I just want to make sure you really want this before it gets merged. I do know that some drivers "wedge" when we call xa_rollback from a reaper thread so we have additional reaper threads to mark those as zombies - i.e. interrupt is not a silver bullet for all drivers.
Please let me know whether you want me to PR WFLY with the CAF,
Tom
-
26. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
dmlloyd Oct 2, 2014 8:42 AM (in response to tomjenkinson)Yes, however I would consider [WFLY-3923] Option for suppressing thread interruption during JDBC driver operations - JBoss Issue Tracker to be a mandatory prerequisite. I'll update the JIRA accordingly.
-
27. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
marklittle Oct 2, 2014 11:01 AM (in response to dmlloyd)Yes. We don't track all threads ever enrolled because a thread could be involved in many transactions during its lifetime
-
28. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
marklittle Oct 2, 2014 11:04 AM (in response to dmlloyd)3923 mentions a list of problem drivers. I assume someone is following up with support? We should also go through the TS team's archives and wiki. The info does exist - it may be spread around though. (Andrew Dinn may also recall.)
-
29. Re: Algorithm for catching and handling TransactionReaper timeouts in EJB methods
dmlloyd Oct 2, 2014 12:25 PM (in response to marklittle)Yeah I'm talking to support and chasing whatever leads I can find. I'll bug Andrew next.