1 2 3 4 5 6 Previous Next 75 Replies Latest reply on Feb 25, 2015 5:31 AM by mmusgrov Go to original post
      • 45. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
        smarlow

        Hi Tom,

         

        Sure, that can also be included (similar to #2) in the list of possible solutions.  I'll edit the above list and slip that in, so that we have a single list of possible solutions.

         

        Scott

        • 46. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
          marklittle

          Adding the new status value would really require changes in the OTS/JTS and JTA standards. It's possible to do this with JTA, though it will take time. It's going to be almost impossible to do this with JTS/OTS these days. This change would break portability and interoperability.

          • 47. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
            smarlow

            Instead of adding a new status value, how about a new method?  Something like boolean wasTimedOut()? 

            • 48. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
              tomjenkinson

              I am still not clear what knowing if the transaction has "timed out" would help you to do?

              • 49. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                marklittle

                A new method on what, precisely?

                • 50. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                  smarlow

                  How about the TransactionSynchronizationRegistry class? 

                  • 51. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                    smarlow

                    For considering a few different paths mentioned earlier, knowing if it is possible to have a wasRolledBack method, will help us with understanding which paths are impossible or just hard.

                    • 52. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                      marklittle

                      OK, that removes the OTS/JTS from the picture but does remain as a change to JTA. We could drive this within the standards group but we'd need a clearly articulated reason for its inclusion. In the meantime (the 2 years it's likely to take to get adopted) would we be willing to make such a change that precludes other implementations being added into WF/EAP?

                      • 53. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                        tomjenkinson

                        Hi Scott,

                         

                        I have prototyped the association handling code. As discussed it required a new SPI listener: https://github.com/tomjenkinson/jboss-transaction-spi/compare/threadDisassociationListener?expand=1

                         

                        Narayana needed to be modified to use it to call back on the listener from within our app server integration layer (if the JBoss Transaction SPI provided a TransactionManager wrapper as it does the ServerVMUserTransaction listener the mod wouldn't have required any changes in Narayana even as these could have been done there):

                        https://github.com/tomjenkinson/narayana/compare/threadDisassociationListener?expand=1

                         

                        I added a test to show how it can work: narayana/TimeoutTest.java at threadDisassociationListener · tomjenkinson/narayana · GitHub - it works through a few scenarios.

                         

                        Its not final as there are a couple of issues to resolve:

                        1. Its hard coded to local JTA only (i.e. not our JTS impl) so that would need a tweak

                        2. Communicating to Narayana which listeners to call when a thread disassociates is via a keyed entry into the TSR (could be made a list of course). This is probably not the best way to do it but I just wanted it to be clear how the two could communicate. It may be possible to intercept the register*Sync methods and check them for instanceof for example.

                         

                        This change does not require any modifications to OTS/JTS/JTA even and ensures that syncs are free to avoid concurrent access to other resources such as the EM. Therefore I think we should look at trying to work out if this approach addresses all your requirements?

                         

                        Tom

                        • 54. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                          smarlow

                          Mark Little wrote:

                           

                          OK, that removes the OTS/JTS from the picture but does remain as a change to JTA. We could drive this within the standards group but we'd need a clearly articulated reason for its inclusion. In the meantime (the 2 years it's likely to take to get adopted) would we be willing to make such a change that precludes other implementations being added into WF/EAP?

                          I meant to say wasTimedOut, instead of wasRolledBack.  That was not clearly articulated by me.  I think that we can get a coherent and clearly articulated description of this issue (and how wasTimedOut helps), it will just take some effort (and some pictures that illustrate some worse case scenarios).

                           

                          The other implementation, would either include the ability to handle wasTimedOut or not.  If not, perhaps WF/EAP could have a soft dependency on the wasTimedOut method, that assumes that no time out occurred. 

                           

                          I would like to bring this issue over to the JPA expert group and reopen, for the EE 8 targeted JPA spec (from yesterdays discussion, it will be a small JSR or MR).  I'm thinking that the question for the JPA expert group, will be how to better expose "tx timeout" to the JPA application.  I believe that the wasTimedOut method would be used in the JPA Synchronization.afterCompletion(int), so that steps can be taken to inform the application that the tx timedout.

                           

                          As a strong advocate of standards (especially EE), I will be disappointed if we cannot come up with a way for (JPA.nextVersion) 3rd party persistence providers to better handle the transaction timeout case (for the benefit of applications).

                          • 55. Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                            smarlow

                            Hi Tom,

                             

                            With this approach (#4 from the earlier summary), we might have some Synchronization callbacks running in the reaper thread (e.g. IronJacamar which would defer database connection closing to the TxAssociationListener based JPA callback).  The JPA callbacks (the ones that are modified to depend on a WildFly/someProject TxAssociationListener extension), may be invoked from the application thread if possible.  In the cases where the application thread doesn't call the TxAssociationListener callback, the top level thread in control (e.g EE container), would then need to call the TxAssociationListener callbacks.  Deferring the calling of the TxAssociationListener callbacks, could lead to memory issues or delay in releasing database connections back to the pool (especially if the application logic is looping and creating different UTs within one invocation but the application fails to properly end the TX or other application bugs like that). 

                             

                            Recently, we found that the persistence provider needs to release the database connection to the pool in the reaper thread, to avoid exhausting the connection pool for a specific use case where no further calls to the persistence context are made after tx timeout.  In theory, with approach #4, we hope not to see the same issue but from a timing perspective, I don't see how we can avoid it.  For example, what happens if the application thread and the reaper thread try to rollback the transaction around the same time.  Which ever thread gets the transaction level lock (assuming we get one), will do the actual rollback.  In this scenario, the application thread will have already called transaction rollback (even if the reaper thread gets the transaction level lock first).  We would need to ensure that the application thread does release the database connections back to the pool in a timely manner.

                             

                            Comparing this approach to #2 (WildFly wraps the TM/TSR), to have WildFly coordinate the running of all Synchronization callbacks.  #2 would at least be standards based and more inclusive of which Synchronizations to always run in the application thread.  Either approach sounds risky, but #2 is at least more complete (works with any persistence provider).  From a performance point of view, I'm not sure of the differences between the two approaches.

                             

                            Of the four approaches, I know that we identified a bug that could occur with approach #1 (set roll back only), which might be impossible to fix.  I don't agree that it is impossible to fix.  I think that we just need to decide between the different approaches here, none of which are perfect. 

                             

                            Approach #3 will need discussion on the JPA expert group, to see if we can agree how we would use the TransactionSynchronizationRegistry.wasTimedOut, to improve how applications deal with transactions that time out in the background.

                             

                            If we can eliminate approach #1 (configuration decides between set roll back only or immediate background roll back) from this discussion, because we know that I'm wrong and its impossible to do, that will leave fewer options to choose from.

                             

                            Scott

                            • 56. Re: Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                              tomjenkinson

                              Scott Marlow wrote:

                               

                              In the cases where the application thread doesn't call the TxAssociationListener callback, the top level thread in control (e.g EE container), would then need to call the TxAssociationListener callbacks.

                              Unless it is an error in the user logic I can't see a scenario where the application thread won't call one of commit/rollback/suspend on the BaseTransactionManagerDelegate?

                               

                              Scott Marlow wrote:

                               

                              Recently, we found that the persistence provider needs to release the database connection to the pool in the reaper thread, to avoid exhausting the connection pool for a specific use case where no further calls to the persistence context are made after tx timeout.  In theory, with approach #4, we hope not to see the same issue but from a timing perspective, I don't see how we can avoid it.  For example, what happens if the application thread and the reaper thread try to rollback the transaction around the same time.  Which ever thread gets the transaction level lock (assuming we get one), will do the actual rollback.  In this scenario, the application thread will have already called transaction rollback (even if the reaper thread gets the transaction level lock first).  We would need to ensure that the application thread does release the database connections back to the pool in a timely manner.

                              I don't think this situation can happen as the BaseTransactionManagerDelegate will always call the thread association listener. For example in rollback: https://github.com/tomjenkinson/narayana/blob/threadDisassociationListener/ArjunaJTS/integration/src/main/java/com/arjuna/ats/jbossatx/BaseTransactionManagerDelegate.java#L137

                              The reaper never calls that particular class (and thereby never invokes the thread association listener), it only invokes rollback on the internal representation of the coordinator. The application always accesses the internal representation via this delegate.

                              • 57. Re: Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                                marklittle

                                Sounds like a plan. Let us know if there's a need for the JTA group to look into this too.

                                • 58. Re: Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                                  marklittle

                                  Assuming the thread remains alive (the process doesn't crash), you're right: it doesn't make sense for the application thread to ignore a running transaction; it should either commit, abort or suspend it.

                                  • 59. Re: Re: Design Discussion: Changing the reaper to use setRollbackOnly() instead of rollback()
                                    smarlow
                                    Unless it is an error in the user logic I can't see a scenario where the application thread won't call one of commit/rollback/suspend on the BaseTransactionManagerDelegate?


                                    Yes, the application logic could have a bug in their user transaction handling logic, this is easy to get wrong.  For example, the application might notice that the transaction status is STATUS_ROLLEDBACK and leave the rolled back transaction associated with the thread (causing a leak of the database connection and persistence context I think).  This could be a very large application that seemed to work before with the bug but no longer functions as it used to.