10 Replies Latest reply on Mar 13, 2015 11:47 AM by tomjenkinson

    XTS behaviour when a server is suspending

    mmusgrov

      The wildfly team are adding a graceful shutdown feature https://developer.jboss.org/wiki/WildflySuspendAndResumeakaGracefulShutdown. The purpose of this suspend and resume feature is to allow a server to be taken out of service in a graceful manner, allowing all current requests to finish as normal whilst rejecting new incoming requests.

       

      Ideally we would want to allow web services requests that are associated with an existing XTS transaction to continue but this is tricky to implement in a performant fashion and therefore the proposal is to also reject such incoming  XTS requests (probably with a code such as 503 service unavailable). I am raising this issue to give people a chance to discuss the ramifications of this new behaviour.

        • 1. Re: XTS behaviour when a server is suspending
          tomjenkinson

          Thanks for raising the discussion Mike. I am not an XTS expert but have you looked at the spec to verify that 503 won't be confused for a rollback for example?

          • 2. Re: XTS behaviour when a server is suspending
            marklittle

            I assume you'll also use the enable/disable feature that's already in the transaction system to prevent new transactions from being created but let existing ones complete?

             

            What are the issues specifically around XTS? You mention performance.

            • 3. Re: XTS behaviour when a server is suspending
              marklittle

              Agreed we'd need to be sure that clients don't mistake what's gluing on. Any transactions inflight should be allowed to complete and their status should still be available until they've ended. And it should be possible to differentiate a failed transaction from a transaction service that is available but not fielding any new transactions at the moment.

              • 4. Re: XTS behaviour when a server is suspending
                tomjenkinson

                Mark:  will let Mike tackle the performance question but I expect that it is the overhead of checking requests before dropping them so it relates (in my expectation) to one possible implementation approach.

                 

                Mike: do you know if it would be acceptable for WFLY to allow XTS to function in the manner whereby it will be responsible for dropping the request for new transactions using the functionality Mark mentioned?

                • 5. Re: XTS behaviour when a server is suspending
                  mmusgrov

                  I assume you'll also use the enable/disable feature that's already in the transaction system to prevent new transactions from being created but let existing ones complete?

                   

                  What are the issues specifically around XTS? You mention performance.

                  It is more than simply disabling new transactions. Assume servers A and B:

                   

                  A begins a (XTS) transaction, T1, and issues a web services call to B.

                  A is told to begin suspending.

                  B makes a webservices call to A which includes the transaction context T1

                   

                  The naive implementation of suspend will reject this last webservices call from B to A.

                  A better approach (for XTS) is to allow calls from B to A if they include the context T1. But this would entail running the handler chain to determine whether one of the SOAP headers contains the context T1. This is what I meant by the performance overhead (since in the naive approach the request is rejected outright). Ideally, we would like to reject the the last call and allow the application to handle the error.

                  • 6. Re: XTS behaviour when a server is suspending
                    mmusgrov

                    Agreed we'd need to be sure that clients don't mistake what's gluing on. Any transactions inflight should be allowed to complete and their status should still be available until they've ended. And it should be possible to differentiate a failed transaction from a transaction service that is available but not fielding any new transactions at the moment.

                     

                    OK. So in that case we cannot take the naive approach and simply reject incoming requests since that would be the only way to ask for the status or to trigger completion. I will continue the discussion with Stuart about how to allow incoming requests if the header includes an existing transaction.

                    • 7. Re: XTS behaviour when a server is suspending
                      marklittle

                      OK, but my next question is: why is this specific to XTS? Surely it's the same problem with any distributed transaction implementation we support, e.g., JTS?

                      • 8. Re: XTS behaviour when a server is suspending
                        tomjenkinson

                        Mike will need to confirm but my recollection of the WFLY feature is that the ORB is not going to have graceful shutdown semantics applied to it.

                         

                        The idea is that the front end HTTP server is the thing that will get the graceful shutdown applied and then the server will quiesce as the external load stops.

                        • 9. Re: XTS behaviour when a server is suspending
                          marklittle

                          Why don't we want consistency across distributed transactions? Telling someone we support graceful shutdown for transactions should not require them to figure out what kind of transactions they're using in order for them to then understand precisely what the definition of "graceful shutdown" actually means.

                          • 10. Re: XTS behaviour when a server is suspending
                            tomjenkinson

                            Mike has put this point over to the WFLY team and shared the link to where we are discussing it. It sounds like the best approach is to enable the facility you mentioned which prevents new transactions being created across the board. Specifically for XTS then, undertow can continue to route all requests to XTS and we can be responsible for rejecting the new ones. The performance overhead is that requests that wont be served are still partially processed.

                             

                            Just to your point about consistency, the theoretical difference between XTS and CORBA is that the load on the app server for XTS requests is more likely to be coming from external clients as opposed to load coming via the ORB port which is more likely to be in the enterprise and can be manually stopped. That said I think allowing XTS to reject the new transaction requests is going to be best.