1 2 3 Previous Next 44 Replies Latest reply on Sep 13, 2013 7:29 AM by Mark Little

    Transactional MSC

    Paul Robinson Master

      Introduction

      This discussion covers the "Transactional MSC" feature scheduled for a future release of WildFly. Although the feature is discussed in general, the focus of this document is to establish the requirements from Narayana and to discuss implementation details.

       

      Transactional MSC Requirements

       

      There should be a way to manipulate the container atomically in order to guarantee integrity of the container in both standalone and domain managed modes. The types of operations that must be transactional are:

      • Modification of configuration values
      • Lifecycle management of subsystems.

      Each operation has a runtime effect on the in-memory configuration database in the MSC, plus a durable update to a configuration file (i.e “standalone*.xml”)

       

      Failures

      Failure to update configuration files (disk space, file system permissions) or modify configuration values must cause the operation to undo updates to all members of the domain.

       

      Failure to deploy subsystems in a server farm is occasionally expected due to port conflicts and other environment issues. These failures should be able to be reported to the user. It is expected that the user will then provide a policy (potentially including manual intervention) to determine whether to complete the update based on the number of expected failures. For example, it may be acceptable to continue the deployment of an application where 30% of application servers have failed to deploy it.

       

      The ability to undo specific branches of the update should not be required assuming the MSC can tolerate all classes of subsystem failure mentioned above.

      A note on isolation

      Elements of the operations are permitted to be visible to other updates before their encompassing transaction is completed. For example, reserving a port will require creating a serversocket binding.

      Transports

      The solution must run over JBoss Remoting. It is not possible to add further transport dependencies such as those required from JTS, RTS or WS-AT.

      Audit

      The auditing of MSC's transactions is out of scope for this discussion.

       

      Performance

      [Q] How much of an impact on performance can the solution impose? I appreciate this is probably hard/impossible to quantify.

      [Q] What number of servers, present in the domain, should the solution scale to, whilst still maintaining an acceptable level of performance? I appreciate this is probably hard/impossible to quantify.

       

      Transaction-management related requirements

      Tx-logs

      1. The user must not see MSC's transaction logs alongside theirs when doing administration. This needs to apply to all administration mechanisms available to the user now and in the future. In particular, when using the object-store browser or CLI, only the applications' transaction logs should be visible. The same applies to the medium used to store the logs (e.g. filesystem, HQ Journal, DB).

       

      Recovery

      1. Recovery of MSC's transactions must complete during some (currently unspecified) phase of the server's boot process. This is different to the applications' transactions that will be recovered after the server boots.
      2. It should not be possible to initiate a recovery scan of MSC's transactions via a network request.

       

      Stats

      1. The agregate transaction statistics presented to the user, should be based solely on application transactions and not any initiated by MSC

       

      [Q] Do statistics need to be available (somehow) for MSC's transactions or can they simply not be gathered?

       

      Logging

      [Q] I think a separate logger should be used for transactions initiated by MSC. Otherwise, when the user increases the log verbosity, they will start to see logs for transactions they do not recognize. This may not be a problem, as this currently happens if you have more then one application deployed; it's not clear which transaction log lines belong to which application.

       

      Configuration

      1. MSC requires its own set of configuration values for the TM. This is to prevent the user from setting configuration values for their application(s) that are not compatible with the correct functionality of MSC. This applies to all those exposed via the ArjunaCore Environment Beans, not just those exposed via the WildFly management API.

       

       

      The solution (provided by WildFly)

       

      Prototype Solution

      A prototype of the proposed solution is available here: https://github.com/paulrobinson/txmsc-prototype

       

      The prototype shows how Arjuna Core can be extended to provide the functionality required for Transactional MSC. In particular it provides a Root Transaction that can enlist Subordinate Transactions that are located on a remote VM. The example also shows how crash recovery and orphan detection can be achieved. The readme.asciidoc file in the root of the project explains how to run the tests and the examples.

       

      prototyped-overlay.png

      The diagram above shows how the pieces of the prototype fit in with the architecture proposed by the MSC team. The red and yellow shapes represent the pieces in the prototype. Notice that only one level of subordination is demonstrated, but it should be straight-forward to extend to the server level and to support multiple hosts.

       

      The ellipses represent the two transaction types. The RootTransaction runs with the Domain Coordinator, and the Subordinate Transaction runs on the Host Controler(s). The rounded boxes represent participants. There is one participant for the Config Service (used to update the Config Store). This resource is just mocked-up in the prototype. In the final implementation the resource will need to be a robust implementation for managing the transactional update to the Config Store. The "Subordinate Participant Stub" is another participant that is enlisted with the Root Transaction and makes remote calls to drive the Subordinate Transaction running in the Host Controller.

       

      The prototype only mocks-up the remote calls. The MSC team shall be responsible for implementing the remoting requirement (paying particular attention to correct handling of any new transport protocol exceptions introduced). It should be possible to apply similar techniques developed for the distributed JTA over JBoss Remoting developed for the EJB container in the WildFly application server. The following fundamental differences should be considered:

      • That was an XAResource implementation rather than AbstractRecord
      • The EJB container made use of a subordinate transaction technique which is out of scope of this solution.

       

      The object-store is not shown on the diagram. However, all participants and transactions are recoverable and state is persisted to the object-store, local to the VM in which they are ran.

       

      Recovery

      Recovery (not shown on the diagram yet) is driven from the "domain controller'. A recovery module (RootTransactionRecoveryModule.java) periodically scans for failed transactions. On finding one, the "Root Transaction", "Config Participant" and "Subordinate Transaction Stub" are restored from the log. When recreating the "Subordinate Transaction Stub", it connects the associated Host Controller and requests that it restores the associated Subordinate Transaction from the log. The second phase of the transaction is then replayed.

       

      As with the previously mentioned WildFly EJB container implementation, the resource manager (in this case the MSC container) is responsible for re-establishing remote connections to domain members required during recovery and correctly converting any transport protocol exceptions to ArjunaCore expected exceptions. It is known that some work (such as reserving socket binding) would be associated with an abstract record and in the case that the transaction rolls back these resources can be released. As mentioned previously these operations have no isolation guarantees.

       

      Orphan Detection

      It is possible that a failure can occur at such a time that the Subordinate Transaction is logged without the associated Root Transaction being logged. This occurs if the failure occurs after the Subordionate Transaction prepares (and thus writes its log), but before the Root Transaction prepares (thus does not write its log). This causes an orphan transaction to be present on the Host Controller. As Orphans relate to a Root Transaction that never committed, they can simply be rolled back. Detection of orphan Subordinate Transactions is driven by the Domain Controller. Once the recovery of known Root Transactions is completed successfully, it is known that any other  Subordinate Transactions, started by that Domain Controller, can be rolled back. This is driven by the Domain Controller who makes a remote call to the Host Controller requesting that it rollback all transactions (not in-flight) started by that Host Controller. Note it is important to distinguish between transactions that are in-flight for a failed Root Transaction and those in-flight for a new Root Transaction that was started since recovery completed. The prototype doesn't currently make this distinction.

       

      Sequence Diagram (NEEDS UPDATING)

       

      The following diagram shows the sequence of messages that would occur with this solution. In particular it is simplified to just two levels (IIRC there is a third level, but I don't recall the details). The first 'level' contains the root MSC coordinator (Actors: Client, MSC, AtomicAction, ServiceProxy, ConfigProxy) and the second 'level' contains all the servers in the domain requiring the update (Actors: Service, Config). The diagram also omits any interposition techniques. This could be done to reduce the number of messages sent between the MSC coordinator and each server.

      txmsc.png

       

       

      Changes required in Naryana

      Our current understanding is that all the changes in Narayana are focused around isolating MSC's usage of ArjunaCore from the applications' and users' usage. Therefore it should be possible to prototype transactional MSC with the latest Narayana 5.x release.

       

      Proposed short-term solution

      WildFly provides two instances of Narayana, each in its own classloader:

       

      Application Instance

      The first instance is provided for applications deployed to Wildfly and appears to be identical to the current offering available in WildFly today (from the user and applications' point of view).

       

      MSC Instance

      The second instance of Narayana is provided in a different classloader and has the following attributes:

       

      1. It only has ArjunaCore available.
      2. It has it's own instance of the ObjectStore, Recovery Manager and Transaction Reaper.
      3. It gathers its own statistics or no statistics.
      4. The recovery manager cannot be driven over the network.
      5. This instance cannot be accessed via WildFly management.
      6. Its statistics are not available when viewing the application-transactions' statistics.
      7. It uses a different logger category (or some other mechanism) to distinguish it's log entries from the first Narayana instance's. This also allows different log-levels to be set.

       

       

      Proposed long-term solution

      Update Narayana to support isolation of applications. Each application can have its own configuration and will see an isolated view of statistics. The current view is that this is possible to achieve but will be a lot of work. We plan to revisit this solution once we are further down the path with the Transactional MSC implementation.

       

      Appendix: WildFly Architecture Diagram

      Transactional WildFly - New Page.png

        • 1. Re: Transactional MSC
          Tom Jenkinson Master

          I think this:

          [Q] Does it matter if the object store contains transaction logs for both Transacional MSC and applications?

          Should be:

          [Q] Can the ObjectStore that the application user configures be used by the MSC?

          [Q] Must the transaction/recovery manager allow separate object stores to be configured?

          [Q] When using the CLI to browse the object store to resolve transactions manually, must the MSC and application txs both be visible?

          • 2. Re: Transactional MSC
            Paul Robinson Master

            Tom,

            Tom Jenkinson wrote:

             

            I think this:

            [Q] Does it matter if the object store contains transaction logs for both Transacional MSC and applications?

            Should be:

            [Q] Can the ObjectStore that the application user configures be used by the MSC?

            [Q] Must the transaction/recovery manager allow separate object stores to be configured?

            [Q] When using the CLI to browse the object store to resolve transactions manually, must the MSC and application txs both be visible?

             

            Done.

            • 3. Re: Transactional MSC
              David Lloyd Master

              OK I will attempt to answer as many questions as I can.

              [Q] Is it correct that the audit should contain the update (and it's outcome), even if the transaction failed and also in the presence of a crash?

              Starting off with the tough ones I see.

               

              Currently our audit requirements are met by using syslog-style remote logging, which is done in a very ad-hoc manner (i.e. without sensitivity to crashes).  It will be difficult, regardless of the answer to this question, to both meet the remote log requirement as well as dealing with the possibility of crashing.  AFAIK there is no way to log to syslog transactionally.

               

              Ignoring that problem though, my feeling is that we are only required to Audit (with a capital A) changes that were successfully made, but we do want to at least locally log (in a human-readable fashion) failures as well.

              [Q] Can the ObjectStore that the application user configures be used by the MSC?

              I think not.  I think we will want to isolate the administration actions from the user's transactions as completely as possible.

              [Q] Must the transaction/recovery manager allow separate object stores to be configured?

              Only insofar as the previous requirement can be met.

              [Q] When using the CLI to browse the object store to resolve transactions manually, must the MSC and application transactions both be visible?

              Just application transactions.  The MSC/management transactions should be separately recoverable, as the management system cannot be initialized without its database.

              [Q] Should the transactions initiated by Transactional MSC be present in the user's view of the agregate statistics gathered by Narayana?

              No.

              [Q] Does it matter if the recovery manager recovers both Transactional MSC's transactions and the applications' tranactions?

              Yes (see above).

              [Q] Do MSC's transactions need to be recovered before the server finishes booting?

              Yes.

              [Q] Are there any configuration options in Narayana that cannot be shared between MSC and the applications? The issue here being that the user could make a configuration change that would be detrimental to the operation of MSC. Do we need to consider all those exposed via the ArjunaCore Environment Beans, or just those exposed via the WildFly managent API?

              Yes.  The object store should be separate as mentioned above.  The recovery system should probably not be (directly) network-enabled for MSC.  Looking over the environment beans, I'm going to say "yes these need to be considered".  I see many parameters that seem likely to be able to cause trouble for the management system.

              • 4. Re: Transactional MSC
                Paul Robinson Master

                David,

                 

                Thanks for the update. Tom and I anticipated your answers in our meeting on Tuesday. I'll update the doc and add our thoughts on what the implications are.

                • 5. Re: Transactional MSC
                  Paul Robinson Master

                  David,

                  David Lloyd wrote:

                   

                  OK I will attempt to answer as many questions as I can.

                  [Q] Is it correct that the audit should contain the update (and it's outcome), even if the transaction failed and also in the presence of a crash?

                  Starting off with the tough ones I see.

                   

                  Currently our audit requirements are met by using syslog-style remote logging, which is done in a very ad-hoc manner (i.e. without sensitivity to crashes).  It will be difficult, regardless of the answer to this question, to both meet the remote log requirement as well as dealing with the possibility of crashing.  AFAIK there is no way to log to syslog transactionally.

                   

                  Ignoring that problem though, my feeling is that we are only required to Audit (with a capital A) changes that were successfully made, but we do want to at least locally log (in a human-readable fashion) failures as well.

                   

                  Given the limitations of the syslog-style logging, is it sufficient to simply log successful operations immediately after they occur? This raises a number of possible issues:

                   

                  • There's a window between the transaction completing and the audit being written. A failure here would result in a un-audited successful action.
                  • Some transactions will be completed by the recovery manager. I don't think the audit would be written for these under the current solution.

                   

                  The problem with these two issues is that I don't think there is an easy way for the user to know that there are some potentially missing entries. We could solve this by always logging the intent of the transaction to the audit prior to beginning it. Then by taking all 'intent' entries without a corresponding 'outcome' entry, you get a list of items to investigate. With this approach you would also need to log failures, in order to ensure that the 'outcome 'is always present.

                   

                  How important is it, that the audit be complete? If we can't provide strong enough guarantees, maybe we need to consider using a transactional audit?

                  • 6. Re: Transactional MSC
                    Paul Robinson Master

                    David,

                     

                    Did you have any input on this Q?

                    [Q] Does Narayana currently do everything needed to prototype this? I think it makes sense to think about what changes would be needed in Narayana now, but delay implementation of them until the prototype is sufficiently well underway. This would ensure that the changes are very likely needed before we spend development time on them.

                    • 7. Re: Transactional MSC
                      Paul Robinson Master

                      David,

                       

                      I've updated the requirements based on your feedback. There's still some outstanding questions and a few new ones. I'll start work on the "Changes required in Naryana" section early next week.

                       

                      Thanks,

                       

                      Paul.

                      • 8. Re: Transactional MSC
                        David Lloyd Master

                        Paul Robinson wrote:

                         

                        David,

                        David Lloyd wrote:

                         

                        OK I will attempt to answer as many questions as I can.

                        [Q] Is it correct that the audit should contain the update (and it's outcome), even if the transaction failed and also in the presence of a crash?

                        Starting off with the tough ones I see.

                         

                        Currently our audit requirements are met by using syslog-style remote logging, which is done in a very ad-hoc manner (i.e. without sensitivity to crashes).  It will be difficult, regardless of the answer to this question, to both meet the remote log requirement as well as dealing with the possibility of crashing.  AFAIK there is no way to log to syslog transactionally.

                         

                        Ignoring that problem though, my feeling is that we are only required to Audit (with a capital A) changes that were successfully made, but we do want to at least locally log (in a human-readable fashion) failures as well.

                         

                        Given the limitations of the syslog-style logging, is it sufficient to simply log successful operations immediately after they occur? This raises a number of possible issues:

                         

                        • There's a window between the transaction completing and the audit being written. A failure here would result in a un-audited successful action.

                        Yeah it's a tradeoff between logging things before the transaction is committed, and potentially losing stuff.  And syslog itself is not exactly super-robust.  But I think that the limitations were known and accepted when this solution was designed.

                         

                        • Some transactions will be completed by the recovery manager. I don't think the audit would be written for these under the current solution.

                         

                        The problem with these two issues is that I don't think there is an easy way for the user to know that there are some potentially missing entries. We could solve this by always logging the intent of the transaction to the audit prior to beginning it. Then by taking all 'intent' entries without a corresponding 'outcome' entry, you get a list of items to investigate. With this approach you would also need to log failures, in order to ensure that the 'outcome 'is always present.

                         

                        How important is it, that the audit be complete? If we can't provide strong enough guarantees, maybe we need to consider using a transactional audit?

                        I think it's pretty important that it's complete, and we probably will want to look into a real transactional audit at some point (not today though as the current solution was deemed good enough by its implementers).

                        • 9. Re: Transactional MSC
                          David Lloyd Master

                          Paul Robinson wrote:

                           

                          David,

                           

                          Did you have any input on this Q?

                           

                          [Q] Does Narayana currently do everything needed to prototype this? I think it makes sense to think about what changes would be needed in Narayana now, but delay implementation of them until the prototype is sufficiently well underway. This would ensure that the changes are very likely needed before we spend development time on them.

                           

                          Well, honestly I've been hesitant to prototype until I know that we can support multiple Narayana instances, for very similar reasons.

                          • 10. Re: Transactional MSC
                            Paul Robinson Master

                            David Lloyd wrote:

                             

                            Paul Robinson wrote:

                             

                            David,

                            David Lloyd wrote:

                             

                            OK I will attempt to answer as many questions as I can.

                            [Q] Is it correct that the audit should contain the update (and it's outcome), even if the transaction failed and also in the presence of a crash?

                            Starting off with the tough ones I see.

                             

                            Currently our audit requirements are met by using syslog-style remote logging, which is done in a very ad-hoc manner (i.e. without sensitivity to crashes).  It will be difficult, regardless of the answer to this question, to both meet the remote log requirement as well as dealing with the possibility of crashing.  AFAIK there is no way to log to syslog transactionally.

                             

                            Ignoring that problem though, my feeling is that we are only required to Audit (with a capital A) changes that were successfully made, but we do want to at least locally log (in a human-readable fashion) failures as well.

                             

                            Given the limitations of the syslog-style logging, is it sufficient to simply log successful operations immediately after they occur? This raises a number of possible issues:

                             

                            • There's a window between the transaction completing and the audit being written. A failure here would result in a un-audited successful action.

                            Yeah it's a tradeoff between logging things before the transaction is committed, and potentially losing stuff.  And syslog itself is not exactly super-robust.  But I think that the limitations were known and accepted when this solution was designed.

                             

                            • Some transactions will be completed by the recovery manager. I don't think the audit would be written for these under the current solution.

                             

                            The problem with these two issues is that I don't think there is an easy way for the user to know that there are some potentially missing entries. We could solve this by always logging the intent of the transaction to the audit prior to beginning it. Then by taking all 'intent' entries without a corresponding 'outcome' entry, you get a list of items to investigate. With this approach you would also need to log failures, in order to ensure that the 'outcome 'is always present.

                             

                            How important is it, that the audit be complete? If we can't provide strong enough guarantees, maybe we need to consider using a transactional audit?

                            I think it's pretty important that it's complete, and we probably will want to look into a real transactional audit at some point (not today though as the current solution was deemed good enough by its implementers).

                             

                            So it looks like we mark "Audit" as "out of scope" for this discussion. We can address this at some point in the future. Agreed?

                            • 11. Re: Transactional MSC
                              David Lloyd Master

                              Paul Robinson wrote:

                               

                              So it looks like we mark "Audit" as "out of scope" for this discussion. We can address this at some point in the future. Agreed?

                              Yes I think so.

                              • 12. Re: Transactional MSC
                                Paul Robinson Master

                                David Lloyd wrote:

                                 

                                Paul Robinson wrote:

                                 

                                David,

                                 

                                Did you have any input on this Q?

                                 

                                [Q] Does Narayana currently do everything needed to prototype this? I think it makes sense to think about what changes would be needed in Narayana now, but delay implementation of them until the prototype is sufficiently well underway. This would ensure that the changes are very likely needed before we spend development time on them.

                                 

                                Well, honestly I've been hesitant to prototype until I know that we can support multiple Narayana instances, for very similar reasons.

                                 

                                Ha ha, well looks like we are dead-locked then. ;-)

                                 

                                How about both teams complete enough of the design and discussion work, in order to be reasonably sure that our approaches are going to work. Then Narayana can commit to developing the features you require, and you could start developing with a single TM, whilst you are waiting for the Narayana side of things to be completed?

                                • 13. Re: Transactional MSC
                                  Paul Robinson Master

                                  David,

                                   

                                  My current thoughts are that we can get classloading to do most of the heavy-lifting here and prevent too-much disruption of the Narayana code-base. Are you able to take a look at the "Overview of the solution" section and let me know if you think I am heading in the right direction? I suspect the main problem would be that we are simplifying the development work at the cost of performance and memory footprint due to there being two TMs running.

                                   

                                  Thanks,

                                   

                                  Paul.

                                  • 14. Re: Transactional MSC
                                    David Lloyd Master

                                    Paul Robinson wrote:

                                     

                                    David,

                                     

                                    My current thoughts are that we can get classloading to do most of the heavy-lifting here and prevent too-much disruption of the Narayana code-base. Are you able to take a look at the "Overview of the solution" section and let me know if you think I am heading in the right direction? I suspect the main problem would be that we are simplifying the development work at the cost of performance and memory footprint due to there being two TMs running.

                                     

                                    Thanks,

                                     

                                    Paul.

                                    The solution is not what I'd call "elegant".  It does introduce duplication, which is annoying (at best) for packagers and at least theoretically problematic if the two coordinatores ever interact in any way, and doesn't scale well (especially if we want to start supporting multiple transaction management configurations).  It is what I would call a "short term hack" - it'll solve our short term problem well enough, but it is very unlikely that these efforts will be usefully reusable in any other context.

                                     

                                    Honestly the structure of the code is very archaic in this regard.  While I appreciate and respect the very conservative approach that the team has historically taken with things like this, I feel that modernizing the code base is still the best move because it addresses many likely use cases without a particularly high level of complexity, and also without wandering down any seldom-trod architectural paths (i.e. componentizing on a POJO basis is a very well-established, safe practice, whereas relying on global state has many known weaknesses relative to this).

                                     

                                    But, that is just my recommendation, and I recognize that my perspective is quite different from yours and that my recommendation goes beyond our actual concrete requirements, so take it as advice and not any kind of mandate.

                                    1 2 3 Previous Next