Design notes for connection / re-connection to Domain Controller when using --cached-dc

Version 4

    Issue analysis and design document for the feature request to support connection / reconnection to Domain Controller when using --cached-dc.

     

    Overview

     

    Currently when using the --cached-dc command line parameter when starting a domain slave, if the Domain Controller(DC) is unavailable but a copy of the domain config is available locally, the slave will start, but never reconnect to the DC. It is desirable to improve this functionality to all re-connection once the DC becomes available again.

     

    There is currently some confusing interactions with the use of --backup with --cached-dc, and this change should also clarify those.

     

    Background

     

    When a slave host controller is started it normally registers with the DC and obtains the domain configuration required. In WF10 we enhanced this behavior to default to only obtaining the necessary portions of the configuration needed for the host controller to function, ignoring any unused configuration (referred to below as 'ignore-unused-configuration'.) This allows very large domains to more efficiently distribute resources for things like deployments; if the deployment is not active on the slave host controller, then it is not transferred, potentially enabling faster domain management etc.

     

    When --backup is provided, the configured (or implicit default true) value of ignore-unused-configuration in host.xml is used. If there is no explicitly configured value in host.xml and --backup is used, then this value is set to false and all portions of the domain configuration are transferred. This enables the slave to act as a standby DC. If the DC were to become unavailable (for example hardware failure) then the slave can be promoted to master and restarted, becoming the DC. Without a complete copy of the domain configuration this would not be possible.

     

    Currently, when --cached-dc is used, a cached copy of the domain configuration is required for the host controller to boot, usually obtained by using --backup on a previous occasion. The current behavior makes no attempt to contact the DC on a successful boot if the DC is unavailable at the time. This enhancement seeks to improve this behavior.

     

    Issue Metadata

     

    https://issues.jboss.org/browse/EAP7-496

    [WFCORE-316] Slave host does not register to DC when starting with --cached-dc and DC is available - JBoss Issue Tracker

    [WFCORE-317] Failed to start server when -backup -cached-dc are used together - JBoss Issue Tracker

    [WFCORE-324] Resolve startup dependency between master hostController and slave hostControllers. - JBoss Issue Tracker

     

    Requirements

     

    Hard Requirements

     

    These items must be satisfied in order to have a satisfactory feature.

     

    • When --cached-dc is provided the HC will make an initial attempt to contact the DC and obtain the current domain configuration.
      • If the DC is unavailable, then a locally cached copy of the last known domain configuration must be used.
      • If the DC is unavailable and the locally cached domain configuration does not exist or is invalid for any reason, the boot must fail.
    • If cached boot is completed, the HC must periodically poll for the DC to become available.
      • Once the DC becomes available again, the slave HC must reconnect and obtain the current domain configuration.
      • The current configuration must be persisted to the cached copy of the domain configuration.
    • Any domain configuration changes performed during normal operation (DC available) must be persisted to the cached domain configuration.
      • If the DC becomes unavailable, and the slave HC is restarted, the slave must use the last obtained up-to-date domain configuration.
    • Interaction of --backup and --cached-dc
      • If --backup is used in addition to --cached-dc then the entire domain configuration shall be persisted to the local domain configuration cache. If a value of ignore-unused-configuration {true | false} is present in host.xml on the slave, then this value will be used when creating the backup, if there is no configured value, then this value will be false when --backup is present, but true otherwise.
      • --backup and --cached-dc will utilize the same local configuration cache.
    • The expected behavior of a HC reconnecting to a previously unavailable DC shall be the same as the current behavior when a slave HC falls out of contact with a DC (due to a network failure etc), then reconnects and syncs the domain configuration. (Note in some cases this means that if the configuration has been changed on the DC in the time it was unavailable, then the changes received by the slave HC would not become active until a restart / reload on the slave was performed.)

     

    Nice-to-have Requirements

    N/a

     

    Non-Requirements

    N/a

     

    Design Details


    • Changes to introduce a polling connection to the DC in the case of initial connection failure with --cached-dc.
    • Allow the existing --cached-dc persister to update the cached domain config.