Design notes for automatic promotion of a "backup" DC to master

Version 6

Created by brian.stansberry on Apr 21, 2016 11:41 AM. Last modified by brian.stansberry on Sep 15, 2016 8:26 AM.

Issue analysis and design document for the feature request to make the Domain Controller more HA.

Overview

The intent of this feature is to make the management of a WildFly Core-based managed domain more resistant to disruption caused by the failure of the master Host Controller (aka the Domain Controller). The goal is to make it possible for another Host Controller configured to detect the failure of the master and to automatically promote itself to master, with other Host Controllers being able to discover the new master and automatically connect to it.

Background

A WildFly DomainController is currently a single point of failure. If the DC fails or is shut down, in order to restore centralized management the domain wide configuration (domain.xml) needs to be available on another WildFly installation and then a Host Controller running from that installation must be configured to run as the master. The slaves in the domain then need to know how to connect to that master.

There are a number of aspects of WildFly's behavior that help mitigate the issues caused by the above:

A running slave HC does not need a connection to a DC for it or its servers to function properly.
Management clients (but not the web console) can connect directly to slave HCs, making it possible to manage them without going through the DC. However, no writes to resources that are part of the domain wide configuration are allowed when connected to a slave.
Even new servers can be configured and launched on the slave, provided the slave HC has in local memory all required parts of the domain wide configuration. (See https://docs.jboss.org/author/display/WFLY10/Admin+Guide#AdminGuide-Ignoringdomainwideresources for more on why a slave might not have some parts of the domain wide configuration in local memory.
A slave can be launched with the --backup command line argument, causing it to persist to xml a local copy of the domain wide configuration. This is a mechanism for ensuring up-to-date copies of the configuration exist off the DC host, and also help make it fairly straightforward to reconfigure manually a slave to act as master.
Slaves can be configured with multiple possible options to discover the master, and can iterate through those until it finds a master. This is the "other Host Controllers being able to discover the new master and automatically connect to it" part of the goal here. That is, that part already exists since WildFly 8 and for that part this doc is really just recording the requirements around it in the same place the related requirements around DC auto-promotion are written down, since the two are aspects of the same overall goal.

Still, despite the above mitigating aspects, it is desirable to make this more automatic.

Terms

Some terms used in this document; these may or may not ever become useful outside this doc:

DCC -- Domain Controller Candidate -- a HostController that has been configured to possibly become the master. The master is itself a DCC. (Otherwise something impossible occurred!)

Issue Metadata

EAP ISSUE: https://issues.jboss.org/browse/EAP7-99

[WFLY-424] DomainController discovery system - JBoss Issue Tracker

DEV CONTACTS: Brian Stansberry, Paul Ferraro, Ken Wills

QE CONTACTS: Martin Simka

AFFECTED PROJECTS OR COMPONENTS: WildFly Core kernel

OTHER INTERESTED PARTIES:

Requirements

Hard Requirements

The software must enforce that only HC's running the most up-to-date version in the domain can be a DCC.
- We'll need to work through patching/upgrade scenarios, i.e. how patched (and hence newer) DCCs can integrate with the domain
The software must enforce that only HC's that aren't ignoring any configuration can be the master. This the software must enforce
Understandable behavior in split brain situations is a requirement.
- A configurable quorum policy is required, and if no quorum exists overall, no HC can become master.
  - Since an effective quorum policy typically requires at least 3 voters to be robust, the minimum numbers of DCCs in the environment should be 3. A simple master + single backup set up is not sufficient.
- In a split brain situation, if hosts find themselves in a minority partition, none can become master.
- Here too we'll need to evaluate patching scenarios, where DCCs are being upgraded. How does this affect the quorum?
A DCC can only be elected master if it was connected to the master as a DCC the last time the domain wide configuration was updated, or if it successfully connected to a master after that point and received the full current domain wide configuration, including all deployments and other items in the content repository. This ensures that only DCCs with the current correct configuration can be elected.
- This information about the fact that changes were made and who was connected at that time must be persistent. In the following scenario, an HC with the current configuration must be elected:
  - All DCCs are taken offline, all with the current configuration persisted. These DCCs are referred to as set {A}
  - A set of DCCs are started, consisting of two subsets
    - a number of DCCs from {A}
    - a number of DCCs who were not running when the {A} set was stopped an who do not have the current configuration
- However, users can configure as a DCC an HC that does not have the current configuration and all deployments. Such an HC can join the domain and function as a normal DCC once it has synced with the domain.
An HC must be specifically configured to act as a DCC. Simply meeting the eligibility requirements listed above does not make an HC a DCC; the user must specifically configure this.
If the CLI is connected to a given URL and that connection closes, the CLI should reconnect to the URL to which it was connected once a process is again listening at that URL. The user taking some action that requires contact by the CLI with the server side is what triggers that reconnect.
- This is actually the CLI's behavior since AS 7.0. It's just recorded here for conceptual completeness.
It shall be possible for a user to configure a DCC such that it will not open any HTTP management interface unless it has been elected as the master. Having only the master available via HTTP makes it possible to have an HTTP reverse proxy in front of the domain, providing a consistent address/port to management clients as the actual DC changes behind the scenes. The proxy knows how to reach all of the DCCs, but since only one DCC will actually have a listening socket open, all requests will go to the master.
- If this is configuration option is used it is the user's responsibility to configure a native management interface (i.e. remoting protocol "remote://", standard port 9999, non-HTTP Upgrade) for the DCC, which can be used for intra-domain communication and for direct management client connection to that host for host-specific administration.
- Possibly a configuration like this could result in the socket being open on non-masters, but all but the master responding with a 503 to requests. This can be considered if it provides better behavior during failover, i.e. the master is down so only DCCs responding with 503s are available, so the client gets a 503, which is accurate. But as soon as a master is elected the proxy knows and the client begins to get normal responses.
The existing mechanism for configuring an HC to function as the master should still work, in order to provide scripting compatibility.
- The configuration needed to specify that an HCC is a DCC will almost certainly be different from the existing one. If both mechanisms are used in the same config, the software should detect that and fail to boot the HC.
- If the user installs a set of DCCs in their environment using the new configuration and then adds a DC with an old style configuration, that old-style DC should not function as a DCC or HC in any way in the domain with the DCC set. Doing this is equivalent to setting up 2 domains.
  - If after doing this the user configures the DC discovery on some slaves to point to both new set DCCs and the old style master, that is a user error but it is not the responsibility of the software to detect or correct this error.
The existing mechanisms of using --backup one slave HC in order to keep a copy of the domain config will still work, as will using --cached-dc to start the slave using that backup copy.
- However, if a slave uses --cached-dc it will still attempt to discover and connect to a master and if present will function as a normal slave, not using the cached domain config. This is different behavior from previous versions.
- If a slave has kept a backup domain config it will still be possible to manually promote the slave to act as a master using the old style config for being declared as master.

Nice-to-Have Requirements

Ability to configure a DCC such that it must either be elected master in the first election after it starts, or it must fail to start.
- If a DCC so configured is present during the initial election of a master (i.e. the first time a quorum has been reached since the number of active hosts reached zero), then it must be elected master, provided it meets all requirements for being a master other than being known to have the latest configuration
  - If more than one such DCC is present, election should fail.
- Any such configuration is only valid for that single initial election. Once the master has been duly elected it ceases to matter in any future election.
  - Handling of :shutdown(restart=true) needs to account for this fact; i.e. if the configuration involved is a command line parameter, that parameter should only be relevant for one launch of the JVM.
- The purpose of this is to allow the user to force election of a particular master, e.g. because they have done offline modifications to the domain wide configuration present on that host and wish to push that configuration into the domain.
Allowing a DCC to directly manage servers. We actually expect this to be supported, but we are listing it as a nice-to-have at this point in case some sort of unexpected problem surfaces preventing support for this.

Non-Requirements

Discovery by clients of the master. Clients are provided by their user with the protocol, address and port to use to connect and it is the user's responsibility to provide the correct data. If the correct data changes it is the user's responsibility to provide new information and establish a new connection.
- However, the standard protocols for management communications are HTTP based (i.e. they are either standard HTTP/HTTPS or JBoss Remoting protocols that use HTTP Upgrade). So if the user has an HTTP reverse proxy that works properly with HTTP Upgrade connections, the address/port of the proxy can become a reliable well known endpoint.
- Providing such an HTTP reverse proxy is not a requirement for this feature; it is a task for the user.
Storage of topology information in the CLI and web console such that upon DC failure they begin trying to communicate with the other DCCs. Again, an HTTP reverse proxy is the client failover mechanism.
- As a logical follow on to this point, transparent failover of in-process requests when the DC fails or is shut down is not a requirement.
Detection that the domain.xml present on the disk of a newly started DCC differs from what was most recently persisted by the WildFly management layer when the DCC was connected to the domain. If users update the domain.xml offline and wish that updated config to be used, it is their responsibility to understand how DC election works and to ensure that the desired domain.xml is present on the host that will be elected.
Re-election of the same master at the end of a reload of that master.
Dynamic registration by DCCs with a HTTP reverse proxy that is running mod_cluster, saving the user the need to configure the reverse proxy with information on how to connect to the DCCs. Configuring the reverse proxy to know how to reach all DCCs is a task for the user.
- Future support for mod_cluster integration should be investigated though.
Software checking for presence of all necessary static modules before electing a DCC as the master. It is the user's responsibility to provision all needed modules on each host configured as a DCC.

Design Details

Design information should be added to this section once the analysis phase is complete.

Some random thoughts.

Use RAFT for tracking what the topology was at the point of each domain wide config change.

Decomposition

Ideally this work can be decomposed somewhat so various people can take on different pieces and we can make more effective use of overall resources.

Possible pieces:

Discovery of PossDC by other PossDC (we already have discovery of DC by ordinary slaves)
Group communication amongst the possible masters.
Leader election
PossDC behavior as state transitions occur.
- During boot
- Post-boot
Notification of slaves of the current master
- Slaves are configured to discover the master but in the basic case this is via trying a static list of PossDC. So if a slave tries PossDC.A but it knows the master is PossDC.B, it would be nice if PossDC.A tells the slave that, rather than just rejecting the registration and making the slave keep searching.

JBossDeveloper