Taking advantage of a git backend for WildFly configuration files

Version 2

    This article aims to synthesise the various discussions  and exchanges around WFCORE-433 and how we could make WildFly more cloud friendly.

    Adding a git backend to manage WildFly configuration files history could add several advantages :

    - ability to get/share configuration files in multiple instances

    - promote/help the good sysadmin practice of using a VCS to manage configuration files

    - some cloud infrastructure tools (like Kubernetes) understand git natively

     

    The first use-case that appears is the ability to get configuration files from a git repository, start a WildFly instance from it, commit configuration changes to it and eventually push them back to the original repository.

    This "simple" use-case brings some question with it :

    1. How do we support authentication required to be able to pull/push changes ?
    2. Since conflicts may arise, how do we manage them ?
    3. When do we commit ? When do we push ?
    4. How do we keep a running instance in sync with a remote repository ?

     

    1. How do we support authentication required to be able to pull/push changes ?

     

    Currently we don't have anything to pass the authentication identities, as we could use ssh keys, user/password or OAuth token we need something like the vault mechanism to protect those values. We might use SECRETS in a Kubernetes environment but we still would need to be able to encrypt and decrypt them.

     

    2. Managing conflicts

     

    Conflicts may arise, so there is two strategies that we can choose from :

    - avoid conflicts : we may choose to have only one instance that can push to the git repository, all the other ones being read-only.

    We could add a leader-elector container to each pod thus we could elect a leader and add a label to this pod which would be the selector of a management service. This way the management requests would be redirected to this instance. The only missing part is to be able to notify the WildFly instance when it becomes the new leader so it could restart with the updated configuration and become read-write.

    Another way would be to use a 'token' to elect a git repository leader using git kesh (Re: [jgit-dev] Ketch: multi-master replicated Git, Gerrit Code Review) so we can insure that we have all the local repositories in sync. But this would require some mechanism to load the updated configuration otherwise we would just overwrite the changes (which leads to question 4).

    - manage conflicts locally : maybe we could add a bunch of commands to manage conflicts (to select which version to keep theirs/ours and maybe display the diff)

     

    N.B.: To facilitate all this the pull should be with --rebase option.

     

    3. When do we commit ? When do we push ?

     

    Currently the configuration files are saved when the model is updated and we can take "snapshots" of the configuration. We may create a new commit at each of this "save" or maybe check if the files have changed before creating the commit.

    Also we could add a "publish" command which would be in charge with sharing the configuration (aka pushing it to the remote repository). This could be used if we want to support other ways of sharing files (like copying on some NFS mount point, using ConfigMaps etc.)

     

    4. How do we keep a running instance in sync with a remote repository ?

     

    We have two issues to solve here :

    - update the local configuration files from the remote repository

    - update the local running instance configuration

     

    While the first issue can be resolved by a scheduled pulling we need some mechanism to get update the running configuration with the new configuration. Taking advantage of what was done with domain is one way to do it, or we can just put the server in restart required mode and pull the updates on restart.

    On cloud platforms we could use existing mechanisms like the Rolling update of Kubernetes to update instances by restarting them.