WildFly deployment architecture discussions

Version 1

    Introduction

     

    This document contains notes on the future of the WildFly deployment architecture, as discussed at the EAP meeting in Brno in January 2014.

     

    At the moment none of the changes discussed here have been agreed upon, and there is no real timeline in place for their implementation.

     

    Current Issues

     

    The main issue with the current architecture is that deployers run in phases with a specific ordering that is provided via a number in Phase.java. This makes it very easy to see when a deployer will run in relation to other deployers, however it provides no way to determine why it is there. The only way to determine this is to examine the code of all deployers in the code base, and see which structures are modified by earlier/later deployers that are also modified by this one. This has the potential to turn into a maintainability nightmare, as no one really knows why a given deployer is run at a specific time.

     

    In addition to this when the code base is modulised there will no longer be a single place we can keep all these numbers.

     

    Another problem is that even though deployments are MSC services they are not restartable, at the moment we have a hack in place that performs a complete redeploy if a restart is detected.

     

    Possible Solutions

     

    We discussed a number of possible approaches:

     

    Using fine grained MSC services to perform the deployment

     

    The general idea with this approach would be that deployers are MSC services, that would then wire up dependencies between themselves. So for example the web metadata merging processor would depend on the parsed web metadata, and the parsed web annotation metadata. Once the dependent services started then the merging processor would run and produce the merged metadata.

     

    After a brief discussiong it was decided that this approach was not workable for a number of reasons. The main problems being that we would need to make all our deployment structures thread safe, we would need to hold onto all our intermediate deployment states in memory which would significantly increase memory usage and that this model does not map well to deployers that simply modify data, rather than producing a new data structure.

     

    Creating a new dependency resolution mechanism that is used to determine ordering

     

    Instead of using MSC with this approach we would create a separate dependency resolution mechanism that is used to determine the correct deployer ordering. This mechanism would assemble deployment chains that work similarly to our current chains, however the order would be determined via a dependency resolution algorithm. Dependencies would include relationships such as 'contributes to' to allow for situations where deployers simply mutate an existing set of data.

     

    This idea seems ok on the face of it, however it would be quite a bit of work to implement, and a lot of people had reservations about it, as it seemed to similar to the system that we present in EAP 5. A system like this is probably worth investigating at some point in the future, however it is unlikely to be a silver bullet.

     

    Introduce a global registry for deployer ordering

     

    This was a solution that was discussed as a solution to the problem that results from modulising WildFly. Basically we would create a global registry, and extension would be added to it in the correct location. This could also including ordering for 3rd party deployers such as Torquebox and Capedwarf. This solution will probably take the form of a project with its own release schedule that contains a single class with the deployer ordering.

     

    Possible solutions to the deployment restart problem

     

    We discussed a number of possible solutions to the deployment restart problem, however they all suffered from the problem that in order for a deployer to be able to restart cleanly it needs to store the previous runtime state. This results in a much larger runtime memory footprint (possibly hundreds of megabytes for large deployments). Given that the current hack that is in place works, it was decided that this was not worth spending any time on at the moment.