On High Availability, Persistence Strategies and Transactions

Version 1
    1. Restore basic Session from disk (80h):
      1. Simple : ObjectStore and Agenda only, single entry-point
      2. Incremental streaming of ObjectStore and Agenda actions
        Create a transaction-aware journal that can be replayed.
        1. Safe-point boundaries:
          1. Shallow user transactions that stage WM actions
          2. Rule network evaluation
          3. Rule Firing
      3. Read the journal back (as a stream) to build an in-memory representation of the minimal state required to replay the engine
        1. Items:
          1. Objects
          2. Store / Handles
          3. Activations
        2. There will be dependencies between items that need to be preserved
        3. The in-memory representation will normalize the items (e.g. insert/delete cancel each other)
        4. (** future : the journal needs to be normalized as well, but will be done at a later stage)
      4. Rebuild the session from the in-memory representation
        1. Restore the objects (will be done lazily in a later version)
        2. Rebuild the object store by reinserting the facts
        3. Restore the agenda
        4. Replay the history
    2. Slaves (80h)
      1. Send the journal to 1+ slaves with 2-phase commits
        1. Asynchronous streaming
      2. Slaves will replay the journal stream, asynchronously
        1. Will replay batches only when a safe point is reached
      3. Add / remove slave gracefully
      4. (** future: Add a slave without stopping the master)
      5. (** Test : master fails during rule evaluation; Test : master fails during rule firing)
    3. Support the remaining internal data structures : TMS, accumulates, globals, ... ( 300h )
    4. Normalization of the journal on disk to ( 300h )
      1. How do we prevent the journal from growing indefinitely
    5. Create a dedicated service that incrementally creates a session snapshot byte array from the journal stream,
      so that a slave can be rehydrated from that directly ( 400h ).
      1. We don't want to use the full protobuf serialization
      2. We don't want to replay the whole history
    6. Develop a transaction model (possibly around rule modules) with rollback and integrate it with the replay mechanism ( 750h + )