Version 5

    Roll, baby roll!

    Introduction

    This document details the design of a rolling upgrade scheme for Infinispan.  The motivation is to allow users to upgrade an Infinispan cluster from one version to a new version without experiencing any downtime.

    JIRAs

    ISPN-1410 tracks this effort.

     

    Documentation

    This is more process intensive than code or software intensive. As such, the process will need to be very clearly documented by our professional docs team, as it will be run by system administrators. This is what will provide users with the level of confidence needed to perform a rolling upgrade.  If possible, the docs team should even help with upstream, community documentation of this capability as it will help community uptake, testing and feedback.

     

    Design

    We define 4 distinct cases of rolling upgrades, each one treated (and implemented) separately.

     

    1.  Rolling upgrades for remote clients using Hot Rod

     

    This section has been implemented and is described in a separate document: https://community.jboss.org/wiki/RollingUpgradesForRemoteClientsUsingHotRod

    Development work

    • Add the ability to enable/disable a cache store via JMX to support step 6 above.
    • Add the ability to check that there are no more open connections in a Server endpoint via JMX, to support step 4 above.
    • Add the ability to dump the entire, locally known keyset into a well-known key via JMX, to support step 4 above.
    • Implement a RollingUpgradeSynchronizer.
      • A standalone process (run on command-line and JMX)
      • Uses Hot Rod to connect to Cluster A, request the contents of the well-known key containing the entire keyset.
      • Then does a cache.get (or cache.exists) on Cluster B with each of the keys (in parallel).
      • This allows Cluster B to fully populate itself from Cluster A.
      • JIRA: ISPN-2346

    Considerations

    • Any cache stores configured on Cluster A should also be configured on Cluster B.  If the cache store is shared, make sure it is not shared between the two clusters.

    2.  Rolling upgrades for remote clients using REST or memcached

    This process is used for installations making use of Infinispan as a remote grid, via REST or memcached, and is similar to the process defined above.  This also assumes an upgrade of the Infinispan grid, and not the client application.

    Steps

    Identical to the Hot Rod process above, except that instead of a RemoteCacheStore, a MemcachedCacheStore or RESTCacheStore is used instead.

    Development work

    • A MemcachedCacheStore and RESTCacheStore will need to be implemented.
      • Identical to the RemoteCacheStore except that it uses REST or memcached as a wire protocol.
      • JIRA: <TBD>

    3.  Rolling upgrades for embedded clients

    This is for clients using Infinispan in Library mode.  The typical use case detailed here is, for example, a webapp using Infinispan jars.  Clients still connect to this webapp over a network, and typically via a load balancer such as mod_cluster.

    Steps

    Identical to the Hot Rod process above, except that the load balancer is used to hard-switch all client requests from the old cluster to the new one.  HTTP sessions will not be lost since the new cluster will still be able to locate existing sessions via the EmbeddedRollingUpgradeCacheStore.Before starting the process, a EmbeddedRollingUpgradeEndpoint will need to be started on each of the existing embedded nodes via JMX.

    Development work

    • The EmbeddedRollingUpgradeEndpoint will need to be developed.
      • will need the ability to be started via JMX, on demand.
      • this is a subclass of the Hot Rod endpoint, with the added ability to translate the byte array key format used by Hot Rod to a Java object used by the embedded clients.
      • JIRA: <TBD>
    • The EmbeddedRollingUpgradeCacheStore will need to be implemented.
      • A subclass of the Hot Rod based RemoteCacheStore, with the added ability to translate Java objects into the byte array key format used by Hot Rod.
      • JIRA: <TBD>

    Considerations

    • Transactions need to be considered.  Locks acquired on the same key on each of the old and new cluster may cause problems as they won't be seen across the cluster boundary and could lead to data inconsistency.  Hence the need for a hard-switch on the load balancer.
    • Client applications that are not able to do a hard, atomic switch may have issues, unless the client application can go into read-only mode prior to such a migration.

    4.  Rolling upgrades of user code (embedded mode)

    This is for an upgrade of the user application.  Infinispan itself will not be upgraded at this time, and this process is to deal with different serialization/externalization formats of data stored in the grid.

    Steps

    Identical to Rolling upgrades for embedded clients, except that the Infinispan version is kept the same.

    Development work

    • The EmbeddedRollingUpgradeEndpoint should also have the ability to serialize data using a portable serialization mechanism like Apache Avro, a JSON serializer or XStream
      • JIRA: <TBD>
    • The EmbeddedRollingUpgradeCacheStore should also have the ability to serialize data using a portable serialization mechanism like Apache Avro, a JSON serializer or XStream
      • JIRA: <TBD>

    Sizing considerations

    In all cases above, we assume that the existing cluster (Cluster A) is close to full capacity.  As such, to be able to perform a live, on-line migration, a new cluster (Cluster B) of the same size and capacity will be necessary.