HotRodBulkGet - Design

Version 13

    Introduction

    Current HotRod protocol specification does not support bulk retrieval of data: e.g. retrieval of all key/value that reside on a node in a single request: ISPN-516. This functionality is required by RemoteCacheStore in order to be a fully flagged CacheLoader implementation. Namely, following API methods require this functionality:

     

    public interface CacheLoader {
       ....
       public void toStream(ObjectOutput outputStream);
       public Set<InternalCacheEntry> loadAll()
       public Set<InternalCacheEntry> load(int numEntries)
       ...
    }
    

     

    Following method won't be supported by RemoteCacheStore (see Manik's first comment for more details on why this is not supported):

    public Set<Object> loadAllKeys(Set<Object> keysToExclude)

     

    This document discusses various aspects of extending HotRod with BulkGet.

     

    Operation description

    BulkGet request

       [header] [entry count]

       - header : request header as specified here.

       - entry count [vint] : maximum number of Infinispan entries to be returned by the server (entry == key + associated value). Needed to support CacheLoader.load(int). If 0 then all entries are returned (needed for CacheLoader.loadAll()).

     

    BulkGet response

      [header] [more] [key size 1] [key 1] [value size 1] [value 1] [more] [key size 2] [key 2] [value size 2] [value 2] [more]...

      - header : response header as specified here

      - more [1 byte, 0 or 1 ] : specified weather end of stream is  reached or not (i.e. are there more entries to be read?)

      - key size [vint] : the size of the key "key 1"

      - key [byte array] : the key as a byte array of length "key size 1"

      - value size 1 [vint] : the size in bytes of the value "value 1"

      - value 1 [byte array] : bytes containing the actual value

    Note: another approach is to send the number of entries in advance, and not use "more" flag. This would be less flexible though, especially given the fact that we don't lock the data before sending it, so its size might vary (see section on Replicated clusters).

    Server side

    Unlike the existing HotRod operations that map to Infinispan's java.util.Map API straight forward, BulkGet is/might be influenced by Infinispan mode: replicated or distributed. Bellow are some considerations regarding BulkGet semantic according to this.

     

    Replicated (ISPN cluster in REPL mode)

    The same logic and code from state generation, in case of in memory state retrieval can be used

     <stateRetrieval fetchInMemoryState="true"/>

    .This is what "fetchInMemoryState" does beyond the scene (state generator's end): iterates over the DataContainer and writes the state into an ObjectOutput (this might need an enhancement to be aware of key filters, or we can write an ObjectOutput implementation that is aware of filtering). No locking is performed: this is fast but consistency might suffer. The code is here: StateTransferManagerImpl.generateInMemoryState.

     

    Distribution (ISPN cluster in DIST mode)

    Things are more complicated when it comes to ditribution, and data resided on multiple nodes.

    Possible solutions:

    1. Only return locally stored data. This would break the cache loader's contract, in the sense that preloading from a distributed ISPN cluster running HotRod servers might(and most likely will) result in less data than was originally written in the store. Should be clearly documented. A pro for this solution is that the code is already there : same logic and code as in Replicated discussed bellow. It would also fit BulkGet(max entry) request - this corespond to remote caches that have eviction enabled, so nothing incorrect here.

    2. move the hard work on the client: HR client client connects to all existing HR servers and fetches the entire state from each one of them (possibly in parallel). More difficult to implement, but doable.

     

    My vote is for 1, for its simplicity. If comunity asks for 2 we can do that as well (it is an extenssion of 1 actually)