12 Replies Latest reply on Jul 5, 2009 11:46 PM by akluge

    Evolution of TcpDelegatingCacheLoaders/TcpCacheServers



      I am currently investigating the incorporation of JBC into an existing, large, environment. In the near future, I expect to touch on many issues from JBCACHE-789 (Proxy), 1259 (HA), and possibly 1262 (Executor).

      The use of JBC centers on the FarCaching pattern mentioned in the wiki, but with some additions. As a proof of concept, I have a cluster of TcpCacheServers, along with several stand alone caches, with a custom cache loader. The cache loader extends the AbstractCacheLoader to implement a distributed hash table. It contains several TcpDelegatingCacheLoaders, and delegates requests to the appropriate one as determined by a consistent hashing algorithm.

      I am looking at a number of issues now, such as event notifications, failovers and recoveries, and generally improving the structure.

      I thought it would be useful to touch base with whoever is active on these fronts on the JBC side, and hopefully some of what I do will be useful for a wider audience over time. If I know the JBC design goals, I might be able to work them in while addressing my own goals.

      One of the first things I want to look at using reader and writer threads for the TcpDelegatingCacheLoaders and possibly the TcpCacheServers. My main desire here is to allow these classes to be easily extended to handle events. This leads to an immediate question: how sensitive should I be to backwards compatibility issues? I am quite tempted to rebuild this CacheServer/Cacheloader pair from the ground up.

      Thanks for sharing your thoughts,

        • 1. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers

          Rebuilding the cache loader/cache server pair is definitely on the cards, and if you want to get involved you are more than welcome.

          My goals for this redesign are:

          1. Treat the TcpCacheServer (TCS) as a standalone module. I.e., it could be used directly, without the TcpCacheLoader (TCL).
          2. The TCS would be able to "speak" 3 different protocols (and should detect which protocol is in use and select the appropriate handler).
          3. These protocols are: Memcached ASCII protocol, Memcached binary protocol, as well as a custom designed binary protocol.
          4. The first two would allow people to reuse existing Memcached client libraries with JBC.
          5. The third one will allow us to encode additional information - such as cluster topology and failover information - when serving responses to clients that understand this.
          6. The TCL would be one such client, speaking the jbc custom protocol.
          7. Both the TCS and TCL would be heavily multithreaded, minimally synchronized, probably making use of Netty or Apache MINA as a layer over NIO sockets.
          8. The TCL should support reconnects (this is already there, some of this code could be reused)
          9. The TCL should support failover (responses from the TCS could contain piggybacked info on TCS cluster topology). TCL would then be able to fail over to alternate TCS instances.
          10. TCL *may* even be able to support load balancing via pluggable LB policies.

          Pretty big stuff. :)

          I'm not so keen on consistent hashing on the client (TCL), since with plans I have for data partitioning (JBCACHE-60) this will end up on the server side anyway. The client can then be simpler, just selecting a server and handling failover, possibly load balancing, etc.

          • 2. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers

            Thanks for the comments. I have had a chance to experiment with MINA and build a simple client/server. It did perform better than I expected, which was one of my major concerns with MINA. As a next step I will be implementing a system very similar to the current TCS/TCL (a binary protocol), and see what issues come up.

            I expect this to be very adaptable to additional uses; indeed the main reason I am doing this is to allow me to provide additional information asynchronously over the client-server connection, which is very difficult with the current close coupling between the request writing, and the reading of the response.

            Right now, the consistent hashing approach seems to meet my needs; however, I am also looking at another approach where the data is partitioned completely on the server side. My main motivation for looking at the server side partitioning as it would make failure recover easier. I see that there is another related discussion on this forum under the memcached client/server title. The data distribution approach I have is very similar to what they discuss, except that I divide the data according to the hash of the key in the key/value pair, and they divide the data according to the hash of the FQN. I considered this, but if the cache is used as a flat cache then all of the data would wind up on one node.

            It was also interesting to see the way that MINA registers filters, which is very similar to the way that I will be allowing the registration of requests/responses within the MINA filter. This is to allow the type of extensions to the existing protocol that I mentioned earlier.


            • 3. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers

              I've been busy, but hopefully I can come back to this project now. I have started to build a prototype with MINA. Mostly I am focusing on a JBC binary protocol, however, handling additional protocols should flow naturally using features built into MINA.

              The primary access to the cache will be via a binary protocol. This provides for efficiency as well as extensibility.

              A Prototypical JBoss Cache Request
              Byte Description
              Byte 0 (0x90) JBoss Cache Request Marker
              Byte 1
              Byte 2 A Java int containing the request type.
              Byte 3
              Byte 4
              Byte 5
              . Data specific to the request type.
              Byte n

              The Initial Byte

              The initial byte marker serves two main purposes. First, it allows the parser to confirm that the currently read byte is the beginning of a JBoss cache binary request.

              Secondly, a JBoss Cache MessageDecoder (A MINA MessageDecoder) can check the value of this byte in its decodable method and return true only if the byte matches the 0x90 JBoss Cache request marker.

              If the JBoss Cache MessageDecoder is used within a DemuxingProtocolDecoder (another MINA class) along with, say, a memcached MessageDecoder, the server can identify the protocol used in each request by examining the lead byte, and invoke the appropriate message decoder.

              Likely values for the first byte identifying the protocol.

              Value Meaning
              0x80 memcached binary request
              0x81 memcached binary response
              0x90 JBoss Cache binary request
              0x91 JBoss Cache binary response

              The Message Type

              The next field is a java int written into four bytes of the message. This identifies the specific type of message. It is used in a RequestDecoder (another standard MINA class) class to select the appropriate builder for the request.

              Data Specific to the Message Type

              The builder reads the remainder of the input stream, and constructs a Request object, which will be passed in to the IoHandler where it will carry out the expected actions on the cache.

              Right now, I am working through an implementation of a single test flow to be sure that the plumbing can generally work.

              • 4. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers

                Not sure why you'd see two types of requests (memcached and JBoss Cache) requests made over the same socket, once server-client handshake was complete. Is your protocol designed to be compatible with the existing memcached protocol?

                • 5. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers

                  Yes, I would have thought the protocol is decided upon during handshake, and a server set up to serve on the JBC protocol would not be able to serve on the memcached protocol, and vice versa.

                  • 6. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers


                    Sorry for taking a bit of time to get back on this, but I have been busy.

                    I would expect any given server to respond to both the Memcache and the Jboss Cache protocol. However, I would expect any given client to use only one of them.

                    I was not planning on a handshake, but to simply have the client connect, and send requests according to its chosen protocol. Even in this case you will want to have the magic bytes.

                    - They allow the server and the client to do a consistency check. If they see something else at the beginning of a message, they know quickly that something is wrong.

                    - They make it easier to handle additional message types (I expect to add event notifications over the JBC protocol).

                    - The Memcache binary protocol (http://code.google.com/p/memcached/wiki/MemcacheBinaryProtocol) already uses them, so at least for Memcache requests/responses they are required.

                    - The Memcache protocol spec also makes the point that it makes it easier for protocol analyzers to work with the magic bytes.


                    • 7. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers


                      I have had chance to get back to this, and have made some good progress.
                      I have a cache client and server, both built with MINA, and that use a binary
                      protocol for put and get requests. Some interesting points

                      - The performance is good (<1ms/ async request)
                      - A request submission returns a Future, and multiple requests can be sent, and their results recovered later. The TCPDelegating cache loader, however, immediatly waits on the future.

                      I have run into one issue. For a cache miss, the CacheLoaderInterceptor always loads the entire cache under the FQN, even if allKeys is false. To me, this is not an optimal solution for a large scale solution. I would want to load only the specific missed value when a get fails. Is there a reason to always load the enire cache under the FQN on a cache miss?

                      • 8. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers

                        By entire cache under an FQN, do you refer to:

                        - all the keys/values associated with the FQN or

                        - all the keys/values associated with the FQN and the any children nodes of that FQN as well?

                        What JBC version are you using?

                        Finally, for the cache miss, what operation where you calling? Where you calling move() by any chance?

                        • 9. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers


                          The test setup is using a one cache (L1) connected to a separate
                          cache (L2) with TcpDelegatingCacheLoader. The second cache is
                          of course running with a TcpCacheServer.

                          When I refer to a cache miss, I am speaking of performing a get for
                          key that does not exist in the L1 cache. The get request is intercepted,
                          as it should be, by the CacheLoaderInterceptor.

                          The visitGetKeyValueCommand method is invoked, and it has the full
                          get command (most importantly, it has the Fqn and the key). The
                          Fqn and the key are passed to the loadIfNeeded method, along with
                          false for the allKeys parameter. So far, everything looks reasonable.

                          However, the loadIfNeeded method does not directly use the allKeys
                          parmater, it simply passes it on to the mustLoad method, and doesn't
                          concern us any further.

                          As expected, mustLoad returns true if the key is not in the cache,
                          and the loadIfNeeded method then loads the entire node from the loader,
                          which in this case results in the entire node being loaded across the
                          network. This request is delegated to the relevant loader, in this case a

                          This is clearly the way that JBossCache is designed, as the
                          CacheLoader interface defines only a get(Fqn) method, and no
                          get(Fqn, key). The question is, why was this design decision made?
                          Is there something that depends on this behavior?

                          This really hurts in terms of scalability. I an looking at replacing
                          the interceptor and the loader eventually so that a get does not attempt
                          to load an entire segment of the cache as it does now.


                          • 10. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers


                            This is clearly the way that JBossCache is designed, as the
                            CacheLoader interface defines only a get(Fqn) method, and no
                            get(Fqn, key). The question is, why was this design decision made?
                            Is there something that depends on this behavior?

                            The loader is designed to lock and load one node at a time. And this is why it makes sense to load the entire node contents in one go, as this will then mark the node as 'loaded' and subsequent operations on the node need not check the loader for the presence of keys, etc. Agreed that there are other ways to deal with this and that a get(fqn, key) could have been used, but this is the way it is for now.

                            Also, the general recommended use of the cache is to only group a few related attributes in a Node. A Node isn't designed to hold a large number of attributes. This minimises the 'pain' of a get(Fqn) call.

                            I would not recommend rewriting the CacheLoaderInterceptor at this stage - if anything, I'd say put more effort into Infinispan instead, where similar functionality will be more efficient.

                            • 11. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers

                              Following on what Manik said, for the Infinispan 4.1.0 release, we're gonna providing a client/server module, see https://jira.jboss.org/jira/browse/ISPN-29

                              If you're interested in helping out in any way, please let us know. We'll be discussing it in the https://lists.jboss.org/mailman/listinfo/infinispan-dev pretty soon! :)

                              • 12. Re: Evolution of TcpDelegatingCacheLoaders/TcpCacheServers


                                I have taken a look into the Infinispan project, and it is interesting how much of it parallels what I am doing currently. Especially in light of my decision to use a server side hash to dispatch the remote cache requests from any server into the cluster of servers, rather than having each client maintain a connection to each server. However, at this point, it would disrupt an ongoing project to make the shift to Infinispan.

                                That said, I expect that a lot of what I have done can be applied to the Infinispan project as well - and indeed I have joined the project dev mailing list.

                                The current project is at a scale where significant data will be loaded into a single node, and there will be occasional writes to key/value pairs into the clustered far cache. When the write happens, the L1 (in process) cache will have to reload the altered data from the far cache. This just won't perform very well if it has to load the entire node every time. This actually seems to be working currently, except that it does reload the entire node. A slight alteration of the cache loader interceptor will load only the single effected key/value pair on a get request - I think I'll call it an L2 loader to avoid confusion with the current loader. It really will be a small, easy change, and won't be required to use the rest of the code.