6 Replies Latest reply on Jul 24, 2012 7:14 AM by galder.zamarreno

    How to design a distributed cache topology?

    moia

      Hi,

      I am currently investigating options for our project. I have spent some time playing with infinispan, and I don't seem to be able to achieve desired effect. So I would like to ask, if it's actually possible and how to do it:

      I would like to have a distributed cache consisting of several node.

      I need to to be able to influence, how many elements can be stored on each node. Some of the nodes will behave as 'clients' - they are our applications, that put and retrieve data from cache. Other nodes will serve only the purpose of cache.

      I can't use HotRod for clients to connect to the cache, as I need support for transactions. And I don't want the 'client' nodes to store the same amount of data, as the other nodes - they have quite heavy memory load from the application's activity.

      The ideal configuration for my purposes would be if the 'client' nodes served  only as L1 cache.

      Does this make any sense and is it possible with infinispan?

       

      Best regards.,

      Mikolaj

        • 1. Re: How to design a distributed cache topology?
          galder.zamarreno

          The easiest thing is probably to have client nodes access Infinispan in embedded mode (in-VM), and then adjust eviction settings per node, depending on how many nodes each can contain.

           

          You probably wanna use distribution as cluster mode.

           

          The key thing about eviction is to figure out where to store data once evicted (if you want to provide backup for it). This could be persisted in a shared JDBC database, a file based cache store...etc.

           

          Another possibility is to have a near cache style set up. This is not natively supported in Hot Rod, but there's a demo I built using JMS that would allow client Infinispan embedded access, and have Hot Rod servers behind. See my preso in:

          https://www.jboss.org/dms/judcon/2012india/presentations/day1track1session2.pdf

           

          And the code for it can be found in our distribution, under near cache client and server demo code.

          • 2. Re: How to design a distributed cache topology?
            moia

            I'll use a simple example to show my problem here.

            If I have a node C (client) configured as follows:

             

            <namedCache name="myCache">
              <eviction strategy="LIRS" maxEntries="50" />
              <expiration lifespan="-1" maxIdle="-1" reaperEnabled="false" wakeUpInterval="-1" />
              <clustering mode="dist">
                <hash numVirtualNodes="100" numOwners="1" />
                <async/>
              </clustering>
            </namedCache>
            

             

            and another node S (server) confiruged as follows:

             

            <namedCache name="myCache">
              <eviction strategy="LIRS" maxEntries="10000" />
              <expiration lifespan="-1" maxIdle="-1" reaperEnabled="false" wakeUpInterval="-1" />
              <clustering mode="dist">
                <hash numVirtualNodes="100" numOwners="1" />
                <async/>
              </clustering>
            </namedCache>
            

             

            When I put 1000 elements in the node C, eventually there will be about 550 elements in the cluster due to even hash distribution. So, even though the cluster's capabilities are much larger, nearly half of the elements will get evicted.

            I am looking for a configuration, where most of the elements will stay on the node S.

             

            Best regards,

            Mikolaj

            • 3. Re: How to design a distributed cache topology?
              galder.zamarreno

              Right, the only way I can currently think of is if the cache keys are generated using the key affinity service.

               

              This enables you to generate a key that's gonna be stored in a particular node. So, you could code against this service in such way that node S takes N times more keys than C.

               

              If you have particular keys that need to use (iow, you cannot reuse the generated keys), you'd need to map your natural keys to the generated ones. A bit complex, but that's the only way I can think of getting that to work right now.

              • 4. Re: How to design a distributed cache topology?
                mircea.markus

                ANother approach is to start multiple Infinispan nodes, each in its own VM, on the more powerful machine (S). This way on S you'd end up having more nodes and proportiaonally more load than on C.

                • 5. Re: How to design a distributed cache topology?
                  moia

                  Yes, but this will work only as long as I can limit number of 'C' type nodes.

                  Is there any other way to influence hash distribution over cluster?

                  I am going to try playing with machineId in transport configuration and numOwners.

                  What if I set numOwners=2, and I set the same machineId for all my 'C' type nodes, and different machineIds for my 'S' type nodes? Does this guarantee, that each element I put in the cluster has at least one replica stored in 'S' nodes?

                   

                  Best regards,

                  Mikolaj

                  • 6. Re: How to design a distributed cache topology?
                    galder.zamarreno

                    I think that would work too...