1 2 Previous Next 22 Replies Latest reply on Jul 1, 2010 10:18 AM by mircea.markus

    Distributed Cache

    rs1050

      Hi,

      I am trying to address the following design: i would like to have set of standalone JVMs (let's say 3 instances) which are designated storage instances of the distributed cache.

      I also have an app server and other stand-alone processes which need to be able to read data from this distributed cache and update it with new data. Inside these processes I would like to have some near cache but of a limited size, in other words if the main cache storage JVMs are configured with 10,000 objects limit, the near cache would have only 500 limit. When app server puts a new object, all members of the cluster get the new value.

      Important is that I do not want app server and other stand-alone processes to be considered storage nodes for the cluster, i.e. when I start them up/bring them down there should be no rebalancing, the data should remain in the main 3 cache storage instances.

      Is it possible to achieve now with 4.0.0?

      Thank you

        • 1. Re: Distributed Cache
          rs1050

          having spent some more time reading the documentation it looks to me that it there is no notion of 'storage' vs 'non-storage' cluster members like it exists in coherence.

          In other words, if this particular JVM is a late-starter and wants to get 'customer' object from the already pre-populated clustered cache, the only way is to do it is to make this new jvm member of the distributed cache, but then it implies the rehashing. Especially it is troublesome if this JVM is a short-lived one, for example some routine processes which is invoked every 5 minutes by a cron job.

          Is this an accurate description regarding 'storage/non-storage'? I am hoping I missed something because I would think this is one of the decision making points for adoption of infinispan vs coherence.

          Thank you.

          • 2. Re: Distributed Cache
            manik

            Hi - yes, this is an accurate description. I think what you need is the client/server API which is scheduled for 4.1.0 (or the REST API which is available now). This will let you have the storage nodes effectively as one "cluster", and your app server nodes use one of the "clients" to query the storage nodes for state, write state there, etc. And if you need a "near cache", you could start a standalone Infinispan instance on the app server node with an aggressive eviction policy, and wrap the REST calls as a CacheStore implementation.

            • 3. Re: Distributed Cache
              rs1050

              Is client/server API still part of 4.1.0? What is your gut feeling for 4.1.0 GA timeline?

               

              Thank you.

              • 4. Re: Distributed Cache
                manik

                4.1.0.ALPHA1 is already out, with the client/server stuff speaking the memcached protocol.  ALPHA2, speaking the Hot Rod Protocol protocol as well, should be out before Easter.  We're gunning for a final 4.1.0 before the start of summer.

                • 5. Re: Distributed Cache

                  The functionality described is what I'm interesting also. 

                   

                  Using the client-server is not very interesting option. In that case you don't get clustering, servers discovery etc for your business instances, and have to implement this by yourself.

                   

                  The perfect solution for java-only systems is to have possibility to configure some nodes of the cluster as having only L1 cache - these will be the business nodes.

                   

                  Do you have any plans for this functionality?

                  • 6. Re: Distributed Cache
                    manik

                    You mean embedded mode?  This is already there.

                    • 7. Re: Distributed Cache

                      So, can I configure the nodes of cluster not to store the state in them and only to have the near cache?

                      • 8. Re: Distributed Cache
                        manik

                        No, they all store state as a "near cache" and share this state across the cluster. 

                        • 9. Re: Distributed Cache
                          galder.zamarreno

                          A thing to note here. Using client-server you do get clustering. The servers could have a clustered configuration and if you use the Hot Rod client, you only need to tell where one of those instances is, and hot rod will be able to query the server and get the list of servers forming the cluster. Then, the Hot Rod client itself will be able to do load balancing and failover. See http://community.jboss.org/docs/DOC-15356

                          • 10. Re: Distributed Cache

                            This is actually the functionality I was asking about. 

                             

                            Why not have the possibility to have the node are not sharing the state but get all the other features of cluster nodes?  Are there architectural/technology issues I don't see?

                            • 11. Re: Distributed Cache

                              The problem with HotRod (at least as I understand it now)  is that the client is aware of the cluster changes of storage nodes, but not about other client(=business) nodes!  So if I need to have client in a sort of cluster too, so that they can communicate - I need to put some additional cluster solution on top of them (e.g. gridgain - but then I can use gridgain's distributed cache instead).

                              • 12. Re: Distributed Cache
                                galder.zamarreno

                                So, u want clients to be in a cluster to communicate and do what with that information or that communication layer?

                                • 13. Re: Distributed Cache

                                  As always:

                                   

                                  (let setup some terms first : clients = business nodes = bs-nodes to be short)

                                   

                                  1. Know what other bs-nodes are in cluster so that some monitoring node can run new bs-nodes if some crashes, or if the load on some kinf of bs-nodes, or kill some bs-nodes that are no more needed

                                   

                                  2. Send a message/RPC/MapReduce call to a defined/group of/all bs-nodes to (1) distribute high-load processing between bs-nodes and/or (2) divide processing between node functionally etc

                                  • 14. Re: Distributed Cache
                                    galder.zamarreno

                                    Re 1. That's outside of the scope of Infinispan, which is an in-memory data grid. You can however build that functionality with JGroups or you can run your business nodes within clustered JBoss Application Servers where you can get view information.

                                     

                                    Re 2. At the Infinispan level, this will be available we'll be able to do things like this when https://jira.jboss.org/browse/ISPN-39 has been implemented. Once that's implemented, we'll look at the possibility of enabling Hot Rod clients to interact with that API although I'm not sure whether that will be feasible in a protocol independent manner.

                                    1 2 Previous Next