1 2 Previous Next 22 Replies Latest reply on Jul 1, 2010 10:18 AM by mircea.markus

Distributed Cache

rs1050 Nov 19, 2009 10:24 AM

Hi,

I am trying to address the following design: i would like to have set of standalone JVMs (let's say 3 instances) which are designated storage instances of the distributed cache.

I also have an app server and other stand-alone processes which need to be able to read data from this distributed cache and update it with new data. Inside these processes I would like to have some near cache but of a limited size, in other words if the main cache storage JVMs are configured with 10,000 objects limit, the near cache would have only 500 limit. When app server puts a new object, all members of the cluster get the new value.

Important is that I do not want app server and other stand-alone processes to be considered storage nodes for the cluster, i.e. when I start them up/bring them down there should be no rebalancing, the data should remain in the main 3 cache storage instances.

Is it possible to achieve now with 4.0.0?

Thank you

1. Re: Distributed Cache

rs1050 Nov 20, 2009 2:58 PM (in response to rs1050)

having spent some more time reading the documentation it looks to me that it there is no notion of 'storage' vs 'non-storage' cluster members like it exists in coherence.

In other words, if this particular JVM is a late-starter and wants to get 'customer' object from the already pre-populated clustered cache, the only way is to do it is to make this new jvm member of the distributed cache, but then it implies the rehashing. Especially it is troublesome if this JVM is a short-lived one, for example some routine processes which is invoked every 5 minutes by a cron job.

Is this an accurate description regarding 'storage/non-storage'? I am hoping I missed something because I would think this is one of the decision making points for adoption of infinispan vs coherence.

Thank you.
Actions
2. Re: Distributed Cache

manik Nov 23, 2009 3:47 PM (in response to rs1050)

Hi - yes, this is an accurate description. I think what you need is the client/server API which is scheduled for 4.1.0 (or the REST API which is available now). This will let you have the storage nodes effectively as one "cluster", and your app server nodes use one of the "clients" to query the storage nodes for state, write state there, etc. And if you need a "near cache", you could start a standalone Infinispan instance on the app server node with an aggressive eviction policy, and wrap the REST calls as a CacheStore implementation.
Actions
3. Re: Distributed Cache

rs1050 Mar 25, 2010 3:26 PM (in response to manik)

Is client/server API still part of 4.1.0? What is your gut feeling for 4.1.0 GA timeline?

Thank you.
Actions
4. Re: Distributed Cache

manik Mar 26, 2010 5:04 AM (in response to rs1050)

4.1.0.ALPHA1 is already out, with the client/server stuff speaking the memcached protocol. ALPHA2, speaking the Hot Rod Protocol protocol as well, should be out before Easter. We're gunning for a final 4.1.0 before the start of summer.
Actions
5. Re: Distributed Cache

zaos May 18, 2010 11:13 AM (in response to rs1050)

The functionality described is what I'm interesting also.

Using the client-server is not very interesting option. In that case you don't get clustering, servers discovery etc for your business instances, and have to implement this by yourself.

The perfect solution for java-only systems is to have possibility to configure some nodes of the cluster as having only L1 cache - these will be the business nodes.

Do you have any plans for this functionality?
Actions
6. Re: Distributed Cache

manik May 18, 2010 11:28 AM (in response to zaos)

You mean embedded mode? This is already there.
Actions
7. Re: Distributed Cache

zaos May 21, 2010 10:02 AM (in response to manik)

So, can I configure the nodes of cluster not to store the state in them and only to have the near cache?
Actions
8. Re: Distributed Cache

manik May 24, 2010 9:09 AM (in response to zaos)

No, they all store state as a "near cache" and share this state across the cluster.
Actions
9. Re: Distributed Cache

galder.zamarreno May 28, 2010 6:51 AM (in response to zaos)

A thing to note here. Using client-server you do get clustering. The servers could have a clustered configuration and if you use the Hot Rod client, you only need to tell where one of those instances is, and hot rod will be able to query the server and get the list of servers forming the cluster. Then, the Hot Rod client itself will be able to do load balancing and failover. See http://community.jboss.org/docs/DOC-15356
Actions
10. Re: Distributed Cache

zaos May 28, 2010 11:03 AM (in response to manik)

This is actually the functionality I was asking about.

Why not have the possibility to have the node are not sharing the state but get all the other features of cluster nodes? Are there architectural/technology issues I don't see?
Actions
11. Re: Distributed Cache

zaos May 28, 2010 11:10 AM (in response to galder.zamarreno)

The problem with HotRod (at least as I understand it now) is that the client is aware of the cluster changes of storage nodes, but not about other client(=business) nodes! So if I need to have client in a sort of cluster too, so that they can communicate - I need to put some additional cluster solution on top of them (e.g. gridgain - but then I can use gridgain's distributed cache instead).
Actions
12. Re: Distributed Cache

galder.zamarreno Jun 3, 2010 5:04 AM (in response to zaos)

So, u want clients to be in a cluster to communicate and do what with that information or that communication layer?
Actions
13. Re: Distributed Cache

zaos Jun 3, 2010 11:46 AM (in response to galder.zamarreno)

As always:

(let setup some terms first : clients = business nodes = bs-nodes to be short)

1. Know what other bs-nodes are in cluster so that some monitoring node can run new bs-nodes if some crashes, or if the load on some kinf of bs-nodes, or kill some bs-nodes that are no more needed

2. Send a message/RPC/MapReduce call to a defined/group of/all bs-nodes to (1) distribute high-load processing between bs-nodes and/or (2) divide processing between node functionally etc
Actions
14. Re: Distributed Cache

galder.zamarreno Jun 4, 2010 3:51 AM (in response to zaos)

Re 1. That's outside of the scope of Infinispan, which is an in-memory data grid. You can however build that functionality with JGroups or you can run your business nodes within clustered JBoss Application Servers where you can get view information.

Re 2. At the Infinispan level, this will be available we'll be able to do things like this when https://jira.jboss.org/browse/ISPN-39 has been implemented. Once that's implemented, we'll look at the possibility of enabling Hot Rod clients to interact with that API although I'm not sure whether that will be feasible in a protocol independent manner.
Actions

1 2 Previous Next

Go to original post