5 Replies Latest reply on Jun 13, 2013 10:27 AM by cdman

Architectural questions about a system to be built on Infinispan

cdman May 27, 2011 4:24 AM

Hello all,

I'm new to Infinispan, sorry for any stupid questions. I'm evaluating it for a medium size processing platforms and would like to get some feedback about the feasibility of the architecture I've came up with after reading the documents which I found.

The system will have two components:

a GUI component which displays (a subset of) data and generates commands
a datastore / processing component which holds the data and changes it by reacting to the commands sent by the GUI

Important considerations are:

high availability in the datastore tier
low latency
optimal data transfer from the data store to the GUI (ie. only deltas / changed elements should be transferred)

My current ideas are the following:

use a set of hotrod servers with DIST mode and the number of copies set to a value I would be comfortable with (I'm thinking 2 or 3 currently)
use these servers to store both the current state and the commands (this works out nicely, since I need to keep the commands for later auditing)
make hashing such that commands and objects on which the commands operate get to the same subset of servers
Question: how can I control this? I don't want to control the specific node, but just to ensure that objects A and B get to the same subset of servers
on each hotrod server add custom interceptors (1) which listen for the command objects and when one is intercepted modifies the corresponding object accordingly
the GUI would write the commands to the correct HotRod servers trough topology aware clients
the GUI would contain a local cache with a subset of objects. These objects would be synchronised with the HotRod servers (ie. when the objects change in the datastore tier / HotRod tier, the change is propagated to the GUI)
Question: what is the best way to achieve this? (to synchronise a local cache with a subset of data from a set of HotRod servers). The only option I'm aware of currently are continious queries (4)
inside the data tier there would be "supporting" information which is needed by nodes, but may not be necessary be in the local node (think for example configuration which can be update runtime, but also more dynamic information). From what I've read, the L1 cache feature (5) would be perfect for this, except for the fact that it uses invalidation
when the data changes, rather than sending an update (ie. if the data changes, it is invalidated and the non-local nodes have to fetch it again)
Question: is it possible to configure the L1 cache mechanism, such that the original node sends updates when the data changes rather than invalidations?

How optimal is the solution which I came up? How could it be improved? I've read about the Distributed Data Stream Processing Framework in Infinispan (3), but it seems to be more a one-off solution (ie. generate a report about all the existing objects at a given moment) rather than something which reacts to a new command as soon as it is written to the cache.

I'm looking to implement a data grid, where each node contains the data and the code to operate on the code. I will also be evaluating Hazelcast and GigaSpaces, but currently Infinispan seems to be the better alternative since it could be reused in multiple places in the architecture, making it easier to maintain and to understand. The JBoss Data Grid (2) also sounds interesting, but unfortunately it's not available yet.

Best regards,

Attila Balazs

(1) http://community.jboss.org/wiki/InfinispanCustomInterceptors

(2) http://www.jboss.com/edg6-early-access/

(3) http://community.jboss.org/wiki/DistributedDataStreamProcessingFrameworkInInfinispan

(4) http://community.jboss.org/wiki/ContinuousQueryWithInfinispan

(5) http://community.jboss.org/wiki/ClusteringModes#L1

1. Re: Architectural questions about a system to be built on Infinispan

galder.zamarreno Jun 1, 2011 1:05 PM (in response to cdman)

For controlling hashing, you can do several things:
- use the new grouping API (https://issues.jboss.org/browse/ISPN-312)
- or put the commands and objects within an atomic map and that will guarantee that related objects are stored in the same nodes
- or use the key affinity service

The new grouping API would be the preferred method.

Rather than custom interceptors, I'd just use listeners which are probably better suited for your use case http://community.jboss.org/docs/DOC-14871

For notifications from the backend to the GUI, you can either use continous queries (this is experiemental!), or you could use plain standard JMS notifications from the server back to the clients.

Well, the L1 is just there to support distribution where a subset of the nodes contain the data. So, when a non-owner request the info, it keeps it for a short period of time, but when invalidated, it can always go back and retrieve it from the other nodes. So, I don't understand what you want L1 for here. You could have a replicated cache for this information if you want data to be available locally, but replication only works on small clusters.
Actions
2. Re: Architectural questions about a system to be built on Infinispan

mircea.markus Jun 3, 2011 4:37 AM (in response to galder.zamarreno)

For notifications from the backend to the GUI, you can either use continous queries (this is experiemental!), or you could use plain standard JMS notifications from the server back to the clients.
We plan to support notifications over hotrod as well: https://issues.jboss.org/browse/ISPN-374
Feel free to vote for it if you find it useful, or if you want to contribute the feature just let us know!
Actions
3. Re: Architectural questions about a system to be built on Infinispan

mircea.markus Jun 3, 2011 4:42 AM (in response to mircea.markus)

on each hotrod server add custom interceptors (1) which listen for the command objects and when one is intercepted modifies the corresponding object accordingly
you'll be able to solve this more nicely with https://issues.jboss.org/browse/ISPN-1094, but your approach sounds good as well.
Actions
4. Re: Architectural questions about a system to be built on Infinispan

pmuir Jun 20, 2011 7:36 AM (in response to galder.zamarreno)

Long time coming, but I just added this article to the docs on the Grouping API :-)
Actions
5. Re: Architectural questions about a system to be built on Infinispan

cdman Jun 13, 2013 10:27 AM (in response to cdman)
Hello all,

Thank you for the feedback. It has been a long time coming, but finally I put together a concrete code example using some of techniques from above. You can read the article about it here: http://www.todaysoftmag.com/article/en/12/High_availability_performance_systems_using_data_grids_in_Java_437 and get the code here: https://github.com/cdman/infinispan-exchange

I would also like to thank Dan Berindei for his help.

Some updates:
hotrod is not used because watching for elements would have needed implementing an interceptor (I understand from Dan that listeners are called before the element is actually committed and using them would have created the risk of double-processing). Instead I just created a REST endpoint
also, hotrod doesn't support continious query, so there is no good solution for the output
an other issue with HotRod (in my opinion) is that implementation of non-java clients seems to have stalled a little bit
the transaction + failover support seems to have some bugs which Dan is (has?) ironed out

Unfortunately the performance I got out of the system (~1000 TPS) is too low for the kind of solution I was looking for (feel free to look at the code and suggest improvements, but the matching engine itself supports ~400k operations on the same system and a test with Radargun confirmed that this value is in the range of values to be expected).

Thank you,
Attila
Actions

Go to original post