I'm new to Infinispan, sorry for any stupid questions. I'm evaluating it for a medium size processing platforms and would like to get some feedback about the feasibility of the architecture I've came up with after reading the documents which I found.
The system will have two components:
- a GUI component which displays (a subset of) data and generates commands
- a datastore / processing component which holds the data and changes it by reacting to the commands sent by the GUI
Important considerations are:
- high availability in the datastore tier
- low latency
- optimal data transfer from the data store to the GUI (ie. only deltas / changed elements should be transferred)
My current ideas are the following:
- use a set of hotrod servers with DIST mode and the number of copies set to a value I would be comfortable with (I'm thinking 2 or 3 currently)
- use these servers to store both the current state and the commands (this works out nicely, since I need to keep the commands for later auditing)
- make hashing such that commands and objects on which the commands operate get to the same subset of servers
Question: how can I control this? I don't want to control the specific node, but just to ensure that objects A and B get to the same subset of servers
- on each hotrod server add custom interceptors (1) which listen for the command objects and when one is intercepted modifies the corresponding object accordingly
- the GUI would write the commands to the correct HotRod servers trough topology aware clients
- the GUI would contain a local cache with a subset of objects. These objects would be synchronised with the HotRod servers (ie. when the objects change in the datastore tier / HotRod tier, the change is propagated to the GUI)
Question: what is the best way to achieve this? (to synchronise a local cache with a subset of data from a set of HotRod servers). The only option I'm aware of currently are continious queries (4)
- inside the data tier there would be "supporting" information which is needed by nodes, but may not be necessary be in the local node (think for example configuration which can be update runtime, but also more dynamic information). From what I've read, the L1 cache feature (5) would be perfect for this, except for the fact that it uses invalidation
- when the data changes, rather than sending an update (ie. if the data changes, it is invalidated and the non-local nodes have to fetch it again)
Question: is it possible to configure the L1 cache mechanism, such that the original node sends updates when the data changes rather than invalidations?
How optimal is the solution which I came up? How could it be improved? I've read about the Distributed Data Stream Processing Framework in Infinispan (3), but it seems to be more a one-off solution (ie. generate a report about all the existing objects at a given moment) rather than something which reacts to a new command as soon as it is written to the cache.
I'm looking to implement a data grid, where each node contains the data and the code to operate on the code. I will also be evaluating Hazelcast and GigaSpaces, but currently Infinispan seems to be the better alternative since it could be reused in multiple places in the architecture, making it easier to maintain and to understand. The JBoss Data Grid (2) also sounds interesting, but unfortunately it's not available yet.