6 Replies Latest reply on Dec 3, 2008 5:44 AM by dworq

searchable, pull-based cache solution

dworq Nov 26, 2008 5:26 AM

His,

If anyone could please suggest, if JBossCache would be an appropriate solution in the following case?

I am looking for a distributable cache that could be asynchronously propagated to the remote clients (over internet, no vpn available).

The challenge is, to have pull-based updates to clients (that is, the clients should establish connection to be able work behind a firewall) that should maintain the latest state of their local cache based on the changes on the server.

And the second thing is to have this cache searchable:
- jofti seems not to support the last versions of JBossCache Core and especially POJO (its website is neither available at the moment);
- JBossCache Searchable Edition is based on full-text search which is inappropriate for the project because of speed and size concerns.

Thanks.

1. Re: searchable, pull-based cache solution

manik Nov 26, 2008 5:34 AM (in response to dworq)

jofti has been discontinued AFAIK.

Re: a pull-based solution, look at using a replicated LAN based cluster of caches (or standalone caches in LOCAL mode) on your client side, talking to remote caches across the internet using the TcpDelegatingCacheLoader (configured on the client) and the TcpCacheServer (running on the server side).

Re: JBC-Searchable, why do you feel this is inappropriate? You can configure which attributes on your domain model get indexed (using the @Field annotation) to limit the size of indexes generated.
Actions
2. Re: searchable, pull-based cache solution

dworq Nov 26, 2008 7:13 AM (in response to dworq)

Thanks for the tip on TcpCacheServer and TcpDelegatingCacheLoader.

Regarding JBC Searchable, won't it be to much of a memory and processor load, if all the indexes that are otherwise managed by the database start being indexed by lucene, in particular foreign keys.

Another reservation is the beta stage of development. Can it be used in enterprise?
Actions
3. Re: searchable, pull-based cache solution

manik Nov 26, 2008 3:44 PM (in response to dworq)

Any searchable caching product will have to maintain indexes. I don't think you can get away from it entirely, although you could optimize it to some degree.

And yes, it still is not in GA, but we hope to change that soon. :-) In the meanwhile, any feedback on the current release is always appreciated.
Actions
4. Re: searchable, pull-based cache solution

hergaty Nov 27, 2008 7:07 AM (in response to dworq)

Hi Manik,

I think I know, what dowrk meant when he spoke of memory and performance. The standard indexing mechanism of a database or any other storage for relational data is to build binary trees for those fields that should get indexed. Building a binary tree for, lets say a forgein key as a 4-byte integer, must be as thin as possible to get fast access an low memory consumption.

My question would be how much more memory and time lucene would take for its full text indices in comparison to binary trees. Or can lucence also build such thin indices that are not text (string) based?

regards,
Thomas
Actions
5. Re: searchable, pull-based cache solution

epbernard Nov 27, 2008 1:43 PM (in response to dworq)

Lucene is using an inverted index technology, not a b-tree nor bitmap technology. It particularly shines at indexing text documents. It's probably not the best technology to index FKs, though it can do it.

At this stage, I would encourage you to write a small proof of concept, that's the only way to see how much memory will be taken by the data structure and if Lucene addresses your needed.
Actions
6. Re: searchable, pull-based cache solution

dworq Dec 3, 2008 5:44 AM (in response to dworq)

Thanks for the answers. Yes, I was actually looking for a binary search since a full-text search would compromise speed and memory consumption when applied on thousands/millions of primary keys and to-be-indexed fields. As epbernard says, have to try it on a sample application first.
Actions

Go to original post