1 Reply Latest reply on Aug 12, 2009 4:35 AM by manik

Best practice for large read-only map?

edwardgao Aug 11, 2009 4:49 PM

Hello, thanks for the great product! I am new to this area, please forgive me if the question looks silly.

The problem I am facing now is that I have a very large (say, 20G) map that cannot fit into single machine's memory, the map however is read-only. I will have multiple clients that request the values in the map, and I want these clients also host a part of the map.

An optimal scenario could be 200 clients are spawned and each of them host 100M or 200M of the map, then some of the clients may terminates earlier, and the data they host will be redistributed to other clients. Also, new clients may be spawned at any time, and it will then automatically take some of the data and serve them to other clients.

I guess the REPLICATE mode does not work since it is supposed to host all the data, which of the two other mode should I use then? Also, are there any way to take advantages of the read-only nature of the map, since no write/update is expected.

Thanks a lot in advance for your help!

1. Re: Best practice for large read-only map?

manik Aug 12, 2009 4:35 AM (in response to edwardgao)

DIST would be what you need. You're correct in that replication will not help you.

Regarding the read-only nature of the map, ensure you have L1 caching enabled as well so that you can minimise the cost of repeated lookups on a single, remote key.

Keep in mind that as of ALPHA6, DIST is pretty unstable. Trunk is a bit better and I hope to cut a far more complete and stable version as BETA1 soon.
Actions