-
1. Re: Inserting millions of records in remote cache
wdfink Mar 20, 2014 3:39 AM (in response to nikhil_joshi)As the entities are filled into the cache and the store with special consistent hash you need to add these via HotRod client, it depends on the power of your server whether you have benefits if you do that with different threads in chunks.
What you mean by "mysql table which has all these records"? Is that an existing table with the kye/values? AFAIK there is no possibility as ISPN store additional informations with the key and it is not possible to use a simple key/value table but this is subject to change.
-
2. Re: Inserting millions of records in remote cache
rpelisse Mar 20, 2014 6:05 AM (in response to nikhil_joshi)Hi,
If your data model allows it, you could try to leverage the "concurrent nature" of Infinispan, and simply fire several clients, and try to find the sweet spot (ie the number of concurrent clients that lead to the best performance). Also, if your data lives in a datastore (SQL or otherwise) you could try to load using the CacheStore API, it might just be more efficient.
Out of my head, I'm not sure if the Batch API is supported remote, but if so, batching the insert (by chunk of 1000, 10000 or more) can also help performance.
Other than that, usual JVM "tricks" may help the overall performance (large heap, GC collector settings, use large pages, and so on...). I can't put my hands on it right now, but there is a nice blog entry from Shane Johnson about those.
-
3. Re: Inserting millions of records in remote cache
nikhil_joshi Mar 20, 2014 10:50 AM (in response to wdfink)What you mean by "mysql table which has all these records"? Is that an existing table with the kye/values? AFAIK there is no possibility as ISPN store additional informations with the key and it is not possible to use a simple key/value table but this is subject to change.
Actually the way I imagined this was
1) "string-keyed-jdbc-store" would store the key/values in table as human readable string not in binary or encrypted, case would be different if i use binary store.
2) If I explicitly need to map id,data & timestamp with database columns in configurations then infinispan will not bother about other columns present in that table.
<string-keyed-jdbc-store datasource="java:jboss/datasources/dbs" passivation="false" preload="true" purge="false">
<string-keyed-table prefix="JDG">
<id-column name="CACHE_KEY" type="VARCHAR(255)"/>
<data-column name="CACHE_DATA" type="VARCHAR(255)"/>
<timestamp-column name="CACHE_ENTRY_TIME" type="BIGINT"/>
</string-keyed-table>
</string-keyed-jdbc-store>
BTW it looks like table name should be same as cache name with "prefix" configured.
Essentially we are looking something like overriding inifnispan's mechanism to preload cache from store (in our case its our database instead of standard JDBC cache store)
-
4. Re: Inserting millions of records in remote cache
nikhil_joshi Mar 20, 2014 12:53 PM (in response to nikhil_joshi)http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_api http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_api
...I guess writing custom store using AdvancedCacheLoader/Writer may work.
-
5. Re: Inserting millions of records in remote cache
wdfink Mar 20, 2014 3:15 PM (in response to nikhil_joshi)Yes you need to implement a custom store.
The StringKeyed.... does not store simple readable key/value. The difference between this and binarykeyed is an implementation detail and String... is better for concurrent access
-
6. Re: Inserting millions of records in remote cache
nikhil_joshi Mar 21, 2014 4:00 PM (in response to nikhil_joshi)Using putAsync() and overriding Async executor factory, I am able to insert records to grid considerably fast in batches.