Couple thousands of puts/reads on 8 indexes NRT vs Ram Directory?
javadevmtl Jan 1, 2014 3:17 PMHi, using:
Infinispan-6.0.0
Java 1.7_17
OpenSuse 11
Server is 16 Core 128GB
Network is 1Gigabit. All machines plugged to same switch.
Reading through the Lucene docs NRT seems to be a wrapper around RamDirectory but for some reason enabling NRT is faster then just RAM directory.
I.e:
<namedCache name="myCache"> | ||||
<jmxStatistics enabled="true"/> | ||||
<clustering mode="distribution"> | ||||
<async/> | ||||
<hash numOwners="1"/> | ||||
</clustering> | ||||
<storeAsBinary enabled="true"/> | ||||
<indexing enabled="true" indexLocalOnly="true"> | ||||
<properties> | ||||
<property name="hibernate.search.default.indexmanager" value="near-real-time" /> | ||||
<property name="hibernate.search.default.directory_provider" value="filesystem" /> | ||||
<property name="hibernate.search.default.indexwriter.merge_factor" value="1024" /> | ||||
<property name="hibernate.search.default.indexwriter.ram_buffer_size" value="128" /> | ||||
<property name="hibernate.search.default.sharding_strategy.nbr_of_shards" value="8" /> | ||||
</properties> | ||||
</indexing> | ||||
</namedCache> |
Is faster then...
<namedCache name="myCache"> | ||||
<jmxStatistics enabled="true"/> | ||||
<clustering mode="distribution"> | ||||
<async/> | ||||
<hash numOwners="1"/> | ||||
</clustering> | ||||
<storeAsBinary enabled="true"/> | ||||
<indexing enabled="true" indexLocalOnly="true"> | ||||
<properties> | ||||
<property name="hibernate.search.default.directory_provider" value="ram" /> | ||||
<property name="hibernate.search.default.indexwriter.merge_factor" value="1024" /> | ||||
<property name="hibernate.search.default.indexwriter.ram_buffer_size" value="128" /> | ||||
<property name="hibernate.search.default.sharding_strategy.nbr_of_shards" value="8" /> | ||||
</properties> | ||||
</indexing> | ||||
</namedCache> |
Note: Even though setup in distribution mode only starting single node to discount network as the problem.
And y NRT being faster then RAM, way faster!
In fact with NRT enabled I can do 16,000 puts per second while with RAM only max 750 puts per second. I checked the JMX stats and averageWriteTime with NRT is 0ms while with ram it's 25ms or higher.
This is my model...
@Indexed
@ProvidedId
@SerializeWith(...)
public class MyModel
{
@DocumentId
Integer id;
//@Field
Integer customerId;
//@Field
Integer code;
@Field(analyze = Analyze.NO)
String name;
@Field(analyze = Analyze.NO)
String acctHash;
@Field(analyze = Analyze.NO)
String address;
@Field(analyze = Analyze.NO)
String phoner;
@Field(analyze = Analyze.NO)
String email;
@Field(analyze = Analyze.NO)
String shipTo;
@Field(analyze = Analyze.NO)
Long ip;
long lease;
// Get setters and the marshallers here...
}
Also query performance is like 40 queries per second. So that's really slow. I know the query is a heavy one, but I would still expect better performance. I tested same query using CQEngine which is a JAVA collections indexing API and it was verry fast. But I'm not here to compare. Because infinispan has all the extra goodies I need liek distribution. So I want to figure out how to tune it all right
The query...
Query query = qf.from(MyModel.class) | ||||
.maxResults(20000) | ||||
.having("acctHash").eq(trxRequest.getAcctHash()) | ||||
.or().having("phone").eq(trxRequest.getPhone()) | ||||
.or().having("email").eq(trxRequest.getEmail()) | ||||
.or().having("ip").eq(trxRequest.getIp()) | ||||
.or().having("name").eq(trxRequest.getName()) | ||||
.or().having("address").eq(trxRequest.getAddress()) | ||||
.or().having("shipTo").eq(trxRequest.getShipTo()) | ||||
.toBuilder().build(); |
How I tested it all...
Basically I put infinispan in my vertx.io web application for each HTTP POST it does the following...
1- Receive POST
2- Parse POST params
3- Cache PUT
4- Build Query
5- Execute query
6- Return response.
I got my numbers mentioned above in 2 methods...
1- Looking at the JMX stats
2- Visually looking at JMeter reports
The test is setup as follows...
JMeter(200 users) ----> Vertx ---> Infinispan embeded in vertx app.
1- Puts only (NRT) 16,0000 requets/sec (JMeter Reports) averageWriteLatency 0ms (in JMX)
1- Puts only (RAM) 750ms requets/sec (JMeter Reports) averageWriteLatency 25+ms (in JMX)
1- Puts + Query either NRT or RAM 40 requests/sec (JMeter Reports) didn't check JMX cause I figured I try to tune indexing properly first andf hope it works out later...
Here are the snapshots...
https://dl.dropboxusercontent.com/u/27413499/NRT.nps