-
1. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
kbachl Nov 22, 2014 5:32 AM (in response to bes82)The numbers you report are way to low IMHO;
What infinispan storage do you use? How is it configured?
Why do you write
"This includes updating the mentioned index synchronously."
vs.
"because with infinispan configured to async write through should not"
?
Are you doing it sync or async?
Maybe you want post your modeshape config file as well as your infinispan config;
Best,
KB
-
2. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
bes82 Nov 24, 2014 2:38 AM (in response to kbachl)Modeshape local Indexes are updated synchronously, infinispan cache is set to async, two different things.
----
<?xml version="1.0" encoding="UTF-8"?>
<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:infinispan:config:6.0 http://www.infinispan.org/schemas/infinispan-config-6.0.xsd"
xmlns="urn:infinispan:config:6.0">
<global>
<globalJmxStatistics enabled="false" allowDuplicateDomains="true"/>
</global>
<namedCache name="contentRepository">
<transaction
transactionManagerLookupClass="org.infinispan.transaction.lookup.GenericTransactionManagerLookup"
transactionMode="TRANSACTIONAL"
lockingMode="OPTIMISTIC" />
<persistence
passivation="false">
<singleFile
preload="false"
shared="false"
fetchPersistentState="false"
purgeOnStartup="false"
location="${datalocation}">
<!-- write behind configuration -->
<async enabled="true"/>
</singleFile>
</persistence>
<!-- limit the number of nodes to hold in memory -->
<eviction maxEntries="8192" strategy="LIRS" />
</namedCache>
</infinispan>
----
{
"name" : "modeshapeRepository",
"jndiName": "jcr/modeshapeRepository",
"monitoring" : {
"enabled" : true
},
"indexProviders" : {
"local" : {
"classname" : "org.modeshape.jcr.index.local.LocalIndexProvider",
"directory" : "${indexlocation}"
}
},
"storage" : {
"cacheName" : "contentRepository",
"cacheConfiguration" : "META-INF/infinispan-file-config-6.xml",
"binaryStorage" : {
"type" : "file",
"directory": "${binarylocation}"
}
},
"workspaces" : {
"default" : "default",
"allowCreation" : true
},
"security" : {
"anonymous" : {
"roles" : ["readonly","readwrite","admin"],
"useOnFailedLogin" : false
}
}
}
-
3. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
kbachl Nov 24, 2014 7:44 AM (in response to bes82)I think your problem is here:
"<eviction maxEntries="8192" strategy="LIRS" />"
You work with way over 10' objects that have to go in, but you limit the max number of in memory entries for infinispan to 8' - remove this and then try it;
Also
<singleFile
preload="false"
leads to a slower adoption at the benefit of a bit faster startup time; I would set this to true; Also make sure your ram is big enough to hold *all* data in ( the whole repo) as you use the singleFileStore of infinispan 6;
You might also want to give your binary storage a
"minimumBinarySizeInBytes": 1048576
(here 1 MB) so only big files are written to the filesystem directly (this can lead to slower performance);
Best,
KB -
4. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
bes82 Nov 25, 2014 3:59 AM (in response to kbachl)Thanks for the infos.
I was talking about an import job on an empty repository. I can see when eviction starts to take place, because then (I guess when nodes have to be reloaded) performance starts to drop a bit (but not much). So 80ms is the performance measured right from the beginning (test started on empty repository)
Preload doesn't change anything for my test, I guess again because of the empty repository at the start.
Currently I don't store binaries but thanks for mentioning this.
What I store though is some nodes having ~1k properties, might this be a problem?
-
5. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
bes82 Dec 4, 2014 9:42 AM (in response to bes82)I found out where the bottleneck is, LocalIndexProvider using MapDB:
It seems (at least with synchronous indexes) every node put into an index means direct synchrounous blocking disk I/O.
So I started with a simple FS-based ramdisk and suddenly modifying or storing nodes is 100 times (no joke) faster. The ramdisk implemenation used is even able to sync to disk every few seconds.
So I started looking at the code of LocalIndexProvider and tried to play around with DBMaker in order to come up with a pure Java solution.
Currently MapDB is just used as: this.db = DBMaker.newFileDB(file).make();
Adding mmapFileEnableIfSupported() already increased performance by a factor of 10, I guess without any drawbacks. Still ten times slower as with a ramdisk.
So I'm currently playing around with asyncWriteFlushDelay, asyncWriteEnable and cacheLRUEnable.
AsyncWriteFlushDelay seems to totally block everything, which I don't understand and the implications of cacheLRUEnable are not yet understood.
If anyone could shed some light on which options I could/should (better not) use, that would be very helpfull.
-
6. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
rhauch Dec 4, 2014 9:50 AM (in response to bes82)Obviously we'd like to have it perform as fast as possible, and it sounds like you've found a few switches that make a considerable difference. Feel free to log an enhancement request in JIRA and create a pull request with specific changes to how the MapDB maps are created. First, doing so lets us see what you're proposing as well as run with the same proposed changes. Second, it would allow us to collaborate on which set of switches/options makes the most sense -- there are quite a few added in recent MapDB releases that I wasn't aware of.
-
7. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
ma6rl Aug 13, 2015 3:17 PM (in response to bes82)bes82 Were you able to make any additional progress with improving the MapDB performance?
-
8. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
bes82 Aug 14, 2015 3:50 AM (in response to ma6rl)I guess the MapDb index implemented some configuration settings in 4.3 mmapFileEnableIfSupported and cacheLRUEnable are the options to go for.
But in general MapDb single property indexes were just way to slow for my use case.
So I created my own lucene based provider that can handle multiple properties per index. So every nodeType is now an index and I can do things like "search all from nt:x where a=b and c!=d and e>f" dramatically faster than with MapDb.
However the Lucene index does not yet support joins and some property types, that the MapDb provider supports. It's working very well for me, but I'm not at the point where I think it's bugfree enough to be release it to the public.
-
9. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
ma6rl Aug 14, 2015 11:42 AM (in response to bes82)Thanks for the update
Bjoern Schmidt wrote:
I guess the MapDb index implemented some configuration settings in 4.3 mmapFileEnableIfSupported and cacheLRUEnable are the options to go for.
I can see the flags in LocalIndexProvider.java but there does not appear to be anyway to set them other than changing the values in the class and rebuilding Modeshape. I don't believe they have been exposed via any of the Modeshape configuration mechanisms.
We are also seeing really poor performance with writes to MapDB especially under concurrent load. We see our through put drop from being able to write 400 nodes a second without indexes to less than 40 nodes a second with a single sync index enabled. While I did expect to see a drop in write performance with indexing, a 10 times reduction does seem a little extreme. I did some profiling using Flame Graphs and all of CPU time is spent in LocalIndexProvider and the MapDB classes.
It's encouraging to hear that you are seeing better performance with Lucene as we have also been looking at alternative index providers. I know there is an open issue [MODE-2159] Store indexes in local Lucene - JBoss Issue Tracker to add support for Lucene. It is currently assigned to 4.4 but am not sure if it is going to be in the release. hchiorean, do you know if MODE-2159 is still planned for 4.4 or is it going to be moved out?
I'm also running into another interesting issue with MapDB indexes when using user transactions and pessimistic locking. I intermittently see concurrent writes (to different parts of the node hierarchy) deadlock so that only a few complete and the others all sit until the underlying infinispan locks timeout. Theses issues do not occur with indexing disabled. I'm working on trying to create a test case to demonstrate this and if I can am going to create an issue for it.
hchiorean, rhauch, do you have any suggestions or feedback about the performance we see writing to a MapDB local index?
-
10. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
hchiorean Aug 17, 2015 3:50 AM (in response to ma6rl)You can configure various MapDB options: modeshape/LocalIndexProvider.java at modeshape-4.3.0.Final · ModeShape/modeshape · GitHub both via JSON: modeshape/local-index-provider-with-custom-settings.json at modeshape-4.3.0.Final · ModeShape/modeshape · GitHub
and via the Wildfly XML: modeshape/standalone-modeshape.xml at modeshape-4.3.0.Final · ModeShape/modeshape · GitHub
It's encouraging to hear that you are seeing better performance with Lucene as we have also been looking at alternative index providers. I know there is an open issue [MODE-2159] Store indexes in local Lucene - JBoss Issue Tracker to add support for Lucene. It is currently assigned to 4.4 but am not sure if it is going to be in the release. hchiorean, do you know if MODE-2159 is still planned for 4.4 or is it going to be moved out?
It is going to be moved out. It's simply too much work to finish it in time for 4.4
-
11. Re: Modeshape / Infinispan performance [how performant should it be] (ModeShape 4.0)
ma6rl Aug 20, 2015 12:39 PM (in response to hchiorean)I've experimented with the MapDB configuration and based on the earlier findings posted by bes82 was able to get a significant performance boost by setting:
cacheLRUEnable="true"
mmapFileEnable="true"
commitFileSyncDisable="true"
and am able to create ~300 nodes a second with synchronous indexes. The majority of the performance boost came from 'commitFileSyncDisable', it is worth noting this does greatly increase the chances that the index cache may become corrupted if not shutdown correctly so it should be used with caution.
I was also able to figure out the deadlocking issue I was running into above. It turned out this was a result of using the JDBC Infinispan Cache Store and setting the datasources max connection pool size too low. The significant overhead that indexing added meant that connections were not being released quick enough and writes were blocked waiting for connections but were holding a lock on the infinispan entries which in turn was blocking writes with open connections from completing and releasing the connection.
Based on metrics from my test environment I was able to handle node writes with a connection pool to request ratio of 1 connection per 4 concurrent requests with indexing disabled. With indexing enabled I need a ration of 1 connection per 2 concurrent requests, or more simply I needed to double the max connections in the pool to cope with indexing.
At this point we can work within the constraints of using the local index provider shipped with Modeshape but will most likely move to one of the other providers currently being implemented (lucene or elastic search) once they are available.