wesssel, I'm seeing similar issues with index sizes and share your concerns about the size of the MapDB index.
The other problems I'm running into are:
- the time it takes for new instances to create an index from scratch when adding additional instances to a cluster.
- the lack of support for LIKE and Full Text Search.
At the moment adding Lucene as an index provider is scheduled for 4.4 but it does not look like any work has been done on it yet [MODE-2159] Store indexes in local Lucene - JBoss Issue Tracker. This may well help with the Index size and Like/Full Text queries but most likely will not help with the time it takes for a new cluster instance to build it's index.
Given this I'm currently looking at what it would take to add support for Solr or ElasticSearch as an index provider to Modeshape. I know they are on the roadmap [MODE-2161] Store indexes in Solr - JBoss Issue Tracker, [MODE-2162] Store indexes in ElasticSearch - JBoss Issue Tracker but there are not currently any resources assigned to work on them. Would adding support for either of these be of use to you and if so which one would be your preference? At the moment I'm leaning towards ElasticSearch as we run our instances in AWS and they provide a hosted ElasticSearch service which would make our lives easier.
I'm having exactly the same problem with index sizes at the moment on Modeshape 4.1.0 and would be interested in knowing if there are any best practices we can follow, or if reverting to Lucene (or to ElasticSearch/Solr) is an option.