Questions regarding indexing
dalbani Oct 23, 2014 4:28 PMHello,
Before opening yet another invalid bug report, I have a couple of questions / remarks regarding indexing.
First, is it expected to have *large* index files?
In my setup, on the one side, I have a LevelDB-based (non binary) store of around 400 MB.
On the other side, the MapDB index files amount to almost 5 GB!? The local-indexes.db.t file is the largest by far.
I have around 20 indexes defined, mainly on STRING columns.
Other question, more of a bug report. Here's the exception that I get when I use Workspace.reindex():
2014-10-23 22:00:06,678 ERROR [org.modeshape.jcr.RepositoryIndexManager$ScanningRequest] (modeshape-reindexing-6-thread-2) Error while indexing '/' in workspace 'default': null: java.lang.NullPointerException at org.mapdb.DataOutput2.writeUTF(DataOutput2.java:147) [mapdb-1.0.6.jar:] at org.mapdb.Serializer$1.serialize(Serializer.java:70) [mapdb-1.0.6.jar:] at org.mapdb.Serializer$1.serialize(Serializer.java:67) [mapdb-1.0.6.jar:] at org.modeshape.jcr.index.local.MapDB$UniqueKeyBTreeSerializer.serialize(MapDB.java:434) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.mapdb.BTreeMap$NodeSerializer.serialize(BTreeMap.java:385) [mapdb-1.0.6.jar:] at org.mapdb.BTreeMap$NodeSerializer.serialize(BTreeMap.java:288) [mapdb-1.0.6.jar:] at org.mapdb.Store.serialize(Store.java:154) [mapdb-1.0.6.jar:] at org.mapdb.StoreWAL.update(StoreWAL.java:403) [mapdb-1.0.6.jar:] at org.mapdb.Caches$HashTable.update(Caches.java:269) [mapdb-1.0.6.jar:] at org.mapdb.BTreeMap.put2(BTreeMap.java:746) [mapdb-1.0.6.jar:] at org.mapdb.BTreeMap.put(BTreeMap.java:643) [mapdb-1.0.6.jar:] at org.modeshape.jcr.index.local.LocalDuplicateIndex.add(LocalDuplicateIndex.java:90) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.index.local.IndexChangeAdapters$SingleValuedPropertyChangeAdapter.addValues(IndexChangeAdapters.java:587) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.index.local.IndexChangeAdapters$AbstractPropertyChangeAdapter.reindexNode(IndexChangeAdapters.java:490) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.spi.index.provider.IndexChangeAdapter.index(IndexChangeAdapter.java:66) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.spi.index.provider.IndexProvider$7.add(IndexProvider.java:802) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.spi.index.provider.IndexProvider$1.add(IndexProvider.java:190) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.RepositoryQueryManager.reindexContent(RepositoryQueryManager.java:484) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.RepositoryQueryManager$2$1.scan(RepositoryQueryManager.java:279) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.RepositoryIndexManager$ScanningRequest.onEachPathInWorkspace(RepositoryIndexManager.java:889) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.RepositoryQueryManager$2.call(RepositoryQueryManager.java:285) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at org.modeshape.jcr.RepositoryQueryManager$2.call(RepositoryQueryManager.java:252) [modeshape-jcr-4.0.0.Final.jar:4.0.0.Final] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_72] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_72] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_72] at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_72]
What's strange is that it doesn't happen when using Workspace.reindexAsync()?!
Thanks.