How to configure actually working search index over Infinispan for a distributed cache
vladsz83 Jan 18, 2017 10:16 AMHi, folks!
Can anyone pls. help me to setup well-working search for replicated/distributed caches?
I need to allocate a distributed cache on 2-4 nodes and a replicated cache on 2 nodes. Both might be synchronized or not (not decided yet). I’m using only sync. ones now. I also need searching them so I enabled the index:
ConfigurationBuilder cfg = …;
cfg.clustering().cacheMode(CacheMode.REPL_SYNC)
.transaction().transactionMode(TransactionMode.TRANSACTIONAL)
.indexing().index(Index.ALL).indexing()
.addProperty("default.directory_provider", "infinispan")
.addProperty("default.chunk_size", "524288")
…
and optionally the NRT
.indexing().addProperty("default.indexmanager", "near-real-time");
The most widely used configurations for the index caches are:
"LuceneIndexesData": CacheMode.REPL_ASYNC
"LuceneIndexesMetadata": CacheMode.REPL_SYNC
"LuceneIndexesLocking": CacheMode.LOCAL
I unveiled that it is completely impossible to use DIST_ASYNC or REPL_SYNC for the lock data cache ("LuceneIndexesLocking"). Once more that one node appears in the cluster, Lucene starts yielding:
ERROR LogErrorHandler HSEARCH000058: Exception occurred org.apache.lucene.store.LockObtainFailedException: lock instance already assigned
Primary Failure:
Entity com.bpcbt.test.cache.Record Id S:61319 Work Type org.hibernate.search.backend.UpdateLuceneWork
Subsequent failures:
Entity com.bpcbt.test.cache.Record Id S:114133 Work Type org.hibernate.search.backend.UpdateLuceneWork
Entity com.bpcbt.test.cache.Record Id S:128795 Work Type org.hibernate.search.backend.UpdateLuceneWork
org.apache.lucene.store.LockObtainFailedException: lock instance already assigned
at org.infinispan.lucene.impl.CommonLockObtainUtils.failLockAcquire(CommonLockObtainUtils.java:33)
at org.infinispan.lucene.impl.CommonLockObtainUtils.attemptObtain(CommonLockObtainUtils.java:20)
at org.infinispan.lucene.impl.BaseLockFactory.obtainLock(BaseLockFactory.java:35)
at org.infinispan.lucene.impl.BaseLockFactory.obtainLock(BaseLockFactory.java:18)
at org.infinispan.lucene.impl.DirectoryLucene.obtainLock(DirectoryLucene.java:152)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:776)
at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:123)
at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:89)
at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriter(AbstractWorkspaceImpl.java:117)
at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriterDelegate(AbstractWorkspaceImpl.java:203)
at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:80)
at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:46)
at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.applyChangesets(SyncWorkProcessor.java:162)
at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.run(SyncWorkProcessor.java:148)
at java.lang.Thread.run(Thread.java:745)
ERROR LuceneBackendQueueTask HSEARCH000072: Couldn't open the IndexWriter because of previous error: operation skipped, index ouf of sync!
To test and compare performance of the searching with different setups I had to set the index data cache in LOCAL mode. But I guess it’s incorrectly to use local locking mode. Isn’t it?
Moreover, depending on the following parameters:
- Mode of the target cache: DIST_SYNC/REPL_SYNC
- NRT (“near-real-time”): on/off
- Node number: 1 to 4 (on same machine)
- Mode of the index data cache: DIST_SYNC/REPL_SYNC
I could get a successful runs or failures inside the index directory or locking routines like the shown above one or:
ERROR LogErrorHandler HSEARCH000058: Exception occurred java.io.FileNotFoundException: Error loading metadata for index file: M|segments_1s|com.bpcbt.test.cache.Record|-1
Primary Failure:
Entity com.bpcbt.test.cache.Record Id S:17510 Work Type org.hibernate.search.backend.UpdateLuceneWork
Subsequent failures:
Entity com.bpcbt.test.cache.Record Id S:42430 Work Type org.hibernate.search.backend.UpdateLuceneWork
java.io.FileNotFoundException: Error loading metadata for index file: M|segments_1s|com.my.infinitest.TestRecord|-1
at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138)
at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102)
at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109)
at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294)
at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:171)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:949)
at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:123)
at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:89)
at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriter(AbstractWorkspaceImpl.java:117)
at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriterDelegate(AbstractWorkspaceImpl.java:203)
at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:80)
at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:46)
at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.applyChangesets(SyncWorkProcessor.java:162)
at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.run(SyncWorkProcessor.java:148)
at java.lang.Thread.run(Thread.java:745)
ERROR LuceneBackendQueueTask HSEARCH000072: Couldn't open the IndexWriter because of previous error: operation skipped, index ouf of sync!
or
ERROR LogErrorHandler HSEARCH000058: HSEARCH000117: IOException on the IndexWriter
java.io.IOException: Read past EOF
at org.infinispan.lucene.impl.SlicedBufferIndexInput.readByte(SlicedBufferIndexInput.java:64)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:101)
at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:194)
at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255)
at org.apache.lucene.codecs.lucene50.Lucene50PostingsReader.<init>(Lucene50PostingsReader.java:93)
at org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat.fieldsProducer(Lucene50PostingsFormat.java:443)
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:261)
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:341)
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:104)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:65)
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145)
at org.apache.lucene.index.BufferedUpdatesStream$SegmentState.<init>(BufferedUpdatesStream.java:385)
at org.apache.lucene.index.BufferedUpdatesStream.openSegmentStates(BufferedUpdatesStream.java:417)
at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:262)
at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3161)
at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3147)
at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2809)
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2963)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2930)
at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.commitIndexWriter(IndexWriterHolder.java:146)
at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.commitIndexWriter(IndexWriterHolder.java:159)
…
I noticed that DIST_SYNC mode of the index data cache significantly reduces search performance compared to REPL_SYNC mode. Also disabling NRT leads to dramatic decrease of search performance and to huge increase of cache loading and replication time, but keeps you away from search misses when the sharding actually engaged.
However…
I found the only accebtable ones:
- target cache is REPL_SYNC. Works much stablier, consistently and faster in coherence with another options
- NRT is on (works muuuuuch faster)
- index data cace is also REPL_SYNC. Works much stablier and faster.
- lock data cache is LOCAL. The only mode whick doesn't crash
I can't say this set perfectly fits my needs. it just works.
How to configure the searching to make it working out-of-box, using various resonable options of cache and index, not getting the output flooded with
headache-making errors?