1 Reply Latest reply on Jun 4, 2014 11:54 AM by ges

Infinispan Search - lucene index properties

ges Apr 16, 2014 12:28 AM

Appreciate the discussion today after the JDG session @ the Red Hat Summit. As discussed today here is the lucene configuration I'm using.

	<indexing enabled="true" indexLocalOnly="true">
	<properties>
	<property name="default.indexmanager" value="near-real-time"/>
	<property name="default.directory_provider" value="infinispan"/>
	<property name="default.worker.thread_pool.size" value="16"/>
	<property name="default.worker.execution" value="async"/>
	<property name="default.indexwriter.ram_buffer_size" value="1024"/>
	<property name="default.max_queue_length" value="50000"/>
	<property name="default.indexwriter.merge_factor" value="100"/>
	</properties>
	</indexing>

This configuration sometimes results in the indexing for 3M entries taking longer than 1 hour. What do I need to do to ensure that the Mass Indexer is enabled and is using multiple threads? The worker execution is configured to be async but it doesnt look like the put operation is offloading the indexing from the main thread? It could be that the queue length is insufficient and so the main thread is blocked adding to the queue.

Wonder if you see any obvious configuration issues here?

Thanks,

Gesly George

1. Re: Infinispan Search - lucene index properties

ges Jun 4, 2014 11:54 AM (in response to ges)

Saane,

This is the indexing configuration that I'm currently using. This is based on infinispan/query/src/test/resources/nrt-performance-writer.xml at master · infinispan/infinispan · GitHub. This has improved performance but still not got to the ~30 seconds performance that you had mentioned that I could expect for a data set such of this. Are there any more tweaks that could be done to the config here.

	<local-cache name="SecurityDataSource">
	<indexing index="LOCAL">
	<property name="default.indexmanager">near-real-time</property>
	<property name="default.directory_provider">infinispan</property>
	<property name="default.chunk_size">128000</property>
	<property name="default.metadata_cachename">LuceneIndexesMetadataOWR</property>
	<property name="default.data_cachename">LuceneIndexesDataOWR</property>
	<!-- This index is dedicated to the current node -->
	<property name="default.exclusive_index_use">true</property>
	<!-- The default is 10, but we don't want to waste many cycles in merging
	(tune for writes at cost of reader fragmentation) -->
	<property name="default.indexwriter.merge_factor">30</property>
	<!-- Never create segments larger than 1GB -->
	<property name="default.indexwriter.merge_max_size">1024</property>
	<!-- IndexWriter flush buffer size in MB -->
	<property name="default.indexwriter.ram_buffer_size">1024</property>
	<!-- Make sure to use native locking -->
	<property name="default.locking_strategy">native</property>
	<!-- Enable sharding on writers -->
	<property name="default.sharding_strategy.nbr_of_shards">6</property>
	<!--<property name="default.indexwriter.max_merge_docs">5</property>-->
	<property name="default.max_queue_length">1000000</property>
	<property name="default.worker.execution">async</property>
	<property name="default.worker.thread_pool.size">32</property>
	</indexing>
	</local-cache>

Thanks,