-
1. Re: LuceneDirectory limited by Memory
sannegrinovero Mar 11, 2010 9:59 AM (in response to theunique89)Hi,
the Infinispan Directory is using an internal transaction starting at lock() of Directory and committed ad unlock(), so the relevant index segments are loaded in memory for a lifespan dependent to the IndexWriter lifespan: the Lucene IndexWriter acquires the lock at initialization and releases it at close().
This is not dependent to your commit(), as that wouldn't be really possible: a commit() in Lucene can't be mapped directly to a transaction as it might be followed by more changes.
I'd suggest you to close the IndexWriter as soon as possible, or frequently during batch operations, to make sure you clean up references to segments. You would like to close the IndexWriter frequently and keep it open as short as possible anyway, or other nodes won't be able to open one and will timeout on lock aquires.
Sanne
-
2. Re: LuceneDirectory limited by Memory
theunique89 Mar 13, 2010 11:32 AM (in response to sannegrinovero)Hi,
thank you for your anwser. Your solution will work if we only index deltas but we sometimes want to optimize the complete index with
IndexWriter.optimize(). I think this method will read/write the complete index and this will load the complete index in the memory. Is this true?
ciao.frank.
-
3. Re: LuceneDirectory limited by Memory
sannegrinovero Mar 14, 2010 4:14 PM (in response to theunique89)you make a good point, optimizing is definitely going to mess with all segments in the same transaction; for batch works of this size you need to back off from the Transactional LockFactory.
I just committed https://jira.jboss.org/jira/browse/ISPN-372 in trunk, which makes it possible to use a transactionless Index: it will behave like a filesystem, sending out changes to the cluster as soon as they are done.
If you switch to use the new org.infinispan.lucene.locking.BaseLockFactory (in trunk now, and now used by default) you can even use the standard merge strategy.
Watch for https://jira.jboss.org/jira/browse/ISPN-250 which is meant to improve performance of the transactionless mode to enable batching on the transport layer.
-
4. Re: LuceneDirectory limited by Memory
sannegrinovero Mar 15, 2010 6:41 AM (in response to sannegrinovero)there's a dirty but effective workaround: if you just commit the transaction used by the cache underlying the Directory, and don't start a new one, subsequent operations will happen out-of-transaction and be more memory-friendly.
This would get you the same behaviour as using the new BaseLockFactory.