0 Replies Latest reply on Apr 7, 2014 3:59 AM by arnaldo.demaio

    LuceneCacheLoader persistance

    arnaldo.demaio

      Hi all,

      In my use case I need to load a preexisting lucene index into infinispan datagrid, expose it to some (many) concurrent read/write and persist (no matter sync or async) on the same phisycal index directory, writing out only delta between ingrid image and content on disk.

      It seems that such a component doesn't exist in infinispan. I used some naive solutions like:

      1. Directory copy API: simply just copy all files inside the Infinispan Directory on the disk overwriting any if preexistent. This solution is poor because doesn't lock the index and altough lucene index is self coherent you'll have old files that don't belongs to index inside the phisical directory.

       

       

       

      org.apache.lucene.store.Directory backup = new org.apache.lucene.store.NIOFSDirectory(new File(indexBackupPath));

                      if (null != this.dirIndex){

                           IndexFileNameFilter filter = IndexFileNameFilter.getFilter();

                           int numfiles = dirIndex.listAll().length;

                           for (String file : dirIndex.listAll()) {

                             if (filter.accept(null, file)) {

                                 dirIndex.copy(backup, file, file);

                             }

                           }

       

       

       

      2. using the API IndexWriter deleteAll and addIndexNoOptimize: with this solution is created another instance of Directory that point to the disk, all files inside are deleted and finally all the ones in the grid are committed to the directory.

       

       

                  IndexWriter writer = null;
                  boolean bRet = false;
                  IndexWriterConfig idxCfg = new IndexWriterConfig(Version.LUCENE_30,
                          new StandardAnalyzer(Version.LUCENE_30));
                  idxCfg.setOpenMode(IndexWriterConfig.OpenMode.APPEND);
                  try {
                      org.apache.lucene.store.Directory backup = new org.apache.lucene.store.NIOFSDirectory(new File(indexDir));
                      writer = new IndexWriter(backup, idxCfg);
                      writer.deleteAll();
                      writer.addIndexesNoOptimize(this.dirIndex);
                      writer.commit();
                      writer.close();
                      bRet = true;
                  }

       

      Both the solutions don't fit exactly my use case and are obviously not efficient.

      I was looking at infinispan/lucene/lucene-v3/src/main/java/org/infinispan/lucene at 6.0.x · infinispan/infinispan · GitHub package and wondering how to implement write through (or behind) on disk. Could be a good starting point to let DirectoryLuceneV3 extending some concrete Lucene Directory class (not just Directory) and preserving  'super' behaviour on write in the overrided methods, for example when writing something like the following dummy:

       

      MyDirectoryLuceneV3.java:

       

      /**

          * {@inheritDoc}

          */

         @Override

         public IndexOutput createOutput(final String name) {

           IndexOutput io = impl.createOutput(name);

          super.createOutput(name);

          return io;

         }

       

       

      I'm a little bit confused on how it should be implemented and the drawbacks and risks keeping try to implement this solution. Any help would be really appreciated!

      Arnaldo