5 Replies Latest reply on Jun 16, 2014 2:24 PM by ges

    First Query after loading a local indexed cache

    ges

      Hi,

      The first query (using the Query DSL) that I execute against a local embedded cache on one of the indexed fields always takes a long time to execute (in may case almost 2 minutes). Subsequent queries are fast. Is there something I could do to make the first query request fast as well? Of course, I could always execute a query immediately after the data load, so that future queries are fast.

      But I would like to know if there is a better way to do this and also understand the reason for the first query being slow.

       

      Thanks,

      Gesly

        • 1. Re: First Query after loading a local indexed cache
          ges

          Hi,

          Has anyone else seen this issue before? Would be good to know if I'm not the only one who is in the same boat.

           

          Thanks,

          Gesly

          • 2. Re: First Query after loading a local indexed cache
            anistor

            Hi George,

             

            I've not encountered this problem. Could you give a bit more detail about your setup?

             

            Thanks!

            • 3. Re: First Query after loading a local indexed cache
              ges

              anistor

              Hi Adria,

              My setup is a LOCAL cache with 3.9 - 4 million entries, indexed by the 3 String properties of the value object. These are not large objects (around 600 bytes an object). The indexing setup is as follows. This is an ISPN-7 config but I had the same issue with ISPN-6 as well. After I load the entries into the cache, I always trigger a dummy query on the cache, which can take 2 minutes or so. All subsequent queries are fast. As you can see the indexing directory is Infinispan, so I don't expect a scenario where the first query results in warning up an index cache. It should all be available right away from ISPN.

               

                  <local-cache name="DataSource">
                      <indexing index="LOCAL">
                          <property name="default.indexmanager">near-real-time</property>
                          <property name="default.directory_provider">infinispan</property>
                          <property name="default.chunk_size">128000</property>
                          <property name="default.metadata_cachename">LuceneIndexesMetadataOWR</property>
                          <property name="default.data_cachename">LuceneIndexesDataOWR</property>
                          <!-- This index is dedicated to the current node -->
                          <property name="default.exclusive_index_use">true</property>
                          <!-- The default is 10, but we don't want to waste many cycles in merging
                           (tune for writes at cost of reader fragmentation) -->
                          <property name="default.indexwriter.merge_factor">30</property>
                          <!-- Never create segments larger than 1GB -->
                          <property name="default.indexwriter.merge_max_size">1024</property>
                          <!-- IndexWriter flush buffer size in MB -->
                          <property name="default.indexwriter.ram_buffer_size">1024</property>
                          <!-- Make sure to use native locking -->
                          <property name="default.locking_strategy">native</property>
                          <!-- Enable sharding on writers -->
                          <property name="default.sharding_strategy.nbr_of_shards">6</property>
                          <!--<property name="default.indexwriter.max_merge_docs">5</property>-->
                          <property name="default.max_queue_length">1000000</property>
                          <property name="default.worker.execution">async</property>
                          <property name="default.worker.thread_pool.size">32</property>
                      </indexing>

               

              Thanks.

              • 4. Re: First Query after loading a local indexed cache
                sannegrinovero

                Hi George,

                yes this is quite normal in Lucene world: it's very sensitive to warmup of its internal buffers. Search engines based on Lucene such as Solr include an "auto-warmup" thread, but it's mostly effective if the query it runs is similar to the actual queries you intend to run, so an auto-warmup feature makes sense on REST based services as people will define some warmup queries as text in their configuration, but in a system like this it's usually simpler to just have your system run some queries after bootup.

                 

                We could consider an auto-warmup feature as well, if you think the above is not practical?

                • 5. Re: First Query after loading a local indexed cache
                  ges

                  Saane,

                  We know the query pattern we will be encountering, so we have added our own auto warm up thread. Thanks for the help!

                   

                  Gesly