3 Replies Latest reply on Feb 24, 2012 11:43 AM by Akshay Sachdeva

    Hibernate Search Index build on Application startup

    Akshay Sachdeva Newbie

      Environment

       

      Infinispan 5.1.1

      Hibernate Search 4.0.0

      Hibernate Core 4.0.1

      Jgroups 3.0.5

      JTA 1.1

      Bittronix 2.1.2

      XA Datasource setup

      Oracle 10g DB

       

      We have a lot of data in the database and every time server starts up infinispan rebuilds the indexes which takes a long time.  Is there a way to disable this or configure this?  I have tried playing with the loader attributes in the config below with little luck.

       

      I have included the relevant config for infinispan down below

       

      <?xml version="1.0" encoding="UTF-8"?>

      <infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

                  xmlns="urn:infinispan:config:5.1"

                  xsi:schemaLocation="urn:infinispan:config:5.1 http://www.infinispan.org/schemas/infinispan-config-5.1.xsd">

       

          <!-- *************************** -->

          <!-- System-wide global settings -->

          <!-- *************************** -->

       

          <global>

       

              <!-- Duplicate domains are allowed so that multiple deployments with default configuration

                  of Hibernate Search applications work - if possible it would be better to use JNDI to share

                  the CacheManager across applications -->

              <globalJmxStatistics

                  enabled="false"

                  cacheManagerName="HibernateSearch"

                  allowDuplicateDomains="true" />

       

              <!-- If the transport is omitted, there is no way to create distributed or clustered

                  caches. There is no added cost to defining a transport but not creating a cache that uses one,

                  since the transport is created and initialized lazily. -->

              <transport

              transportClass="xxx.DexJGroupsTransport"

                  clusterName="Hibernate_Search_Cluster"

                  distributedSyncTimeout="120000">

                  <!-- Note that the JGroups transport uses sensible defaults if no configuration

                      property is defined. See the JGroupsTransport javadocs for more flags -->

              </transport>

       

              <!-- Used to register JVM shutdown hooks. hookBehavior: DEFAULT, REGISTER, DONT_REGISTER.

                  Hibernate Search takes care to stop the CacheManager so registering is not needed -->

              <shutdown

                  hookBehavior="DONT_REGISTER" />

       

          </global>

       

          <!-- *************************** -->

          <!-- Default "template" settings -->

          <!-- *************************** -->

       

          <default>

       

                      <loaders passivation="true" shared="false" preload="true">

                         <loader class="org.infinispan.loaders.file.FileCacheStore"

                            fetchPersistentState="true" ignoreModifications="false"

                            purgerThreads="3" purgeSynchronously="true" purgeOnStartup="false">

                            <properties>

                               <property name="location" value="/var/dex/infinispan/index-store" />

                            </properties>

                            <singletonStore enabled="true" pushStateWhenCoordinator="true" pushStateTimeout="20000"/>

                         </loader>

                      </loaders>

       

       

              <locking

                  lockAcquisitionTimeout="120000"

                  writeSkewCheck="false"

                  concurrencyLevel="5000"

                  useLockStriping="false" />

       

       

              <!-- Invocation batching is required for use with the Lucene Directory -->

              <invocationBatching

                  enabled="true" />

       

              <!-- This element specifies that the cache is clustered. modes supported: distribution

                  (d), replication (r) or invalidation (i). Don't use invalidation to store Lucene indexes (as

                  with Hibernate Search DirectoryProvider). Replication is recommended for best performance of

                  Lucene indexes, but make sure you have enough memory to store the index in your heap.

                  Also distribution scales much better than replication on high number of nodes in the cluster. -->

              <clustering

                  mode="replication">

       

                  <!-- Prefer loading all data at startup than later -->

                  <stateRetrieval

                      timeout="120000"

                      fetchInMemoryState="true"

                      />

       

                  <!-- Network calls are synchronous by default -->

                  <sync

                      replTimeout="20000" />

              </clustering>

       

              <jmxStatistics

                  enabled="false" />

       

              <eviction

                  maxEntries="-1"

                  strategy="NONE" />

       

              <expiration

                  maxIdle="-1" />

       

          </default>

       

          <!-- ******************************************************************************* -->

          <!-- Individually configured "named" caches.                                         -->

          <!--                                                                                 -->

          <!-- While default configuration happens to be fine with similar settings across the -->

          <!-- three caches, they should generally be different in a production environment.   -->

          <!--                                                                                 -->

          <!-- Current settings could easily lead to OutOfMemory exception as a CacheStore     -->

          <!-- should be enabled, and maybe distribution is desired.                           -->

          <!-- ******************************************************************************* -->

       

          <!-- *************************************** -->

          <!--  Cache to store Lucene's file metadata  -->

          <!-- *************************************** -->

          <namedCache

              name="LuceneIndexesMetadata">

              <clustering

                  mode="replication">

                  <stateRetrieval

                      fetchInMemoryState="true"

                      logFlushTimeout="30000" />

                  <sync

                      replTimeout="120000" />

              </clustering>

          </namedCache>

       

          <!-- **************************** -->

          <!--  Cache to store Lucene data  -->

          <!-- **************************** -->

          <namedCache

              name="LuceneIndexesData">

              <clustering

                  mode="replication">

                  <stateRetrieval

                      fetchInMemoryState="true"/>

                  <sync

                      replTimeout="120000" />

              </clustering>

          </namedCache>

       

          <!-- ***************************** -->

          <!--  Cache to store Lucene locks  -->

          <!-- ***************************** -->

          <namedCache

              name="LuceneIndexesLocking">

              <clustering

                  mode="replication">

                  <stateRetrieval

                      fetchInMemoryState="true"/>

                  <sync

                      replTimeout="120000" />

              </clustering>

          </namedCache>

       

      </infinispan>

       


       

        • 1. Re: Hibernate Search Index build on Application startup
          Sanne Grinovero Master

          Hi,

          This:

          <singletonStore enabled="true" pushStateWhenCoordinator="true" pushStateTimeout="20000"/>

          should not be needed, and is being a bottleneck for all your nodes to load from it.

          • 2. Re: Hibernate Search Index build on Application startup
            Sanne Grinovero Master

            Also FileCacheStore is quite slow. You should try one of the other implementations.

            • 3. Re: Hibernate Search Index build on Application startup
              Akshay Sachdeva Newbie

              Hey Sanne

               

              Thanks for the answers and pointers. 

               

              As an update, we reverted back to using filesystem-master and filesystem-slave framework using Jgroups(2.12.1.3)  for managing the indexes in our clustered environment.  While there is a latency associated with index updates being available across the other nodes in the cluster, the setup and maintenance is a lot easier.  It also seems a lot more stable in all of our testing  Our application also will not be very active as regards write/updates so I think this latency we are seeing should be acceptable.

               

              There were a lot of variables involved with making the switch to using Infinispan 5.1.1 that prompted us to make that switch back.

               

              The variable were

              Switch to JTA (we are on JPA)

              Switch to using Bittronix

              Switch to XA datasources

              Switch to Jgroups 3.0.5

               

              We had gotten over most hurdles but we hit a big headache in the Orcale 10G xadatasource config (recovery would error out on startup on the bitronix side).  The Oracle XA stuff just seemed to have a lot of headaches in general and it was kind of the straw that broke the camels back especially since we are really close to go live and needed something in place

               

              Thanks

              -a