3 Replies Latest reply on Jan 6, 2015 3:16 PM by djchapm

Master/slave setup with jgroups.

jakob.skwarski May 6, 2014 7:56 AM

I want to have a shared index for a cluster of nodes and for that one node needs to be master so that only that nodes writes to the index. My config is as follows (some of this is copied from some examples I have found around the web...)

<infinispan>

    <global>

        <globalJmxStatistics

            enabled="true"

            cacheManagerName="SomeManager"

            allowDuplicateDomains="true" />

        <transport

            distributedSyncTimeout="50000">

            <properties>

             <property name="configurationFile"

             value="jgroups-udp.xml"/>

             <property name="hibernate.search.default.worker.backend"

             value="jgroups"/>

             <property name="hibernate.search.services.jgroups.clusterName"

             value="SomeID"/>

          </properties>

        </transport>

    </global>

    <namedCache name="access">

        <clustering mode="distribution">

            <async/>

            <hash numOwners="2">

                <groups enabled="true" >

                    <grouper class="quicksearch.data.KeyGrouper"/>

                </groups>

            </hash>

        </clustering>

        <indexing enabled="true" indexLocalOnly="true">

            <properties>

               <property name="hibernate.search.default.directory_provider" value="infinispan" />

               <property name="hibernate.search.lucene_version" value="LUCENE_36" />

            </properties>

        </indexing>

    </namedCache>

    <namedCache

        name="LuceneIndexesMetadata">

        <clustering mode="replication">

            <async/>

        </clustering>

    </namedCache>

    <namedCache

        name="LuceneIndexesData">

        <clustering mode="replication">

            <sync />

        </clustering>

    </namedCache>

    <namedCache

        name="LuceneIndexesLocking">

        <clustering mode="replication">

            <sync />

        </clustering>

    </namedCache>

</infinispan>

When I run this I start two jbosses on two different JVMs and add objects to the cache on one of them.

13:22:47,231 INFO [stdout] (http--127.0.0.1-8180-1) Initiate Cache

13:22:47,508 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (http--127.0.0.1-8180-1) ISPN000078: Starting JGroups Channel

13:22:51,589 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (http--127.0.0.1-8180-1) ISPN000094: Received new cluster view: [johannes-38255|0] (1) [johannes-38255]

13:22:51,682 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (http--127.0.0.1-8180-1) ISPN000079: Cache local address is johannes-38255, physical addresses are [10.237.245.98:63852]

13:22:51,693 INFO [org.infinispan.factories.GlobalComponentRegistry] (http--127.0.0.1-8180-1) ISPN000128: Infinispan version: Infinispan 'Infinium' 6.0.2.Final

13:22:51,785 INFO [org.infinispan.query.impl.LifecycleManager] (http--127.0.0.1-8180-1) ISPN014003: Registering Query interceptor

13:22:51,797 INFO [org.hibernate.search.Version] (http--127.0.0.1-8180-1) HSEARCH000034: Hibernate Search 4.4.0.Final

13:22:51,811 INFO [org.hibernate.annotations.common.Version] (http--127.0.0.1-8180-1) HCANN000001: Hibernate Commons Annotations {4.0.4.Final}

13:22:51,858 INFO [org.infinispan.jmx.CacheJmxRegistration] (http--127.0.0.1-8180-1) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:22:51,868 INFO [org.infinispan.jmx.CacheJmxRegistration] (http--127.0.0.1-8180-1) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:22:51,885 INFO [stdout] (http--127.0.0.1-8180-1) Persons created...

13:22:51,888 INFO [stdout] (http--127.0.0.1-8180-1) Orders created...

13:22:51,891 INFO [stdout] (http--127.0.0.1-8180-1) 30 accesses created...

13:22:51,953 INFO [org.infinispan.jmx.CacheJmxRegistration] (CacheStartThread,null,LuceneIndexesData) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:22:51,955 INFO [org.infinispan.jmx.CacheJmxRegistration] (CacheStartThread,null,LuceneIndexesMetadata) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:22:51,955 INFO [org.infinispan.jmx.CacheJmxRegistration] (CacheStartThread,null,LuceneIndexesLocking) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:22:52,573 INFO [stdout] (http--127.0.0.1-8180-1) Accesses added to cache...

13:22:52,574 INFO [stdout] (http--127.0.0.1-8180-1) Cache size: Access: 28 took: 691 ms

13:31:28,811 INFO [stdout] (http--127.0.0.1-8180-1) Initiate Cache

13:31:28,814 INFO [stdout] (http--127.0.0.1-8180-1) Persons created...

13:31:28,814 INFO [stdout] (http--127.0.0.1-8180-1) Orders created...

13:31:28,815 INFO [stdout] (http--127.0.0.1-8180-1) 30 accesses created...

13:31:29,117 INFO [stdout] (http--127.0.0.1-8180-1) Accesses added to cache...

13:31:29,118 INFO [stdout] (http--127.0.0.1-8180-1) Cache size: Access: 54 took: 304 ms

13:31:38,031 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-1,johannes-38255) ISPN000094: Received new cluster view: [johannes-38255|1] (2) [johannes-38255, johannes-16825]

I can then perfom a search on the other jboss and find objects from the first without any issues.

13:31:36,700 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (http--127.0.0.1-8280-1) ISPN000078: Starting JGroups Channel

13:31:38,052 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (http--127.0.0.1-8280-1) ISPN000094: Received new cluster view: [johannes-38255|1] (2) [johannes-38255, johannes-16825]

13:31:38,158 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (http--127.0.0.1-8280-1) ISPN000079: Cache local address is johannes-16825, physical addresses are [10.237.245.98:59536]

13:31:38,165 INFO [org.infinispan.factories.GlobalComponentRegistry] (http--127.0.0.1-8280-1) ISPN000128: Infinispan version: Infinispan 'Infinium' 6.0.2.Final

13:31:38,263 INFO [org.infinispan.query.impl.LifecycleManager] (http--127.0.0.1-8280-1) ISPN014003: Registering Query interceptor

13:31:38,276 INFO [org.hibernate.search.Version] (http--127.0.0.1-8280-1) HSEARCH000034: Hibernate Search 4.4.0.Final

13:31:38,292 INFO [org.hibernate.annotations.common.Version] (http--127.0.0.1-8280-1) HCANN000001: Hibernate Commons Annotations {4.0.4.Final}

13:31:38,342 INFO [org.infinispan.jmx.CacheJmxRegistration] (http--127.0.0.1-8280-1) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:31:38,599 INFO [org.infinispan.jmx.CacheJmxRegistration] (CacheStartThread,null,LuceneIndexesMetadata) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:31:38,599 INFO [org.infinispan.jmx.CacheJmxRegistration] (CacheStartThread,null,LuceneIndexesLocking) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:31:38,599 INFO [org.infinispan.jmx.CacheJmxRegistration] (CacheStartThread,null,LuceneIndexesData) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:31:38,799 INFO [org.infinispan.jmx.CacheJmxRegistration] (http--127.0.0.1-8280-1) ISPN000031: MBeans were successfully registered to the platform MBean server.

13:31:39,048 INFO [stdout] (http--127.0.0.1-8280-1) ----------------------------------------------

13:31:39,050 INFO [stdout] (http--127.0.0.1-8280-1) 3_1_1 : Access[3, 1, 1] Order[3, C, Kkkka, 30] Person[1, Jakob, Markussson, 37]

13:31:39,052 INFO [stdout] (http--127.0.0.1-8280-1) ----------------------------------------------

13:31:39,053 INFO [stdout] (http--127.0.0.1-8280-1) Query took : 32 ms CacheSize: 54 Resultsize: 1

I can also add a third node without any problems and see how the objects in the cache get moved around since numOwners = 2 etc.

However, If i were to add objects to the cache again (on the jboss I first added objects with), once I have started more than one node I get the following error:

13:39:32,016 INFO [stdout] (http--127.0.0.1-8180-1) Initiate Cache
13:39:32,017 INFO [stdout] (http--127.0.0.1-8180-1) Persons created...
13:39:32,018 INFO [stdout] (http--127.0.0.1-8180-1) Orders created...
13:39:32,018 INFO [stdout] (http--127.0.0.1-8180-1) 1 accesses created...
13:39:32,484 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate Search: Index updates queue processor for index quicksearch.data.Access-1) HSEARCH000058: HSEARCH000117: IOException on the IndexWriter: java.io.FileNotFoundException: Error loading metadata for index file: _1u.nrm|M|quicksearch.data.Access
        at org.infinispan.lucene.InfinispanDirectory.openInput(InfinispanDirectory.java:269) [infinispan-lucene-directory-6.0.2.Final.jar:6.0.2.Final]
        at org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java:231) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:201) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:604) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3587) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3376) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3485) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3467) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3451) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.commitIndexWriter(IndexWriterHolder.java:158) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.commitIndexWriter(IndexWriterHolder.java:171) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at org.hibernate.search.backend.impl.lucene.ExclusiveIndexWorkspaceImpl.afterTransactionApplied(ExclusiveIndexWorkspaceImpl.java:45) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:124) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:67) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_45]
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_45]
        at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_45]
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_45]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_45]
        at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45]

13:39:32,506 INFO [stdout] (http--127.0.0.1-8180-1) Accesses added to cache...
13:39:32,507 INFO [stdout] (http--127.0.0.1-8180-1) Cache size: Access: 44 took: 490 ms

But the objects are added to the cache!

I can also try to add objects on one of the other running jbosses and then I get the following error:

13:48:40,839 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate Search: Index updates queue processor for index quicksearch.data.Access-1) HSEARCH000058: Exception occurred org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: org.infinispan.lucene.locking.BaseLuceneLock@65a2327a
Primary Failure:
        Entity quicksearch.data.Access Id S:10_7_2 Work Type org.hibernate.search.backend.UpdateLuceneWork
: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: org.infinispan.lucene.locking.BaseLuceneLock@65a2327a
        at org.apache.lucene.store.Lock.obtain(Lock.java:84) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1098) [lucene-core-3.6.2.jar:3.6.2 1423725 - rmuir - 2012-12-18 19:45:40]
        at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:146) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:113) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriter(AbstractWorkspaceImpl.java:117) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:101) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:67) [hibernate-search-engine-4.4.0.Final.jar:4.4.0.Final]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_45]
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_45]
        at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_45]
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_45]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_45]
        at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45]

13:48:40,851 ERROR [org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask] (Hibernate Search: Index updates queue processor for index quicksearch.data.Access-1) HSEARCH000072: Couldn't open the IndexWriter because of previous error: operation skipped, index ouf of sync!

Once again, the objects are still added to the cache...

Any ideas what I could be doing wrong? My goal is to have a master/slave setup with one node as master, but if that nodes goes down another should take up the master role. According to documentation on Hibernate Search this should be achieved by this property:

<property name="hibernate.search.default.worker.backend" value="jgroups"/> but I cant seem to get it work. The Infinispan documentation is not helping much other than implying that what I want to achieve can be done (unless I'm just misunderstanding everything). The jgroups-udp.xml config file I use is the one which comes with Infinispan per default, I have made no changes to it. My Infinispan version is 6.0.2.Final

1. Re: Master/slave setup with jgroups.

jakob.skwarski May 12, 2014 8:46 AM (in response to jakob.skwarski)

I got it to work if I changed to Infinispan 7.0.0.Alpha and with the following config (only if I used the xml config tho, cant seem to "translate" this config to a programatical kind....). Anyway, using an unstable version wont be an option anyway so if anybody could help with a configuration for version 6.0.2.Final then it would be nice. The documentation is no help at all.

Here is the working 7.0.x config

<?xml version="1.0" encoding="UTF-8"?>

<infinispan

      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

      xsi:schemaLocation="urn:infinispan:config:7.0 http://www.infinispan.org/schemas/infinispan-config-7.0.xsd"

      xmlns="urn:infinispan:config:7.0">

   <cache-container name="QueryEnabledGrid-Dist" default-cache="default" statistics="true">

      <jmx duplicate-domains="true" />

      <transport cluster="Infinispan-Query-Cluster">

      </transport>

       

       

       

      <distributed-cache name="default" mode="SYNC" owners="2" remote-timeout="20000" statistics="true">

         <locking acquire-timeout="20000" write-skew="false" concurrency-level="500" striping="false" />

         <state-transfer timeout="480000" enabled="true" />

         <eviction max-entries="-1" strategy="NONE" />

         <expiration max-idle="-1" />

            <indexing index="LOCAL">

               

               <property name="hibernate.search.default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager</property>

               

               <property name="hibernate.search.default.directory_provider">infinispan</property>

               

               <property name="hibernate.search.default.exclusive_index_use">false</property>

               

               <property name="hibernate.search.lucene_version">LUCENE_36</property>

            </indexing>

       </distributed-cache>

      

      

      

      <replicated-cache name="LuceneIndexesMetadata" mode="SYNC" remote-timeout="25000">

         <state-transfer enabled="true" />

         <indexing index="NONE" />

      </replicated-cache>

      

      

      

      <distributed-cache name="LuceneIndexesData" mode="SYNC" remote-timeout="25000">

         <state-transfer enabled="true" />

         <indexing index="NONE" />

      </distributed-cache>

      

      

      

      <replicated-cache name="LuceneIndexesLocking" mode="SYNC" remote-timeout="25000">

         <state-transfer enabled="true" />

         <indexing index="NONE" />

      </replicated-cache>

   </cache-container>

</infinispan>
Actions
2. Re: Master/slave setup with jgroups.

steljboss Oct 8, 2014 7:23 AM (in response to jakob.skwarski)

I think related to https://issues.apache.org/jira/browse/LUCENE-5541

based on this Bug 1140790 – FileNotFoundException in Lucene it will be fixed in JDG 6.3 CR2 if you can use a supported verions
Actions
3. Re: Master/slave setup with jgroups.

djchapm Jan 6, 2015 3:16 PM (in response to jakob.skwarski)

Hey - like the research you did but you were mostly concerned with the Lucene File.Exists issue that was fixed in Lucene 4 which is included in infinispan 7. The other error you had was
13:48:40,839 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate Search: Index updates queue processor for index quicksearch.data.Access-1) HSEARCH000058: Exception occurred org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: org.infinispan.lucene.locking.BaseLuceneLock@65a2327a

We are having this issue too with a distributed/searchable cluster. Do you know the fix related to this? It's happening when we bounce a node and then try to do a put in the new node.
Wondering if it was related to this property:

<property name="hibernate.search.default.exclusive_index_use">false</property>
Where the documentation simply says 'set to true <default> for better performance'.

Appreciate your response.

Dan C.
Actions

Go to original post