7 Replies Latest reply on Nov 16, 2011 3:37 AM by Galder Zamarreño

    Problem in using Infinispan with Jgroups for Clustering

    Dzung Leonhart Newbie

      Hi Infinispan team,

       

      I'm working on a project that the application utilizes Hibernate Search and will be deployed to Amazon EC2. So, I'm trying to configure Infinispan with Jgroups backend on 2 local nodes (with firewall disabled) for testing. The problem which stucks me is when the second node tries to apply the state from the first node (I've spent several days for google-ing, and reading this link http://community.jboss.org/thread/165126?start=15&tstart=0 but no achievement):

       

          1. My configurations:

                a. Spring bean configuration

       

                <bean id="sessionFactory"

                  <property name="hibernateProperties">

                  <props>

                      <prop key="hibernate.dialect">org.hibernate.dialect.MySQLDialect</prop>

                      <prop key="hibernate.search.default.directory_provider">infinispan</prop>

                      <prop key="hibernate.search.worker.backend.jgroups.configurationFile">jdbc_ping.xml</prop>               

                  </props>

              </property>

       

              b. jdbc_ping.xml

       

      <config xmlns="urn:org:jgroups"

              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

              xsi:schemaLocation="urn:org:jgroups file:schema/JGroups-2.8.xsd">

         <TCP

              bind_addr="127.0.0.1" bind_port="7800"

              loopback="true" port_range="30" recv_buf_size="20000000" send_buf_size="640000"

              discard_incompatible_packets="true" max_bundle_size="64000"

              max_bundle_timeout="30" enable_bundling="true" use_send_queues="true"

              sock_conn_timeout="300" enable_diagnostics="false"

              thread_pool.enabled="true" thread_pool.min_threads="2"

              thread_pool.max_threads="30"

              thread_pool.keep_alive_time="5000"

              thread_pool.queue_enabled="false"

              thread_pool.queue_max_size="100"

              thread_pool.rejection_policy="Discard"

       

              oob_thread_pool.enabled="true"

              oob_thread_pool.min_threads="2"

              oob_thread_pool.max_threads="30"

              oob_thread_pool.keep_alive_time="5000"

              oob_thread_pool.queue_enabled="false"

              oob_thread_pool.queue_max_size="100"

              oob_thread_pool.rejection_policy="Discard"       

               />

       

         <JDBC_PING id="102" connection_driver="com.mysql.jdbc.Driver"

              connection_username="root" connection_password="root"

              connection_url="jdbc:mysql://localhost/clientdb2"

              level="debug" />

       

         <MERGE2 max_interval="30000"

                 min_interval="10000"/>

         <FD_SOCK/>

         <FD timeout="3000" max_tries="3"/>

         <VERIFY_SUSPECT timeout="1500"/>

         <pbcast.NAKACK

               use_mcast_xmit="false" gc_lag="0"

               retransmit_timeout="300,600,1200,2400,4800"

               discard_delivered_msgs="false"/>

         <UNICAST timeout="300,600,1200"/>

         <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"

                        max_bytes="400000"/>

         <pbcast.GMS print_local_addr="false" join_timeout="7000" view_bundling="true"/>

         <UFC max_credits="2000000" min_threshold="0.10"/>

         <MFC max_credits="2000000" min_threshold="0.10"/>

         <FRAG2 frag_size="60000"/>

         <pbcast.STREAMING_STATE_TRANSFER/>

         <pbcast.FLUSH timeout="0"/>

      </config>

       

          2. My logs:

                a. I start the first node, it works fine:

               

      2011-11-02 09:34:19,920 [main] INFO  org.infinispan.factories.TransactionManagerFactory - Using a batchMode transaction manager

      2011-11-02 09:34:20,099 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - Starting JGroups Channel

      2011-11-02 09:34:20,119 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - Unable to use any JGroups configuration mechanisms provided in properties {}.  Using default JGroups configuration!

      2011-11-02 09:34:20,207 [main] INFO  org.jgroups.JChannel - JGroups version: 2.12.0.Final

      2011-11-02 09:34:20,492 [main] WARN  org.jgroups.protocols.UDP - send buffer of socket java.net.DatagramSocket@110f850 was set to 640KB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max send buffer in the OS correctly (e.g. net.core.wmem_max on Linux)

      2011-11-02 09:34:20,492 [main] WARN  org.jgroups.protocols.UDP - receive buffer of socket java.net.DatagramSocket@110f850 was set to 20MB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)

      2011-11-02 09:34:20,492 [main] WARN  org.jgroups.protocols.UDP - send buffer of socket java.net.MulticastSocket@11e8d5c was set to 640KB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max send buffer in the OS correctly (e.g. net.core.wmem_max on Linux)

      2011-11-02 09:34:20,492 [main] WARN  org.jgroups.protocols.UDP - receive buffer of socket java.net.MulticastSocket@11e8d5c was set to 25MB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)

      2011-11-02 09:34:23,542 [main] WARN  org.jgroups.protocols.pbcast.GMS - join(localhost-47297) sent to localhost-53070 timed out (after 3000 ms), retrying

      2011-11-02 09:34:27,626 [main] WARN  org.jgroups.protocols.pbcast.GMS - join(localhost-47297) sent to localhost-53070 timed out (after 3000 ms), retrying

      2011-11-02 09:34:30,712 [main] WARN  org.jgroups.protocols.pbcast.GMS - join(localhost-47297) sent to localhost-53070 timed out (after 3000 ms), retrying

      2011-11-02 09:34:33,862 [main] WARN  org.jgroups.protocols.pbcast.GMS - join(localhost-47297) sent to localhost-53070 timed out (after 3000 ms), retrying

      2011-11-02 09:34:38,391 [main] WARN  org.jgroups.protocols.pbcast.GMS - join(localhost-47297) sent to localhost-53070 timed out (after 3000 ms), retrying

      2011-11-02 09:34:41,414 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - Received new cluster view: [localhost-47297|0] [localhost-47297]

      2011-11-02 09:34:41,415 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - Cache local address is localhost-47297, physical addresses are [192.168.55.37:54819]

      2011-11-02 09:34:41,415 [main] INFO  org.infinispan.factories.GlobalComponentRegistry - Infinispan version: Infinispan 'Ursus' 4.2.1.FINAL

      2011-11-02 09:34:41,531 [OOB-2,localhost-47297] ERROR org.jgroups.protocols.UNICAST - localhost-47297: sender window for localhost-53070 not found

      2011-11-02 09:34:41,576 [main] INFO  org.infinispan.jmx.CacheJmxRegistration - MBeans were successfully registered to the platform mbean server.

      2011-11-02 09:34:41,576 [main] INFO  org.infinispan.factories.ComponentRegistry - Infinispan version: Infinispan 'Ursus' 4.2.1.FINAL

      2011-11-02 09:34:41,585 [main] INFO  org.infinispan.factories.TransactionManagerFactory - Using a batchMode transaction manager

      2011-11-02 09:34:41,769 [main] INFO  org.infinispan.jmx.CacheJmxRegistration - MBeans were successfully registered to the platform mbean server.

      2011-11-02 09:34:41,769 [main] INFO  org.infinispan.factories.ComponentRegistry - Infinispan version: Infinispan 'Ursus' 4.2.1.FINAL

      2011-11-02 09:34:41,775 [main] INFO  org.infinispan.factories.TransactionManagerFactory - Using a batchMode transaction manager

      2011-11-02 09:34:41,961 [main] INFO  org.infinispan.jmx.CacheJmxRegistration - MBeans were successfully registered to the platform mbean server.

      2011-11-02 09:34:41,961 [main] INFO  org.infinispan.factories.ComponentRegistry - Infinispan version: Infinispan 'Ursus' 4.2.1.FINAL

      2011-11-02 09:34:42,232 [main] INFO  org.jgroups.JChannel - JGroups version: 2.12.0.Final

      2011-11-02 09:34:42,320 [main] DEBUG org.jgroups.protocols.JDBC_PING - Registering JDBC Driver named 'com.mysql.jdbc.Driver'

      2011-11-02 09:34:42,320 [main] INFO  org.jgroups.protocols.JDBC_PING - Table creation step skipped: initialize_sql property is missing

      2011-11-02 09:34:42,495 [Timer-2,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Removed c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster from database.

      2011-11-02 09:34:42,548 [Timer-2,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Registered c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster into database.

      2011-11-02 09:34:43,532 [OOB-5,localhost-47297] WARN  org.jgroups.protocols.pbcast.NAKACK - localhost-47297: dropped message from de3a06ce-e0cc-4849-ee10-d33d2afa0c18 (not in table [localhost-47297]), view=[localhost-47297|0] [localhost-47297]

      2011-11-02 09:34:51,627 [OOB-2,localhost-47297] WARN  org.jgroups.protocols.pbcast.NAKACK - localhost-47297: dropped message from 10083683-c381-1e7b-b41e-6682026771dd (not in table [localhost-47297]), view=[localhost-47297|0] [localhost-47297]

      bWVnYXRlc3R8bWF4bWNjbG91ZEBxYXN5bXBob255LmNvbToxMzIwNDkzODc2OTc4OjJhOWUxY2FmNWU4OTQ1Njk3ODg2YzFiZjBlZDVhYWE5

      2011-11-02 09:35:08,907 [Timer-3,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Removed c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster from database.

      2011-11-02 09:35:08,952 [Timer-3,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Registered c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster into database.

      2011-11-02 09:35:10,545 [Timer-4,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Removed c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster from database.

      2011-11-02 09:35:10,587 [Timer-4,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Registered c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster into database.

      2011-11-02 09:35:19,120 [OOB-2,localhost-47297] WARN  org.jgroups.protocols.pbcast.NAKACK - localhost-47297: dropped message from localhost-53070 (not in table [localhost-47297]), view=[localhost-47297|0] [localhost-47297]

      2011-11-02 09:35:29,703 [Timer-4,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Removed c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster from database.

      2011-11-02 09:35:29,735 [Timer-4,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Registered c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster into database.

      2011-11-02 09:35:31,355 [Timer-2,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Removed c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster from database.

      2011-11-02 09:35:31,386 [Timer-2,HSearchCluster,localhost-21096] DEBUG org.jgroups.protocols.JDBC_PING - Registered c9ee9675-76f0-5d6e-9bf1-999d0b33a92c for clustername HSearchCluster into database.

      2011-11-02 09:35:32,695 [Incoming-1,localhost-47297] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - Received new cluster view: [localhost-47297|1] [localhost-47297, localhost-42372]

       

                b. When I start the second node:

                     * Second node logs:

                    

      2011-11-02 09:35:26,015 [main] INFO  org.infinispan.factories.TransactionManagerFactory - Using a batchMode transaction manager

      2011-11-02 09:35:26,180 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - Starting JGroups Channel

      2011-11-02 09:35:26,182 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - Unable to use any JGroups configuration mechanisms provided in properties {}.  Using default JGroups configuration!

      2011-11-02 09:35:26,217 [main] INFO  org.jgroups.JChannel - JGroups version: 2.12.0.Final

      2011-11-02 09:35:26,473 [main] WARN  org.jgroups.protocols.UDP - send buffer of socket java.net.DatagramSocket@121a735 was set to 640KB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max send buffer in the OS correctly (e.g. net.core.wmem_max on Linux)

      2011-11-02 09:35:26,473 [main] WARN  org.jgroups.protocols.UDP - receive buffer of socket java.net.DatagramSocket@121a735 was set to 20MB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)

      2011-11-02 09:35:26,473 [main] WARN  org.jgroups.protocols.UDP - send buffer of socket java.net.MulticastSocket@68a2cc was set to 640KB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max send buffer in the OS correctly (e.g. net.core.wmem_max on Linux)

      2011-11-02 09:35:26,473 [main] WARN  org.jgroups.protocols.UDP - receive buffer of socket java.net.MulticastSocket@68a2cc was set to 25MB, but the OS only allocated 131.07KB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)

      2011-11-02 09:35:29,509 [main] WARN  org.jgroups.protocols.pbcast.GMS - join(localhost-42372) sent to localhost-53070 timed out (after 3000 ms), retrying

      2011-11-02 09:35:32,670 [main] WARN  org.jgroups.protocols.pbcast.GMS - join(localhost-42372) sent to localhost-53070 timed out (after 3000 ms), retrying

      2011-11-02 09:35:32,728 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - Received new cluster view: [localhost-47297|1] [localhost-47297, localhost-42372]

      2011-11-02 09:35:32,743 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - Cache local address is localhost-42372, physical addresses are [192.168.55.38:45737]

      2011-11-02 09:35:32,744 [main] INFO  org.infinispan.factories.GlobalComponentRegistry - Infinispan version: Infinispan 'Ursus' 4.2.1.FINAL

      2011-11-02 09:35:32,829 [main] INFO  org.infinispan.jmx.CacheJmxRegistration - MBeans were successfully registered to the platform mbean server.

      2011-11-02 09:35:32,829 [main] INFO  org.infinispan.remoting.rpc.RpcManagerImpl - Trying to fetch state from localhost-47297

      2011-11-02 09:35:32,970 [Incoming-2,localhost-42372] ERROR org.infinispan.remoting.transport.jgroups.JGroupsTransport - Caught while requesting or applying state

      org.infinispan.statetransfer.StateTransferException: java.io.EOFException: Read past end of file

          at org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:332)

          at org.infinispan.remoting.InboundInvocationHandlerImpl.applyState(InboundInvocationHandlerImpl.java:199)

          at org.infinispan.remoting.transport.jgroups.JGroupsTransport.setState(JGroupsTransport.java:595)

          at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:711)

          at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)

          at org.jgroups.JChannel.up(JChannel.java:1441)

          at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)

          at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:477)

          at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.connectToStateProvider(STREAMING_STATE_TRANSFER.java:523)

          at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.handleStateRsp(STREAMING_STATE_TRANSFER.java:462)

          at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:223)

          at org.jgroups.protocols.FRAG2.up(FRAG2.java:189)

          at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)

          at org.jgroups.protocols.FlowControl.up(FlowControl.java:400)

          at org.jgroups.protocols.pbcast.GMS.up(GMS.java:891)

          at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:246)

          at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:613)

          at org.jgroups.protocols.UNICAST.up(UNICAST.java:294)

          at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:703)

          at org.jgroups.protocols.BARRIER.up(BARRIER.java:119)

          at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:177)

          at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:275)

          at org.jgroups.protocols.MERGE2.up(MERGE2.java:209)

          at org.jgroups.protocols.Discovery.up(Discovery.java:291)

          at org.jgroups.protocols.PING.up(PING.java:66)

          at org.jgroups.protocols.TP.passMessageUp(TP.java:1102)

          at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1658)

          at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1640)

          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

          at java.lang.Thread.run(Thread.java:636)

      Caused by: java.io.EOFException: Read past end of file

          at org.jboss.marshalling.AbstractUnmarshaller.eofOnRead(AbstractUnmarshaller.java:184)

          at org.jboss.marshalling.AbstractUnmarshaller.readUnsignedByteDirect(AbstractUnmarshaller.java:319)

          at org.jboss.marshalling.AbstractUnmarshaller.readUnsignedByte(AbstractUnmarshaller.java:280)

          at org.jboss.marshalling.river.RiverUnmarshaller.doStart(RiverUnmarshaller.java:1165)

          at org.jboss.marshalling.AbstractUnmarshaller.start(AbstractUnmarshaller.java:389)

          at org.infinispan.marshall.jboss.GenericJBossMarshaller.startObjectInput(GenericJBossMarshaller.java:177)

          at org.infinispan.marshall.VersionAwareMarshaller.startObjectInput(VersionAwareMarshaller.java:157)

          at org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:308)

                    

                * First node logs after starting the second node:


      011-11-02 09:35:32,971 [STREAMING_STATE_TRANSFER-sender-1,localhost-47297] ERROR org.jgroups - uncaught exception in Thread[STREAMING_STATE_TRANSFER-sender-1,localhost-47297,5,JGroups] (thread group=org.jgroups.util.Util$1[name=JGroups,maxpri=10] )

      java.lang.NullPointerException

          at org.infinispan.marshall.jboss.GenericJBossMarshaller.finishObjectOutput(GenericJBossMarshaller.java:146)

          at org.infinispan.marshall.VersionAwareMarshaller.finishObjectOutput(VersionAwareMarshaller.java:140)

          at org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:178)

          at org.infinispan.remoting.InboundInvocationHandlerImpl.generateState(InboundInvocationHandlerImpl.java:217)

          at org.infinispan.remoting.transport.jgroups.JGroupsTransport.getState(JGroupsTransport.java:578)

          at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:690)

          at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)

          at org.jgroups.JChannel.up(JChannel.java:1484)

          at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)

          at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:477)

          at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderHandler.process(STREAMING_STATE_TRANSFER.java:651)

          at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderThreadSpawner$1.run(STREAMING_STATE_TRANSFER.java:580)

          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

          at java.lang.Thread.run(Thread.java:636)

      Exception in thread "STREAMING_STATE_TRANSFER-sender-1,localhost-47297" java.lang.NullPointerException

          at org.infinispan.marshall.jboss.GenericJBossMarshaller.finishObjectOutput(GenericJBossMarshaller.java:146)

          at org.infinispan.marshall.VersionAwareMarshaller.finishObjectOutput(VersionAwareMarshaller.java:140)

          at org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:178)

          at org.infinispan.remoting.InboundInvocationHandlerImpl.generateState(InboundInvocationHandlerImpl.java:217)

          at org.infinispan.remoting.transport.jgroups.JGroupsTransport.getState(JGroupsTransport.java:578)

          at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:690)

          at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)

          at org.jgroups.JChannel.up(JChannel.java:1484)

          at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)

          at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:477)

          at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderHandler.process(STREAMING_STATE_TRANSFER.java:651)

          at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderThreadSpawner$1.run(STREAMING_STATE_TRANSFER.java:580)

          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

          at java.lang.Thread.run(Thread.java:636)

      -------------------------

       

      It' really great for me if I could have your advices for this problem.

      Thanks a lot and Best regards.

        • 1. Re: Problem in using Infinispan with Jgroups for Clustering
          Galder Zamarreño Master

          What version of Hibernate Search and Infinispan are you using?

           

          What is Infinispan being used for?

           

          What is the Infinispan configuration?

          • 2. Re: Problem in using Infinispan with Jgroups for Clustering
            Dzung Leonhart Newbie

            Hi Zamarreno,

            First of all, thanks a lot for your concern with my problem.

            For the questions,

                 1. Here're the list of jar files that I've used for HSearch and Infinispan:

                      hibernate-search-3.4.0.Final.jar

                      hibernate-search-infinispan-3.4.0.Final.jar

                      infinispan-core-4.2.1.FINAL.jar

                      infinispan-lucene-directory-4.2.1.FINAL.jar

                      jgroups-2.12.0.Final.jar

                      marshalling-api-1.2.3.GA.jar

                      river-1.2.3.GA.jar

             

                 2. Infinispan is used as the directory provider of HSearch, and JGroups is for synchronization backend.

             

                 3. I think the default-hibernatesearch-infinispan.xml in hibernate-search-infinispan-3.4.0.Final.jar is used as default when no configuration is explicitly stated.

                

            <?xml version="1.0" encoding="UTF-8"?>

            <infinispan

                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

                xsi:schemaLocation="urn:infinispan:config:4.2 http://www.infinispan.org/schemas/infinispan-config-4.2.xsd"

                xmlns="urn:infinispan:config:4.2">

                <!-- *************************** -->

                <!-- System-wide global settings -->

                <!-- *************************** -->

                <global>

                    <!-- Duplicate domains are allowed so that multiple deployments with default configuration

                        of Hibernate Search applications work - if possible it would be better to use JNDI to share

                        the CacheManager across applications -->

                    <globalJmxStatistics

                        enabled="true"

                        cacheManagerName="HibernateSearch"

                        allowDuplicateDomains="true" />

                    <!-- If the transport is omitted, there is no way to create distributed or clustered

                        caches. There is no added cost to defining a transport but not creating a cache that uses one,

                        since the transport is created and initialized lazily. -->

                    <transport

                        clusterName="HibernateSearch-Infinispan-cluster"

                        distributedSyncTimeout="60000">

                        <!-- Note that the JGroups transport uses sensible defaults if no configuration

                            property is defined. See the JGroupsTransport javadocs for more flags -->

                    </transport>

                    <!-- Used to register JVM shutdown hooks. hookBehavior: DEFAULT, REGISTER, DONT_REGISTER.

                        Hibernate Search takes care to stop the CacheManager so registering is not needed -->

                    <shutdown

                        hookBehavior="DONT_REGISTER" />

                </global>

             

                <!-- *************************** -->

                <!-- Default "template" settings -->

                <!-- *************************** -->

                <default>

                    <locking

                        lockAcquisitionTimeout="20000"

                        writeSkewCheck="false"

                        concurrencyLevel="500"

                        useLockStriping="false" />

                    <lazyDeserialization

                        enabled="false" />

                    <!-- Invocation batching is required for use with the Lucene Directory -->

                    <invocationBatching

                        enabled="true" />

                    <!-- This element specifies that the cache is clustered. modes supported: distribution

                        (d), replication (r) or invalidation (i). Don't use invalidation to store Lucene indexes (as

                        with Hibernate Search DirectoryProvider). Replication is recommended for best performance of

                        Lucene indexes, but make sure you have enough memory to store the index in your heap.

                        Also distribution scales much better than replication on high number of nodes in the cluster. -->

                    <clustering

                        mode="replication">

                        <!-- Prefer loading all data at startup than later -->

                        <stateRetrieval

                            timeout="20000"

                            logFlushTimeout="30000"

                            fetchInMemoryState="true"

                            alwaysProvideInMemoryState="true" />

                        <!-- Network calls are synchronous by default -->

                        <sync

                            replTimeout="20000" />

                    </clustering>

                    <jmxStatistics

                        enabled="true" />

                    <eviction

                        maxEntries="-1"

                        strategy="NONE" />

                    <expiration

                        maxIdle="-1" />

                </default>

             

                <!-- ******************************************************************************* -->

                <!-- Individually configured "named" caches.                                         -->

                <!--                                                                                 -->

                <!-- While default configuration happens to be fine with similar settings across the -->

                <!-- three caches, they should generally be different in a production environment.   -->

                <!--                                                                                 -->

                <!-- Current settings could easily lead to OutOfMemory exception as a CacheStore     -->

                <!-- should be enabled, and maybe distribution is desired.                           -->

                <!-- ******************************************************************************* -->

                <!-- *************************************** -->

                <!--  Cache to store Lucene's file metadata  -->

                <!-- *************************************** -->

                <namedCache

                    name="LuceneIndexesMetadata">

                    <clustering

                        mode="replication">

                        <stateRetrieval

                            fetchInMemoryState="true"

                            logFlushTimeout="30000" />

                        <sync

                            replTimeout="25000" />

                    </clustering>

                </namedCache>

                <!-- **************************** -->

                <!--  Cache to store Lucene data  -->

                <!-- **************************** -->

                <namedCache

                    name="LuceneIndexesData">

                    <clustering

                        mode="replication">

                        <stateRetrieval

                            fetchInMemoryState="true"

                            logFlushTimeout="30000" />

                        <sync

                            replTimeout="25000" />

                    </clustering>

                </namedCache>

                <!-- ***************************** -->

                <!--  Cache to store Lucene locks  -->

                <!-- ***************************** -->

                <namedCache

                    name="LuceneIndexesLocking">

                    <clustering

                        mode="replication">

                        <stateRetrieval

                            fetchInMemoryState="true"

                            logFlushTimeout="30000" />

                        <sync

                            replTimeout="25000" />

                    </clustering>

                </namedCache>

            </infinispan>

            • 3. Re: Problem in using Infinispan with Jgroups for Clustering
              Galder Zamarreño Master

              Hmmm, that's a very weird error, I haven't seen before and that Infinispan version has been out for a while.

               

              I think we can do two things here:

              1. If you can build a test case that we can run to replicate the issue, that'd help us get to the bottom of this.

              2. Alternatively, try using Hibernate Search 4 which is in candidate release now and uses a more recent Infinispan version (5.1.x IIRC)

              1 of 1 people found this helpful
              • 4. Re: Problem in using Infinispan with Jgroups for Clustering
                Dzung Leonhart Newbie

                Finally, I can make it work with my 2 local nodes the following configurations:

                     1. Jars file:

                     hibernate-search-3.5.0-SNAPSHOT.jar

                     hibernate-search-infinispan-3.4.0.Final.jar

                     infinispan-core-5.0.1.jar

                     jgroups-2.12.1.3.Final.jar

                 

                     2. Spring bean:                  

                     <bean id="sessionFactory"

                             class="org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean">

                             <property name="hibernateProperties">

                                 <props>

                                       <prop key="hibernate.search.default.directory_provider">infinispan</prop>

                                       <prop key="hibernate.search.infinispan.configuration_resourcename">hibernate-search-infinispan.xml</prop>

                                 </props>

                             </property>

                     </bean>

                 

                     3. hibernate-search-infinispan.xml

                <?xml version="1.0" encoding="UTF-8"?>

                 

                <infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

                    xsi:schemaLocation="urn:infinispan:config:5.0 http://www.infinispan.org/schemas/infinispan-config-5.0.xsd"

                    xmlns="urn:infinispan:config:5.0">

                 

                 

                    <!-- *************************** -->

                    <!-- System-wide global settings -->

                    <!-- *************************** -->

                    <global>

                        <!-- Duplicate domains are allowed so that multiple deployments with default

                            configuration of Hibernate Search applications work - if possible it would

                            be better to use JNDI to share the CacheManager across applications -->

                        <globalJmxStatistics enabled="false"

                            cacheManagerName="HibernateSearch" allowDuplicateDomains="true" />

                 

                        <transport clusterName="infinispan-hibernate-search-cluster"

                            distributedSyncTimeout="60000"

                            transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport">

                            <properties>

                                <property name="configurationFile" value="jdbc_ping.xml" />

                            </properties>

                        </transport>

                 

                        <!-- Used to register JVM shutdown hooks. hookBehavior: DEFAULT, REGISTER,

                            DONT_REGISTER. Hibernate Search takes care to stop the CacheManager so registering

                            is not needed -->

                        <shutdown hookBehavior="DONT_REGISTER" />

                    </global>

                 

                 

                    <!-- *************************** -->

                    <!-- Default "template" settings -->

                    <!-- *************************** -->

                    <default>

                        <locking lockAcquisitionTimeout="60000" writeSkewCheck="false"

                            concurrencyLevel="500" useLockStriping="false" />

                 

                        <!-- Invocation batching is required for use with the Lucene Directory -->

                        <invocationBatching enabled="true" />

                 

                 

                        <!-- This element specifies that the cache is clustered. modes supported:

                            distribution (d), replication (r) or invalidation (i). Don't use invalidation

                            to store Lucene indexes (as with Hibernate Search DirectoryProvider). Replication

                            is recommended for best performance of Lucene indexes, but make sure you

                            have enough memory to store the index in your heap. Also distribution scales

                            much better than replication on high number of nodes in the cluster. -->

                        <clustering mode="replication">

                            <!-- Prefer loading all data at startup than later -->

                            <stateRetrieval timeout="60000" logFlushTimeout="60000"

                                fetchInMemoryState="true" alwaysProvideInMemoryState="true" />

                            <!-- Network calls are synchronous by default -->

                            <sync replTimeout="60000" />

                        </clustering>

                        <jmxStatistics enabled="false" />

                        <eviction maxEntries="-1" strategy="NONE" />

                        <expiration maxIdle="-1" />

                    </default>

                 

                 

                    <!-- ******************************************************************************* -->

                    <!-- Individually configured "named" caches. -->

                    <!-- -->

                    <!-- While default configuration happens to be fine with similar settings

                        across the -->

                    <!-- three caches, they should generally be different in a production environment. -->

                    <!-- -->

                    <!-- Current settings could easily lead to OutOfMemory exception as a CacheStore -->

                    <!-- should be enabled, and maybe distribution is desired. -->

                    <!-- ******************************************************************************* -->

                 

                 

                    <!-- *************************************** -->

                    <!-- Cache to store Lucene's file metadata -->

                    <!-- *************************************** -->

                    <namedCache name="LuceneIndexesMetadata">

                        <clustering mode="replication">

                            <stateRetrieval fetchInMemoryState="true"

                                logFlushTimeout="60000" />

                            <sync replTimeout="60000" />

                        </clustering>

                    </namedCache>

                 

                 

                    <!-- **************************** -->

                    <!-- Cache to store Lucene data -->

                    <!-- **************************** -->

                    <namedCache name="LuceneIndexesData">

                        <clustering mode="replication">

                            <stateRetrieval fetchInMemoryState="true"

                                logFlushTimeout="60000" />

                            <sync replTimeout="60000" />

                        </clustering>

                    </namedCache>

                 

                 

                    <!-- ***************************** -->

                    <!-- Cache to store Lucene locks -->

                    <!-- ***************************** -->

                    <namedCache name="LuceneIndexesLocking">

                        <clustering mode="replication">

                            <stateRetrieval fetchInMemoryState="true"

                                logFlushTimeout="60000" />

                            <sync replTimeout="60000" />

                        </clustering>

                    </namedCache>

                </infinispan>

                 

                     4. jdbc_ping.xml

                <config xmlns="urn:org:jgroups" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

                    xsi:schemaLocation="urn:org:jgroups JGroups-2.12.xsd">

                    <TCP bind_port="${jgroups.tcp.port:7800}"

                        loopback="true" port_range="30" recv_buf_size="20000000"

                        send_buf_size="640000" discard_incompatible_packets="true"

                        max_bundle_size="64000" max_bundle_timeout="30" enable_bundling="true"

                        use_send_queues="true" sock_conn_timeout="300" enable_diagnostics="false"

                        thread_pool.enabled="true" thread_pool.min_threads="2"

                        thread_pool.max_threads="30" thread_pool.keep_alive_time="5000"

                        thread_pool.queue_enabled="false" thread_pool.queue_max_size="100"

                        thread_pool.rejection_policy="Discard" oob_thread_pool.enabled="true"

                        oob_thread_pool.min_threads="2" oob_thread_pool.max_threads="30"

                        oob_thread_pool.keep_alive_time="5000" oob_thread_pool.queue_enabled="false"

                        oob_thread_pool.queue_max_size="100" oob_thread_pool.rejection_policy="Discard" />

                 

                    <JDBC_PING connection_driver="com.mysql.jdbc.Driver"

                        connection_username="xxx" connection_password="xxx"

                        connection_url="jdbc:mysql://xxx/xxx" level="debug" />

                 

                    <MERGE2 max_interval="30000" min_interval="10000" />

                    <FD_SOCK />

                    <FD timeout="3000" max_tries="3" />

                    <VERIFY_SUSPECT timeout="1500" />

                    <pbcast.NAKACK use_mcast_xmit="false" gc_lag="0"

                        retransmit_timeout="300,600,1200,2400,4800" discard_delivered_msgs="false" />

                    <UNICAST timeout="300,600,1200" />

                    <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"

                        max_bytes="400000" />

                    <pbcast.GMS print_local_addr="false" join_timeout="7000"

                        view_bundling="true" />

                    <UFC max_credits="2000000" min_threshold="0.10" />

                    <MFC max_credits="2000000" min_threshold="0.10" />

                    <FRAG2 frag_size="60000" />

                    <pbcast.STREAMING_STATE_TRANSFER />

                </config>

                 

                I see that my previous Spring bean configuration:

                      <bean id="sessionFactory"

                            <property name="hibernateProperties">

                            <props>

                                <prop key="hibernate.search.default.directory_provider">infinispan</prop>

                                <prop key="hibernate.search.worker.backend.jgroups.configurationFile">jdbc_ping.xml</prop>               

                            </props>

                        </property>

                didn't work as I thought it to be. As I observered in my posted logs above, it seems the UDP broadcast was still used to discover nodes, JDBC-PING have no effect on that???

                 

                Furthermore, when I deploy to Amazon EC2, I got this error when starting the second node:

                 

                2011-11-15 01:27:56,623 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - ISPN000078: Starting JGroups Channel

                2011-11-15 01:27:56,727 [main] INFO  org.jgroups.JChannel - JGroups version: 2.12.1.3.Final

                2011-11-15 01:27:57,025 [main] DEBUG org.jgroups.protocols.JDBC_PING - Registering JDBC Driver named 'com.mysql.jdbc.Driver'

                2011-11-15 01:27:57,474 [main] DEBUG org.jgroups.protocols.JDBC_PING - Could not execute initialize_sql statement; not necessarily an error.

                2011-11-15 01:27:57,586 [Timer-2,infinispan-hibernate-search-cluster,ip-10-156-119-140-17503] DEBUG org.jgroups.protocols.JDBC_PING - Removed 2352aa04-bbaf-c079-d806-f87b5c1378ea for clustername infinispan-hibernate-search-cluster from database.

                2011-11-15 01:27:57,595 [Timer-2,infinispan-hibernate-search-cluster,ip-10-156-119-140-17503] DEBUG org.jgroups.protocols.JDBC_PING - Registered 2352aa04-bbaf-c079-d806-f87b5c1378ea for clustername infinispan-hibernate-search-cluster into database.

                2011-11-15 01:27:57,961 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - ISPN000094: Received new cluster view: [ip-10-152-101-144-26718|11] [ip-10-152-101-144-26718, ip-10-156-119-140-17503]

                2011-11-15 01:27:57,965 [main] INFO  org.infinispan.remoting.transport.jgroups.JGroupsTransport - ISPN000079: Cache local address is ip-10-156-119-140-17503, physical addresses are [10.156.119.140:7800]

                2011-11-15 01:27:57,965 [main] INFO  org.infinispan.factories.GlobalComponentRegistry - ISPN000128: Infinispan version: Infinispan 'Pagoa' 5.0.1.FINAL

                2011-11-15 01:27:58,305 [main] INFO  org.infinispan.remoting.rpc.RpcManagerImpl - ISPN000074: Trying to fetch state from ip-10-152-101-144-26718

                2011-11-15 01:28:19,355 [Incoming-2,infinispan-hibernate-search-cluster,ip-10-156-119-140-17503] WARN  org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER - State reader socket thread spawned abnormaly

                java.net.ConnectException: Connection timed out

                    at java.net.PlainSocketImpl.socketConnect(Native Method)

                    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)

                    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)

                    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)

                    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

                    at java.net.Socket.connect(Socket.java:529)

                    at org.jgroups.util.Util.connect(Util.java:276)

                    at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.connectToStateProvider(STREAMING_STATE_TRANSFER.java:510)

                    at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.handleStateRsp(STREAMING_STATE_TRANSFER.java:462)

                    at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:223)

                    at org.jgroups.protocols.FRAG2.up(FRAG2.java:189)

                    at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)

                    at org.jgroups.protocols.FlowControl.up(FlowControl.java:400)

                    at org.jgroups.protocols.pbcast.GMS.up(GMS.java:908)

                    at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:246)

                    at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:613)

                    at org.jgroups.protocols.UNICAST.up(UNICAST.java:294)

                    at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:703)

                    at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:133)

                    at org.jgroups.protocols.FD.up(FD.java:275)

                    at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:275)

                    at org.jgroups.protocols.MERGE2.up(MERGE2.java:209)

                    at org.jgroups.protocols.Discovery.up(Discovery.java:293)

                    at org.jgroups.protocols.TP.passMessageUp(TP.java:1109)

                    at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1665)

                    at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1647)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

                    at java.lang.Thread.run(Thread.java:662)

                2011-11-15 01:28:19,356 [Incoming-2,infinispan-hibernate-search-cluster,ip-10-156-119-140-17503] WARN  org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER - Could not connect to state provider. Closing socket...

                2011-11-15 01:28:57,723 [Timer-2,infinispan-hibernate-search-cluster,ip-10-156-119-140-17503] DEBUG org.jgroups.protocols.JDBC_PING - Removed 2352aa04-bbaf-c079-d806-f87b5c1378ea for clustername infinispan-hibernate-search-cluster from database.

                2011-11-15 01:28:57,727 [Timer-2,infinispan-hibernate-search-cluster,ip-10-156-119-140-17503] DEBUG org.jgroups.protocols.JDBC_PING - Registered 2352aa04-bbaf-c079-d806-f87b5c1378ea for clustername infinispan-hibernate-search-cluster into database.

                -------------------------

                 

                Then I tried to stop & start both nodes several times and ... it worked. Do you have an advice to eliminate this problem?

                 

                Thanks a lot Infinispan team. Your helps really essential to me.

                 

                Best Regards,

                Dung Ngo.

                • 5. Re: Problem in using Infinispan with Jgroups for Clustering
                  Galder Zamarreño Master

                  In EC2, machines are set up with firewalls and by default, STREAMING_STATE_TRANSFER uses a different tcp port to do streaming. So, you can either:

                   

                  - set use_default_transport property in STREAMING_STATE_TRANSFER protoocol to true in order to use the transport to do the streaming through it.

                   

                  or

                   

                  - set bind_port property in STREAMING_STATE_TRANSFER to a particular value, rather than the default 0, and then open that port in the EC2 instances.

                  • 6. Re: Problem in using Infinispan with Jgroups for Clustering
                    Dzung Leonhart Newbie

                    You did point out the correct solution for my that issue.

                     

                    Thank you very much.

                    • 7. Re: Problem in using Infinispan with Jgroups for Clustering
                      Galder Zamarreño Master

                      Btw, as a FYI, starting with Infinispan 5.1.0, streaming state transfer won't be necessary any more. This is already present in the latest 5.1.0.BETA4.