2 Replies Latest reply on Sep 13, 2010 4:09 AM by Vladimir Blagojevic

    Getting java.lang.IllegalStateException: Join cannot be complete without rehash to finish

    Savin Seven Newbie

      Dear All,

       

      I was running a infinispan cluster and after two days or three days, I was getting the below exception stacktrace in one server

       

      Exception:

      : view is MergeView::[lnxdevvm102-45158|3] [lnxdevvm102-45158, lnxdevvm102-4758], subgroups=[[lnxdevvm102-45158|1] [lnxdevvm102-4758],
      [lnxdevvm102-45158|2] [lnxdevvm102-45158]]
      2010-09-08 04:39:28,194 DEBUG [FD_SOCK] (Incoming-1,Infinispan-Cluster,lnxdevvm102-4758) VIEW_CHANGE received: [lnxdevvm102-45158, lnxdevvm102-4758]
      2010-09-08 04:39:28,234 INFO  [JGroupsTransport] (Incoming-1,Infinispan-Cluster,lnxdevvm102-4758) Received new cluster view: MergeView::[lnxdevvm102-45158|3] [lnxdevvm102-45158, lnxdevvm102-4758], subgroups=[[lnxdevvm102-45158|1] [lnxde
      vvm102-4758], [lnxdevvm102-45158|2] [lnxdevvm102-45158]]
      2010-09-08 04:39:28,354 DEBUG [JoinTask] (Rehasher-lnxdevvm102-4758) Commencing rehash on node: lnxdevvm102-4758. Before start, dmi.joinComplete = true
      2010-09-08 04:39:28,372 DEBUG [FLUSH] (Incoming-1,Infinispan-Cluster,lnxdevvm102-4758) lnxdevvm102-4758: installing view MergeView::[lnxdevvm102-45158|3] [lnxdevvm102-45158, lnxdevvm102-4758], subgroups=[[lnxdevvm102-45158|1] [lnxdevvm1
      02-4758], [lnxdevvm102-45158|2] [lnxdevvm102-45158]]
      2010-09-08 04:39:28,376 DEBUG [JoinTask] (Rehasher-lnxdevvm102-4758) Commencing rehash on node: lnxdevvm102-4758. Before start, dmi.joinComplete = true
      Commencing rehash on node: lnxdevvm102-4758. Before start, dmi.joinComplete = true
      2010-09-08 04:39:28,476 ERROR [JoinTask] (Rehasher-lnxdevvm102-4758) Caught exception!
      java.lang.IllegalStateException: Join cannot be complete without rehash to finish (node lnxdevvm102-4758 )
              at org.infinispan.distribution.JoinTask.performRehash(JoinTask.java:82)
              at org.infinispan.distribution.RehashTask.call(RehashTask.java:52)
              at org.infinispan.distribution.RehashTask.call(RehashTask.java:32)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
              at java.util.concurrent.FutureTask.run(FutureTask.java:138)
              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
              at java.lang.Thread.run(Thread.java:619)
      2010-09-08 04:39:28,526 ERROR [JoinTask] (Rehasher-lnxdevvm102-4758) Caught exception!
      java.lang.IllegalStateException: Join cannot be complete without rehash to finish (node lnxdevvm102-4758 )
              at org.infinispan.distribution.JoinTask.performRehash(JoinTask.java:82)
              at org.infinispan.distribution.RehashTask.call(RehashTask.java:52)
              at org.infinispan.distribution.RehashTask.call(RehashTask.java:32)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
              at java.util.concurrent.FutureTask.run(FutureTask.java:138)
              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
              at java.lang.Thread.run(Thread.java:619)

       

      --------------------------------------------------------------------------------------------------------------------------

      Configuration File:

       

      <?xml version="1.0" encoding="UTF-8"?>
      <infinispan>
        <global>
        <transport clusterName="infinispan-cluster"
                   distributedSyncTimeout="50000"
          transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport">
          <properties>
            <property name="configurationFile" value="../etc/config-samples/jgroups-udp.xml"/>
         <!-- <property name="configurationFile" value="../etc/config-samples/jgroups-tcp.xml"/> -->
          </properties>
        </transport>
        </global>

      <default>
            <clustering mode="distribution">
               <sync/>
               <hash rehashEnabled="true" numOwners="2" rehashWait="120000" rehashRpcTimeout="600000"/>
               <l1 enabled="false" lifespan="600000"/>
            </clustering>
          </default>
       
          <namedCache name="distributedCache">
            <clustering mode="distribution">
               <sync/>
               <hash rehashEnabled="true" numOwners="2" rehashWait="120000" rehashRpcTimeout="600000"/>
               <l1 enabled="false" lifespan="600000"/>
            </clustering>
         </namedCache>
        
      <namedCache name="replicationCache">    
        <clustering mode="replication">
        <!--  Defines whether to retrieve state on startup -->
        <stateRetrieval timeout="20000" fetchInMemoryState="false"/>
        <!--  Network calls are synchronous. -->
        <sync replTimeout="20000"/>
        </clustering>
      </namedCache>  
      </infinispan>

       

       

      ------------------------------------------------------------------------------------------------------------------------------------------

      Server startup Command:

       

      Machine 1: Server is started using below command

      startServer.bat -l 192.168.1.7 -p 6904 -m 20 -t 20 -c ../etc/config-samples/minimal.xml -r hotrod -i 600 -n true -s 1024000 -e 1024000 -o 192.168.1.7 -x 6904

       

      Machine 2: Server is started using below command

      startServer.bat -l 192.168.1.14 -p 6905 -m 20 -t 20 -c ../etc/config-samples/minimal.xml -r hotrod -i 600 -n true -s 1024000 -e 1024000 -o 192.168.1.14 -x 6905

       

      -----------------------------------------------------------------------------------------------------------------------------------------

       

      Question 1:

      I wanted to know, why this exception is coming in my Infinispan server ??

       

      Question 2:

      Please let me know, How can I overcome this exception in future. [any change in the configuration files.. etc needs to be done from my end ?]

       

      Thanks & Regards,

      Savin