7 Replies Latest reply on May 18, 2009 9:36 AM by belaban

    Facing Reincarnation Error?

      I am using JBoss 4.0.3SP1 (shipped with JGroups 2.2.7) to combine a cluster with 3 nodes on Win 2003, basing on JDK1.4.2_17.

      I configured a TCP stack for underlying communication. Both in cluster-service.xml & tc5-cluster-service.xml. See following:

      <Config>
       <TCP bind_addr="10.200.**.1" start_port="7800" loopback="true"/>
       <TCPPING initial_hosts="10.200.**.1[7800],10.200.**.2[7800],10.200.**.3[7800]" port_range="3" timeout="10000"
       num_initial_members="3" up_thread="true" down_thread="true"/>
       <MERGE2 min_interval="5000" max_interval="10000"/>
       <FD shun="false" timeout="15000" max_tries="5" up_thread="true" down_thread="true" />
       <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
       <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100"
       retransmit_timeout="3000"/>
       <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
       <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false"
       print_local_addr="true" down_thread="true" up_thread="true"/>
       <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
       </Config>



      The problem is, if I turn off one server(say,node B) and restart it immediately, there are ERROR messages:
      on the B side:
      09:07:08|ERROR [ParticipantGmsImpl] handleJoinResponse() should not be invoked on an instance of org.jgroups.protocols.pbcast.participantGmsImpl


      and after a while also:
      ERROR [GMS] [B:7800(additional data:18bytes)] received view <= current view; discarding it (current vid:[A:7800(additional data:18bytes)|2],new vid:[A:7800(additional data:18bytes)|2])


      on the A side:
      09:07:03|ERROR[CoordGmsImp] memeber B:7800 already present; return existing view [A:7800,B:7800]
      09:07:07|ERROR[GMS][A:7800] received view<=current view;descarding it(current vid:[A:7800|1], new vid:[B:7800|1])


      After B started up, from the web console, I can see that B joined the cluster view. Is it a proper cluster? Could I just ignore these Error messages? It would be great if someone could explain that what do these error messages mean exactly. What happened behind?

      Is it typical reincarnation error?
      Is there some solution to solve this problem? it happens a lot and really annoys me.

      My solution(it works sometimes but looks stupid):
      After I turn off one node, I wait for about 4 minutes, then restart it. The Error messages don't show up anymore.

      Thanks for any idea.

        • 1. Re: Facing Reincarnation Error?
          belaban

          Yes, this is most likely a reincarnation error: the time to detect a crash (in FD) and exclude the crashed member is 46.5 seconds, so when you kill and restart a node within ~ 46 secs, that node will get the same port and be reincarnated.

          Workarounds (besides using 2.8):
          - Add FD_SOCK as described on then JGroups wiki
          - Don't se bind_port in TCP

          Bela

          • 2. Re: Facing Reincarnation Error?

            Thanks Bela.

            Does "don't set the bind_port in TCP" mean set it to "0"?
            Then I can't use TCPPING anymore right?

            • 3. Re: Facing Reincarnation Error?
              belaban

              Either set it to 0, or omit the parameter altogether.
              You could use TCPGOSSIP or FILE_PING instead of TCPPING.

              • 4. Re: Facing Reincarnation Error?

                Unfortunately, for some reasons, We can not use extra Gossip Route and FILE_PING is not compliant to JGroups 2.2.7 right?...Schade

                How about the solution of adding attributes, persistent_ports?
                But it seems, it's also not compliant to version 2.2.7, because ERROR message: can't recognize this attribute.

                Should upgrade the JGroups version? Then which one? I tried 2.4.5GA, but unstable. I also want that Auto-connection feature is set true by default..So I'll try to use 2.4.2.

                Is there any recommendation or experience to share?

                Thanks a lot.
                -------------------
                LarryJ

                • 5. Re: Facing Reincarnation Error?
                  belaban

                  Upgrade to 2.6.10, this is backwards compatible with 2.2.7 (which is very old)

                  • 6. Re: Facing Reincarnation Error?

                    Thanks Bela...

                    But version 2.6 bases on JDK1.5. We are using 1.4.2 and could not be upgraded for other applications.

                    Nevertheless, I have given up the TCP and use UDP configuration now.
                    It works more smooth.

                    • 7. Re: Facing Reincarnation Error?
                      belaban

                      OK, good. I stronly suggest to upgrade your JDK though... JDK 1.5 is EOL and will not be supported by SUN this year, let alone JDK 1.4...