1 Reply Latest reply on Nov 10, 2006 1:52 PM by azhurakousky

    Weird clustererror

    metalhead

      Hello, I'm experiencing some weird errors... my guess is it has something to do with clustering... (using JBoss 4.0.3)

      We're developping in a small team, in one chamber, using the same switch.

      Now, when my colleague starts jboss first, and after that I start JBoss, my Application fails.

      This is part of the console of my colleague (ELECTRA);

      08:23:01,392 WARN [FD] I was suspected, but will not remove myself from membership (waiting for EXIT message)
      08:23:02,892 WARN [CoordGmsImpl] I am the coord and I m being am suspected -- will probably leave shortly
      08:23:08,970 WARN [FD] I was suspected, but will not remove myself from membership (waiting for EXIT message)
      08:23:10,470 WARN [CoordGmsImpl] I am the coord and I m being am suspected -- will probably leave shortly
      08:23:11,986 WARN [FD] I was suspected, but will not remove myself from membership (waiting for EXIT message)
      08:23:13,486 WARN [CoordGmsImpl] I am the coord and I m being am suspected -- will probably leave shortly
      08:23:13,986 WARN [GMS] checkSelfInclusion() failed, 172.16.3.82:1321 is not a member of view [HYPERION:1234|4] [HYPERION:1234]; discarding view
      08:23:13,986 WARN [GMS] I (172.16.3.82:1321) am being shunned, will leave and rejoin group (prev_members are [172.16.3.82:1321 HYPERION:1096 HYPERION:1234 ])
      08:23:14,861 INFO [STDOUT]
      -------------------------------------------------------
      GMS: address is 172.16.3.82:2157
      -------------------------------------------------------
      08:23:16,564 ERROR [ClientGmsImpl] suspect() should not be invoked on an instance of org.jgroups.protocols.pbcast.ClientGmsImpl
      08:23:19,532 ERROR [ClientGmsImpl] suspect() should not be invoked on an instance of org.jgroups.protocols.pbcast.ClientGmsImpl
      08:23:21,861 WARN [ClientGmsImpl] handleJoin(172.16.3.82:2157) failed, retrying
      08:23:30,861 WARN [ClientGmsImpl] handleJoin(172.16.3.82:2157) failed, retrying
      08:23:36,907 INFO [TreeCache] viewAccepted(): new members: [HYPERION:1234, 172.16.3.82:2157]


      And this is my console (HYPERION):
      08:23:13,015 WARN [FD] ping_dest is null: members=[ELECTRA:1321, 172.16.3.77:1234], pingable_mbrs=[172.16.3.77:1234], local_addr=172.16.3.77:1234
      08:23:16,015 WARN [FD] ping_dest is null: members=[ELECTRA:1321, 172.16.3.77:1234], pingable_mbrs=[172.16.3.77:1234], local_addr=172.16.3.77:1234
      08:23:19,015 WARN [FD] ping_dest is null: members=[ELECTRA:1321, 172.16.3.77:1234], pingable_mbrs=[172.16.3.77:1234], local_addr=172.16.3.77:1234
      08:23:23,343 ERROR [CoordGmsImpl] mbr ELECTRA:1321 is not a member !
      08:23:23,343 INFO [TreeCache] viewAccepted(): new members: [172.16.3.77:1234]
      08:23:23,359 ERROR [CoordGmsImpl] mbr ELECTRA:1321 is not a member !


      Is this problem cluster-related? And if so, how can I fix this?

        • 1. Re: Weird clustererror

          Without getting into to many details here is a quck explanation and a fix:

          Being on the same network you are attempting to Join the cluster (weather you like it or not). You can clearly see that two JGroups stacks are talking and shunning mambers since they belong to a different grous.

          Start your JBoss as follows:

          run -c <conf_name> -g ELECTRA_PARTITION -u 230.1.2.10
          run -c <conf_name> -g HYPERION_PARTITION -u 230.1.2.11

          Opotion -g will create a unique name for a cluster partition, so you won't have a conflict such as DefaultPartition.
          Option -u will override a default Multicast address.

          With that, your servers will not even see each other.

          On the oposute side if you do want two of them join the cluster thenmake suer that values of -g and -u are the same on bothe machines