0 Replies Latest reply on Apr 6, 2007 4:54 AM by anovarini

    This node cannot be suspected!

    anovarini

      Hello everybody,

      I'm new to this forum, and after a fast search to find something helpful I still have a question about a configuration issue that I'm experiencing.

      I have a cluster with two nodes (IPs are 10.0.0.152 and 10.0.0.153).
      When I run the second node (doesn't matter which is the first I run), I see theese messages in the log file:

      09:36:04,940 INFO [TreeCache] viewAccepted(): [10.0.0.153:33063|1] [10.0.0.153:33063, 10.0.0.152:51775]
      09:36:05,081 INFO [TreeCache] locking the subtree at / to transfer state
      09:36:05,097 INFO [StateTransferGenerator_140] returning the state for tree rooted in /(1024 bytes)
      09:36:07,482 INFO [agora-partition-prod] New cluster view for partition agora-partition-prod (id: 1, delta: 1) : [10.0.0.153:4399, 10.0.0.152:4399]
      09:36:07,482 INFO [agora-partition-prod] I am (10.0.0.153:4399) received membershipChanged event:
      09:36:07,483 INFO [agora-partition-prod] Dead members: 0 ([])
      09:36:07,483 INFO [agora-partition-prod] New Members : 1 ([10.0.0.152:4399])
      09:36:07,483 INFO [agora-partition-prod] All Members : 2 ([10.0.0.153:4399, 10.0.0.152:4399])
      09:36:10,789 INFO [agora-partition-prod] Suspected member: 10.0.0.152:51778 (additional data: 15 bytes)
      09:36:10,792 INFO [agora-partition-prod] New cluster view for partition agora-partition-prod (id: 2, delta: -1) : [10.0.0.153:4399]
      09:36:10,793 INFO [agora-partition-prod] I am (10.0.0.153:4399) received membershipChanged event:
      09:36:10,793 INFO [agora-partition-prod] Dead members: 1 ([10.0.0.152:4399])
      09:36:10,793 INFO [agora-partition-prod] New Members : 0 ([])
      09:36:10,793 INFO [agora-partition-prod] All Members : 1 ([10.0.0.153:4399])
      09:36:13,627 INFO [agora-partition-prod] New cluster view for partition agora-partition-prod (id: 3, delta: 1) : [10.0.0.153:4399, 10.0.0.152:4399]
      09:36:13,628 INFO [agora-partition-prod] I am (10.0.0.153:4399) received membershipChanged event:
      09:36:13,628 INFO [agora-partition-prod] Dead members: 0 ([])
      09:36:13,628 INFO [agora-partition-prod] New Members : 1 ([10.0.0.152:4399])
      09:36:13,628 INFO [agora-partition-prod] All Members : 2 ([10.0.0.153:4399, 10.0.0.152:4399])

      -------------------------

      In the other log file, on the other node, I see this:

      -------------------------------------------------------
      GMS: address is 10.0.0.152:51830 (additional data: 15 bytes)
      -------------------------------------------------------
      09:37:30,714 INFO [agora-partition-prod] New cluster view for partition agora-partition-prod: 27 ([10.0.0.153:4399, 10.0.0.152:4399] delta: 0)
      09:37:30,715 INFO [agora-partition-prod] I am (10.0.0.152:4399) received membershipChanged event:
      09:37:30,715 INFO [agora-partition-prod] Dead members: 0 ([])
      09:37:30,715 INFO [agora-partition-prod] New Members : 0 ([])
      09:37:30,715 INFO [agora-partition-prod] All Members : 2 ([10.0.0.153:4399, 10.0.0.152:4399])
      09:37:34,019 INFO [agora-partition-prod] Suspected member: 10.0.0.152:51830 (additional data: 15 bytes)
      09:37:34,021 WARN [GMS] checkSelfInclusion() failed, 10.0.0.152:51830 (additional data: 15 bytes) is not a member of view [10.0.0.153:33066 (additional data: 15 bytes)|28] [10.0.0.153:33066 (
      additional data: 15 bytes)]; discarding view
      09:37:34,021 WARN [GMS] I (10.0.0.152:51830 (additional data: 15 bytes)) am being shunned, will leave and rejoin group (prev_members are [10.0.0.153:33066 (additional data: 15 bytes) 10.0.0.1
      52:51830 (additional data: 15 bytes) ])

      -------------------

      Now some notes about the environment:

      The jboss version is 4.0.5.GA;
      The jdk is 1.5.0_10 Itanium version;
      The operating system is redhat linux 64 bit;
      The two installations were done using a script that worked fine until jboss version 4.0.4;
      In the test environment the two nodes seem to work correctly;
      In the test environment running two 'All' instances they work fine, but this doesn't happen on the servers above; this makes me thing that it's not a problem of the instances, but something related to the environment...
      I'm using the standard UDP communication protocol, as described in the cluster-service.xml

      If anyone can lead me to the right way to fix this, I'll really appreciate it.

      Thank you in advance.
      Ale