1 Reply Latest reply on Oct 13, 2004 10:12 AM by drpizza

    Removing replicants from partition occasionally failing in 3

    drpizza

      Occasionally when a member leaves the cluster rather than everything working properly, we see this in our log files:

      2004-10-13 09:03:20,283 INFO [org.jboss.ha.framework.interfaces.HAPartition.lifecycle.DefaultPartition] Suspected member: 10.3.1.40:32878 (additional data: 14 bytes)
      2004-10-13 09:03:20,289 INFO [org.jboss.ha.framework.interfaces.HAPartition.lifecycle.DefaultPartition] New cluster view (id: 22, delta: -1) : [10.3.1.51:1099]
      2004-10-13 09:03:20,289 INFO [DefaultPartition:ReplicantManager] Dead members: 1
      2004-10-13 09:03:20,290 DEBUG [DefaultPartition:ReplicantManager] trying to remove deadMember 10.3.1.41:1099 for key HAJNDI
      2004-10-13 09:03:20,290 DEBUG [DefaultPartition:ReplicantManager] 10.3.1.41:1099 was NOT removed!!!
      2004-10-13 09:03:20,291 DEBUG [DefaultPartition:ReplicantManager] trying to remove deadMember 10.3.1.41:1099 for key DCacheBridge-DefaultJGBridge
      2004-10-13 09:03:20,303 DEBUG [DefaultPartition:ReplicantManager] 10.3.1.41:1099 was NOT removed!!!
      


      This has frankly disastrous repercussions, as everything still tries to use the dead member, and clustered singletons don't switch, and it's all horrible, horrible, horrible.

      The majority of time things work properly; members leave the partition and everything is fine and dandy.