1 Reply Latest reply on Jun 10, 2004 5:04 AM by Tom

    cluster suddenly failed

    Tom Newbie

      I've been happily running a cluster for several weeks but yesterday it fell apart and I can't get it working again.
      First thing that happened was that at least one node stopped listening on 1100. I can't remember what state the other was in but there were no exceptions in the logs and the servers were still performing their scheduled mbean tasks.
      I restarted both nodes in the cluster but neither seem to come up properly although they both behave differently.

      I'm using the default 'all' configuration including the default cluster-service.xml. When I start the one server I get:

      2004-06-10 09:05:34,142 INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] Initializing
      2004-06-10 09:05:34,163 DEBUG [DefaultPartition:ReplicantManager] registerRPCHandler
      2004-06-10 09:05:34,163 DEBUG [DefaultPartition:ReplicantManager] subscribeToStateTransferEvents
      2004-06-10 09:05:34,163 DEBUG [DefaultPartition:ReplicantManager] registerMembershipListener
      2004-06-10 09:05:34,302 DEBUG [org.javagroups.DefaultPartition] [Thu Jun 10 09:05:34 BST 2004] [ERROR] JChannel.connect(): exception: java.net.BindException: Cannot assign requested address
      2004-06-10 09:05:34,305 ERROR [org.jboss.ha.framework.server.ClusterPartition] Starting failed
      ChannelException: java.net.BindException: Cannot assign requested address
       at org.jgroups.JChannel.connect(JChannel.java:224)
      


      Is there anything I can check on the system to try to fix this? (Running on Linux)

      The second machine seems happy to start the DefaultPartition but fails to deploy anything in the farm directory. There are no exceptions thrown, but I'll need to look at this more closely.

      Having started and stopped the cluster regularly during development I can't work out what has changed or why it suddenly failed when nothing (as far as I know) changed in the environment.

      Any help would be appreciated. Thanks.