4 Replies Latest reply on Jan 23, 2015 10:31 AM by rafachies

    Slave host controller does not start if domain controller (master) is not available – how to setup the number of attempts to reconnect before giving up

    hajue

      JBoss EAP in a domain model... let's say one domain controller and N slave host controllers... all on Windows machines and registered as service which will start on server boot...

       

      The good case: the domain controller is always up and running; when a slave host controller starts it can connect to the master and everything is fine.

       

      If the domain controller goes down, no problem: the slaves continue their work and reconnect as far as the domain controller gets back.

       

      The bad case: But what happens if a slave host controller tries to start while the domain controller is currently down, not reachable or it's host needs more time to boot!?

       

      Well, by default the host controller tries 5 times (with a delay of 6 seconds) to connect the domain controller, then it gives up:

       

      [Host Controller] 06:21:37,308 WARN  [org.jboss.as.host.controller] (ControllerBoot Thread) JBAS010900: Could not connect to remote domain controller ***.***.*.***:9999: java.net.ConnectException: JBAS012144: Could not connect to remote://***.***.*.***:9999. The connection timed out

      [Host Controller] 06:21:43,376 WARN  [org.jboss.as.host.controller] (ControllerBoot Thread) JBAS010900: Could not connect to remote domain controller ***.***.*.***:9999: java.net.ConnectException: JBAS012144: Could not connect to remote://***.***.*.***:9999. The connection timed out

      [Host Controller] 06:21:49,492 WARN  [org.jboss.as.host.controller] (ControllerBoot Thread) JBAS010900: Could not connect to remote domain controller ***.***.*.***:9999: java.net.ConnectException: JBAS012144: Could not connect to remote://***.***.*.***:9999. The connection timed out

      [Host Controller] 06:21:55,545 WARN  [org.jboss.as.host.controller] (ControllerBoot Thread) JBAS010900: Could not connect to remote domain controller ***.***.*.***:9999: java.net.ConnectException: JBAS012144: Could not connect to remote://***.***.*.***:9999. The connection timed out

      [Host Controller] 06:22:01,613 WARN  [org.jboss.as.host.controller] (ControllerBoot Thread) JBAS010900: Could not connect to remote domain controller ***.***.*.***:9999: java.net.ConnectException: JBAS012144: Could not connect to remote://***.***.*.***:9999. The connection timed out

      [Host Controller] 06:22:07,666 ERROR [org.jboss.as.host.controller] (ControllerBoot Thread) JBAS010901: Could not connect to master. Aborting. Error was: java.lang.IllegalStateException: JBAS010951: Could not connect to master in 5 attempts within 30000 ms

      [Host Controller] 06:22:08,664 INFO  [org.jboss.as] (MSC service thread 1-3) JBAS015950: JBoss EAP 6.1.0.GA (AS 7.2.0.Final-redhat-8) stopped in 520ms

       

      This isn't the desirable behavior in production environments; the slaves should start even if the master is [temporarily] not present.

       

      So, how and where can I change (a) the number of attempts to connect the master and (b) the delay between two attempts? Is it possible to (c) convince the slave to start without a master?