First of all, remember: a domain controller is not essential for a cluster to serve requests. It is only required if you want to perform management operations.
My suggestion is to place the DC on a virtualized host that can be recreated anywhere.
Additionally the domain.sh / domain.bat script has two switches that can help you out:
--cached-dc If this host is not the Domain
Controller and cannot contact the
Domain Controller at boot, boot using a
locally cached copy of the domain
configuration (see --backup)
--backup Keep a copy of the persistent domain
configuration even if this host is not
the Domain Controller
Okay, its not essential for serving request. But I think it is also required when initially the cluster is initializing; otherwise how would HC reads the configuration from DC, that is why initially after some try HC stops. Although if once initialized properly, it continues to run, yet it will not be able to perform management operations, as you said.
Regarding the second point, I tried that, by entering to the bin directory, opening command prompt and by typing -> domain.bat --cached-dc --backup (enter), but it threw these errors -
Boot Thread) WFLYHC0149: Option --cached-dc was set; obtaining domain-wide configuration from domain.cached-remote.xml
[Host Controller] 11:01:21,179 WARN [org.jboss.as.host.controller] (ControllerBoot Thread) WFLYHC0031: Cannot load the domain model using using --backup
[Host Controller] 11:01:21,245 ERROR [org.jboss.as.host.controller] (ControllerBoot Thread) WFLYHC0008: Failed to start server (ServerTwo): java.lang.IllegalArgumentException
So, I am certainly missing something. Where this domain.cached-remote.xml will be located? Do I have to copy the domain.xml from DC and paste it into HC and rename it to this file?
Apart from this, one more thing I found was that I can write <discovery-options> tag inside <remote> tag in hosts.xml file. I tried that, It worked also. I had two DC and one HC, all three had server node(so total 3 nodes). I started both DC1 and DC2 and then HC1, it got connected to DC1. Then I turned down DC1, HC retried and got connected to DC2 then. What If I make configuration changes in DC1 through Web Administration Console, like Adding a new cache. It functions fine, but how will it get replicated to DC2 ? Or it won't?
The --cached-dc and --backup options are mutually exclusive. First of all use --backup to create a local copy. Then use --cached-dc to tell a HC to use the backup if it cannot contact the DC.
As for the second part, it would be up to you to ensure that all DCs get a copy of the configuration.
I followed what you said exactly.
I first ran it with --backup option and it copied the domain.xml file into new domain.cached-remote.xml file
Now I stopped both DC and HC and then I started HC alone, it retried to connect to DC for some 10 times and then it could not start, exactly same behavior which was earlier. So, what difference did the backup made?