I have created https://bugzilla.redhat.com/show_bug.cgi?id=1026088 for the deployment issue. This will be fixed in 4.10. For the other two issues, would you remind creating separate forum threads for those since they are distinct from the deployment issue? I have probably run into BZ 1017961. That sounds familiar. Not sure about the hard link issue. Can you provide any more details on how you hit that?
John Sanda wrote:
Not sure about the hard link issue. Can you provide any more details on how you hit that?
I'm not sure how I hit it. I originally had a four node cluster and one node stopped talking and I had it decommissioned. Since I wasn't terribly methodical about things, maybe something got corrupted. It seems to be working though.
I'll start a few more threads if you like...
1 of 1 people found this helpful
Thanks for creating the separate threads. I have created another bug about adding support for deploying multiple nodes simultaneously - https://bugzilla.redhat.com/show_bug.cgi?id=1026128. There is some work that needs to be done in order to support this properly. There is nothing in RHQ 4.9 to prevent you from attempting to deploy multiple nodes simultaneously (other than the bug you hit), but I strongly discourage doing it as you are likely to run into problems. For example, suppose you have a 3 node cluster with nodes N1, N2, and N3. Then you decide to deploy N4 and N5 at the same time. N4 and N5 will likely not be able to talk to one another and that ought to lead a whole host of interesting problems.
For now the fastest solution is to install multiple nodes before installing the server. It requires more manual steps, but you completely bypass the deployment process that happens when new storage nodes get imported into inventory.
What I initially tried to do is create one node, then I added three more. :-(
I've had lots of problems, like bug 1025783, where the installer ends up creating multiple agents in inventory, causing trouble.
What would be helpful is a log, showing the steps, what it's doing and what failed. It also would be nice if the state within RHQ would reflect the state from Cassandra, not try to guess what it is based on some sort of installation operations. For example, I have a node that's working but it shows up as DOWN, then it gets kicked out of the cluster.