I - Why you don't use Artemis 1.1... it's much better already.
II - some high profile users I know were not using failover at all as they were just trusting the storage and monitoring to restart the instance as soon as something died, and have the clients to reconnect.
DR functionality isn't specifically supported by HornetQ. However, you if Region A and Region B have a reliable network connection between them then I think you could do what you want with a 4 node cluster (2 live nodes in Region A, 2 backup nodes in Region B just as you've described). There are a couple of key issues that I see at this point:
- Ensure the consumers in each region do not use a connection factory with HA = true so that when fail-over occurs the consumers don't connect to the servers in the other region.
- If only one of the live nodes in Region A fail then you would end up with a node in Region A up and a node in Region B up. I'm not sure how to mitigate that one.
At the end of the day I think you should look at Apache ActiveMQ Artemis. The HornetQ code-base was donated to Apache ActiveMQ a little over a year ago now and is continuing it's life as ActiveMQ Artemis. No further work will be done on HornetQ.
We will be looking at changing our messaging architecture in the future, however this needs to be phased in due to timelines,
Thanks for the suggestion, will have a review.
Thanks for the feedback and suggestions, after some further testing have found the application does not like connect to a backup server.
As for the message queue solution overall we will be assessing other alternatives, however needs to be worked into the current project and roadmap.
...after some further testing have found the application does not like connect to a backup server.
What do you mean, exactly?