0 Replies Latest reply on May 19, 2017 3:13 PM by wwang2016

Failed re-start of a wildfly instance (in a cluster) after it was shut down (related to HA for messaging configuration)

wwang2016 May 19, 2017 3:13 PM

Hi,

I am investigating how to set up a cluster of wildfly (active) - wildfly (active) configuration with each wildfly 10 instance having a live-backup pair of message servers. The approach is data replication.

I followed the document to set it up the cluster

https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_application_platform/7.0/html/configuring_messaging/messaging-ha#colocated_backup_servers

Chapter 29. High Availability - Red Hat Customer Portal

The configuration did not work completely since some up-processed messages of the wildfly instance which was shut down were not accessible to other wildfly instance to get processed. I made a simple modification to add discovery-group in the cluster-connection definition in the backup server, and I was able to observe message fail over to another wildfly instance (got processed). That sounds all good.

Test:

wildfly #1 (live1, backup2)

wildfly #2 (live2, backup1)

However, this change brought in an issue: the wildfly instance (wildfly #1) that was down could not restart, and it is complaining about the missing dependency of a queue. I can fix the issue by shutting down the other wildfly instance (wildfly #2) before re-starting the instance that was brought down. However, this is not a good solution in production.

If I shut down wildfly #2, I can restart wildfly #1 without problem.

Is it possible that the backup server (backup1) in the other instance (wildfly #2) was not shut down once wildfly #1 was restarted? The default value of allow-failback in the backup server is true, so there is no need to set it true.

Is there any other configuration that can get the wildfly #1 to restart without problem?

Thanks,

Wayne