1 Reply Latest reply on Jun 10, 2014 3:46 AM by jaikiran

Advice please JBOSS timings

rockshore Jun 9, 2014 9:10 AM

HI all, first timer here so please be gentle!

We have two machines each running a JBoss EAP 5.1 application server. The servers are in a cluster.

Machine 1 is started with:

~/jboss-eap-5.1/jboss-as/bin/run.sh -c all -g CDMLive -b 0.0.0.0 -Djboss.messaging.ServerPeerID=1

Machine 2 is started with:

~/jboss-eap-5.1/jboss-as/bin/run.sh -c all -g CDMLive -b 0.0.0.0 -Djboss.messaging.ServerPeerID=2

We deploy a web application to the servers as an exploded WAR in the JBoss server/all/deploy directory on both machines. The web application jboss-web.xml specifies a HA Singleton deployment as such:

<depends>jboss.ha:service=HASingletonDeployer,type=Barrier</depends>

It is important that the web application only runs on one node at a time as running two instances of it at the same time causes issues.

Our problem is that the machines are hosted as virtual machines and that a backup operation causes the virtual machines to become heavily loaded, lock up completely, or lose network connectivity for a period of time. We cannot avoid this at the moment. Typically, if the lockup occurs on the first JBoss which is running the web application, the second JBoss will notice and output a message to the log. For example:

2014-04-15 13:22:49,533 WARN [org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager] (Thread-46) Connection failure detected. Clean up and retry connection. maxRetry: -1 retryInterval: 5000

2014-04-15 13:22:51,544 ERROR [org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager] (Thread-46) Retrying ConnectionInfo org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager$ConnectionInfo@481e2e4e failed after maxmum retry: 0

We would thus like to know the best settings to change to increase the time which the JBoss servers allow before they conclude that a node has fallen from the cluster and/or to be more lenient when failing to communicate with the other server. This should then allow the "lockup" to occur without accidentally starting up another instance of the web application. We accept that this will increase the time taken for the cluster to recognise a genuine fault - it is something we are willing to accept.

Many thanks in advance for any help.

1. Re: Advice please JBOSS timings

jaikiran Jun 10, 2014 3:46 AM (in response to rockshore)

Tony, welcome to the forums!

2014-04-15 13:22:49,533 WARN [org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager] (Thread-46) Connection failure detected. Clean up and retry connection. maxRetry: -1 retryInterval: 5000

2014-04-15 13:22:51,544 ERROR [org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager] (Thread-46) Retrying ConnectionInfo org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager$ConnectionInfo@481e2e4e failed after maxmum retry: 0

Those logs look very specific to JBoss Messaging. I don't know what version is shipped in EAP 5.1, but you could use the community user guide here to understand the configurations JBoss Messaging 1.4 User's Guide. I have no experience with JBoss Messaging, but looking at their docs, my guess is that you have to configure the FailureRetryInterval in the JBoss Messaging configuration file Chapter 9. JBoss Messaging Message Bridge Configuration.

P.S: Although the name of this forum is JBoss EAP, this really is a forum for JBoss EAP 6 and higher versions. As far as I know, for previous versions like the one you are using, the preferred place to ask these questions is through the support portal for EAP, using your EAP account. I suggest that you create a support ticket there about this question so that someone with more knowledge in this area can help you out.
Actions