Failover Scenario on Sun Solaris Veritas Cluster with failov

sheckler Apr 17, 2008 7:23 AM

Hi,
let me try to describe the architecture on the customers site and the observed behavior, which was categorized as an error by the customer.

An JBoss cluster is running within a Veritas cluster on Solaris (JBoss Version 3.2.8SP1)
There are two zones, each containing 1 application server and 1 oracle server. In zone A the HAJMS Master and the oracle server for the HAJMS is running. In zone B the second jboss cluster node, which is not HAJMS Master, is running and the standby database.

Now the zone A is killed and the following happens on the second jboss cluster node.:

- HAJMS Failover of running JMS Clients like MDBs etc. -> ok
- Ping database failure of connection pool -> ok
- a new cluster view is detected with only one member -> ok
- the node is not becoming HAJMS master -> not ok

After 1 minute the standby db is available and the errors of the connection pool stop

- The node is never becoming HAJMS master and has to be restarted, as there is no more JMS available.
This means downtime and manuel action.

Obviously the JMS master change does not work, when at the same time the database is not available.

My hope is, that some parametrization of JMS can avoid this situation. Has anyone an idea?

1. Re: Failover Scenario on Sun Solaris Veritas Cluster with fa

adrian.brock May 16, 2008 12:24 PM (in response to sheckler)

Ask in the clustering forum.
HASingleton master election is a clustering feature.

I do remember one bug after 3.2.8 off the top of my head:
http://jira.jboss.com/jira/browse/JBAS-4229
Actions