let me try to describe the architecture on the customers site and the observed behavior, which was categorized as an error by the customer.
An JBoss cluster is running within a Veritas cluster on Solaris (JBoss Version 3.2.8SP1)
There are two zones, each containing 1 application server and 1 oracle server. In zone A the HAJMS Master and the oracle server for the HAJMS is running. In zone B the second jboss cluster node, which is not HAJMS Master, is running and the standby database.
Now the zone A is killed and the following happens on the second jboss cluster node.:
- HAJMS Failover of running JMS Clients like MDBs etc. -> ok
- Ping database failure of connection pool -> ok
- a new cluster view is detected with only one member -> ok
- the node is not becoming HAJMS master -> not ok
After 1 minute the standby db is available and the errors of the connection pool stop
- The node is never becoming HAJMS master and has to be restarted, as there is no more JMS available.
This means downtime and manuel action.
Obviously the JMS master change does not work, when at the same time the database is not available.
My hope is, that some parametrization of JMS can avoid this situation. Has anyone an idea?