Not sure I understood what the problem is from your explanation.
Can you explain in more detail?
When you say "failover does not work as I would expect it" - how would you expect it to work?
If you can give me a step by step to reproduce and state what you your expected behaviour is that would be a great help.
Thanks for the reply.
Initially I did the following (all locally on my machine, Windows XP, SQLServer 2005).
1) Start the Cluster.
2) Wait for it to completely come up.
3) Start my 'test' tool
- The test simply sends messages as quickly as possible to a distributed topic, while additionally (in another thread) receiving them again.
- At this point in time both JBoss instances start consuming CPU.
4) Shutdown one JBoss instance (through CTRL-C in its command window).
- There are now some exceptions thrown in the background, but the test continues to run.
- There is no 're-initialisation' of any JMS recources whatsoever from my side (the cluster should be transparent to me).
5) Wait for 'my' test to finish.
- The test in the end reports the number of 'send' and received messages. Some have gone lost, but ok, not too many.
Now I did the test again,with the following differences:
1) Extend the duration of the test.
2) Start everything again.
3) Shutdown JBoss instance 'A' (as above)
4) Re-start JBoss instance 'A'
5) Wait for it to come up again.
6) Shutdown JBoss instance 'B'
- THE PROBLEM: At this point in time exceptions start to pour in, the test fails. 'Failover', i.e. the sender connection previously using 'B' now using 'A' did not happen for me.
Failover kicks in when a server *dies*.
CTRL-C does not kill a server, it shuts it down cleanly - this won't cause failover.
Kill it using kill, or using task manager in windows.
There have been long discussions on this in other threads.