OK... for those not familiar with the setup.
I have 2 JBoss 4.2.3.GA nodes running clustered. They are dual-homed. eth0 is the external face and eth1 is the common-subnet on each of them. In other words eth0 is 192.168.1.6 and on node 1 and eth0 is 192.168.2.7 on node 2 with a 255.255.255.0 netmask.
eth1 on node 1 is 192.168.3.6 and on node 2 it is 192.168.3.7 with the same netmask.
In other words, eth1-eth1 is the subnet on which clustering can occur.
Through judicious tweaking of the command line two JBoss instances recognize each other and APPEAR to be a cluster.
They are built on all-with-hornetq profiles.
The hornetq config is straightforward. There is a local_bind_address in the broadcast groups and jboss is invoked setting the hornetq.remoting.netty.host to the appropriate value.
All good most of the time... but we are testing and so we tried shutting down an application that is attached to a durable topic... on just one node.
Shutdown appears to work OK. No Exceptions after listing the deployed apps getting the URL and then calling undeploy. Node1 now has messages piling up. (not really, we do this quietly, but it doesn't help).
This is the only app attached to the topic. Only other copy of the app is on the other side of the cluster. Wanted to see that messages went (have set to forward when nothing is attached).
So nothing WAS transferred. This is not good.
The Topic still has a subscriber.
Let's restart the application.
Can't connect to the topic as something is already subscribed.
Looks a lot like the attached snippet.
Try to unsubscribe all. This gathers rather more vehement exceptions related to having the reader still attached. Recall that it was undeployed now.
So so so.
Try the same drill with node 2 down. That is, ONLY node 1 is available to shut down and turn on.
No exceptions at all.
But this is not a cluster as I know it.
Restarting JBoss puts it all right again. Nothing else appears to work. Once it is broken in this fashion turning off node 2 does not allow the stuck subscription to be reset. JBoss has to be restarted.
Which is all probably something simple to someone. Right now I am Mystifried... as it is 0130, and nothing seems to make it happy.