Core Bridge Failover issues
artp May 18, 2010 8:21 PMI'm having issues with a core bridge during failover. Currently, I have two clusters of jboss 5.1 servers with HornetQ 2.0. Cluster A contains three nodes(A1,A2,A3). Cluster B has two nodes(B1,B2). Each node on cluster A has a core bridge configured to send messages to Cluster B (see below). I set up a connector(B1) and a backup(B2). To test failover, I took down B1 and messages produced on cluster A went to B2 as expected. Then I took down B2, so no nodes were running in cluster B. I waited for some time then brought up B1. After B1 started I saw an exception on each node in cluster A(see below) and a few exceptions on B1(see one below). I tried sending messages on node A1 but none were sent over the bridge. Messages were backing up on the forwarding address queue(jms.topic.WOPEvents) of the bridge.
It looks like my issue is similar to https://community.jboss.org/thread/149213
Also, would it help to upgrade to 2.1?
Bridge on cluster A nodes
<bridge name="ots-bridge">
<queue-name>jms.queue.OTSForward</queue-name>
<forwarding-address>jms.topic.WOPEvents</forwarding-address>
<retry-interval>5000</retry-interval>
<reconnect-attempts>-1</reconnect-attempts>
<failover-on-server-shutdown>true</failover-on-server-shutdown>
<use-duplicate-detection>false</use-duplicate-detection>
<connector-ref connector-name="B1"
backup-connector-name="B2"/>
</bridge>
Exception on A1,A2,A3
2010-05-18 23:14:41,522 WARN [org.hornetq.core.remoting.impl.RemotingConnectionImpl] (Thread-19 (group:HornetQ-client-global-threads-968713772)) Connection failure has been detected: Did not receive data from server for org.hornetq.integration.transports.netty.NettyConnection@220334b4[local= /10.20.28.168:56337, remote=euca-10-20-28-165.eucalyptus.ec.company.corp/10.20.28.165:5445] [code=3]
2010-05-18 23:15:11,528 ERROR [org.hornetq.core.client.impl.ClientSessionImpl] (Thread-14 (group:HornetQ-client-global-threads-968713772)) Failed to handle failover
HornetQException[errorCode=3 message=Timed out waiting for response when sending packet 32]
at org.hornetq.core.remoting.impl.ChannelImpl.sendBlocking(ChannelImpl.java:270)
at org.hornetq.core.client.impl.ClientSessionImpl.handleFailover(ClientSessionImpl.java:863)
at org.hornetq.core.client.impl.FailoverManagerImpl.reconnectSessions(FailoverManagerImpl.java:785)
at org.hornetq.core.client.impl.FailoverManagerImpl.failoverOrReconnect(FailoverManagerImpl.java:686)
at org.hornetq.core.client.impl.FailoverManagerImpl.handleConnectionFailure(FailoverManagerImpl.java:548)
at org.hornetq.core.client.impl.FailoverManagerImpl.access$600(FailoverManagerImpl.java:69)
at org.hornetq.core.client.impl.FailoverManagerImpl$DelegatingFailureListener.connectionFailed(FailoverManagerImpl.java:1111)
at org.hornetq.core.remoting.impl.RemotingConnectionImpl.callFailureListeners(RemotingConnectionImpl.java:445)
at org.hornetq.core.remoting.impl.RemotingConnectionImpl.fail(RemotingConnectionImpl.java:250)
at org.hornetq.core.client.impl.FailoverManagerImpl$PingRunnable$1.run(FailoverManagerImpl.java:1169)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Exception on B1
7:39,274 ERROR [org.hornetq.core.client.impl.ClientSessionImpl] (Thread-6 (group:HornetQ-client-global-threads-1492690777)) Failed to handle failover
HornetQException[errorCode=3 message=Timed out waiting for response when sending packet 32]
at org.hornetq.core.remoting.impl.ChannelImpl.sendBlocking(ChannelImpl.java:270)
at org.hornetq.core.client.impl.ClientSessionImpl.handleFailover(ClientSessionImpl.java:863)
at org.hornetq.core.client.impl.FailoverManagerImpl.reconnectSessions(FailoverManagerImpl.java:785)
at org.hornetq.core.client.impl.FailoverManagerImpl.failoverOrReconnect(FailoverManagerImpl.java:686)
at org.hornetq.core.client.impl.FailoverManagerImpl.handleConnectionFailure(FailoverManagerImpl.java:548)
at org.hornetq.core.client.impl.FailoverManagerImpl.access$600(FailoverManagerImpl.java:69)
at org.hornetq.core.client.impl.FailoverManagerImpl$DelegatingFailureListener.connectionFailed(FailoverManagerImpl.java:1111)
at org.hornetq.core.remoting.impl.RemotingConnectionImpl.callFailureListeners(RemotingConnectionImpl.java:445)
at org.hornetq.core.remoting.impl.RemotingConnectionImpl.fail(RemotingConnectionImpl.java:250)
at org.hornetq.core.client.impl.FailoverManagerImpl$PingRunnable$1.run(FailoverManagerImpl.java:1169)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)