5 Replies Latest reply on Jul 18, 2013 9:39 PM by clebert.suconic

    Fixing JMSBridge with HA connection issue

    gaohoward

      If you configure and deploy a HornetQ JMS bridge whose source connection factory or/and target connection factory is HA, it will fail to work when a failover happens to any of the connections. The problem is that when failover happens, the bridge will receive the failure notification through its registered listeners and it will start the whole retry process, starting with JNDI lookups until the connection/session etc are fully re-created, not knowing the fact that a failover is happening.  The issue is better described here.

      https://bugzilla.redhat.com/show_bug.cgi?id=963215

      The first we should do to solve this is to let the bridge know that when failover happens it doesn't need to retry the connections. Then we need to make sure the bridge won't lose/duplicate messages during failover. So, to make JMS Bridge work with HornetQ HA connection factories, A failover event listener is added to the bridge source and target connections. When failover happens, this listener will wait for failover event. If the failover happens successfully, the bridge won't do retry connections.

      Some changes has been made to guarantee the correct handling during failover. a new CreateSessionMessageV2 packet type has been added to help correct XA transaction rollback during failover. It is used to pass the current Xid to the new failover connection so that the transaction can be rolled back on the new connection. Otherwise when rollback is coming in the session can't find the Xid and rollback will fail.

      To avoid message duplication in AT_MOST_ONCE mode, a hash set is used to store the batch message IDs. It can detect duplicated messages possibly re-delivered after failover and discard them.

      Tests are provided for each of the scenarios.

      https://github.com/hornetq/hornetq/pull/1163