We still see the threads stuck in socketWrite both on the broker and clients, during the failover. And the failover is not successful. And these sockets were not released till we recycle the broker and client.
We are using AMQ 5.4.2 and for performance perspective, we are using Straight through Session Consumption ie, alwaysSessionAsync set to false. But this is not an issue when we set alwaysSessionAsync to true.
From the jira items, viz https://issues.apache.org/jira/browse/AMQ-2693 we notice that this socketWrite issue has been resolved in 5.4. But it isnt the case. The issue is highly repeatable. In the attachment, i had the thread dump on both the broker as well as the clients connected to it. Let me know in case you need more details.
I think you need to use the WriteTimeoutFilter so that you can short circuit the OS tcp level detection of a broken socket connection.
Configure on the broker for the broker dispatch thread:
and on on the client, for acks