In WildFly 10.1.0.Final (but also already with 8.2.1.Final) installations with higher load (no cluster), we often see that the same JMS connection obtained from the "java:/JmsXA" connection pool is used by two threads at the same time, causing the communication to be messed up and eventually failing with errors like
ActiveMQConnectionTimedOutException[errorType=CONNECTION_TIMEDOUT message=AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 53]
but also other packets like 43, 51 and 63.
We have triple-checked that the way how we are sending the JMS messages is correct, such as creating the TopicConnection individually and closing it in a finally block (in a SLSB).
In consequence, these broken connections stay in the pool that eventually contains only broken connections. To avoid that, we are applying this workaround: https://developer.jboss.org/message/975486#975486
This however doesn't solve what seems to be the cause.
For now we have resorted to using the unpooled connection factory "java:/ConnectionFactory" and since those connections don't participate in the transaction, we manually commit/rollback using CDI events.
This solves the issue but of course it is just a hack.
It appears as if under high load the connection pool sometimes hands out the same connection to two clients/threads. So far we have been unable to reproduce that, for example, we have sent millions of messages from several threads concurrently using the same SLSB used in production without a single error. It only happens at some high-load installations and normally only a few times per day, but in one installation it occurs so often that the system is almost unusable.
Does anyone have an idea what the problem could be or a suggestion on how to reproduce it?
Cheers,
Torsten