Duplicate/retried messages when acknowledge fails due to race condition?
grim_toaster Apr 19, 2013 5:25 AMHi,
I'm getting duplicate messages sent to a consumer as it appears the acknowledgement response fails with the server side error. This also leads to excessively large log files and a lot of messages being sent to the DLQ, but the duplicate messages on queue consumers is probably the scariest bit.
Apr 19, 2013 9:52:03 AM org.hornetq.core.protocol.core.ServerSessionPacketHandler handlePacket
ERROR: HQ224016: Caught exception
HornetQException[errorType=ILLEGAL_STATE message=HQ119027: Could not find reference on consumer ID=0, messageId = 156 queue = b6f5c32a-3058-4c57-bfd9-3e4bee9e2d7d]
at org.hornetq.core.server.impl.ServerConsumerImpl.acknowledge(ServerConsumerImpl.java:704)
at org.hornetq.core.server.impl.ServerSessionImpl.acknowledge(ServerSessionImpl.java:634)
at org.hornetq.core.protocol.core.ServerSessionPacketHandler.handlePacket(ServerSessionPacketHandler.java:274)
at org.hornetq.core.protocol.core.impl.ChannelImpl.handlePacket(ChannelImpl.java:631)
at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.doBufferReceived(RemotingConnectionImpl.java:547)
at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:523)
at org.hornetq.core.remoting.server.impl.RemotingServiceImpl$DelegatingBufferHandler.bufferReceived(RemotingServiceImpl.java:565)
at org.hornetq.core.remoting.impl.netty.HornetQChannelHandler.messageReceived(HornetQChannelHandler.java:72)
at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:281)
at org.hornetq.core.remoting.impl.netty.HornetQFrameDecoder2.decode(HornetQFrameDecoder2.java:169)
at org.hornetq.core.remoting.impl.netty.HornetQFrameDecoder2.messageReceived(HornetQFrameDecoder2.java:134)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.oio.OioWorker.process(OioWorker.java:71)
at org.jboss.netty.channel.socket.oio.AbstractOioWorker.run(AbstractOioWorker.java:73)
at org.jboss.netty.channel.socket.oio.OioWorker.run(OioWorker.java:51)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at org.jboss.netty.util.VirtualExecutorService$ChildExecutorRunnable.run(VirtualExecutorService.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
If I have a single producer with a single consumer, then there are no issues, every message sent is consumed once. However, if I use multiple producers that are sending messages in parallel with the single consumer, then I get the above stack trace and although only 2000 messages are sent, over 10000 messages are normally consumed (redelivered messages).
I have this issue with a deployed application that has recently been upgraded from 2.2.14 (due to the HORNETQ-1042), but I have managed to recreate the issue in the attached test case, it fails on at least one of the tests almost every time run (increasing the values for MESSAGES_TO_SEND_PER_PRODUCER or MULTIPLE_PRODUCERS_COUNT increases the likelihood of failure).
I'm presuming some kind of race condition in HornetQ, but would appreciate if anyone knows of this issue (I couldn't find anything relevant in Jira) or whether it's a simple configuration issue.
Many Thanks
Andy
-
project.zip 3.6 KB