Deadlock when using netty NIO acceptor
carl.heymann Jul 9, 2011 2:26 PMHi
I've been trying to use netty acceptors with NIO, rather than blocking IO. HornetQ doesn't seem to work well in this configuration, freezing up after running for only a very short while.
Setup
2 cores, 32-bit linux, AIO persistence.
Acceptor configuration:
<acceptor name="netty">
<factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
<param key="host" value="${hornetq.remoting.netty.host:localhost}"/>
<param key="port" value="${hornetq.remoting.netty.port:5445}"/>
<param key="batch-delay" value="50"/>
<param key="use-nio" value="true" />
</acceptor>
Paging enabled:
<max-size-bytes>104857600</max-size-bytes>
<page-size-bytes>4096</page-size-bytes>
<message-counter-history-day-limit>10</message-counter-history-day-limit>
<address-full-policy>PAGE</address-full-policy>
Results
The server freezes after a very short time running (almost immediately). I have to kill it with kill -9. With a profiler, I see a deadlock:
New I/O server worker #1-2 [BLOCKED; waiting to lock java.lang.Object@904497]
org.hornetq.core.server.impl.ServerConsumerImpl.promptDelivery(ServerConsumerImpl.java:664)
org.hornetq.core.server.impl.ServerConsumerImpl.readyForWriting(ServerConsumerImpl.java:642)
org.hornetq.core.remoting.impl.netty.NettyConnection.fireReady(NettyConnection.java:264)
org.hornetq.core.remoting.impl.netty.NettyAcceptor$Listener.connectionReadyForWrites(NettyAcceptor.java:695)
org.hornetq.core.remoting.impl.netty.HornetQChannelHandler.channelInterestChanged(HornetQChannelHandler.java:65)
org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:136)
org.jboss.netty.channel.StaticChannelPipeline.sendUpstream(StaticChannelPipeline.java:362)
org.jboss.netty.channel.StaticChannelPipeline$StaticChannelHandlerContext.sendUpstream(StaticChannelPipeline.java:514)
org.jboss.netty.channel.SimpleChannelUpstreamHandler.channelInterestChanged(SimpleChannelUpstreamHandler.java:183)
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:116)
org.jboss.netty.channel.StaticChannelPipeline.sendUpstream(StaticChannelPipeline.java:362)
org.jboss.netty.channel.StaticChannelPipeline.sendUpstream(StaticChannelPipeline.java:357)
org.jboss.netty.channel.Channels.fireChannelInterestChanged(Channels.java:335)
org.jboss.netty.channel.socket.nio.NioSocketChannel$WriteRequestQueue.poll(NioSocketChannel.java:242)
org.jboss.netty.channel.socket.nio.NioSocketChannel$WriteRequestQueue.poll(NioSocketChannel.java:197)
org.jboss.netty.channel.socket.nio.NioWorker.write0(NioWorker.java:455)
org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:388)
org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:137)
org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:76)
org.jboss.netty.channel.StaticChannelPipeline$StaticChannelHandlerContext.sendDownstream(StaticChannelPipeline.java:502)
org.jboss.netty.channel.SimpleChannelHandler.writeRequested(SimpleChannelHandler.java:304)
org.jboss.netty.channel.SimpleChannelHandler.handleDownstream(SimpleChannelHandler.java:266)
org.jboss.netty.channel.StaticChannelPipeline.sendDownstream(StaticChannelPipeline.java:385)
org.jboss.netty.channel.StaticChannelPipeline.sendDownstream(StaticChannelPipeline.java:380)
org.jboss.netty.channel.Channels.write(Channels.java:611)
org.jboss.netty.channel.Channels.write(Channels.java:578)
org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:259)
org.hornetq.core.remoting.impl.netty.NettyConnection.write(NettyConnection.java:211)
org.hornetq.core.protocol.core.impl.ChannelImpl.send(ChannelImpl.java:199)
org.hornetq.core.protocol.core.impl.ChannelImpl.send(ChannelImpl.java:142)
org.hornetq.core.protocol.core.impl.CoreSessionCallback.sendProducerCreditsMessage(CoreSessionCallback.java:87)
org.hornetq.core.server.impl.ServerSessionImpl$2.run(ServerSessionImpl.java:1151)
org.hornetq.core.paging.impl.PagingStoreImpl.executeRunnableWhenMemoryAvailable(PagingStoreImpl.java:741)
org.hornetq.core.server.impl.ServerSessionImpl.requestProducerCredits(ServerSessionImpl.java:1147)
org.hornetq.core.protocol.core.ServerSessionPacketHandler.handlePacket(ServerSessionPacketHandler.java:473)
org.hornetq.core.protocol.core.impl.ChannelImpl.handlePacket(ChannelImpl.java:474)
org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.doBufferReceived(RemotingConnectionImpl.java:496)
org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:457)
org.hornetq.core.remoting.server.impl.RemotingServiceImpl$DelegatingBufferHandler.bufferReceived(RemotingServiceImpl.java:459)
org.hornetq.core.remoting.impl.netty.HornetQChannelHandler.messageReceived(HornetQChannelHandler.java:73)
org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:100)
org.jboss.netty.channel.StaticChannelPipeline.sendUpstream(StaticChannelPipeline.java:362)
org.jboss.netty.channel.StaticChannelPipeline$StaticChannelHandlerContext.sendUpstream(StaticChannelPipeline.java:514)
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:287)
org.hornetq.core.remoting.impl.netty.HornetQFrameDecoder2.decode(HornetQFrameDecoder2.java:169)
org.hornetq.core.remoting.impl.netty.HornetQFrameDecoder2.messageReceived(HornetQFrameDecoder2.java:134)
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
org.jboss.netty.channel.StaticChannelPipeline.sendUpstream(StaticChannelPipeline.java:362)
org.jboss.netty.channel.StaticChannelPipeline.sendUpstream(StaticChannelPipeline.java:357)
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274)
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261)
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:350)
org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:281)
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:201)
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
org.jboss.netty.util.internal.IoWorkerRunnable.run(IoWorkerRunnable.java:46)
org.jboss.netty.util.VirtualExecutorService$ChildExecutorRunnable.run(VirtualExecutorService.java:181)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
java.lang.Thread.run(Thread.java:636)
Thread-5 (group:HornetQ-server-threads12254719-18227730) [BLOCKED; waiting to lock java.lang.Object@c7d960]
org.hornetq.core.protocol.core.impl.ChannelImpl.send(ChannelImpl.java:161)
org.hornetq.core.protocol.core.impl.ChannelImpl.sendBatched(ChannelImpl.java:147)
org.hornetq.core.protocol.core.impl.CoreSessionCallback.sendMessage(CoreSessionCallback.java:76)
org.hornetq.core.server.impl.ServerConsumerImpl.deliverStandardMessage(ServerConsumerImpl.java:704)
org.hornetq.core.server.impl.ServerConsumerImpl.handle(ServerConsumerImpl.java:291)
org.hornetq.core.server.impl.QueueImpl.handle(QueueImpl.java:2017)
org.hornetq.core.server.impl.QueueImpl.deliver(QueueImpl.java:1587)
org.hornetq.core.server.impl.QueueImpl.doPoll(QueueImpl.java:1472)
org.hornetq.core.server.impl.QueueImpl.access$1100(QueueImpl.java:72)
org.hornetq.core.server.impl.QueueImpl$ConcurrentPoller.run(QueueImpl.java:2299)
org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:100)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
java.lang.Thread.run(Thread.java:636)
It seems that both stack traces pass through ServerConsumerImpl and ChannelImpl, resulting in "lock" and "sendLock" being acquired in different orders, which causes the deadlock. I'm not sure what can be done to solve this, but I suspect HornetQChannelHandler.channelInterestChanged may have to defer execution.
Furthermore, the reaper thread is blocked at
org.hornetq.core.server.impl.QueueImpl.expireReferences()
org.hornetq.core.postoffice.impl.PostOfficeImpl$Reaper.run()
java.lang.Thread.run()
and the failure check thread is block at
org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.flush()
org.hornetq.core.remoting.server.impl.RemotingServiceImpl$FailureCheckAndFlushThread.run()
but I'm not sure if these are problems, but it doesn't happen with use-nio=false.
Regards
Carl