12 Replies Latest reply on Jan 15, 2013 10:34 AM by clebert.suconic

    Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

    manu_1185

      Hi,

       

      We are using hornetq version 2.2.14 Final in our production environment. In the last few days, our user count has increased quite a bit and every day our server hangs because all the producer threads are blocked. Given below is the stack trace (I was able to reproduce it with somewhat larger messages. See below):

       

      "pool-5-thread-100" prio=10 tid=0x00007fc694112000 nid=0x7599 waiting on condition [0x00007fc62cb4a000]

         java.lang.Thread.State: WAITING (parking)

              at sun.misc.Unsafe.park(Native Method)

              - parking to wait for  <0x000000050098fb68> (a java.util.concurrent.Semaphore$NonfairSync)

              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)

              at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)

              at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)

              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)

              at java.util.concurrent.Semaphore.acquire(Semaphore.java:468)

              at org.hornetq.core.client.impl.ClientProducerCreditsImpl.acquireCredits(ClientProducerCreditsImpl.java:74)

              at org.hornetq.core.client.impl.ClientProducerImpl.doSend(ClientProducerImpl.java:305)

              at org.hornetq.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:142)

              at org.hornetq.jms.client.HornetQMessageProducer.doSend(HornetQMessageProducer.java:451)

              at org.hornetq.jms.client.HornetQMessageProducer.send(HornetQMessageProducer.java:246)

              at com.bsb.hike.pubsub.jms.JMSProducer.send(JMSProducer.java:129)

              at com.bsb.hike.pubsub.jms.JMSProducer.send(JMSProducer.java:117)

              at com.bsb.hike.pubsub.ProducerPool$MessageSendTask.run(ProducerPool.java:150)

              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

              at java.lang.Thread.run(Thread.java:679)

       

      It starts working after a restart but happens again after sometime. We are using BLOCK in address-full-policy, and max-size-bytes is 1GB for all addresses. Initially we though that queues might be getting full. But we have verified using the management API that queue size is very less (<100) or 0 all the time; even when this problem comes (since in our application, we are always consuming). So it is not the issue of queue getting full.

       

      Some of our recorded stats showed that when users are sending large messages....the issue comes around that time. So I ran some tests with larger messages messages and I was able to reproduce this. In my test, max-size-bytes was configured to be 100MB (for all addresses....though I was using only 1 queue in persistent mode with persistent messages). 10 clients were sending messages (at an interval of 5 secs between messages). Each client send 100 messages (so the total number of messsages was 1000). I tried different message sizes. It ran fine for message size of 10 KB and 15 KB. But when I increased the message size to 20 KB, out of 1000 messages, only a few were received (ranging from 15-150 in different tests) . All the producer threads were blocked in acquireCredits. It seems that somehow server is not sending more credits to producers even though queue is not full.

       

      Is this a bug or can we fix this by changing some configuration? Any help would be appreciated.