12 Replies Latest reply on Jan 15, 2013 10:34 AM by clebert.suconic

Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

manu_1185 Jan 13, 2013 11:32 PM

Hi,

We are using hornetq version 2.2.14 Final in our production environment. In the last few days, our user count has increased quite a bit and every day our server hangs because all the producer threads are blocked. Given below is the stack trace (I was able to reproduce it with somewhat larger messages. See below):

"pool-5-thread-100" prio=10 tid=0x00007fc694112000 nid=0x7599 waiting on condition [0x00007fc62cb4a000]

java.lang.Thread.State: WAITING (parking)

at sun.misc.Unsafe.park(Native Method)

- parking to wait for <0x000000050098fb68> (a java.util.concurrent.Semaphore$NonfairSync)

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)

at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)

at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)

at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)

at java.util.concurrent.Semaphore.acquire(Semaphore.java:468)

at org.hornetq.core.client.impl.ClientProducerCreditsImpl.acquireCredits(ClientProducerCreditsImpl.java:74)

at org.hornetq.core.client.impl.ClientProducerImpl.doSend(ClientProducerImpl.java:305)

at org.hornetq.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:142)

at org.hornetq.jms.client.HornetQMessageProducer.doSend(HornetQMessageProducer.java:451)

at org.hornetq.jms.client.HornetQMessageProducer.send(HornetQMessageProducer.java:246)

at com.bsb.hike.pubsub.jms.JMSProducer.send(JMSProducer.java:129)

at com.bsb.hike.pubsub.jms.JMSProducer.send(JMSProducer.java:117)

at com.bsb.hike.pubsub.ProducerPool$MessageSendTask.run(ProducerPool.java:150)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

at java.lang.Thread.run(Thread.java:679)

It starts working after a restart but happens again after sometime. We are using BLOCK in address-full-policy, and max-size-bytes is 1GB for all addresses. Initially we though that queues might be getting full. But we have verified using the management API that queue size is very less (<100) or 0 all the time; even when this problem comes (since in our application, we are always consuming). So it is not the issue of queue getting full.

Some of our recorded stats showed that when users are sending large messages....the issue comes around that time. So I ran some tests with larger messages messages and I was able to reproduce this. In my test, max-size-bytes was configured to be 100MB (for all addresses....though I was using only 1 queue in persistent mode with persistent messages). 10 clients were sending messages (at an interval of 5 secs between messages). Each client send 100 messages (so the total number of messsages was 1000). I tried different message sizes. It ran fine for message size of 10 KB and 15 KB. But when I increased the message size to 20 KB, out of 1000 messages, only a few were received (ranging from 15-150 in different tests) . All the producer threads were blocked in acquireCredits. It seems that somehow server is not sending more credits to producers even though queue is not full.

Is this a bug or can we fix this by changing some configuration? Any help would be appreciated.

1. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

gaohoward Jan 14, 2013 9:30 AM (in response to manu_1185)

Can you please try latest release?
Actions
2. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

ataylor Jan 14, 2013 9:31 AM (in response to gaohoward)

and also provide a test so we can verify if it is a bug
Actions
3. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

manu_1185 Jan 15, 2013 6:22 AM (in response to ataylor)
I can't attach the whole code for our application and clients. So I have written a small test program where I have tried to simulate things we do in our application (like having pools for connections, sessions and producers, separate threads for producers etc). It directly connects to HornetQ server and I have verified that the problem is reproduced consistently by this program as well. Attached is a new test program and the script I use to invoke it. I pass message size as a parameter to the program.

Some observations which might help:

1) To this test program, I first added producer pool and then added session pool. Till then things were fine. The problem started coming consistently after I added connection pool
2) I first send a small message (around 50 bytes), then send a large message (of message size passed to the program) and then send a small message (50 bytes) again. I have done this because ours is a chat application where we also send events (like start typing, end typing etc) along with the actual message. Earlier I had said that I was sending a total of 1000 messages....it means I am sending 1000 LARGE messages and 2000 small messages as well. All the above stats were only accounting for the LARGE messages. Also I noticed that if I remove the two small messages, I am NOT able to reproduce the problem.

For me, the problem comes very consistently when message size is 20 (kbs). After receiving a few LARGE messages (15-150) in different tests, more messages stop coming and threads get stuck. I have verified that even the small messages stop getting delivered after this problem has come.

Please let me know what might be causing this issue and how can we fix it.

run.sh 233 bytes

HornetQTests.java.zip 4.2 KB
Actions
4. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

manu_1185 Jan 15, 2013 6:37 AM (in response to manu_1185)

As for moving to the latest release, the latest release is a Beta release. We are not sure about the stablility of that release. But we don't have any problems in moving to it if we know for sure that the release is stable and this issue won't come after we have moved.
Actions
5. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

ataylor Jan 15, 2013 6:51 AM (in response to manu_1185)

1) To this test program, I first added producer pool and then added session pool. Till then things were fine. The problem started coming consistently after I added connection pool
Is this your own implementation of a pool? if so if you arent syncronising on sessions/connections properly this could cause the issue you see
Actions
6. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

manu_1185 Jan 15, 2013 7:06 AM (in response to ataylor)

Yes, we have implemented our pools. We initialize connections, sessions and producers during starting of our application. After that we just use the producer pool to get a producer instance and send message. Can you give me some detail of where synchronization can cause problems?
Actions
7. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

manu_1185 Jan 15, 2013 7:31 AM (in response to manu_1185)

Though I doubt that there is any problem in implementation of pools. There are a few questions which come to my mind - like why is this not coming with only small messages, or when I remove the two small messages in the test? And as already stated, Connection pool and Session pool are initialized and used only during startup of application; to create producers for ProducerPool. During running of our application, we only use ProducerPool. We get a producer from pool, use it to send message and release it to the pool (in a thread safe way).

You can check the code in the attached test. It reflects our implementation to certain extent. Please let me know if you see any issues there.
Actions
8. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

ataylor Jan 15, 2013 9:04 AM (in response to manu_1185)

do you have many producers per session? if so do you see the issue with 1 producer per session.
1 of 1 people found this helpful
Actions
9. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

manu_1185 Jan 15, 2013 9:41 AM (in response to ataylor)

The way our code is, we can have multiple producers per session. I just tested that I am not able to reproduce this issue if I create 1 producer per session. And the issue definitely comes if I use same session to create multiple producers.

So it looks that multiple producers per session might be causing this. We will make changes to fix this. Just out of curiosity, why do we have this restriction. The documentation says we should try to reuse these objects. This change would mean that if we need to create a new producer, we will need a new session object.

Anyways, thanks for your help. At least we will get through this.
Actions
10. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

ataylor Jan 15, 2013 9:57 AM (in response to manu_1185)

everything on a session needs to be single threaded, this includes producers and consumers. 1 session = 1 thread, so either 1 producer per session or synchronize on your producer sends
Actions
11. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

manu_1185 Jan 15, 2013 10:07 AM (in response to ataylor)

ok. Again thanks for your help.
Actions
12. Re: Possible Bug where Producer threads hang in ClientProducerCreditsImpl.acquireCredits

clebert.suconic Jan 15, 2013 10:34 AM (in response to manu_1185)

That's not just us.. that's the JMS Spec
Actions

Go to original post