7 Replies Latest reply on Feb 7, 2008 5:44 AM by timfox

Threads not being cleaned up when clustered

chipschoch Feb 1, 2008 3:38 PM

Clustered App servers. JBossAS 4.2.2.GA, JBM 1.4.0SP3

--------------
| Appserver1 | <====== ConvServer1
| Appserver2 | <====== ConvServer2
 --------------

My configuration is a shown above. I have 2 linux JBoss servers clustered and two windows jboss servers not clustered that act as clients to the clustered queues on the app servers. The conv servers connect using the ClusteredXAConnection factory. When I run a bunch of messages through in this configuration the thread count on both appservers continually increases until eventually I run out of memory.

When I shut down one of the app servers and perform the same test the thread count on the other app server decreases initially then remains steady . It does not increase when I run a bunch of messages through it.

--------------
| | <====== ConvServer1
| Appserver2 | <====== ConvServer2
 --------------
 DevApp1 ThreadCount DevApp2 ThreadCount
Cluster with queuing started on devapp2
Start 158 200
After 4 usign packages 167 206
After 4 usign packages 173 208
After 4 usign packages 179 208
After hundred events 177 212
After 10 WPS packages 178 212
After 10 WPS packages 180 212

DevApp2 Only

Before shutting down Devapp1 212
After shutting down DevApp1 187
After 4 usign packages 187
After 6 usign packages 187

Restarted with each conv server
connecting to different appserver (no message sucking)

Start 155 225
After 4 usign packages 169 228
After 4 usign packages 174 232
After 3 packages one at a time 175 235*
After 20 uSign packages 187 256

Without belaboring the point, in our system a package causes a series of
messages to be posted to various queues as it moves through the processing chain.

The packages called usign require that a conversion be performed by a service
running on the windows (conv) servers. These are the ones that cause the increase
in the thread count. I am convinced that it is not my code that is leaving the threads
around because it only happens when the cluster has more than one server running.

* I observed that appserver one processed the message, but the thread count increased
on appserver2 anyway.

Here are a couple of the stranded threads from the jmx-console view.

Thread: Thread-1562 : priority:5, demon:true, threadId:4470, threadState:WAITING, lockName:java.lang.Object@11bad13

 java.lang.Object.wait(Native Method)
 java.lang.Object.wait(Object.java:474)
 EDU.oswego.cs.dl.util.concurrent.LinkedQueue.take(LinkedQueue.java:122)
 EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(QueuedExecutor.java:83)
 java.lang.Thread.run(Thread.java:595)

Thread: Thread-1568 : priority:5, demon:false, threadId:4478, threadState:WAITING, lockName:java.lang.Object@1ffed3a

 java.lang.Object.wait(Native Method)
 java.lang.Object.wait(Object.java:474)
 EDU.oswego.cs.dl.util.concurrent.LinkedQueue.take(LinkedQueue.java:122)
 EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(QueuedExecutor.java:83)
 java.lang.Thread.run(Thread.java:595)

Thread: Thread-1569 : priority:5, demon:true, threadId:4479, threadState:WAITING, lockName:java.lang.Object@12f926d

 java.lang.Object.wait(Native Method)
 java.lang.Object.wait(Object.java:474)
 EDU.oswego.cs.dl.util.concurrent.LinkedQueue.take(LinkedQueue.java:122)
 EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(QueuedExecutor.java:83)
 java.lang.Thread.run(Thread.java:595)

Has anyone seen anything like this before?

1. Re: Threads not being cleaned up when clustered

timfox Feb 3, 2008 9:23 AM (in response to chipschoch)

Please can you post (or mail me) a complete thread dump of the server when this problem occurs? (killall -3 java)
Actions
2. Re: Threads not being cleaned up when clustered

chipschoch Feb 5, 2008 1:29 PM (in response to chipschoch)
Tim,
I emailed a thread dump to tim.fox@jboss.com.

I have been able to narrow my parameters and get a reproducible environment for this issue.

I wrote a webapp program that queues up messages to the clustered queue. It uses the default provider so it is always queueing to its partial queue. This is executed on AppSvr1.

The consumers are connected to AppSvr2, so all messages posted get sucked from AppSvr1 to AppSvr2. In this configuration both servers leak the same number of threads, one for each message.

However, when I changed the request message to not specify a temporary queue for the return message I get no leakage. My consumer has a default queue to send responses if a response queue is not specified in the message. All the response messages end up on the default response queue.

So to summarize:
Cluster has 2 JBoss servers AppSrv1 & AppSrv2. AppSrvr1 posts messages to [partial queue 1] requestQueue. Messages are sucked over to AppSvr2 [partial queue 2] requestQueue Consumers consumes from [partial queue 2] requestQueue a) When a temporary response queue is specified, all response messages end up back on AppSvr1, but both servers leak one thread per message. b) When no response queue is specified then all responses end up on AppSvr2 [partial queue 2] responseQueue (the default) and no threads are leaked.

It would appear that the issue is somewhere in the code that deals with the temporary queues.

I hope this helps to resolve this.
Actions
3. Re: Threads not being cleaned up when clustered

timfox Feb 5, 2008 1:33 PM (in response to chipschoch)

I would have a look in your code to see where you are creating temporary queues, and make sure you are deleting them when you're finished.

Also it's worth taking a look in JNDI (use jmx-console) to see if there are a lot of temp queues hanging around.
Actions
4. Re: Threads not being cleaned up when clustered

timfox Feb 5, 2008 1:47 PM (in response to chipschoch)

BTW I would avoid creating a new temp rely queue for every message you send. This is likely to adversely affect performance.
Actions
5. Re: Threads not being cleaned up when clustered

chipschoch Feb 6, 2008 12:45 PM (in response to chipschoch)

Deleting the TemporaryQueue has no effect. The JMS API spec for createTemporaryQueue() says:

"Create a temporary queue. It's lifetime will be that of the QueueConnection unless deleted earlier."

I create a connection make the call and close the connection so I should not need to delete the queue. That said, I put in code to delete it anyway and it makes no difference. Also, I created a temporary queue then stopped execution in my debugger and went to jmx-console. The JNDIView does not list any temporary queues. Go Figure.
Actions
6. Re: Threads not being cleaned up when clustered

chipschoch Feb 6, 2008 4:01 PM (in response to chipschoch)

So, I modified my code to reuse the same jms connection and temporary queue within each process that posts messages, and now the thread leakage is gone.

While this change admittedly optimizes the processing, I would still consider it as a work around for a bug that does not dispose of the thread created by the use of a temporary queue.
Actions
7. Re: Threads not being cleaned up when clustered

timfox Feb 7, 2008 5:44 AM (in response to chipschoch)

I agree that although your application usage of temporary queues was an anti-pattern, it shouldn't leak threads as long as you were closing the connection.

Can you create a JIRA with a small program that demonstrates this issue and we will investigate further?

Also can you first verify you're not just hitting this http://jira.jboss.org/jira/browse/JBMESSAGING-1215 issue that was fixed a while back?
Actions

Go to original post