-
1. Re: AJP13 Request Hang Intermittently
erasmomarciano Jan 20, 2010 9:07 AM (in response to edlam_2000)check the mod_jk.log anf if you find some error the past them
-
2. Re: AJP13 Request Hang Intermittently
edlam_2000 Jan 20, 2010 9:26 AM (in response to erasmomarciano)I notice there are some error messages recorded in mod_jk.log:
(node1) can't receive the response header message from tomcat, network problems or tomcat is down
But that wasn't recorded during the time of the request hang, it was recorded at some other time.
When the request was hang, I noticed that both the Apache access_log and JBoss localhost access log didn't record the URL access in the log, however, the URL access record was logged into the log at almost 2 or 3 hours after the request hang.
-
3. Re: AJP13 Request Hang Intermittently
edlam_2000 Jan 21, 2010 12:31 AM (in response to edlam_2000)Usually for a successful AJP13 request, the JBoss returns immediately a 200 OK Header, the body chunk and then ends with an AJP End Response packet.
However, for a hang AJP13 request, the JBoss simply returns a TCP Acknowledgement and nothing else.
I can observe from the http://localhost:8080/status?full=true that the thread connection was in the K (Keep Alive) stage meaning the request has been successfully completed by JBoss even though JBoss returns nothing for the request. The number of active threads reported is always under 10 and the maximum threads is set to 1000.
This is so weird as it appears that the JBoss seems running very normal even though it is returning nothing for a request at time of problem occurence. Usually, 1 out of 4 requests would lead to this problem.
-
4. Re: AJP13 Request Hang Intermittently
edlam_2000 Jan 21, 2010 12:35 AM (in response to edlam_2000)In mod_jk.log, the most often recorded errors and warnings are:-
ajp_connection_tcp_get_message::jk_ajp_common.c (1150): (node1) can't receive the response header message from tomcat, network problems or tomcat (127.0.0.1:8009) is down (errno=104)
[error] ajp_get_reply::jk_ajp_common.c (1962): (node1) Tomcat is down or refused connection. No response has been sent to the client (yet)
[warn] map_uri_to_worker_ext::jk_uri_worker_map.c (962): Uri * is invalid. Uri must start with /
But these are recorded not at time of problem occurence.
-
5. Re: AJP13 Request Hang Intermittently
jfclere Jan 26, 2010 3:57 AM (in response to edlam_2000)It could be a new problem. Try
socket_keepalive=true in worker configuration.
No amswer from the application... The application hangs somewhere (and therefore you have the CLOSE_WAIT) try a stackdump and analyse the stack.
-
6. Re: AJP13 Request Hang Intermittently
edlam_2000 Jan 26, 2010 7:16 AM (in response to jfclere)Thanks, but How to do a stack dump? does the jboss needs to be stopped ?
When I look at the network connections of the hanged request, it's actually always in ESTABLISH state.
However, one intresting observation is that when I do a netstat -an, and it displays the 0 127.0.0.1:8009 127.0.0.1:xxxx, many of the entries have their recv-q filled up with over one hundred value and it just stay there and won't go away. I think this is very abnormal.
-
7. Re: AJP13 Request Hang Intermittently
peterj Jan 26, 2010 11:47 AM (in response to edlam_2000)"How to do a stack dump?"
http://java.sun.com/developer/technicalArticles/Programming/Stacktrace/
Also, JConsole and VisualVM let you take stack dumps of running JVMs.
-
8. Re: AJP13 Request Hang Intermittently
edlam_2000 Jan 27, 2010 12:23 AM (in response to edlam_2000)Thanks all.
Below is a netstat output of the ESTABLISHED connections:-
Most connection's Recv-Q are filled up with data for long period of time, is that the problem?
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 218 0 127.0.0.1:8009 127.0.0.1:35871 ESTABLISHED
tcp 179 0 127.0.0.1:8009 127.0.0.1:35869 ESTABLISHED
tcp 223 0 127.0.0.1:8009 127.0.0.1:35865 ESTABLISHED
tcp 300 0 127.0.0.1:8009 127.0.0.1:35863 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:36362 ESTABLISHED
tcp 300 0 127.0.0.1:8009 127.0.0.1:35841 ESTABLISHED
tcp 179 0 127.0.0.1:8009 127.0.0.1:35875 ESTABLISHED
tcp 223 0 127.0.0.1:8009 127.0.0.1:35874 ESTABLISHED
tcp 300 0 127.0.0.1:8009 127.0.0.1:35873 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:35927 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:35920 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:35939 ESTABLISHED
tcp 297 0 127.0.0.1:8009 127.0.0.1:35742 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:35998 ESTABLISHED
tcp 382 0 127.0.0.1:8009 127.0.0.1:35735 ESTABLISHED
tcp 219 0 127.0.0.1:8009 127.0.0.1:35733 ESTABLISHED
tcp 224 0 127.0.0.1:8009 127.0.0.1:35727 ESTABLISHED
tcp 300 0 127.0.0.1:8009 127.0.0.1:35724 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:35975 ESTABLISHED
tcp 224 0 127.0.0.1:8009 127.0.0.1:35712 ESTABLISHED
tcp 297 0 127.0.0.1:8009 127.0.0.1:35784 ESTABLISHED
tcp 224 0 127.0.0.1:8009 127.0.0.1:35835 ESTABLISHED
tcp 219 0 127.0.0.1:8009 127.0.0.1:35826 ESTABLISHED -
9. Re: AJP13 Request Hang Intermittently
edlam_2000 Jan 27, 2010 4:55 AM (in response to edlam_2000)For a hanged request, ie. a AJP13 request that has its TCP acknowledged but without any AJP13 Response Header received, the nestat network connection is showing its Recv-Q of 175,
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 179 0 127.0.0.1:8009 127.0.0.1:36890 ESTABLISHEDCould this be the problem???
-
10. Re: AJP13 Request Hang Intermittently
jfclere Jan 28, 2010 3:28 AM (in response to edlam_2000)netstat on the JAVA side?
So something is hanging... Stack dump should help you to find where it is hanging.
-
11. Re: AJP13 Request Hang Intermittently
edlam_2000 Feb 4, 2010 2:58 AM (in response to jfclere)Thanks, I haven't been able to take a thread dump / stack dump yet because that involves shuting down the server, but I will do so.
Right now, the server.xml connectionTimeout is 600000 ~ 10 minutes, while the workers.properties connection_pool_timeout is 60 ~ 1 minute. I know there's a mismatch because it is suggested in the document that these 2 values should be equal so that both end closes the connection at about the same time.
Even if there's a mismatch, resulting in a situation in which the mod_jk closes the connection earlier than the JBOSS, the JBOSS should passively close its connection when the mod_jk actively closes the connection at its local end. I mean there should be a TCP handshaking for closing the connection happens when the connection is closed by mod_jk, so that the JBOSS end is aware of the connection closed by the mod_jk, isn't it ?
-
12. Re: AJP13 Request Hang Intermittently
jaikiran Feb 4, 2010 3:11 AM (in response to edlam_2000)edlam_2000 wrote:
Thanks, I haven't been able to take a thread dump / stack dump yet because that involves shuting down the server, but I will do so.
You don't have to shutdown the server. You can use the jmx-console to obtain a thread dump. See this http://community.jboss.org/wiki/ThreadDump
-
13. Re: AJP13 Request Hang Intermittently
edlam_2000 Feb 4, 2010 5:24 AM (in response to jaikiran)Thanks.
Additional information is as follow:-
I noticed that for those empty netstat Recv-Q socket, its threads is actually in the K (Keep Alive) stage in the display of the http://localhost:8080/status?full=true JMX status output, and it's those (K) status thread with empty Recv-Q that is capable of handling AJP request without any problem.
On the other hand, those Recv-Q that is stuck with a positive number are in the R (Ready state). I also notice all those netstat CLOSE_WAIT belongs to those threads in the (R) Ready state.
To me, it appears that only those (K) state thread are able to handle client request successfully, as those (R) state thread are either stuck with Recv-Q packet or has already been in the CLOSE_WAIT state.
-
14. Re: AJP13 Request Hang Intermittently
edlam_2000 Feb 4, 2010 9:31 PM (in response to jaikiran)Thank you.
I have just taken the Thread dump via twiddle. Attached is the Thread dump file, threads.html.
I just added a "ping_mode=I" to the workers.properties file and today I found that the number of ESTABLISH connections has grown to a very large number 113. Before adding the "ping_mode=I", the number of ESTABLISH connection is usually around 10.
The JMX http://localhost:8080/status?full=true displays that there are 224 threads in the (R)eady State and 10 busy threads which are in the (K)eep Alive state.
Max threads: 1000 Current thread count: 224 Current thread busy: 10
Max processing time: 10246 ms Processing time: 343.196 s Request count: 8136 Error count: 250 Bytes received: 0.00 MB Bytes sent: 25.13 MBIt appears that all those threads in the (R)eady state are either being stuck in their Recv-Q or in the CLOSE_WAIT network connections, and it's only those threads which are in the (K)eepalive state is capable of handling client request successfully.
All the Recv-Q of the ESTABLISH connection are now being stuck with data as below:-
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 298 0 127.0.0.1:8009 127.0.0.1:45087 ESTABLISHED
tcp 473 0 127.0.0.1:8009 127.0.0.1:45086 ESTABLISHED
tcp 298 0 127.0.0.1:8009 127.0.0.1:45085 ESTABLISHED
tcp 477 0 127.0.0.1:8009 127.0.0.1:45083 ESTABLISHED
tcp 298 0 127.0.0.1:8009 127.0.0.1:45082 ESTABLISHED
tcp 518 0 127.0.0.1:8009 127.0.0.1:45081 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:43032 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:45080 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42517 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42772 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42000 ESTABLISHED
tcp 298 0 127.0.0.1:8009 127.0.0.1:35855 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:41230 ESTABLISHED
tcp 368 0 127.0.0.1:8009 127.0.0.1:35850 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:35848 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:40455 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:39427 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:40508 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:40507 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:40506 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:41527 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:41526 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:43059 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42801 ESTABLISHED
tcp 298 0 127.0.0.1:8009 127.0.0.1:34353 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:40240 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42799 ESTABLISHED
tcp 302 0 127.0.0.1:8009 127.0.0.1:35887 ESTABLISHED
tcp 472 0 127.0.0.1:8009 127.0.0.1:35885 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:38186 ESTABLISHED
tcp 298 0 127.0.0.1:8009 127.0.0.1:45091 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:40995 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:45090 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:39202 ESTABLISHED
tcp 514 0 127.0.0.1:8009 127.0.0.1:45089 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:39201 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:40284 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:39256 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:39254 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:43347 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:44110 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:39753 ESTABLISHED
tcp 298 0 127.0.0.1:8009 127.0.0.1:44616 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:44614 ESTABLISHED
tcp 447 0 127.0.0.1:8009 127.0.0.1:38782 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42877 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:38781 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:43894 ESTABLISHED
tcp 444 0 127.0.0.1:8009 127.0.0.1:40561 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:41584 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:40558 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:39533 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:41833 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:44645 ESTABLISHED
tcp 300 0 127.0.0.1:8009 127.0.0.1:44643 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:42337 ESTABLISHED
tcp 516 0 127.0.0.1:8009 127.0.0.1:44956 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:43675 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:43673 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42133 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:38292 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:42387 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42131 ESTABLISHED
tcp 444 0 127.0.0.1:8009 127.0.0.1:41361 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:38794 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:38793 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:38788 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:38786 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:44673 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:38784 ESTABLISHED
tcp 570 0 127.0.0.1:8009 127.0.0.1:34495 ESTABLISHED
tcp 448 0 127.0.0.1:8009 127.0.0.1:38330 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:43704 ESTABLISHED
tcp 457 0 127.0.0.1:8009 127.0.0.1:38328 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:41654 ESTABLISHED
tcp 447 0 127.0.0.1:8009 127.0.0.1:38836 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:39860 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42160 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:39344 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:36010 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:40358 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:41122 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:43426 ESTABLISHED
tcp 444 0 127.0.0.1:8009 127.0.0.1:40609 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:40608 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:40159 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:38110 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:39644 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:38362 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:43482 ESTABLISHED
tcp 445 0 127.0.0.1:8009 127.0.0.1:38872 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:42712 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:34518 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:38102 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:44240 ESTABLISHED
tcp 512 0 127.0.0.1:8009 127.0.0.1:37583 ESTABLISHED
tcp 182 0 127.0.0.1:8009 127.0.0.1:44238 ESTABLISHED
tcp 444 0 127.0.0.1:8009 127.0.0.1:38859 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45259 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45258 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45257 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:44745 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45256 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45255 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45254 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45253 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45252 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45251 ESTABLISHED
tcp 0 0 127.0.0.1:8009 127.0.0.1:45250 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:40701 ESTABLISHED
tcp 446 0 127.0.0.1:8009 127.0.0.1:40188 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:43760 ESTABLISHED
tcp 180 0 127.0.0.1:8009 127.0.0.1:34544 ESTABLISHED-
threads.html.zip 9.2 KB
-