1 Reply Latest reply on Feb 6, 2003 11:15 AM by greid

Huge memory usage by httpd

greid Feb 3, 2003 12:25 PM

We've been experiencing a very odd problem with Apache processes going rogue and taking up huge amounts of memory, as well as using large amounts of CPU.

Our setup is as follows:
We've got a RedHat 7.3 (kernel version 2.4.18-3) box running Apache 2.0.43 with mod_jk-2.0.42, sitting in front of several Win2K Server machines running JBoss3.0.0 with bundled Tomcat 4. The Linux box with Apache handles all incoming requests, running several Virtual Hosts for static content, and dispatching all webapp requests to the appropriate Win2k/JBoss/Tomcat server over AJP13. We've also had a Win2K server in the place of the Linux box and experienced the same problem.

Every so often (within every hour under heavy load, less often under a lighter load), one or more httpd processes will begin to take up more and more memory (hundreds of MB), at an alarmingly fast rate. This seems to burn up CPU as well, althought that's less of a problem. Apart from these httpd processes going crazy, there does not seem to be many clues as to what is going on. The only way to 'fix' this so far has been killing the memory eating process(es), or sending Apache a HUP.

As I say, there seem to be very few clues as to how this is occurring. Using netstat, it first appeared that there were a large number of connections in SYN_RECV state, but this does not seem to be very consistent (and I also understand that this is most likely due to IE's non-compliant behaviour). Nothing of interest comes throught in /var/log/messages, or the httpd error log or mod_jk log.

I've tried setting MaxRequestsPerChild to a very low number (as low as 5) in httpd.conf, as well as bringing the MaxKeepAliveRequests and KeepAliveTimeout way down. These measures may have had an effect, but if they have, it has been very minor.

Up to this point, I have been unsuccessful in finding any more information about his problem anywhere in the forums, or searches on the Internet. I can't imagine that we're the only ones to experience this problem, or that I'm missing something that is painfully obvious, but I can't seem to get any closer to the problem either. I'm also pretty sure that it is linked to mod_jk, because other instances of Apache (running on Windows, only serving static content, with a slightly lower load) are not having the same problems. If anyone has any information about this, I would be very thankful if you could share your wisdom. In any case, if I am able to find the problem, I will be sure to post it here immediately.

Gabriel

1. Re: Huge memory usage by httpd

greid Feb 6, 2003 11:15 AM (in response to greid)

Although I still don't know for certain that I've solved the problem from my previous post, I'm posting my findings here in the hope that they may be of some use to someone else.

I eventually found that there appears to be a problem with the tomcat side of the connection, in terms of following the AJP13 protocol. My understanding of the protocol is pretty much entirely based on the explanation at http://jakarta.apache.org/tomcat/tomcat-4.1-doc/jk2/common/AJPv13.html.

Based on the above-mentioned document, when the servlet container requests a body chunk, the packet it sends should be of the following form:

0x41 0x42 [ two bytes for the size of this packet] [ chunk size request ]

The web server should respond to this request with a SEND_BODY_CHUNK packet, unless there is no further data to send, in which case it should respond with

0x12 0x34 0x00 0x00

which basically says "There is no more data"

The packet capture below shows this going on, over and over and over, for the process that builds up memory.
[pre]
# T 10.10.4.141:8009 -> 10.10.4.95:37321 [AP]
41 42 00 03 06 1f fa AB.....

# T 10.10.4.95:37321 -> 10.10.4.141:8009 [AP]
12 34 00 00 .4..

# T 10.10.4.141:8009 -> 10.10.4.95:37321 [AP]
41 42 00 03 06 1f fa AB.....

# T 10.10.4.95:37321 -> 10.10.4.141:8009 [AP]
12 34 00 00 .4..

# T 10.10.4.141:8009 -> 10.10.4.95:37321 [AP]
41 42 00 03 06 1f fa AB.....

# T 10.10.4.95:37321 -> 10.10.4.141:8009 [AP]
12 34 00 00 .4..
[/pre]

It seems to me that the catalina connector (org.apache.ajp.tomcat4.Ajp13Connector) is just not properly following the AJP13 protocol, although I don't want to start pointing fingers. If anyone can shed any light on this, I would appreciate it.

In any case, I've upgraded to JK2 on the Apache side, and the combination of CoyoteConnector and JkCoyoteHandler and the Tomcat/JBoss side, so hopefully this will resolve this problem.
Actions