6 Replies Latest reply on Jul 24, 2012 2:23 AM by jfclere

routing table lookup performance

asyomichev Aug 26, 2011 4:45 PM

I recently noticed that as the population of instances behind mod_cluster grows, request routing takes significantly longer time. To give a rough order of magnitude, I am talking about latencies on the order of 100-200ms for pools of about 200+ workers. Parallel requests still scale well (i.e. several request loops running in parallel show the same request rate as a single one), but latency of 200 ms translates into a single-client throughput of 5 HTTP transactions per second, which is alarmingly low.

After some digging it became apparent that the time is mostly spent accessing host and context information in the routing table in shared memory. I am talking about host_storage->read_host() and context_storage->read_context() in get_balancer_by_node() (mod_proxy_cluster.c) It looks like reading from shared memory is quite slow and on top of that it is done repeatedly (something close to O(n^2) to the number of entries).

Has someone observed it before? Is the anything in the works to improve?

My environment: linux 2.6.18 x86_64, httpd 2.2.15, mod_cluster 1.1.0.Final, Tomcat 6.0.20

Thanks,
--Alexey

1. Re: routing table lookup performance

jfclere Aug 29, 2011 2:30 AM (in response to asyomichev)

Could you try 1.1.3? it should bring some improvements.
Actions
2. Re: routing table lookup performance

asyomichev Aug 29, 2011 10:31 AM (in response to jfclere)
Just tried with a fresh build of 1.1.3, but unfortunately the results are very similar. I've cooked up a patch on top of 1.1.0.Final (attached) that seems to improve things by caching host and context entries in the heap. I am seeing up to 10x improvement in certain cases. Could you please take a look if the approach makes sense and it would be meaningful to port it to 1.1.3? (the patch is not applicable to 1.1.3 directly due to significant rework in mod_proxy_cluster.c)

mod_proxy_cluster.patch.zip 3.4 KB
Actions
3. Re: routing table lookup performance

jfclere Aug 30, 2011 2:05 AM (in response to asyomichev)

I have created a JIRA and I will integrate the patch in the next version. Many thanks.
Actions
4. Re: routing table lookup performance

asyomichev Jun 15, 2012 4:08 PM (in response to jfclere)

Hi Jean-Frederic,

I recently upgraded to 1.2.0.Final with the patch included, and found that the lookup performance is still not quite where expected: with 200 instances I still see over 90ms of latency per http transaction on average.

Looking at mod_proxy_cluster.c in 1.2.0.Final I noticed that worker lookup now touches "node" structures within a tight loop and it is stilll coming from the shared memory. I took the liberty of mirroring the caching approach already applied to "context" and "host", and that brought the latency down to about 10ms in that test. What is more important for scalability, it became less dependent on the table size.

I was wondering if you could consider adding the extra caching (proxy_node_table/read_node_table) to the upstream of mod_cluster for further releases?

Thanks,
--Alexey
Actions
5. Re: routing table lookup performance

jfclere Jun 18, 2012 3:59 AM (in response to asyomichev)

Adding extra caching for the node looks a good idea. Create a JIRA and submit a patch ;-)
Actions
6. Re: routing table lookup performance

jfclere Jul 24, 2012 2:23 AM (in response to asyomichev)

the initial JIRA was MODCLUSTER-252.
Actions

Go to original post