Performance (CPU Utilization costs, for remote calls across
bainwohl Dec 3, 2009 3:43 PMI ran a simple experiment to try and measure the performance impact of using a remote session versus a local session within a jboss cluster. More specifically, I was looking at CPU utilization. It involved the following components:
IEchoServer, IEchoServerLocal, IEchoServerRemote interfaces
- the IEchoServer interface has one method
- MyMessage echo(MyMessage)
- IEchoServerLocal and IEchoServerRemote both extend the IEchoServer Interface, with a @Local and @Remote annotation, respectively
EchoServer
- implementation of the IEchoServerLocal and IEchoServerRemote interfaces with a @Stateless annotation
- one line implementation of the "MyMessage echo(MyMessage msg) interface, return msg;
Message Producer MBean:
- this MBean is capable of producing messages at a specified rate
- the messages created, MyMessage, is a POJO with a number of properties with getters and setters
- the message also contains a byte[] payload
- the message producer can be configured to generate messages with a text payload or a random binary payload, of a specified length
- a custom handler is used to consume the generated messages
- for the purposes of the test, the custom handler calls out to the
IEchoServer interface
- it can be configured to call out to the local session or the remote session running on a different box (passed the hajndi binding name).
We have two test machines, each running JBoss (5.0.1.GA), H1 and H2 respectively. The machines are both quads. All CPU utilization numbers are out of 400%.
Deployed on each machine is the following:
- MessageProducer MBean
- EchoServer SLSB (both local and remote)
On both H1 and H2:
- the IEchoServerLocal implementation is bound as EchoServer/local
On H1:
- the IEchoServerRemote implementation is bound as EchoServer_H1
On H2:
- the IEchoServerRemote implementation is bound as EchoServer_H2
Very simple stuff; for each of the following test cases I had the message producer push messages at a specified rate. Each message has a text payload of 1000 bytes.
In the remote test, messages produced on H1 call out to "EchoServer_H2" and messages produced on H2 call out to "EchoServer_H1".
Test Case One:
H1: MessageProducer, generating MyMessage instances @ 400msg/sec, with a 1000byte payload, calling out to the local EchoServer SLSB
H2: does nothing
Results: H1_CPU_Utilization=2.5%,H2_CPU_Utilization=0
Test Case Two:
H1: MessageProducer, generating MyMessage instances @ 400msg/sec, with a 1000byte payload, calling out to the remote EchoServer SLSB running on H2
H2: accepts calls to the EchoServer SLSB, from H1
Results: H1_CPU_Utilization=34%, H2_CPU_Utilization=32%
THIS IS NOT A MISTAKE!
Test Case Three:
H1: MessageProducer, generating MyMessage instances @ 200msg/sec, with a 1000byte payload, calling out to the remote EchoServer SLSB running on H2 AND accepts call to the EchoServer SLSB from H2
H2: MessageProducer, generating MyMessage instances @ 200msg/sec, with a 1000byte payload, calling out to the remote EchoServer SLSB running on H1 AND accepts call to the EchoServer SLSB from H1
Results: H1_CPU_Utilization=35%, H2_CPU_Utilization=35%
Test Case Four:
H1: MessageProducer, generating MyMessage instances @ 400msg/sec, with a 1000byte payload, calling out to the remote EchoServer SLSB running on H2 AND accepts call to the EchoServer SLSB from H2
H2: MessageProducer, generating MyMessage instances @ 400msg/sec, with a 1000byte payload, calling out to the remote EchoServer SLSB running on H1 AND accepts call to the EchoServer SLSB from H1
Results: H1_CPU_Utilization=66%, H2_CPU_Utilization=70%
Of course, the round trip latency increased by an order of magnitude, when comparing local to remote calls, however, this was expected.
I can see via netstat that TCP/IP connections are coming and going during tests 2,3 and 4.
I'm wondering where all the CPU is going. Is it used opening and closing socket connections or is it the cost of serialization and writing/reading to and from the socket?
Presumably, I'm already using JBoss Remoting over sockets.
The question is, is this as good as it gets? Can I tweak the configuration to improve this? Any help would be greatly appreciated.
Cheers.