1 2 Previous Next 25 Replies Latest reply on May 15, 2008 10:42 AM by timfox

Optimisations: A couple of low hanging fruits going for grab

timfox May 14, 2008 8:58 AM

There are a couple of areas for valuable optimisation if anyone would like to take them:

1)

One is that generating JMS message ids before sending can slow things down significantly (maybe 5%??).

I noticed that if i set producer.setDisableMessageID(true) on the producer before sending performance increases significantly

I am sending 1k non persistent bytes messages from client one my laptop to server on another machine with gigabit lan and i can get about 45K msgs / sec :), but this drops to about 40K or less with disablemessageid set to false

I suspect this is due to the following in JBossMessageProducer:

String id = UUID.randomUUID().toString();
bm.setJMSMessageID("ID:" + id);

which involves generation and two (?) string copies.

What we need is a way of generating the id without doing String copies - you could take the same UUID algorithm and apply it directly to a SimpleString instance (with the id already included at the beginning).

2) Text messages. Currently text message bodies are encoded using UTF-8 with the encoding methods on the MINA IoBuffer class. This is slow (as mentioned in another thread).

I'm sure there is great scope to speed this up.

1. Re: Optimisations: A couple of low hanging fruits going for

jmesnil May 14, 2008 11:33 AM (in response to timfox)
"timfox" wrote:
There are a couple of areas for valuable optimisation if anyone would like to take them:

1)

One is that generating JMS message ids before sending can slow things down significantly (maybe 5%??).

I noticed that if i set producer.setDisableMessageID(true) on the producer before sending performance increases significantly

I am sending 1k non persistent bytes messages from client one my laptop to server on another machine with gigabit lan and i can get about 45K msgs / sec :), but this drops to about 40K or less with disablemessageid set to false

I suspect this is due to the following in JBossMessageProducer:
String id = UUID.randomUUID().toString(); bm.setJMSMessageID("ID:" + id);

which involves generation and two (?) string copies.

It seems the bottleneck is the generation of the UUID.

When using UUID.randomUUID(), the implementation uses a SecureRandom.
If I use Random instead and creates the UUID with new UUID(rand.nextLong(), nextLong(), the perf increases significantely.

I've isolated this in the following tests:

public void testRandomUUID() throws Exception { long start = System.currentTimeMillis(); for (int i = 0; i < MANY_TIMES; i++) { UUID uuid = UUID.randomUUID(); String id = "ID:" + uuid; } long duration = System.currentTimeMillis() - start; System.out.println(getName() + ": " + duration); } public void testSecureRandom() throws Exception { doManyRandomLongs(new SecureRandom()); } public void testRandom() throws Exception { doManyRandomLongs(new Random()); } public void doManyRandomLongs(Random rand) { long start = System.currentTimeMillis(); for (int i = 0; i < MANY_TIMES; i++) { UUID uuid = new UUID(rand.nextLong(), rand.nextLong()); String id = "ID:" + uuid; } long duration = System.currentTimeMillis() - start; System.out.println(getName() + ": " + duration); }

When running the loop 1 million times, I've got:
testRandomUUID: 18625
testSecureRandom: 19430
testRandom: 5059

=> using Random.nextLong instead of RandomUUID, we go from 18s to 5s
Actions
2. Re: Optimisations: A couple of low hanging fruits going for

trustin May 14, 2008 11:41 AM (in response to timfox)

"timfox" wrote:
2) Text messages. Currently text message bodies are encoded using UTF-8 with the encoding methods on the MINA IoBuffer class. This is slow (as mentioned in another thread).

One of my friends told me just writing a byte array generated by String.getBytes(enc) performs better. YMMV though. I'd prefer David's suggestion to use UTF-16 although it's not so efficient for ASCII strings.
Actions
3. Re: Optimisations: A couple of low hanging fruits going for

timfox May 14, 2008 11:55 AM (in response to timfox)

"jmesnil" wrote:

It seems the bottleneck is the generation of the UUID.

When using UUID.randomUUID(), the implementation uses a SecureRandom.
If I use Random instead and creates the UUID with new UUID(rand.nextLong(), nextLong(), the perf increases significantely.

I don't think we should be using random UUIDs at all, since they're... well random (so can clash).

Instead I think we should use a variant 2 UUID.
Actions
4. Re: Optimisations: A couple of low hanging fruits going for

timfox May 14, 2008 11:57 AM (in response to timfox)

"trustin" wrote:
I'd prefer David's suggestion to use UTF-16 although it's not so efficient for ASCII strings.

I'm not really sure I understand why UTF-16 is going to be more performant than UTF-8.
Actions
5. Re: Optimisations: A couple of low hanging fruits going for

timfox May 14, 2008 12:00 PM (in response to timfox)

"timfox" wrote:

Instead I think we should use a variant 2 UUID.

http://www.webdav.org/specs/draft-leach-uuids-guids-01.txt
Actions
6. Re: Optimisations: A couple of low hanging fruits going for

trustin May 14, 2008 12:09 PM (in response to timfox)

"timfox" wrote:
"trustin" wrote:
I'd prefer David's suggestion to use UTF-16 although it's not so efficient for ASCII strings.

I'm not really sure I understand why UTF-16 is going to be more performant than UTF-8.

It's because it's as simple as writing a series of short integers? It should look like this for example:

for (int i = 0; i < str.length(); i ++) {
buf.putChar(str.charAt(i));
}
Actions
7. Re: Optimisations: A couple of low hanging fruits going for

trustin May 14, 2008 12:11 PM (in response to timfox)

"timfox" wrote:
"timfox" wrote:

Instead I think we should use a variant 2 UUID.

http://www.webdav.org/specs/draft-leach-uuids-guids-01.txt

This might also be useful:

http://jug.safehaus.org/
Actions
8. Re: Optimisations: A couple of low hanging fruits going for

timfox May 14, 2008 12:30 PM (in response to timfox)

"trustin" wrote:

This might also be useful:

http://jug.safehaus.org/

Looks interesting.
Actions
9. Re: Optimisations: A couple of low hanging fruits going for

timfox May 14, 2008 12:32 PM (in response to timfox)

"trustin" wrote:

for (int i = 0; i < str.length(); i ++) {
buf.putChar(str.charAt(i));
}

Well... UTF-8 is just putting a sequence of bytes generated from bitwise operations on the chars? So should be fast right?

Maybe the overhead is due to the creation of the encoder class etc, I don't really know I haven't profiled it yet...

Maybe we should just write our own encoding that writes directly onto a SimpleString...
Actions
10. Re: Optimisations: A couple of low hanging fruits going for

dmlloyd May 14, 2008 12:40 PM (in response to timfox)

UTF-8 is a little more complex than that. Chars 0-0x7F are represented by one byte, 0-0x7F. Beyond that characters are represented with 2 to 4 bytes. This means that for every character there are multiple comparisons and shifts performed, with some extra bits being set or cleared for certain characters.

UTF-16 on the other hand, being the native encoding for Java, is written one char at a time without transcoding - no shifts, no comparisons, no bitmasks. It's just a straight write of chars. You can't possibly do better than that in terms of processing speed.
Actions
11. Re: Optimisations: A couple of low hanging fruits going for

trustin May 14, 2008 12:44 PM (in response to timfox)

"timfox" wrote:
"trustin" wrote:

for (int i = 0; i < str.length(); i ++) {
buf.putChar(str.charAt(i));
}

Well... UTF-8 is just putting a sequence of bytes generated from bitwise operations on the chars? So should be fast right?

Maybe the overhead is due to the creation of the encoder class etc, I don't really know I haven't profiled it yet...

Maybe we should just write our own encoding that writes directly onto a SimpleString...

Yes, the creation of a CharsetEncoder takes a lot of time and therefore cached somewhere. String.getBytes() uses ThreadLocal for this purpose, but if you already have some context object that stores the decoding/encoding state, then you can just add a field there.
Actions
12. Re: Optimisations: A couple of low hanging fruits going for

timfox May 14, 2008 1:02 PM (in response to timfox)

"david.lloyd@jboss.com" wrote:

UTF-16 on the other hand, being the native encoding for Java, is written one char at a time without transcoding - no shifts, no comparisons, no bitmasks. It's just a straight write of chars. You can't possibly do better than that in terms of processing speed.

Even for UT-16 I believe for the high planes of unicode the characters have to be encoded using pairs of chars, so it's not quite a simple write of chars but pretty close.

But in any case, simple comparisons and bitwise operations are fast. I doubt this is the reason why the encoding is slow.

My bet, as mentioned before is there is some other overhead due to the setup of the codec classes.
Actions
13. Re: Optimisations: A couple of low hanging fruits going for

dmlloyd May 14, 2008 1:08 PM (in response to timfox)

Incorrect, Tim. A char holds 16 bits - high-plane code points are *already* split into surrogate pairs in a char[] such as used to back a String. A char represents a UCS-16 character, not a Unicode codepoint.

So again, just write out the chars and you've got UTF-16.
Actions
14. Re: Optimisations: A couple of low hanging fruits going for

timfox May 14, 2008 1:43 PM (in response to timfox)

"david.lloyd@jboss.com" wrote:
Incorrect, Tim. A char holds 16 bits - high-plane code points are *already* split into surrogate pairs in a char[] such as used to back a String.

This is probably all moot since we're talking about bitwise operations here which are going to be fast :)
Actions

1 2 Previous Next

Go to original post