0 Replies Latest reply on Nov 17, 2009 11:22 AM by timfox

Journal threads blocking on completion means it doesn't scal

timfox Nov 17, 2009 11:22 AM

Theoretically the timedbuffer using NIO can saturate the disk throughput of the disk even when disk write cache is activated. (I have tested this by tuning timed buffer size)

However, there is a problem, currently say you have many connections each sending blocking sends of messages. Currently each thread would add the record in the journal then the thread would block waiting for completion.

This blocking simply doesn't scale, we will never get anywhere near disk write throughput by blocking like this.

(Actually we can get to about a max of 2MiB/s by blocking, as opposed to the max of 24 MiB/s). I.e. we're limited to about 8% of theoretical performance.

Instead of blocking, the way it should be implemented is as follows:

We append the record to the journal and instead of blocking waiting for completion, we need to register a runnable with the completion, so when the completion occurs it executes the runnable.

The runnable then sends the response packet back to the client, e.g.

new Runnable()
{
public void run()
{
packet.confirm();
channel.send(response);
}
}

This is similar to what we currently do in replication.

Morale of the story - we shouldn't be blocking in the messaging server - blocking is bad and stops it scaling.