9 Replies Latest reply on Oct 8, 2008 11:17 PM by clebert.suconic

    Some thoughts on large messages and message chunking

    timfox

      I've been thinking about what we're going to do about large messages and message chunking this morning, as this seems to have become a sticking point.

      I'm conscious that the way we were trying to tackle the problem was perhaps too complex and trying to be "too clever". In large part this was probably my fault, not Andy's.

      To get us moving I propose a solution that is simpler, and will not effect in any way (e.g. performance) the primary use case - that of non-large messages.

      We can define a min large message size in bytes, call this ml above which any message is defined as large.

      When sending a message, we simply check if message size is greater than ml. If false then message sending and delivery proceeds exactly as normal. If true, then we send a standard SendMessage packet, followed by n MessageContinuation packets which contains the rest of the message split into each packet.

      This will involve some extra copying, but will only occur for the large message case, where high performance is probably not expected so much anyway. The user can also configure the value of ml for their system, so if they know they always send 16K messages, they can tune it appropriately to minimise copying.

      In the SendMessage header we write an extra field "hasContinuations". If true it means the message is large and has extra continuation packets to follow.

      Continuation packets can be interleaved on the remoting connection with other packets to prevent head-of-line blocking issues.

      When the server receives a SendMessage it checks the hasContinuations field. If true then the message is large and won't be stored in the journal. Instead the message and its continuations will be stored in a file on disk, and a "pointer" to that file will be stored in the journal.

      When a large message is delivered, again, it is delivered a set of packets not one, and the consumer can re-assemble the parts into a single message.

      I think this solution is simply and more do-able than the previous.