7 Replies Latest reply on Aug 29, 2009 10:55 AM by Ross Nicholson

    Messaging Ordering - Is there a better way?

    Ross Nicholson Apprentice

      Currently in the process of moving from JBM to HornetQ and I was wondering if you guys have any advice on preserving the order of messages.

      I'll give an example first. There are several different source servers, let's call them A,B and C, which all send messages to a single queue on X using bridges (will probably be core bridges). For performance reasons it's preferrable to use EJB's to read the messages on many threads, than use a single message consumer. It's is important that the order that packets are sent from each source (but not across sources) is preserved, which we lose when using EJB's.

      Currently I generate a sequence number at each source and embed it in the message (along with an ID for the source) so we can order the packets based on the source once they arrive in the at X. This works fine but I am required to store the sequence numbers used both at the sources and X as each message is sent and received. This of course has a performance overhead as persistent storage is required.

      With the arrival of HornetQ could you tell me if there is any other way to do what it is I'm doing? Or is what I'm doing the best approach possible?

      Cheers,

      Ross

        • 1. Re: Messaging Ordering - Is there a better way?
          Clebert Suconic Master

          I'm not sure what are your requirements. We guarantee Message Ordering at producer's level.


          Maybe you need something like Message Group?

          http://hornetq.sourceforge.net/docs/hornetq-2.0.0.BETA5/user-manual/en/html/message-grouping.html

          • 2. Re: Messaging Ordering - Is there a better way?
            Ross Nicholson Apprentice

            Another example might help:

            A) sends: MA1, MA2, MA3
            B) sends: MB10, MB11, MB12
            C) sends: MC21, MC22, MC23

            If all these messages are sent at the same time then they can be delivered in any order using an MDB (due to multiple consumers).

            So I'll need to group them and order them correctly when the MDB fires. In the case above I group them by A,B,C and order them correctly so at X the order is preserved after the MDB has finished firing.

            If I was to use groups, are you saying that only three threads would be used by the MDB and all the messages from A would arrive in one thread, all from B would arrive at the second and all from C at the third in the correct order? Is it also true that there will never be overlap between one consumer closing and another being chosen (i.e. if the consumer reading A is closing is it guaranteed that it will finish before another consumer reads the next message from A)?

            I fear I would still suffer from performance problems when using groups as only one thread is reading for each producer. In the example above a maximum of 3 consumers could only be running at any one time?

            I need to ascertain which of the following has better performance:

            1) Read messages as quickly as possible from the MDB and order them in memory.

            2) Your solution, use message groups - I won't need to order the messages in memory but fewer MDB consumers will be used.

            At what volume of messages will 1) outperform 2) by a factor? Or have the advances in HornetQ made my method 1 obselete (Or the difference is negligible)?

            As a guideline we would need to process about 200 messages a second where each message is 2K to 5K in size.

            There will only be one application server running that will consume the messages.

            Thanks for your help and guidance,

            Ross

            ;)

            • 3. Re: Messaging Ordering - Is there a better way?
              Clebert Suconic Master

               

              As a guideline we would need to process about 200 messages a second where each message is 2K to 5K in size.



              I believe it would be possible to process at that rate.

              Your bottleneck here however is the transaction of receiving each message. Every message receive will fire a transaction back to the server, what will cause a roundtrip and a wait on the server for the info being serialized on disk. (or synced if using NIO.. see my blogpost on http://hornetq.blogspot.com/2009/08/persistence-on-hornetq.html)


              If you batch transactions you would be able to process at even faster rates.

              Currently our JCA doesn't support batching, but that' s something we are considering.

              Maybe you shouldn't use MDBs at this point and control the transaction manually.

              • 4. Re: Messaging Ordering - Is there a better way?
                Tim Fox Master

                Clebert is right.

                If you're just using straight HornetQ - it'll be able to deliver many 1000s of messages per sec.

                But as soon as you put JTA in the picture, your performance will be limited by that, since JTA tx need to sync at different points in the commit protocol and requires syncing. That's nothing to do with HornetQ.

                The MDB and JCA layers themselves will also provide some overhead.

                If you're using a database it's likely your perf will be limited by that too. HornetQ storage is much faster than any database we've seen.

                So the bottom line is, if you've got JTA or database in the picture don't worry too much about HornetQ performance - it's not your limiting factor. If you want to speed things up you need to tune your db, tune JTA, or think about handling transactions in a different way

                • 5. Re: Messaging Ordering - Is there a better way?
                  Ross Nicholson Apprentice

                  Cool,

                  This is really helpful.

                  To be honest I don't really need to use MDB's at all. All I need to store when I receive a message is a unique id for that message (which I store in a berkeley db), then the payload of the message is put in a concurrent data structure and another module takes over from there, plus I only ever access the data in the berkely db in the case of a fail where I reload the data from an external source.

                  So if I was to use message groups, and use HornetQ consumers instead of MDBs can the message still be rolled back (or marked for redelivery) if the unique id has not been stored? Once this id has been stored I'm happy that the delivery of the message has been completed. The javadoc is a little sparse on information, but I'm sure you guys will get around to adding that in.

                  How do I assign a set of consumers for processing message groups? Or do I simply create the same number of consumers as message groups I have and leave the choice of consumer up to HornetQ (I'm guessing I can set the MessageHandler on each of my consumers so the call will be similar to an MDB)?

                  Finally, Am I right in saying that using JMS to send messages is OK still? This simply maps to a HornetQ send?

                  If I can find a reliable solution avoiding the use of JTA and MDBs I think I would like to use it as performance will always be important. Plus I know as soon as the system is in place they will want to increase traffic many fold (in the past two years messaging traffic has increased by 200%).

                  I do not need to interface with any third party messaging systems so going totally HornetQ is not an issue.

                  • 6. Re: Messaging Ordering - Is there a better way?
                    Tim Fox Master

                    I guess I don't really understand your application architecture, but why do you need to store messages in a database?

                    The messaging system can guarantee persistence for messages.

                    • 7. Re: Messaging Ordering - Is there a better way?
                      Ross Nicholson Apprentice

                      All messages are sent to a central server from many different sources. From here they are sent onto a number of different targets. The data contained in each message is the superset of all data required by any target. At the central server the data is analysed, prioritised (against other messages) and only the data required for a specific target is sent on to the target in question. It's not possible to send all data to all targets as the data for certain targets is private and should never be seen by a rival target. In addition to this the sources do not know anything about the targets. It's up to the central server to decide which targets are to receive what data.

                      So the architecture is a set of bridges from the sources to the central location and another set of bridges from the central server to the targets (All servers are separted by WAN's and leased lines).

                      As the central server is essentially a broker for message delivery to the targets it reads messages from an incoming queue and sends the messages on (once processed) to the required outgoing queue(s). As one incoming message can be split into several different outgoing messages I need to persist the fact that the message has come off the incoming queue (no longer persisted by the messaging system) and has not yet been placed on the outgoing queue(s) (persisted once more).

                      So I need to do the following: once the message is delivered to the central server I must persist the fact it has been delivered before the messaging system removes it from its own persistent storage. Otherwise in the case of the central server failing we will not know what messages were delivered to the central server and which of these have been processed and sent on to their targets.

                      The berkeley DB's at the central server are replicated to DR servers.

                      I hope I'm giving you the right amount of detail? Or am I confusing you further?

                      ;)

                      A possible solution would be to do all processing from the onMessage call in the MDB basing everything on transactions (but the delay to read the next message could be quite large, depending on the number of targets that would receive this processed message) and the messaging system would provide the persistence. Or, persist the messages as they arrive, while not under the governance of the messaging system so they can be processed and prioritised. This would be the new approach, allowing us to prioritise and compare received messages for importance.