10 Replies Latest reply on Mar 15, 2011 12:46 PM by joe.developer

Hornetq Evaluation

joe.developer Mar 9, 2011 2:24 PM

I'm busy evaluating HornetQ for use as part of a high throughput publish-subscribe system and have a few questions:

1. (Most important issue for our requirements) If I use HornetQ for publish-subscribe how are slow readers/consumers handled? Section 19.1.1 of your user guide states that slow readers can hold up a queue. I'm developing a system with high throughput, approx 10k messages a second. If one of the subscribers to the system is reading messages slowly I don't want the rest of the subscribers to be impacted. To be clear each subscriber does not "remove" items from the queue, each message is published to all interested subscribers. If a subscriber is reading too slowly the requirement is that that subscriber gets booted.

2. What algorithm is used for matching messages to subscriptions? (link to the source would be great).

The following may be of interest:

http://www.rabbitmq.com/blog/2010/09/14/very-fast-and-scalable-topic-routing-part-1/

Thanks

1. Hornetq Evaluation

clebert.suconic Mar 9, 2011 4:33 PM (in response to joe.developer)

1. That's fixed on the next version

2. We distributed the messages on topics through PostOfficeImpl::route.
1 of 1 people found this helpful
Actions
2. Hornetq Evaluation

joe.developer Mar 10, 2011 2:46 AM (in response to clebert.suconic)

Thanks for the reply Clebert!

I have a few more questions

a) When do you expect 1) to be fixed (which version number and when is the the intended release date?)

b) I notice your matcher uses java regex and your matching code is in Matcher.java. It would be interesting to see how the performance of this regex approach compares to:
http://www.rabbitmq.com/blog/2010/09/14/very-fast-and-scalable-topic-routing-part-1/ (they seem to have some good ideas, but would prefer a java solution as I can imbed it).

c) If I understand your documentation correctly a subscription is represented by a queue and only one pattern can be associated with a single queue. Perhaps I misread it, but I need each subscriber to be represented by a queue which can match multiple patterns e.g. a.*.b AND f.c.*d. Each subscriber needs to subscribe to multiple data items.

d) Does HornetQ support parellism in cases where reordering doesn't matter e.g. if I have data.x.abc and data.y.abc (I would like all data containing x to remain ordered with respect to other tuples with x (or any specific ID really) whereas I done care if data items identified by 'y' are reordered with respect to those identified by 'x'.

Thanks
Actions
3. Hornetq Evaluation

clebert.suconic Mar 10, 2011 9:31 AM (in response to joe.developer)

a) We are closing the latest issues before we can put 2.2 out

c) Look at chapter 12 and 13. The wildcards are at the producer's side.

d) The ordering is defined by the producer. We keep ordering of a single producer. If you have multiple producers they will go in parallel to the queue. Is that what you're asking?
Actions
4. Hornetq Evaluation

timfox Mar 10, 2011 3:57 PM (in response to joe.developer)

HornetQ *does not* use regexp for matching messages to subscriptions.
Actions
5. Hornetq Evaluation

joe.developer Mar 11, 2011 1:51 AM (in response to timfox)

Hi Tim

Perhaps I'm looking at the wrong class, but Match.java appears to do regex matching for the a.*.c.d.# syntax used for subscriptions (see below). If I'm wrong could you please point me to the class and line number where the matching algorithm is actually implemented, would appreciate it.

package org.hornetq.core.settings.impl;

import java.util.regex.Pattern;

/**
    a Match is the holder for the match string and the object to hold against it.
*/
public class Match<T>
{
   public static String WORD_WILDCARD = "*";

   private static String WORD_WILDCARD_REPLACEMENT = "[^.]+";

   public static String WILDCARD = "#";

   private static String WILDCARD_REPLACEMENT = ".+";

   private static final String DOT = ".";

   private static final String DOT_REPLACEMENT = "\\.";
Actions
6. Hornetq Evaluation

timfox Mar 12, 2011 11:25 AM (in response to joe.developer)

Yes, but this is only evaluated once for a particular address, not every time a message is routed.

At routing time the hit is basically just a String lookup in a map, IIRC.

One of the HQ team should be able to explain in more detail.
1 of 1 people found this helpful
Actions
7. Hornetq Evaluation

clebert.suconic Mar 12, 2011 7:38 PM (in response to timfox)

+1

Look at BindingsImpl::route
Actions
8. Re: Hornetq Evaluation

joe.developer Mar 15, 2011 10:48 AM (in response to clebert.suconic)

Thanks for the replies guys, I was able to identify BindingsImpl as the code which uses Match.java.

I like the overall idea of using regex and caching the matches so that each message doesn't have to result in a regex match. My only concern is for use cases like ours where subscribers come and go and change their subscriptions fairly often (where a subscription maps onto a queue in hornetq terminology). Whenever there is a new subscription, or a subscrition changes BindingsImpl clears the cache, so if there are fairly frequent changes to the subscriptions all lookups will have to go through all the regexes to check for matches (which isn't crazy expensive). This is particularly painful if messages are posted to many different/unique topics and there are dozens of subscriptions. It would be great if you had a mitigation strategy to reduce this effect.

In terms of my question regarding parallel processing I have the following...
A single producer is producing a stream of events, certain events can be reordered based on an id associated with each event. For example a1, a2 and a3 must remain in order with respect to each other, similarly b1, b2, b3 must remain in order, but it doesn't matter if events from the "a" set arrive before events from the "b" set or vice versa, the ordering between a and b doesn't matter. This can be exploited to provide publication of these messages in parallel to subsribers using multiple threads, only preserving ordering where it matters, leading to higher throughput.

The ability to boot slow readers is a core requirement to ensure the throughput of our application, so we won't be able to consider HQ till at least the next release. Having said that I do believe that HornetQ is pretty much the queue to beat at the moment. All indications are that RabbitMQ is unsuitable for production use due to instability, particularly under memory pressure (see reddit discussions about their problems using rabbitmq, among other sources). One suggestion I would make is that you include more figures in your documentation and on your website regarding the different messaging models supported, for example see the numbered diagrams with examples on the left of this page:
http://www.rabbitmq.com/tutorials/tutorial-one-python.html

Thanks again
Actions
9. Re: Hornetq Evaluation

ataylor Mar 15, 2011 11:19 AM (in response to joe.developer)

In terms of my question regarding parallel processing I have the following...
A single producer is producing a stream of events, certain events can be reordered based on an id associated with each event. For example a1, a2 and a3 must remain in order with respect to each other, similarly b1, b2, b3 must remain in order, but it doesn't matter if events from the "a" set arrive before events from the "b" set or vice versa, the ordering between a and b doesn't matter. This can be exploited to provide publication of these messages in parallel to subsribers using multiple threads, only preserving ordering where it matters, leading to higher throughput.
I'm not sure i understand this properly, when a producer sends a message it writes it straight to the channel i'm not sure where any re ordering could take place, could you be more explicit please?

The ability to boot slow readers is a core requirement to ensure the throughput of our application, so we won't be able to consider HQ till at least the next release. Having said that I do believe that HornetQ is pretty much the queue to beat at the moment. All indications are that RabbitMQ is unsuitable for production use due to instability, particularly under memory pressure (see reddit discussions about their problems using rabbitmq, among other sources). One suggestion I would make is that you include more figures in your documentation and on your website regarding the different messaging models supported, for example see the numbered diagrams with examples on the left of this page:
http://www.rabbitmq.com/tutorials/tutorial-one-python.html
If i ever get time i may try and add some
Actions
10. Re: Hornetq Evaluation

joe.developer Mar 15, 2011 12:46 PM (in response to ataylor)

Andy Taylor wrote:

In terms of my question regarding parallel processing I have the following...
A single producer is producing a stream of events, certain events can be reordered based on an id associated with each event. For example a1, a2 and a3 must remain in order with respect to each other, similarly b1, b2, b3 must remain in order, but it doesn't matter if events from the "a" set arrive before events from the "b" set or vice versa, the ordering between a and b doesn't matter. This can be exploited to provide publication of these messages in parallel to subsribers using multiple threads, only preserving ordering where it matters, leading to higher throughput.
I'm not sure i understand this properly, when a producer sends a message it writes it straight to the channel i'm not sure where any re ordering could take place, could you be more explicit please?

I haven't had a chance to check the exact flow which HornetQ uses to process messages, so basing things on a few assumptions here. Basically a producer writes messages to a queue. When delivering those messages to subscribers a pool of threads could be used to feed off this queue in such a way that messages which can be reordered can be pulled off that queue in parallel and posted to the (remote) subscribing application. Doing this in parallel may be useful in cases where regex matching is required if matching in parallel is supported for events where ordering rules are relaxed.
Actions

Go to original post