3 Replies Latest reply on Oct 19, 2017 11:00 AM by traivor

Does Wildfly distribute the messages of a queue among 2 consuming servers half and half?

inor Oct 18, 2017 6:00 AM

Hi,

I'm observing a strange behavior with MDB processing queue messages.

In my application, a wildfly 10 server instance (i'll call it the "main server") breaks up a job submitted to it, into smaller homogenious tasks.

It then sends the task id's to its local queue so that multiple threads can process the independent tasks in parallel

reducing the total time it takes to complete the job.

The tasks are consumed and processed via an MDB.

When this runs, for a specific job that splits it into 615 tasks, it takes 13:40 minutes.

When we add a second wildfly server (i'll call it the "secondary server"), which connects to the [remote] queue in the main server, and also consumes

messages via MDB, both servers now process the 615 tasks and complete the job in 26:50 minutes.

Why does it take 2 servers to complete the job about double the amount of time that it takes 1 server?

Now more details:

1) The MDB on both is annotated with @Pool("pool-for-JobTaskMDB") which is specified in the standalone-full.xml as

<strict-max-pool name="pool-for-JobTaskMDB" max-pool-size="10" instance-acquisition-timeout="120" instance-acquisition-timeout-unit="SECONDS"/>

and

@ActivationConfigProperty(propertyName = "maxSession", propertyValue = "10")

2) The processing of the tasks involves DB access. A single DB instance used by both servers, is on the same machine as the main server.

3) It turns out that, on the average, a task running on the main server take about 10 seconds to complete and a task

running on the secondary server take about 60 seconds.

It's not clear to me why it takes 6 times longer to run on the secondary, even if the task is very DB-intensive,

(I also understand that the queue is local to the main server and remote to the secondary server)

but let's ignore that for now.

So let's assume that that's a given... that processing the queue message on the secondary server takes 60 seconds, vs. 10 seconds on the main server.

So when running with both servers, I would expect that:

A) in the time it takes to complete processing all the tasks, the faster/main server would process about 6 times more messages/tasks

than the slower server.

B) worse case scenario, is that the last message that is consumed from the queue is consumed by the secondary server,

and it would take an extra minute.

But what I found, to my astonishment, when looking at the results, is that:

1) contrary to my expectation A above,

only 315 messages/tasks were processed by the faster/main server and

300 messages/tasks were processed by the slower/secondary!

why?

2) Part of processing the task is logging the start time. Looking at the start times, i discovered

that during the last 16 minutes of the job, no tasks were processed by the main server!

why?

So my theory, and i hope I'm wrong, or that this may be controlled via configuration, is this:

The 615 tasks of the queue were divided up front among the 2 servers, and now each server was assigned to and processed about

300 or so tasks (and since 300 tasks are processed by the 10 threads in the secondary server in about 30 rounds,

where each round takes 1 minute, it comes out to the total of about 30 minutes!)

Had the servers consumed tasks, more or less, based on availability, (and my expectation A been met) I would have expected the job to be completed

in less than 9 minutes!

Is there a way to configure the 2 servers to consume a message only when there is a thread available to do so,

and not up front take half of the queue?

Finally, only if it matters:

The main server MDB uses the default resource adapter.

The secondary server MDB uses a pooled-connection-factory using an http-connector, and it does not use jndi lookup.

thanks, i would really appreciate some insight on this.

1. Re: Does Wildfly distribute the messages of a queue among 2 consuming servers half and half?

traivor Oct 18, 2017 10:49 AM (in response to inor)

The default load balancing is round robin. A client will also by default buffer messages up to a certain number of bytes regardless of the maxSession setting, 1 meg by default. If your messages are relatively small, they will get a bunch buffered on the slow server even though the fast one is available.

So, basically, tune down your consumerWindowSize or change the flow control and/or load balancing strategy altogether.

Flow Control | ActiveMQ Artemis Documentation

HTH
Actions
2. Re: Does Wildfly distribute the messages of a queue among 2 consuming servers half and half?

inor Oct 19, 2017 10:34 AM (in response to traivor)

consumerWindowSize specified in bytes? that is so silly. what do bytes have to do with consumers and messages?
Do you know if there is a forum where one can complain/discuss issues related to Artemis?
Actions
3. Re: Does Wildfly distribute the messages of a queue among 2 consuming servers half and half?

traivor Oct 19, 2017 11:00 AM (in response to inor)

I'm just a user, not a member of the team. I would guess the byte based buffer allows the buffer size to be determinate. Were it message count based, then you could blow up the buffer by suddenly sending a bunch of huge messages.

Artemis home: ActiveMQ
There is a community link there.
Actions

Go to original post