-
1. Re: Non persistent messages and 2PC multicast
marklittle May 31, 2006 11:47 AM (in response to timfox)How about a flag on the volatile messages so the sender can say whether or not they are idempotent? If they are, then redoing the work on the backup shouldn't be a problem. If they aren't idempotent, then you need to something else. Can you multicast the response from the primary so that the backup also sees it?
-
2. Re: Non persistent messages and 2PC multicast
adrian.brock May 31, 2006 12:00 PM (in response to timfox)
How about a flag on the volatile messages so the sender can say whether or not they are idempotent?
It is the receiver that specifies their Idempotency
javax.jms.Session.DUPS_OK_ACKNOWLEDGE
Can you multicast the response from the primary so that the backup also sees it?
Only the sender gets the acks back from the multicast.
Here we are talking about one receiver (the master) knowing
whether another (the buddy) has a copy before it delivers the
message to the client.
Without that you get:
sender -> mulitcast
master -> receives multicast
master -> ack message to sender
master -> deliver to client
client -> ack to master
master -> replicate client's ack to buddy
buddy -> what are you talking about?
buddy -> finally processes original multicast
This is an example where total ordering is useful.
However 2PR (2 phase replication) also works
sender -> mulitcast prepare
master -> receives multicast
master -> ack message to sender
master -> wait for confirmation
buddy -> receives multicast
buddy -> ack message to sender
sender -> multicast commit
master -> receives commit and delivers to the client -
3. Re: Non persistent messages and 2PC multicast
timfox May 31, 2006 12:07 PM (in response to timfox)"adrian@jboss.org" wrote:
It is the receiver that specifies their Idempotency
javax.jms.Session.DUPS_OK_ACKNOWLEDGE
Right, but I guess we could implement a JBoss Messaging specific feature, where the sender specifies that duplicates might happen.
E.g. DeliveryMode.DUPS_OK -
4. Re: Non persistent messages and 2PC multicast
marklittle May 31, 2006 12:09 PM (in response to timfox)Total ordering is necessary, I was just wondering if there was a way to only do it when you have to by exploiting application-specific semantics.
-
5. Re: Non persistent messages and 2PC multicast
timfox May 31, 2006 12:18 PM (in response to timfox)Ouch.
I don't even think 2PC is sufficient since that won't guarantee total ordering.
Since if different nodes multicast their message via 2pc to the active and the buddy, both the active and the buddy need to receive in the correct order which won't necessarily be the case.
So this means we'd need multicast + total order protocol....... -
6. Re: Non persistent messages and 2PC multicast
timfox May 31, 2006 12:28 PM (in response to timfox)If we did passive replication rather than active we could avoid the total ordering.
So node A sends message to active node. Then active node synchronously sends message to replica(s) before returning.
This is more network traffic and more latency though... -
7. Re: Non persistent messages and 2PC multicast
adrian.brock May 31, 2006 12:38 PM (in response to timfox)"timfox" wrote:
I don't even think 2PC is sufficient since that won't guarantee total ordering.
You don't need total ordering if you have 2PR.
Each node has a view of the state (including persisting it) from the prepare.
Nothing is done externally until the commit.
Yes, the commit on the buddy might race with the ack from the client via
the master, but that is easily dealt with if you have a well defined state
machine.
Remember as far as the client is concerned it hasn't full acked the
message until the invocation on the server returns.
client -> ack the message
master -> replicate ack to buddy and update internal state (again 2PR)
master -> return to client
* Only at the this point does the client know there won't be a redelivery.
Of course for non persistent messages and DUPS_OK these rules
can be relaxed. But only as long a the server(s) don't get in a
confused state. -
8. Re: Non persistent messages and 2PC multicast
timfox May 31, 2006 12:50 PM (in response to timfox)"adrian@jboss.org" wrote:
You don't need total ordering if you have 2PR.
What about the following situation:
I have 4 nodes.
node A and B are inactive, node C has the active, and node D has the replica (buddy)
I send a message m1 from node A.
Prepare(m1) gets multicast.
commit (m1) gets multicast
Around the same time I send a message m2 from node B.
Prepare(m2) gets multicast
Commit (m2) gets multicast
node C receives this:
prepare (m1)
prepare (m2)
commit (m1)
commit (m2)
node D receives this:
prepare (m2)
prepare (m1)
commit (m2)
commit (m1)
Can this happen?
If so, then we need total ordering surely... -
9. Re: Non persistent messages and 2PC multicast
adrian.brock May 31, 2006 1:02 PM (in response to timfox)Yes it can happen, but it is OK.
It doesn't matter that the order changed, the messages came from
different senders. Their order cannot matter. There is a race
condition here anywhere even if you have only one jms server.
i.e. Which of the servers A and B gets to send their message first.
Total ordering across multiple sender sessions is not a required
JMS semantic. It is only required if the sends come from the same
session/transaction.
Can we put this in the FAQ somewhere. I keep having to repeat this. ;-)
Like we discussed on a different thread. If somebody does want this
semantic then it is a "value add" configuration, they pay the cost in throughput for this total ordering.
It should not be the default! -
10. Re: Non persistent messages and 2PC multicast
timfox May 31, 2006 1:16 PM (in response to timfox)I know it's not a required JMS semantic, and I want to avoid total ordering if at all possible as much as the next man :)
What you're saying is it's ok for the active and the backup to have different states.
E.g. active node has messages A, B, C, D
but backup has messages C, A, B, D
So they're not true replicas. I'm trying to figure if this works.
This means when I multicast a receive(), the active node is going to remove the first message - message A and put it in delivering state.
But the backup node is going to remove the first message - message C and put it in the delivering state.
Then I multicast an ack(message A). The active node will receive it and remove message A.
But the backup node will receive it and say "but message A is not in the delivering state" and barf -
11. Re: Non persistent messages and 2PC multicast
adrian.brock May 31, 2006 1:34 PM (in response to timfox)That's what I said about implementing a proper state machine
so the server doesn't get its "knickers in a twist".
If the master is going to deliver a message to the client,
the buddy needs to know that the message should be in the delivered
state. Besides the reason you state, it needs to know this in case
the master crashes at this point.
master -> deliver -> client
master -> crash
client -> ack -> failover -> buddy
If the message isn't in the delivered state then another client could
come in after failover and steal it before the original client acks it.
There is no requirement for total ordering here either.
All the "put in delivered state" and ack/nack messages to the buddy
are coming from the master node which provides the ordering. -
12. Re: Non persistent messages and 2PC multicast
timfox May 31, 2006 1:52 PM (in response to timfox)"adrian@jboss.org" wrote:
There is no requirement for total ordering here either.
All the "put in delivered state" and ack/nack messages to the buddy
are coming from the master node which provides the ordering.
Ok, I was assuming the "put in delivered state" and ack/nack messages *wouldn't* be coming from the master node, but could be coming from any node.
If they are coming from the master node, than I agree that would be ok.
My previous assumption was that if a client is connected to a node other than the active node, then any send(Message m), ack(MessageID id), receive() calls would be multicast.
My understanding of what you're saying is we should only multicast the send() call to the master and the buddy, and all others (ack, receive) need to be channelled through the master to the buddy to give the ordering guarantee.
We could just channel everything through the master (even the send), which is what I suggested earlier:
So node A sends message to active node. Then active node synchronously sends message to replica(s) before returning.
This is more network traffic and more latency though...
At the expense of latency ensuring only the master talks to the replica might make our lives easier. -
13. Re: Non persistent messages and 2PC multicast
ovidiu.feodorov May 31, 2006 9:12 PM (in response to timfox)Adrian wrote:
master -> deliver to client
client -> ack to master
master -> replicate client's ack to buddy
buddy -> what are you talking about?
The buddy knows it's a buddy, so it's supposed to hold replicated state. When it receives such an out-of-order acknowledgment, why can't it just store it, and then let it to be cancelled out by the (eventually) arriving message?
It doesn't matter the order in which they arrive, what it matters is that eventually, they'll cancel each-other (or don't, if the master crashes and it cannot send the acknowledgment at all) -
14. Re: Non persistent messages and 2PC multicast
timfox Jun 1, 2006 3:49 AM (in response to timfox)"ovidiu.feodorov@jboss.com" wrote:
Adrian wrote:
master -> deliver to client
client -> ack to master
master -> replicate client's ack to buddy
buddy -> what are you talking about?
The buddy knows it's a buddy, so it's supposed to hold replicated state. When it receives such an out-of-order acknowledgment, why can't it just store it, and then let it to be cancelled out by the (eventually) arriving message?
It doesn't matter the order in which they arrive, what it matters is that eventually, they'll cancel each-other (or don't, if the master crashes and it cannot send the acknowledgment at all)
Well, this is all moot now, if we channel everything through the master.