12 Replies Latest reply on May 11, 2005 2:17 PM by Tim Fox

    Synchronous vs. asynchronous NACK when client refuses delive

    Ovidiu Feodorov Master


      I am starting a new design thread here, to separate it from the Remoting issue that sparkled it (http://jira.jboss.com/jira/browse/JBREM-93):


      Use case:

      I am using a push callback to synchronously deliver a message to a JMS Client (MessageConsumer). My server-side consumer endpoint receives the message from the messaging core and synchronously invokes callbackHandler.handleCallback(...). The server-side thread is blocked until the push either succeeds or fails.

      This invocation eventually reaches my callback handler on the client-side. However, the callback handler should be able to refuse the message: for example, nobody is blocked in a receive() and there is no message listener. There should be a way to politely say: yes, I have seen your message, but I don't need it and I don't want it, and here's my NACK.

      With the current InvokerCallbackHandler.handleCallback() signature (public void handleCallback(InvocationRequest invocation) throws HandleCallbackException), the only way for the callback handler to send a NACK is to throw a HandleCallbackException



        • 1. Re: Synchronous vs. asynchronous NACK when client refuses de
          Ovidiu Feodorov Master

          Adrian's reply:


          I think you are wrong Ovidiu.
          I recently fixed this in JBossMQ to avoid the blocking:
          http://www.jboss.org/index.html?module=bb&op=viewtopic&t=60912

          The client should refuse the delivery with an asynchronous NACK.
          You should not block server threads on communication or slow/unreliable clients.

          OFF-TOPIC:
          I believe this also fixes an issue with distributed deadlocks that were a scourge
          of the basic JBossMQ IL design that required the introduction of a thread pool
          to handoff requests in UIL2.


          • 2. Re: Synchronous vs. asynchronous NACK when client refuses de
            Ovidiu Feodorov Master

            The central issue seem to revolve around the statement: The server threads should not be blocked on communication or slow/unreliable clients

            With the current design, the server-side Consumer endpoint synchronously calls into its corresponding client-side remoting callback handler, every time a new message is pushed to it by the core.

            If we want asynchronous delivery on the client (i.e. the message gets as soon as possible on the client, without polling), I don't think there is a way around that: the server thread will block calling callbackHandler.handleCallback(...) until something happens. That something could be:
            1. a negative acknowledgement - the client says that nobody wants that message
            2. a positive acknowledgment - the client says that it's got the message and now the client endpoint can acknowlege the message to the core.
            3. an unchecked exception generated by a broken client
            4. a communication error of some sort, that bubbles out of remoting

            The case 1 is detected immediately: there is no tread blocked in waiting on receive() or there is no message listener. The negative acknoweldgement can be generated pretty quickly in this case, and with no client code involvement.

            For the case 2, the situation is a little bit more complicated.

            2.1 If there is client thread blocked in MessageConsumer.receive(), the callback handler will just hand over the message (my solution for this is to use a RendezVous object) and this is the positive acknowledgment. The client-side remoting thread will just unwind and the server will immediately get the positive acknowledgment.

            2.2 However, if there is a message listener, the current implementation uses the remoting thread to invoke onMessage(). This can be a problem, since I have no control over how long onMessage() will take. I can think at several solutions, to have a separate thead that invokes onMessage(), for example.

            • 3. Re: Synchronous vs. asynchronous NACK when client refuses de
              Tom Elrod Master

              The use case helps me understand what you are after. Thanks.

              I still don't think returning an Object from the handleCallback() method is a good idea since, from the server side, will never be sure of the real behavior (if client is pulling the callbacks, the return from the handleCallback() method will always be null, even if this is not the value the client intended to return). This is just the nature of allowing pull model for callbacks.

              I would be happy to (and will) add a getUserException() to the HandleCallbackException, which will be null in the case of pull callbacks. All of this I have commented in JBREM-93.

              What I think you really need is the ability to make return invocations on the client (or to be more specific, the client's server) directly, without having to use the callback API (because of the whole push/pull issue). My initial thought on this is to actually supply a Client object in the InvocationRequest given to the ServerInvocationHandler implementation (via the invoke() method). This way you can keep a reference to this client as you wish and call directly on it. This can be a synchronous or asynchronous (oneway) call... however you want to do it. If is synchronous, there will be a return value supplied as well as throw an exception if needed.

              I am just thinking aloud here, so let me know if you think this would be a better fit for you? Adrian, feel free to add your 2 cents on this idea as well.


              • 4. Re: Synchronous vs. asynchronous NACK when client refuses de
                Ovidiu Feodorov Master

                 

                I still don't think returning an Object from the handleCallback() method is a good idea since, from the server side, will never be sure of the real behavior (if client is pulling the callbacks, the return from the handleCallback() method will always be null, even if this is not the value the client intended to return). This is just the nature of allowing pull model for callbacks.


                You are right, I totally disregarded pull callbacks, since I don't use them, so far.

                I would be happy to (and will) add a getUserException() to the HandleCallbackException, which will be null in the case of pull callbacks. All of this I have commented in JBREM-93.


                That will do. Will you also avoid generating error messages in case of an user HandleCallbackException and let the client code worry about it?


                What I think you really need is the ability to make return invocations on the client (or to be more specific, the client's server) directly, without having to use the callback API (because of the whole push/pull issue). My initial thought on this is to actually supply a Client object in the InvocationRequest given to the ServerInvocationHandler implementation (via the invoke() method). This way you can keep a reference to this client as you wish and call directly on it. This can be a synchronous or asynchronous (oneway) call... however you want to do it. If is synchronous, there will be a return value supplied as well as throw an exception if needed.


                The whole point of using callbacks was to avoid the need of opening a second connection from server to client. This won't be possible if the client sits behind a firewall. I know that this is exactly what remoting does behind the scenes right now, but I hope we will have a UIL2-like transport in the future that will make this issue irrelevant.

                • 5. Re: Synchronous vs. asynchronous NACK when client refuses de
                  Tim Fox Master

                  Hi guys, here's my 2c FWIW, apologies if you've already gone over this ground :)

                  In Ovidiu's case 2) above it seems to me that the positive acknowledgement may not be known until some (very long and unknown) time in the future - this might be the case for a transacted JMS Session, where the acknowledgement is done at commit time, not at message receipt time.

                  In that case, blocking the server thread that delivers the message to the client until the ack/nack occurs isn't really an option.

                  I can see two options here:

                  1) Let something on the client side implement a reliable store and take over responsibility for message delivery. It can then respond with an ack immediately. I think this violates any idea of having a thin client though, plus it's just passing the same problem to the next thing in the chain IMO.

                  2) Have some way (callback) for the client to inform the server queue object asynchronously that a message is acked/nacked. This would be called by the client side code when the JMS session (or whatever) commits or rolls back. It's my understanding the core classes currently don't support such functionality - I may well be wrong here.

                  -Tim



                  • 6. Re: Synchronous vs. asynchronous NACK when client refuses de
                    Adrian Brock Master

                     

                    "timfox" wrote:

                    I can see two options here:

                    1) Let something on the client side implement a reliable store and take over responsibility for message delivery. It can then respond with an ack immediately. I think this violates any idea of having a thin client though, plus it's just passing the same problem to the next thing in the chain IMO.


                    Rule 1 in Middleware design: NEVER TRUST A CLIENT TO BE RELIABLE
                    That is a decision for the end user that requires work on their side.


                    2) Have some way (callback) for the client to inform the server queue object asynchronously that a message is acked/nacked. This would be called by the client side code when the JMS session (or whatever) commits or rolls back. It's my understanding the core classes currently don't support such functionality - I may well be wrong here.


                    Correct, you need this anyway for CLIENT_ACKNOWLEDGEMENT/session.close()/etc.
                    You can also "piggy-back" the acks on top of other client->server requests in the case of
                    DUPS_OK, which is a trick that JBossMQ does NOT do.


                    • 7. Re: Synchronous vs. asynchronous NACK when client refuses de
                      Ovidiu Feodorov Master

                      The code that's currently in CVS does not block the server thread while the client processes the message, but also does not correctly handle acknowledgment (i.e. CLIENT_ACKNOWLEDGMENT, etc.) However, the goal was to first have the infrastructure in place so we can write test and stress cases and fine tune the behavior later.

                      This is how I see things happening. While reading the explanations, please refer to the diagram http://wiki.jboss.org/wiki/attach?page=JBossMessagingDesignDiagrams%2F2004.05.05_Consumer+Acknowledgment.jpg, which presents a simple use case of a topic with only one consumer.

                      Message M1 is delivered by a core thread to the topic (1). A topic is made of a LocalPipe and a PointToMultipointRouter. The LocalPipe first attempts synchronous delivery, the router synchronously pushes M1 to the Consumer C1, which invokes handleCallback() on its callback handler (2). The remoting layer forwards the message, which eventually is delivered to the MessageCallbackHander on the client side.

                      In a colocated configuration, M1 is delivered to the MessageCallbackHandler by the core thread itself. In a remote configuration, the core thread blocks waiting the remote call to complete.

                      The MessageCallbackHandler accepts M1 by queuing it in a in-memory staging area (currently it is called RendezVous, but this is a bad name, it will change; a more appropriate name I think would be ClientDeliveryQueue) and immediately releases the tread, which is either the remoting thread or the core thread, depending on the configuration (3). This is very fast, it does not block the server thread for long and also not much can go wrong here since there is no client code involved. M1 is not acknowledged, however. When the Consumer's callbackHandler.handleCallback() completes (4), the Consumer NACKs the message to the topic, which makes the topic to store the message M1 in the MessageStrore and the NACK (C1-M1) in the AcknowledgmentStore (5). The storage can be reliable (persistent store) or unreliable (memory).

                      Independently of this process, on the client side, a client thread (the tread that invokes MessageConsumer.receive() or a thread that polls the RendezVous and calls into listener, if any) consumes M1 (6) and sooner or later produces the positive acknowledgment (7), which is sent over remoting to the Consumer with a direct synchronous call. The Consumer accesses the AcknowlegmentStore (8) and removes the M1-C1 NACK which triggers M1 removal from the MessageStore, if there aren't any other NACKs for M1.



                      This are several interesting particular cases worth commenting on:

                      1. What if the client never acknowledges?

                      After a while, the topic will redeliver M1 and the process repeats until message's time to live expires.

                      2. What happens if a message is redelivered before the client acknowledges it?

                      There two cases: if the message is still in the RendezVous, it is simply ignored. However, this situation is possible: the message is taken out of the queue (step (6)) and while the acknowledgment is being sent to the Consumer (7), an independent re-delivery puts it back in the queue (3). This means duplicate delivery.

                      3. What if RendezVous fills up?

                      There should be a limit of some sort, after which the RendezVous won't even accept messages. From the server side, nothing changes, messages are NACKed as before and go to the store.

                      • 8. Re: Synchronous vs. asynchronous NACK when client refuses de
                        Adrian Brock Master

                        You are missing the point.

                        1) You don't need to wait for the message to land on the client
                        2) The acks/nacks can come back at any time - including transactionally (based on the client's acknowledgement policy)
                        3) Don't trust the client to give you *any* acks/nacks or in the correct order, or duplicates
                        or any assumptions about correctness by the client.

                        Instead, what you need to do is deliver the message to something on the serverside
                        that you do trust. This can then *asynchronously* send the messages the client
                        and *asynchronously* wait for the acks/nacks coming back from the client.

                        If the client does it wrong, you know this from your "something".

                        If the client crashes, you know what you need to NACK.
                        If the client closes the session without nacking, you know what you need to NACK.
                        If the client does it wrong, you have a mechanism to trap the problem and not just
                        follow the misbehaving client blindly.
                        If the transaction manager unilaterally rollsback the transaction, you know what you need to NACK.
                        etc....

                        I called this something the Client service in the original design.
                        In JBossMQ, it is a combination of the ClientConsumer/BasicQueue-acknowledgement maps.

                        This does not discount the ability to optimize some paths, e.g.
                        MessageConsumer.receiveNoWait()
                        but they are optimizations - synchronous behaviour
                        the core path should be aimed at asynchronous throughput.

                        Indeed on the throughput side, you could even do "ReadAhead" to message listeners whereby when you start the message listener, it sends N messages rather than just one.
                        http://jira.jboss.com/jira/browse/JBAS-1343
                        Of course, that is something the user needs to be able to configure.

                        Or the "Something" can be more tolerant of connection failures and hold state
                        allowing the client to reconnect transparently "Persistent Connections":
                        http://jira.jboss.com/jira/browse/JBAS-1345

                        • 9. Re: Synchronous vs. asynchronous NACK when client refuses de
                          Ovidiu Feodorov Master

                          OK.

                          org.jboss.jms.server.endpoint.Consumer is "Something".

                          I will modify it to use a thread pool to deliver messages asynchronously to the remoting callback. This way, core threads will never touch the remoting layer. Acknowledgments will be sent back to the server by the thread that accepts the message.

                          • 10. Re: Synchronous vs. asynchronous NACK when client refuses de
                            Ovidiu Feodorov Master

                            Tim

                            I've checked in a partial implementation of the asynchronous acknowledgment handling. What Consumer does with the positive acknowledgment it receives is not defined yet .... it will probably just access the acknowledgment store and submit the ack to it.

                            • 11. Re: Synchronous vs. asynchronous NACK when client refuses de
                              Tim Fox Master

                              Ovidiu-

                              Great. I'll update.

                              BTW I had some more thoughts yesterday regarding our conversation and how we can tie up asynch. acknowledgements with queue browsing, possible changes to the Channel interface etc. (Tri-state stuff??)

                              I'd try and work through the ideas more while you are on training this week and let's get together as discussed at the end of the week.

                              Cheers

                              Tim

                              • 12. Re: Synchronous vs. asynchronous NACK when client refuses de
                                Tim Fox Master

                                Hi All-

                                Further to the discussion regarding asynchronous ACKs, here are some thoughts on how we can implement QueueBrowsing on the core classes and how it all works together.

                                The assumption here is that we want to be able to browse any messages in a queue where delivery has not been attempted. We don't want to browse those messages where the message has been delivered but the JMS client has NACKed it.

                                Currently the core classes don't allow us to distinguish those messages that have not had delivery attempted and those that have.

                                A proposal would be modify the return value of Receiver.handle() from a boolean to one of three states:

                                ACK - the message has been handled sychronously and acknowledged.
                                NACK-DeliveryAttempted - the message is not (yet) acknowledged but delivery is being attempted.
                                NACK-DeliveryNotAttempted - the message is not acknowledged and delivery has not been attempted.

                                The acknowledgement store would also be modified to allow us to distinguish between NACK-DeliveryAttempted and NACK-DeliveryNotAttempted. I.e. it needs to store the tuple <receiver_id, message_id, nack_type (- one of delivery attempted or delivery not attempted)>

                                For the case of a queue:

                                If the queue has no consumers and messages arrive, nacks are stored with state NACK-DeliveryNotAttempted and the messages are stored.

                                The QueueBrowser will only browse NACKed messages with state NACK-DeliveryNotAttempted, so it sees these messages.

                                A consumer is now added to the queue, causing deliver() to be triggered. Messages with either of the two NACK states are sent for delivery. The consumer starts to accept messages for delivery, returning NACK-DeliveryAttempted to the router.

                                A QueueBrowser will not see the messages with state NACK-DeliveryAttempted.

                                As messages are asynchronously acknowledged a callback from the client to the server causes the corresponding NACK to be removed from the acknowledgement store. Currently this is done with a callback interface on the Consumer object (AcknowledgementHandler) which gets invoked from the client, I guess it could also be done by implementing this interface on the Destination classes (??)

                                Messages can be asynch. acknowledged due to either auto acknowledgement, manual acknowledgement, lazy acknowledgement or session commit.

                                For session commit, and potentially lazy acknowledgement and manual acknowledgement I think we also need to extend the Acknowledgement interface to take a batch of acknowledgements as opposed to a single acknowledgement, since the acks must be processed as an atomic unit for transacted sessions, and for the other cases this could help reduce network traffic (although it is probably to early for optimisations at this stage).

                                This also means I think we need to extend the acknowledgement store to be able to process a batch of ACKs (i.e. forget a batch of NACKs) as an atomic unit, otherwise if only some are reliably persisted because of failure in mid operation we have a problem.

                                Message re-delivery should only happen when the consumer (re)connects to the queue, Session.recover() is invoked or transacted session rollback occurs (Is that all cases?), so there should be no problem with the message being redelivered when it's already in mid delivery.

                                If a client dies, then its consumer can be removed, and when it reconnects the nacked messages are redelivered.

                                The behaviour of a durable subscription on a Topic should be almost identical I believe. Actually this should give us, (if we wanted to do some kind of JBoss specific JMS extension), the ability to browse durable subscribers too.

                                Non durable subscriptions shouldn't be affected since we don't store anything.

                                -Tim