1 2 3 Previous Next 36 Replies Latest reply on Jun 1, 2009 9:23 AM by jmesnil

    Messages are lost on Queue?

    clebert.suconic

      I have executed one perfSender/perfListener like this:


      ant perfSender -Dsess.trans=true

      ant perfListener -Dsess.ackmode=AUTO_ACK


      (notice: this is NonPersistent)


      In about 1 in 10 tries.. I'm not getting all the messages on the listener.


      I was having problems with messages not being deleted on the journal, then I decided to do a test without persistent and I'm still loosing messages.

      Probably messages are sent Asynchronously and something is ignoring an eventual socket error? This is something to be investigated.


        • 1. Re: Messages are lost on Queue?
          timfox

          This looks ok now, I guess it was related to the same MINA issue we had the other day.

          • 2. Re: Messages are lost on Queue?
            clebert.suconic

            I just did another test and it happened again.

            It looks actually AutoACK is loosing messages.

            • 3. Re: Messages are lost on Queue?
              clebert.suconic

               

              It looks actually AutoACK is loosing messages.


              Well.. actually no... perfListener didn't receive all the messages.

              So.. something is definitely making it loose messages.

              • 4. Re: Messages are lost on Queue?
                timfox

                It doesn't really help much to say "it's not working" and that's it. At the minimum how to replicate this would be important.

                Reminds me of one of those support posts "JBoss is not working, please fix immediately" ;)

                • 5. Re: Messages are lost on Queue?
                  clebert.suconic

                   

                  "timfox" wrote:
                  It doesn't really help much to say "it's not working" and that's it. At the minimum how to replicate this would be important.

                  Reminds me of one of those support posts "JBoss is not working, please fix immediately" ;)


                  Just read the first post:



                  "Clebert Suconic on the first post" wrote:

                  I have executed one perfSender/perfListener like this:

                  ant perfSender -Dsess.trans=true

                  ant perfListener -Dsess.ackmode=AUTO_ACK


                  (notice: this is NonPersistent)


                  In about 1 in 10 tries.. I'm not getting all the messages on the listener.




                  • 6. Re: Messages are lost on Queue?
                    timfox

                    Ok sorry, I should have read your first post more carefully! :)

                    I have just run perfListener/Sender as you mentioned 12 times, but saw no problems. I guess this must be some subtle race condition.

                    • 7. Re: Messages are lost on Queue?
                      clebert.suconic

                      Same with me...

                      funny is.. yesterday it happened on the first time I tried. I'm sure I had my SVN updated.

                      Now.. I'm not being able to replicate it any more.

                      • 8. Re: Messages are lost on Queue?
                        clebert.suconic

                        Just found what I was doing when this happened.


                        I - Start the server:
                        ant runServer


                        II - Start a consumer:
                        ant perfListener -Ddrain.queue=false

                        III - Kill the consumer, start the consumer again:
                        ant perfListener -Ddrain.queue=false


                        IV - Start the sender
                        ant perfSender


                        You will miss about 20K messages on the execution. It doesn't matter if you wait some time before restart the Listener, you will aways loose messages.

                        • 9. Re: Messages are lost on Queue?
                          timfox

                          Maybe the "lost" messages are in the killed consumer which still exists on the server until server side cleanup kicks in.

                          This would be correct behaviour.

                          • 10. Re: Messages are lost on Queue?
                            clebert.suconic

                             

                            "timfox" wrote:
                            Maybe the "lost" messages are in the killed consumer which still exists on the server until server side cleanup kicks in.

                            This would be correct behaviour.



                            As I said:

                            "Clebert" wrote:
                            It doesn't matter if you wait some time before restart the Listener, you will aways loose messages.


                            And besides.... after some time... those messages were supposed to come back to the queue when clenup kicks in, and that's not happening.

                            • 11. Re: Messages are lost on Queue?
                              timfox

                              So where are they going then?

                              • 12. Re: Messages are lost on Queue?
                                clebert.suconic

                                 

                                "timfox" wrote:
                                So where are they going then?


                                That's something to be investigated...
                                I can do that later if nobody does it before me.

                                For now I was just feeding this thread on how to replicate this issue.

                                • 13. Re: Messages are lost on Queue?
                                  clebert.suconic

                                  There's definitely something going on.

                                  ServerSessionImpl:cancel is called at the end of perfListener to place messages delivered but not consumed back on the queues. That's a regular operation... but cancel is canceling deliveries on messages that were not sent yet.

                                  I had a few problems with reference counting when I ran perfSender/perfListener this way:


                                  and perfSender -Dmessage.count=400000 -Dmessage.wamup.count=20000

                                  and then:

                                  run two perfListener as:

                                  ant perfListener -Ddrain.queue=false
                                  ant perfListener -Ddrain.queue=false


                                  The last one won't receive 20K messages, and I believe the serverSessionImpl.cancel has a play on this bug.

                                  This is the next thing I'm going to do as soon as I'm done with paging. I just want to make sure there isn't anything bigger than expected going on here.

                                  • 14. Re: Messages are lost on Queue?
                                    clebert.suconic

                                    The problem is that when you CTRL-C the client the listeners are not informed about the destroy.

                                    (Actually before Tim's commit today, RemotingConnection::destroy was never being called.)

                                    So, I could fix this by calling the listeners upon destroy also.

                                    A side effect of that is I'm getting now this exception on regular closes, as destroy is being called either way:

                                    " Session with name 6dbb02fb-842d-11dd-8672-00007f000001 was already removed"


                                    So, if someone (Tim/Jeff) could please take a look on what's the proper way of doing this?

                                    I have committed a fix and a testcase, and you could follow the diffs on this JIRA:

                                    https://jira.jboss.org/jira/browse/JBMESSAGING-1421


                                    BTW: I feel like the API on FailureListener is a bit confusing. Shouldn't it be called ConnectionListener, and instead of only a failure shouldn't also have a connectionClosed Method.

                                    1 2 3 Previous Next