1 2 3 4 Previous Next 47 Replies Latest reply on Jun 8, 2010 9:10 AM by timfox Go to original post
      • 30. Re: many topics, paused queues, memory growing
        hantunca

        Tim - as you know, in multi-threaded systems, there sometimes is not a way to 100% reproduce the problem as it is a timing issue - these, of course, are the hardest to debug.  I understand that the code has completely changed (just by looking at it) but I think an assumption was made that this could 'never' happen in production and code was written around that assumption (i.e. 'if refs == 1, then deliverAsync()' as opposed to 'if refs > 0, then deliverAsync()').  From my experience, I'm seeing this happen in production.  I will debug more into it...

         

        Han

        • 31. Re: many topics, paused queues, memory growing
          timfox

          As with any bug report we need hard evidence, so we can investigate.

           

          So far, that has been lacking.

           

          As I mentioned previously, I am not saying there isn't a problem but you need to provide some evidence for it, so we can progress.

           

          I look forward to seeing your new test case.

          • 32. Re: many topics, paused queues, memory growing
            hantunca

            Tim,

             

            ok, I went up the stack a little more.  Here's what I find...

             

            - the producer is running before the client starts up - that's how I start my system.  The producer is using an inVM connection to talk to the embedded server.  The stack trace shows a call on one thread from InvCMConnection.run() to ServerSessionImpl.send which eventually gets to the QueueImpl.add method.

             

            - on another thread, I see my consumer connecting via a NettyAcceptor - it calls ServerSessionImpl.createConsumer after the inVM method has called ServerSessionImpl.send on the other thread.

             

            Han

            • 33. Re: many topics, paused queues, memory growing
              timfox

              han tunca wrote:

               

              Tim,

               

              ok, I went up the stack a little more.  Here's what I find...

               

              - the producer is running before the client starts up - that's how I start my system.  The producer is using an inVM connection to talk to the embedded server.  The stack trace shows a call on one thread from InvCMConnection.run() to ServerSessionImpl.send which eventually gets to the QueueImpl.add method.

               

              - on another thread, I see my consumer connecting via a NettyAcceptor - it calls ServerSessionImpl.createConsumer after the inVM method has called ServerSessionImpl.send on the other thread.

               

              Han

              That all seems perfectly normal to me. Your point is?

              • 34. Re: many topics, paused queues, memory growing
                hantunca

                Tim,

                 

                attached is a new zip file.  In this version, I've taken out the pause in the 'addConsumer' in QueueImpl - the QueueImpl is basically stock except for some more verbose debugging.  Then, I pumped up the producer to continually produce 50,000 messages in 2 seconds.  Again, to run the test:

                 

                - start the server using server.sh

                - start the producer using producer.sh

                - start the consumer using consumer.sh

                 

                I tried this on my system 5 times and each time I was able to reproduce the problem.  I appreciate you looking into this.

                 

                thanks,

                Han

                • 35. Re: many topics, paused queues, memory growing
                  timfox

                  Thanks. Will take a look tomorrow

                  • 36. Re: many topics, paused queues, memory growing
                    timfox

                    Han, you didn't supply any source code.

                     

                    We need the source of any test program so we can investigate a problem, otherwise we've no idea what your producer and consumer are doing...

                    • 37. Re: many topics, paused queues, memory growing
                      timfox

                      When you provide the source can you also add:

                       

                      1) How to run the test case (exact steps)

                      2) What you would expect to happen

                      3) What actually happens

                       

                      2) and 3) are pretty important so we can see if your expectations are correct or not.

                      • 38. Re: many topics, paused queues, memory growing
                        timfox

                        A couple of observations on your config:

                         

                        1) You're using NIO on both the client and server - this will add extra latency. I remember you mentioned previously you wanted the lowest latency - you need to set NIO to false on both client and server in this case.

                         

                        2) In your server.sh script you're not adding the native AIO libraries to the classpath. This doesn't matter if you don't run on Linux, but on Linux it will mean you fall back to using the slower non AIO journal.

                         

                        3) In server.sh you're ommitting the -XX performance related params

                         

                        For 2) and 3) take a look at bin/run.sh in the distro to see how it is done.

                        • 39. Re: many topics, paused queues, memory growing
                          timfox

                          Some more observations:

                           

                          When running server.sh, it gets a NPE in the security subsystem. This is because you have disabled security but there is a hornetq-users.xml in the classpath.

                           

                          If I open up the jar file hornetq_testcase.jar it actually contains a load of hornetq config including hornetq-configuration.xml, hornetq-queues.xml etc, etc

                           

                          This is all added to the classpath, so when starting your server it will pick up this config rather than the config you specified in the server0 directory.

                           

                          Your jar file should contain only your own classes, it should not contain any other stuff.

                          • 40. Re: many topics, paused queues, memory growing
                            hantunca

                            Tim,

                             

                            I'm going to answer all of your questions in this one reply - also attached is the source code.

                             

                            - To run the test case, all you need to do is:

                                 - start the server by running server.sh.

                                 - start the producer by running producer.sh

                                 - start the consumer by running consumer.sh

                             

                                 At this point, you should see messages backing up in the queue, i.e. not getting sent to the consumer.  The queue is effectively in a "paused" state

                            but is not marked as paused - to get the messages flowing you would need to call resume on the queue.

                             

                            - What I would expect to happen: I would expect messages to flow to the consumer from the producer - the queue is not in a paused state, so messages should flow.

                             

                            - What actually happens: messages don't flow to the consumer until resume is called on the queue.

                             

                            - Using both NIO on client/server - I understand - I've tested with/without NIO and in my use case it doesn't make that much of a difference, but I will keep this in mind.

                             

                            - AIO libraries - I don't use journaling.

                             

                            - server.sh not using performance params - I've created these scripts for testing only, not for my production system.  On the production system I'm using the performance params.

                             

                            - NPE because of security system - again, because this is a test case, I was not concerned about this.

                             

                            - jar file containing other classes - in order to make the test case as simple as possible, I loaded up all the dependent classes for the test case in one jar file.

                             

                            thanks again for looking into this.

                            Han.

                            • 41. Re: many topics, paused queues, memory growing
                              timfox

                              Han,

                               

                              you need to fix the classpath before we look at it.

                               

                              Having multiple sets of config on the classpath is a recipe for confusion and disaster.

                              • 42. Re: many topics, paused queues, memory growing
                                timfox

                                I don't want to see anything apart from your client classes in your jar

                                • 43. Re: many topics, paused queues, memory growing
                                  hantunca

                                  Attached is a tar file that get's you what you've asked for - the test case has a jar file for just the test case, the 2.1.0 hornetq jars are in the jars directory.  I've modified the classpath for the startup scripts.  Please note in this case that the stock QueueImpl class is being used, so you will not see output from running the server that shows the queue backing up - just use JMX to view the messages in the queue.

                                  • 44. Re: many topics, paused queues, memory growing
                                    timfox

                                    BTW I can see a bug here.

                                     

                                    In the case that you have turned off consumer flow control and started the session before you create the consumer, no flow control message will be sent from the client so no delivery will be prompted on the queue, hence any pre-existing messages won't get delivered.

                                     

                                    It's a trivial fix.

                                     

                                    Workaround is to start the session *after* creating the consumer, OR don't turn off consumer flow control. I'd recommend keeping consumer flow control enabled, it doesn't really have much overhead.