6 Replies Latest reply on Feb 22, 2008 5:16 AM by timfox

    Message Pulling

      First off, here's what I'm using:
      JBoss AS 4.2.2 GA
      JBoss Messaging 1.4.0 SP3

      I have a cluster configured that is running 3 nodes. This cluster is running on a single computer, so I'm using the binding service to use different ports. I have an MDB that was farmed out to all nodes and is running. That MDB is looking at a queue named "Test". I have all the clustering turned on in the config files for messaging but when I create 50 messages on a single queue, it will throw all messages at a single node. This is all well and good because I am only making 1 connection from the ClusteredConnectionFactory, so all my messages are sent to just 1 queue. The problem is that it should eventually pull some of those messages over to another free node if the other nodes are not doing anything, which they are not.

      The MDB that gets all the messages is waiting 5 seconds on each message and is capped at handling only 3 messages at a time, so there is plenty of time for another node to notice the messages lingering on the single node and to pull them over to another node, but it never happens.

      Any ideas why things are not pulling as they should? My only guess after days of playing with this is to build out the same environment in multiple vmware virtual machines to ensure that each node has its own IP and there aren't any conflicts. Since I'm not getting any errors at all, I'm saving this option for last since it will be time consuming.

      Thanks in advance for any help someone may be able to provide.

        • 1. Re: Message Pulling
          timfox

          First thing I would do is take MDBs out of the picture - try creating a vanilla JMS consumer on each node.

          The MDB layer buffers some messages - ready to send to it's local MDB instances - which will prevent them from being available for other nodes.

          Secondly, JBM (like pretty much any other messaging system) maintains a comsumer buffer of messages - default is 150 (?).

          Take a look at the prefetchSize param in the user guide.

          Also have a search in this forum, there have been several threads on this subject.

          • 2. Re: Message Pulling

            You are right, 150 is the value for the local buffer of messages. My understanding is that this is the actual local queue slice that pulls from the distributed(clustered) queue, and the whole idea behind the message pulling is for another node to notice there are messages sitting here not yet processing and to pull them over to process elsewhere. I have even set this value to 5 hoping that was it, and it didn't change anything as far as I can tell.

            I'll try creating just a standard consumer without using an MDB and see if its just the MDB caching all the messages on a given node. Will hopefully be back here soon with some good news. :)

            Thanks!

            • 3. Re: Message Pulling

              Thanks for the tip about trying things outside of an MDB. The simple consumer test worked and proved that the queues would pull messages around as needed to get the jobs done as quick as possible. When doing the same type of test with an MDB it would never do this. I spent hours tweaking little config files and I was about to give up and go to bed but finally a long shot hit me, and it worked...

              It seems that even though I'm adding messages using the ClusteredConnectionFactory, this only works with a standard consumer. When you start using an MDB, it seems to step in the middle of things and redirect it all to the old ConnectionFactory instead of the clustered one. I even have the @Clustered annotation in my MDB and that doesn't make it work correctly.

              The solution was to simply add the following attributes to the standard ConnectionFactory in the connections-factory-service.xml file:

              <attribute name="SupportsFailover">true</attribute>
              <attribute name="SupportsLoadBalancing">true</attribute>
              <attribute name="PrefetchSize">5</attribute>
              


              Had to lower the PrefetchSize down to a small number instead of the large 150 or else they will fetch 150 at a time and not let them go. I'm considering setting this to 1 to ensure even distribution of messages. I have no idea what sort of penalty will be applied for having such a low prefetch, but I cant imagine it would be too bad. If someone knows, please enlighten us.

              Thanks

              • 4. Re: Message Pulling
                timfox

                 

                "sams" wrote:
                Thanks for the tip about trying things outside of an MDB. The simple consumer test worked and proved that the queues would pull messages around as needed to get the jobs done as quick as possible. When doing the same type of test with an MDB it would never do this. I spent hours tweaking little config files and I was about to give up and go to bed but finally a long shot hit me, and it worked...

                It seems that even though I'm adding messages using the ClusteredConnectionFactory, this only works with a standard consumer.


                The connection factory you use for sending messages has no bearing on the connection factory used for consuming messages.


                When you start using an MDB, it seems to step in the middle of things and redirect it all to the old ConnectionFactory instead of the clustered one.


                There is no redirection occurring


                I even have the @Clustered annotation in my MDB and that doesn't make it work correctly.

                The solution was to simply add the following attributes to the standard ConnectionFactory in the connections-factory-service.xml file:
                <attribute name="SupportsFailover">true</attribute>
                <attribute name="SupportsLoadBalancing">true</attribute>
                <attribute name="PrefetchSize">5</attribute>
                



                You shouldn't change the SupportsFailover or SupportsLoadBalancing attributes - MDBs should always consume from the local node.

                As mentioned before prefetchSize is the parameter you want to change if you don't want to buffer so many messages. Since you have now reduced that to 5 that is why you are seeing the difference in behaviour.


                Had to lower the PrefetchSize down to a small number instead of the large 150 or else they will fetch 150 at a time and not let them go. I'm considering setting this to 1 to ensure even distribution of messages. I have no idea what sort of penalty will be applied for having such a low prefetch, but I cant imagine it would be too bad. If someone knows, please enlighten us.


                Consumer flow control works a bit like TCP flow control. The server has a certain number of tokens and continues sending messages as long as it has tokens, when it depletes its tokens it won't send any more. As messages are consumed more tokens are sent to the server (in chunks) so the server can send more. This prevents the consumer having to go to the server every time to get a message which would involve a network round trip (RTT) and be sloooow. This is a standard technique that all messaging systems (apart from jboss mq) I know deploy. Setting prefetchSize to 1 effectively means the consumer will go to the server each time to get a message.

                So there is a trade-off. Depending on how fast your MDBs consume messages you may not notice a difference. You can only tell this by experimentation.

                Thanks

                • 5. Re: Message Pulling

                  Actually its not the prefetch setting that is causing this. Today I set the prefetch back to 150 and the behavior is still the same, the MDB will pull messages from its local queue and when it runs out, the local queue will pull more messages from another node's local queue that is currently busy. Technically, the MDB is always pulling from the local queue, but the local queue is now using the LoadBalancing feature to pull in more from overwhelmed remote queues on other nodes.

                  This is EXTREMELY useful because now if messages for work come in and they get backed up on a slow node, or a node that just gets unlucky with longer processing tasks, then the idle nodes will pull some of the work over to them to keep all the nodes working together.

                  Evidently when JBoss creates the MDB, it create's the MDB's Message Listener from a Session that was created from a standard ConnectionFactory instead of the ClusteredConnectionFactory. I have not seen a setting in any of the config files so far to change this, so it may be hard coded somewhere, I'm not sure.

                  Adam

                  • 6. Re: Message Pulling
                    timfox

                     

                    "sams" wrote:

                    Evidently when JBoss creates the MDB, it create's the MDB's Message Listener from a Session that was created from a standard ConnectionFactory instead of the ClusteredConnectionFactory. I have not seen a setting in any of the config files so far to change this, so it may be hard coded somewhere, I'm not sure.

                    Adam


                    The JMS Provider (by default DefaultJMSProvider) that you use is defined in your config - standardjboss.xml - this is just standard jboss config.

                    DefaultJMSProvider is by default defined in jms-ds.xml. That's where the connection factory that is used is specified.

                    All this is nothing to do with JBM, is handled in the AS layer and is exactly the same for every other JMS provider that you might configure AS to use.

                    It's really not a good idea to change your MDBs to use a connection factory with load balancing and failover enabled. MDBs should always consume from the local node.

                    JBM message redistribution should ensure that messages are redistributed between nodes at the queue level.

                    Have you tuned your MDB pool settings?