1 2 Previous Next 20 Replies Latest reply on Aug 1, 2010 6:39 PM by steven.landers

    Message Redistribution of Persisted Messages on Boot

    steven.landers

      I have two servers, where the cluster defined explicitly (no auto-discovery)

       

      Message Redistribution works great when both servers are alive, and the number of consumers on the primary server drops to zero.  When this occurs, all messages correctly increment 'messagesAdded' on the primary server, and are immediately redistributed to the backup server.  This works in both directions.

       

      HOWEVER:

      If I have 1000 in-flight messages on the primary queue, and I restart the server, such that all consumers now consume from the backup server, upon rebooting, the primary server does not redistribute the messages.  The messages stay indefinitely, until new consumers arrive.

       

      Does redistribution consider persisted messages on boot?  Perhaps I need to set another setting somewhere?  I realize that 'failover' isn't available when clusters are defined statically, but I didn't really consider message redistribution within the precise scope of failover functionality.

       

      Thanks for your time/consideration

      Steven

        • 1. Re: Message Redelivery of Persisted Messages on Boot
          clebert.suconic

          "the primary server does not redistribute the message"

           

           

          Are you talking about redistribution or redelivery?

           

           

          Redelivery is loading messages from disk and sending it to consumers.

           

           

          Redistribution is sending messages from one node to another.

           

           

          I'm not sure what you're talking about here.

          • 2. Re: Message Redistribution of Persisted Messages on Boot
            steven.landers

            Ah - good point.  Sorry for the lingo.   I meant Redistribution.  The persisted messages are not redistributed upon booting.  I'll rename the thread.

            • 3. Re: Message Redelivery of Persisted Messages on Boot
              clebert.suconic

              I just realized what you're talking about. But you're talking about backup nodes and reconnection to the backup node. (That's not redistribution)

               

               

              At the current version, you have to start live and backup node together. Once the backup node becames live, you can't add the live node as backup without restarting both servers.

               

               

              This is something that's being worked on at the moment:  http://community.jboss.org/thread/152610?tstart=30

              • 4. Re: Message Redelivery of Persisted Messages on Boot
                steven.landers

                Actually these aren't Live/Backup pairs.   Think of them as identical nodes - that include the other node in the pair.  There's no replication going on, just redistribution when the consumer-count drops to zero via the "redistribution-delay" flag.

                • 5. Re: Message Redelivery of Persisted Messages on Boot
                  steven.landers

                  The 'cluster' is defined in the cluster-connections that just points to the other node - for each node. For our purposes, I refer to one as the 'backup node', but again, that was a lingo problem on my part - sorry

                  • 6. Re: Message Redelivery of Persisted Messages on Boot
                    clebert.suconic

                    It's probably a misconfiguration of your cluster properties.

                     

                    If you believe there's a bug, you would need to validate at the latest version, and provide us a test replicating the issue. (Or exact intrusction on how to replicate it).

                     

                     

                    Even if it's a misconfig in your part.. I don't have much information to look on here and help you now. A test would be the best way to go I think.

                    • 7. Re: Message Redistribution of Persisted Messages on Boot
                      steven.landers

                      Here's some more visibility into my configuration.  I included the configuration for both nodes.  I thought I had the acceptors/connectors configured correctly.

                       

                      NODE A:

                      hornetq-configuation.xml

                      
                      <configuration xmlns="urn:hornetq"
                                   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                                   xsi:schemaLocation="urn:hornetq /schema/hornetq-configuration.xsd">
                      <clustered>true</clustered>
                      
                      <log-delegate-factory-class-name>org.hornetq.integration.logging.Log4jLogDelegateFactory</log-delegate-factory-class-name>
                      <bindings-directory>${jboss.server.data.dir}/hornetq/bindings</bindings-directory>
                      <journal-directory>${jboss.server.data.dir}/hornetq/journal</journal-directory>
                      <journal-min-files>10</journal-min-files>
                      <large-messages-directory>${jboss.server.data.dir}/hornetq/largemessages</large-messages-directory>
                      <paging-directory>${jboss.server.data.dir}/hornetq/paging</paging-directory>
                      
                      <connectors>
                           <connector name="netty">
                           <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>
                           <param key="host"  value="10.177.160.24"/>
                           <param key="port"  value="${hornetq.remoting.netty.port:5445}"/>
                           </connector>
                      
                           <connector name="netty-throughput">
                           <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>
                           <param key="host"  value="${jboss.bind.address:localhost}"/>
                           <param key="port"  value="${hornetq.remoting.netty.batch.port:5455}"/>
                           <param key="batch-delay" value="50"/>
                           </connector>
                      
                           <connector name="in-vm">
                           <factory-class>org.hornetq.core.remoting.impl.invm.InVMConnectorFactory</factory-class>
                           <param key="server-id" value="${hornetq.server-id:0}"/>
                           </connector>
                      
                           <connector name="backup-queue-server">
                           <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>
                           <param key="host" value="${hornetq.backup.queue:globalqueuenode2}"/>
                           <param key="port" value="${hornetq.remoting.netty.port:5465}"/>
                           </connector>
                      </connectors>
                      
                      <acceptors>
                        <acceptor name="netty">
                            <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
                            <param key="host"  value="${jboss.bind.address:localhost}"/>
                            <param key="port"  value="${hornetq.remoting.netty.port:5445}"/>
                        </acceptor>
                      
                        <acceptor name="netty-throughput">
                            <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
                            <param key="host"  value="${jboss.bind.address:localhost}"/>
                            <param key="port"  value="${hornetq.remoting.netty.batch.port:5455}"/>
                            <param key="batch-delay" value="50"/>
                            <param key="direct-deliver" value="false"/>
                        </acceptor>
                      
                        <acceptor name="in-vm">
                           <factory-class>org.hornetq.core.remoting.impl.invm.InVMAcceptorFactory</factory-class>
                           <param key="server-id" value="0"/>
                        </acceptor>
                      
                        <acceptor name="primary-queue-server">
                            <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
                            <param key="host" value="${hornetq.primary.queue:globalqueuenode1}"/>
                            <param key="port" value="${hornetq.remoting.netty.port:5465}"/>
                        </acceptor>
                      
                      </acceptors>
                      
                      <cluster-connections>
                           <cluster-connection name="globalQueueCluster">
                                <address>jms</address>
                                <connector-ref connector-name="backup-queue-server" />
                           </cluster-connection>
                      </cluster-connections>
                      
                      <security-settings>
                        <security-setting match="#">
                            <permission type="createNonDurableQueue" roles="guest"/>
                            <permission type="deleteNonDurableQueue" roles="guest"/>
                            <permission type="consume" roles="guest"/>
                            <permission type="send" roles="guest"/>
                        </security-setting>
                      </security-settings>
                      
                      <address-settings>
                        <!--default for catch all-->
                        <address-setting match="#">
                            <dead-letter-address>jms.queue.DLQ</dead-letter-address>
                            <expiry-address>jms.queue.ExpiryQueue</expiry-address>
                            <redelivery-delay>0</redelivery-delay>
                            <max-size-bytes>10485760</max-size-bytes>
                            <message-counter-history-day-limit>10</message-counter-history-day-limit>
                            <address-full-policy>BLOCK</address-full-policy>
                        </address-setting>
                      
                       <address-setting match="jms#">
                           <max-delivery-attempts>10</max-delivery-attempts>
                           <redelivery-delay>60000</redelivery-delay>
                           <redistribution-delay>0</redistribution-delay>
                       </address-setting>
                      
                      </address-settings>
                      
                      </configuration>
                      
                      

                       

                       

                      NODE B:

                      hornetq-configuration.xml

                       

                      <configuration xmlns="urn:hornetq"
                      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                      xsi:schemaLocation="urn:hornetq /schema/hornetq-configuration.xsd">
                      
                      <clustered>true</clustered>
                      
                      <log-delegate-factory-class-name>org.hornetq.integration.logging.Log4jLogDelegateFactory</log-delegate-factory-class-name>
                      <bindings-directory>${jboss.server.data.dir}/hornetq/bindings</bindings-directory>
                      <journal-directory>${jboss.server.data.dir}/hornetq/journal</journal-directory>
                      <journal-min-files>10</journal-min-files>
                      <large-messages-directory>${jboss.server.data.dir}/hornetq/largemessages</large-messages-directory>
                      <paging-directory>${jboss.server.data.dir}/hornetq/paging</paging-directory>  
                      
                      <connectors>
                           <connector name="netty">
                           <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>
                           <param key="host"  value="10.177.160.113"/>
                           <param key="port"  value="${hornetq.remoting.netty.port:5445}"/>
                           </connector>
                      
                           <connector name="netty-throughput">
                           <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>
                           <param key="host"  value="${jboss.bind.address:localhost}"/>
                           <param key="port"  value="${hornetq.remoting.netty.batch.port:5455}"/>
                           <param key="batch-delay" value="50"/>
                           </connector>
                      
                           <connector name="in-vm">
                           <factory-class>org.hornetq.core.remoting.impl.invm.InVMConnectorFactory</factory-class>
                           <param key="server-id" value="${hornetq.server-id:0}"/>
                           </connector>
                      
                           <connector name="primary-queue-server">
                           <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>
                           <param key="host" value="${hornetq.primary.queue:globalqueuenode1}"/>
                           <param key="port" value="${hornetq.remoting.netty.port:5465}"/>
                           </connector>
                      </connectors>
                      
                      <acceptors>
                           <acceptor name="netty">
                           <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
                           <param key="host"  value="${jboss.bind.address:localhost}"/>
                           <param key="port"  value="${hornetq.remoting.netty.port:5445}"/>
                           </acceptor>
                      
                           <acceptor name="netty-throughput">
                           <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
                           <param key="host"  value="${jboss.bind.address:localhost}"/>
                           <param key="port"  value="${hornetq.remoting.netty.batch.port:5455}"/>
                           <param key="batch-delay" value="50"/>
                           <param key="direct-deliver" value="false"/>
                           </acceptor>
                      
                           <acceptor name="in-vm">
                           <factory-class>org.hornetq.core.remoting.impl.invm.InVMAcceptorFactory</factory-class>
                           <param key="server-id" value="0"/>
                           </acceptor>
                      
                           <acceptor name="backup-queue-server">
                           <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
                           <param key="host"  value="${hornetq.backup.queue:globalqueuenode2}"/>
                           <param key="port"  value="${hornetq.remoting.netty.port:5465}"/>
                           </acceptor>
                      </acceptors>
                      
                      <cluster-connections>
                           <cluster-connection name="globalQueueCluster">
                           <address>jms</address>
                           <connector-ref connector-name="primary-queue-server" />
                           </cluster-connection>
                      </cluster-connections>
                      
                      <security-settings>
                           <security-setting match="#">
                           <permission type="createNonDurableQueue" roles="guest"/>
                           <permission type="deleteNonDurableQueue" roles="guest"/>
                           <permission type="consume" roles="guest"/>
                           <permission type="send" roles="guest"/>
                           </security-setting>
                      </security-settings>
                      
                      <address-settings>
                           <!--default for catch all-->
                           <address-setting match="#">
                                <dead-letter-address>jms.queue.DLQ</dead-letter-address>
                                <expiry-address>jms.queue.ExpiryQueue</expiry-address>
                                <redelivery-delay>0</redelivery-delay>
                                <max-size-bytes>10485760</max-size-bytes>
                                <message-counter-history-day-limit>10</message-counter-history-day-limit>
                                <address-full-policy>BLOCK</address-full-policy>
                           </address-setting>
                      
                           <address-setting match="jms#">
                                <max-delivery-attempts>10</max-delivery-attempts>
                                <redelivery-delay>60000</redelivery-delay>
                                <redistribution-delay>0</redistribution-delay>
                           </address-setting>
                      </address-settings>
                      
                      </configuration>
                       
                      
                      
                      
                      • 8. Re: Message Redistribution of Persisted Messages on Boot
                        steven.landers

                        Quick example scenario of my issue, for clarity:

                         

                        This works:

                        1) I send a message to Node A, and it has no consumers

                        2) The message is successfully redistributed to Node B, because it has consumers.

                         

                        This doesn't work:

                        1) I send 1000 messages to Node A, and it HAS consumers.

                        2) I stop Node A's server while there are messages left in the queue

                        3) Node B gets all of Node A's consumers (via consumer spring wiring)

                        4) I start Node A's server again

                        5) Node B keeps its consumers (good) - but Node A does not send its in-flight messages to Node B

                        • 9. Re: Message Redistribution of Persisted Messages on Boot
                          steven.landers

                          Actually, nevermind about what works....neither situation works for me right now.

                          • 10. Re: Message Redistribution of Persisted Messages on Boot
                            steven.landers

                            Ok, I've returned normal functionality- only the 2nd case above is not working for me again.  Basically once the messages are persisted, they are stuck to the server - and are not redistributed.

                            • 11. Re: Message Redistribution of Persisted Messages on Boot
                              steven.landers

                              I can seem to get the desired behavior by doing this:

                               

                              1) Set up a core bridge from Node A to the queue on Node B

                              2) Set up a core bridge from Node B to the queue on Node A

                               

                              Since only one of these nodes has a large set of consumers, and the other only has one consumer (the core bridge), the messages are pushed to the node with the large number of consumers. 

                               

                              Side Effect:

                              occasionally I see this in the log:

                              2010-07-30 14:33:25,407 WARN  [org.hornetq.core.postoffice.impl.PostOfficeImpl] (Old I/O server worker (parentId: 25663439, channelId: 1509923, null => globalqueuenode1/10.177.160.24:5465)) Duplicate message detected - message will not be routed

                               

                              But...all of the thousands of messages I push through both nodes end up on one node, and out to consumers, as desired.

                               

                              Thoughts?  I would much rather use Redistribution for this. 

                              • 12. Re: Message Redistribution of Persisted Messages on Boot
                                timfox

                                I don't really understand what you're trying to achieve here, e.g. I don't understand what this means:

                                 

                                Node B gets all of Node A's consumers (via consumer spring wiring)

                                 

                                You'd need to replicate the issue without using Spring. I'd also suggest having a look at the wiki article on "how to report an issue".

                                • 13. Re: Message Redistribution of Persisted Messages on Boot
                                  steven.landers

                                  Three ways of explaning the issue, depending on just how much reading desired:

                                   

                                  In a very concise statement:

                                  Messages, once persisted, are not redistributed if the consumers on that node drop to 0.  

                                   

                                  In a step-by-step use case:

                                  prereq:  Two Nodes, A & B, in a staticly-defined HornetQ cluster

                                  1) I send 1000 messages to Node A, and it HAS consumers.

                                  2) I stop Node A's server while there are messages left in the queue

                                  3) Node B gets all of Node A's consumers (via consumer spring wiring)  -  (who cares who the consumers are...or how they consume.  the consumers failed over.  this is out of scope)

                                  4) I start Node A's server again

                                  5) Node B keeps its consumers (good) - but Node A, with 0 consumers, does not send its in-flight messages to Node B with consumers.

                                  5a)  Any NEW messages ARE redistributed to Node B, even if they try to arrive at Node A.

                                   

                                  A State Diagram with a COLORED HEADER:

                                  STATENODE ANODE B
                                  1NODE A has 25 consumers, 0 messagesNODE B has 0 consumers, 0 messages
                                  2NODE A has 25 consumers, 1000 messagesNODE B has 0 consumers, 0 messages
                                  3NODE A has 25 consumers, 850 messages (processing)NODE B has 0 consumers, 0 messages
                                  4NODE A is turned off,  850 messages are persistedNODE B has 25 consumers 0 messages   (consumers failed over.  Regardless of how, it has the consumers now.  This is out of scope.)
                                  5NODE A is turned on,  0 consumers, 850 messagesNODE B has 25 consumers
                                  6NODE A never hands its messages to NODE BNODE B laughs at NODE A, because NODE B has the consumers, and NODE A has the messages

                                   

                                   

                                  That's my issue.  I'd be happy to answer any questions, provide any configuration files, or give other information as desired. 

                                   

                                  Few Fundamental Questions that may help:

                                  1) Should messages be redistributed once they are persisted?  (I thought they would)

                                  2) Does redistribution only occur some time after consumers to DROP to 0?  (must the server be up when the consumers drop to 0?)

                                  3) Does redistribution occur ONLY on messages the moment they arrive?

                                   

                                  I may have my configuration incorrect.  Helpfully, I included my whole configuration file.  Forums are great for users to help each other find solutions.  I think I've been clear, but if not, do ask questions.  

                                  • 14. Re: Message Redistribution of Persisted Messages on Boot
                                    timfox

                                    You didn't say what version you are running. Please make sure it's the latest.

                                    1 2 Previous Next