3 Replies Latest reply on Sep 30, 2011 5:30 AM by srazza

    Two nodes with standalone HornetQ 2.2.5 in HA - missing messages

    srazza

      Hi,

      I have setup two nodes with HornetQ 2.2.5 standalone, with HA configuration (live/backup).

      For the first one (LIVE), in hornetq-configuration.xml I've added the following lines:

       

      ###### HQ1 (LIVE) #######

         <clustered>true</clustered>

         <shared-store>true</shared-store>

         <persistence-enabled>true</persistence-enabled>

         <failover-on-shutdown>true</failover-on-shutdown>

         <allow-failback>false</allow-failback>

         <jmx-management-enabled>true</jmx-management-enabled>

          <journal-directory><PATH_TO_JOURNAL></journal-directory>

      ############################

       

       

      For the second one (BACKUP), in hornetq-configuration.xml I've added the following lines:

       

      ###### HQ2 (BACKUP) #######

         <clustered>true</clustered>

         <shared-store>true</shared-store>

         <backup>true</backup>

         <persistence-enabled>true</persistence-enabled>

         <failover-on-shutdown>true</failover-on-shutdown>

         <allow-failback>false</allow-failback>

         <jmx-management-enabled>true</jmx-management-enabled>

          <journal-directory><PATH_TO_JOURNAL></journal-directory>

      ############################

       

      Both instances have started correctly, with the first in LIVE mode, and the second in Backup mode.

       

      We made a test session in which we sent 1000 messages to one Topic defined in both HQ instances.

      Terminated the publishing process we have verified that all 1000 messages were in the Topic.

      Then we have started a consumer process, during which we simulated the HornetQ failover process through some alternately HQ1 and HQ2
      shutdown.

      Every time the backup instance switched correctly to live status.

      At the end of the test, when the Topic was empty, we checked the number of consumed messages and we find out that they were 912!!

       

      We have repeated the same test many times and always we verified that some messages were lost.

       

      HornetQ logs didn't trace any errors, so I'm wondering where are missed messages, since the are not in the HornetQ Topic tested.

      Are messages really missed? Can I make some other check in the HornetQ structure to verify where they are?

      Is there some other configuration parameter to add in both configurations I forgot?


      Thanks in advance

       

      Stefano

        • 1. Re: Two nodes with standalone HornetQ 2.2.5 in HA - missing messages
          clebert.suconic

          Maybe you should use XA, to validate if the Message was acknowledged or not during the failover process.

           

           

          You can use PrintData. Maybe the system failed after the ACK was persisted and before the client got the confirmation, what will could mean "loss" from your point of view.

          • 2. Re: Two nodes with standalone HornetQ 2.2.5 in HA - missing messages
            henry.g.li

            Can you post the hornetq-configuration.xml for both of the primary and failover? I am trying to configure a pair of hornetq server(2.2.5). But the xsd for connector-ref in 2.2.5 has changed, I cannot use

            <connector-ref connector-name="netty-connector"

                    backup-connector-name="backup-connector"/> in the <broadcast-groups> section to specify the failover server because connnector-ref doesn't support attribute.

            • 3. Re: Two nodes with standalone HornetQ 2.2.5 in HA - missing messages
              srazza

              Hi Henri,

              here are the two configuration files contents:

               

              ################################

              # hornetq-configuration-LIVE.xml #

              ################################

              <configuration xmlns="urn:hornetq"

                             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

                             xsi:schemaLocation="urn:hornetq /schema/hornetq-configuration.xsd">

               

                 <clustered>true</clustered>

                 <id-cache-size>5000</id-cache-size>

                 <persist-id-cache>true</persist-id-cache>

               

                 <shared-store>true</shared-store>

                

                 <large-messages-directory>/SHARED/HQ/large-messages</large-messages-directory>

                 <bindings-directory>/SHARED/HQ/bindings</bindings-directory>

                 <paging-directory>/SHARED/HQ/paging</paging-directory>

                 <journal-directory>/SHARED/HQ/journal</journal-directory>

                 <journal-min-files>10</journal-min-files>

               

                 <journal-type>NIO</journal-type>

                

                 <persistence-enabled>true</persistence-enabled>

                 <failover-on-shutdown>true</failover-on-shutdown>

                 <allow-failback>false</allow-failback>

                

                 <!-- false to disable JMX management for HornetQ -->

                 <jmx-management-enabled>true</jmx-management-enabled>

               

                 <connectors>     

                    <connector name="netty">

                       <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                       <param key="host"  value="${hornetq.remoting.netty.host:localhost}"/>

                       <param key="port"  value="${hornetq.remoting.netty.port:5445}"/>

                         </connector>

                   

                    <connector name="netty-throughput">

                       <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                       <param key="host"  value="${hornetq.remoting.netty.host:localhost}"/>

                       <param key="port"  value="${hornetq.remoting.netty.batch.port:5455}"/>

                       <param key="batch-delay" value="50"/>

                    </connector>

                 </connectors>

               

                 <acceptors>

                    <acceptor name="netty">

                       <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>

                       <param key="host"  value="${hornetq.remoting.netty.host:localhost}"/>

                       <param key="port"  value="${hornetq.remoting.netty.port:5445}"/>

                    </acceptor>

                   

                    <acceptor name="netty-throughput">

                       <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>

                       <param key="host"  value="${hornetq.remoting.netty.host:localhost}"/>

                       <param key="port"  value="${hornetq.remoting.netty.batch.port:5455}"/>

                       <param key="batch-delay" value="50"/>

                       <param key="direct-deliver" value="false"/>

                    </acceptor>

                 </acceptors>

               

                 <broadcast-groups>

                    <broadcast-group name="HornetQBroadcast">

                       <local-bind-address>159.213.227.185</local-bind-address>

                       <local-bind-port>9000</local-bind-port>

                       <group-address>231.7.7.7</group-address>

                       <group-port>9876</group-port>

                       <broadcast-period>5000</broadcast-period>

                       <connector-ref>netty</connector-ref>

                    </broadcast-group>

                 </broadcast-groups>

               

                 <discovery-groups>

                    <discovery-group name="HornetQGroup">

                       <local-bind-address>159.213.227.185</local-bind-address>

                       <group-address>231.7.7.7</group-address>

                       <group-port>9876</group-port>

                       <refresh-timeout>10000</refresh-timeout>

                         </discovery-group>

                 </discovery-groups>

               

                 <cluster-connections>

                    <cluster-connection name="HornetQCluster">

                       <address>jms</address>

                       <connector-ref>netty</connector-ref>

                       <discovery-group-ref discovery-group-name="HornetQGroup"/>

                    </cluster-connection>

                 </cluster-connections>

               

                 <security-settings>

                    <security-setting match="#">

                       <permission type="createNonDurableQueue" roles="guest"/>

                       <permission type="deleteNonDurableQueue" roles="guest"/>

                       <permission type="consume" roles="guest"/>

                       <permission type="send" roles="guest"/>

               

               

                      <permission type="createDurableQueue" roles="guest"/>

               

                    </security-setting>

                 </security-settings>

               

                 <address-settings>

                    <!--default for catch all-->

                    <address-setting match="#">

                       <dead-letter-address>jms.queue.DLQ</dead-letter-address>

                       <expiry-address>jms.queue.ExpiryQueue</expiry-address>

                       <redelivery-delay>0</redelivery-delay>

                       <max-size-bytes>10485760</max-size-bytes>      

                       <message-counter-history-day-limit>10</message-counter-history-day-limit>

                       <address-full-policy>BLOCK</address-full-policy>

                      

                  

                       <max-delivery-attempts>-1</max-delivery-attempts>

               

                    </address-setting>

                 </address-settings>

               

              </configuration>

               

              ###################################

              # hornetq-configuration-BACKUP.xml #

              ###################################

               

              <configuration xmlns="urn:hornetq"

                             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

                             xsi:schemaLocation="urn:hornetq /schema/hornetq-configuration.xsd">

               

                 <clustered>true</clustered>

               

                 <id-cache-size>5000</id-cache-size>

                 <persist-id-cache>true</persist-id-cache>

               

                 <shared-store>true</shared-store>

                 <backup>true</backup>

               

                 <large-messages-directory>/SHARED/HQ/large-messages</large-messages-directory>

                 <bindings-directory>/SHARED/HQ/bindings</bindings-directory>

                 <paging-directory>/SHARED/HQ/paging</paging-directory>

                 <journal-directory>/SHARED/HQ/journal</journal-directory>

                 <journal-min-files>10</journal-min-files>

               

                 <journal-type>NIO</journal-type>

               

                 <persistence-enabled>true</persistence-enabled>

                 <failover-on-shutdown>true</failover-on-shutdown>

                 <allow-failback>false</allow-failback>

               

               

                 <!-- false to disable JMX management for HornetQ -->

                 <jmx-management-enabled>true</jmx-management-enabled>

               

               

                   <connectors>

                    <connector name="netty">

                       <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                       <param key="host"  value="${hornetq.remoting.netty.host:localhost}"/>

                            <param key="port"  value="${hornetq.remoting.netty.port:5445}"/>

                    </connector>

                   

                    <connector name="netty-throughput">

                       <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                       <param key="host"  value="${hornetq.remoting.netty.host:localhost}"/>

                       <param key="port"  value="${hornetq.remoting.netty.batch.port:5455}"/>

                       <param key="batch-delay" value="50"/>

                    </connector>

                 </connectors>

               

                 <acceptors>

                    <acceptor name="netty">

                       <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>

                       <param key="host"  value="${hornetq.remoting.netty.host:localhost}"/>

                       <param key="port"  value="${hornetq.remoting.netty.port:5445}"/>

                    </acceptor>

                   

                    <acceptor name="netty-throughput">

                       <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>

                       <param key="host"  value="${hornetq.remoting.netty.host:localhost}"/>

                       <param key="port"  value="${hornetq.remoting.netty.batch.port:5455}"/>

                       <param key="batch-delay" value="50"/>

                       <param key="direct-deliver" value="false"/>

                    </acceptor>

                 </acceptors>

               

               

                 <broadcast-groups>

                    <broadcast-group name="HornetQBroadcast">

                       <local-bind-address>159.213.227.186</local-bind-address>

                       <local-bind-port>9000</local-bind-port>

                       <group-address>231.7.7.7</group-address>

                       <group-port>9876</group-port>

                       <broadcast-period>5000</broadcast-period>

                       <connector-ref>netty</connector-ref>

                    </broadcast-group>

                 </broadcast-groups>

               

                 <discovery-groups>

                       <discovery-group name="HornetQGroup">

                       <local-bind-address>159.213.227.186</local-bind-address>

                       <group-address>231.7.7.7</group-address>

                       <group-port>9876</group-port>

                       <refresh-timeout>10000</refresh-timeout>

                    </discovery-group>

                 </discovery-groups>

               

                 <cluster-connections>

                    <cluster-connection name="HornetQCluster">

                       <address>jms</address>

                       <connector-ref>netty</connector-ref>

                       <discovery-group-ref discovery-group-name="HornetQGroup"/>

                    </cluster-connection>

                 </cluster-connections>

               

                 <security-settings>

                    <security-setting match="#">

                       <permission type="createNonDurableQueue" roles="guest"/>

                       <permission type="deleteNonDurableQueue" roles="guest"/>

                       <permission type="consume" roles="guest"/>

                       <permission type="send" roles="guest"/>

               

                      <permission type="createDurableQueue" roles="guest"/>

               

                    </security-setting>

                 </security-settings>

               

                 <address-settings>

                    <!--default for catch all-->

                    <address-setting match="#">

                       <dead-letter-address>jms.queue.DLQ</dead-letter-address>

                       <expiry-address>jms.queue.ExpiryQueue</expiry-address>

                       <redelivery-delay>0</redelivery-delay>

                       <max-size-bytes>10485760</max-size-bytes>      

                       <message-counter-history-day-limit>10</message-counter-history-day-limit>

                       <address-full-policy>BLOCK</address-full-policy>

                      

               

                       <max-delivery-attempts>-1</max-delivery-attempts>

               

                        </address-setting>

                 </address-settings>