2 Replies Latest reply on Dec 16, 2011 11:09 AM by ejb3workshop

Failover / HA not working in stand-alone mode

ejb3workshop Dec 16, 2011 6:15 AM

I am trying to setup a HA environment consisting of two severs. The data folder including the journal are shared between the server via an NFS share. The directory /mnt/share is common to both systems. After some fiddling the configuration I managed to get fail over working (I think).

When I stop server1 using CTRL+C the other (server1_standby) seems to take over and become the live server.

My test client sends a series of simple text messages to a queue hosted on the server. It uses JNDI and JMS api rather then HornetQ specific code. It does a standard JNDI lookup and retrieves the SpecialConnectionFactory configured as shown below / attached.

The problem which occurs is that the client (attached) keeps logging the following message

Exception Session is closed during sending

rather then falling over to the backup server. My question is should I expect it to continue. I don't mind it getting some exception while failover takes place, but I am hoping to be able to avoid having to re-connect to the backup server manually. The examples included seem to suggest this is possible even though they focus on message acknowledgment rather then sending.

Any pointers what do try out to get HA with failover working correctly ?

Also if I restart the stopped server the client doesn't seem to resume sending messages ? Any pointer to get this working ? Using the supplied examples I managed to get this working. I tried to compare the server configuration, but haven't been able to find the relevant difference.

I am using hornetq-2.2.5.Final on Linux with Java 1.7.0_01 64 bit

Thanks in advance

Here are extracts from my configuration. The complete files are attached to the discussion.

hornetq-jms.xml

<connection-factory name="NettyConnectionFactory">
    <xa>true</xa>
    <ha>true</ha>
    
    <retry-interval>1000</retry-interval>
    
    <retry-interval-multiplier>1.0</retry-interval-multiplier>
    
    <reconnect-attempts>-1</reconnect-attempts>
    <client-failure-check-period>100</client-failure-check-period>
    <failover-on-server-shutdown>true</failover-on-server-shutdown>
    <failover-on-initial-connection>true</failover-on-initial-connection>
    <discovery-group-ref discovery-group-name="dg-group1"/>
    <connectors>
      <connector-ref connector-name="netty"/>
    </connectors>
    <entries>
      <entry name="/SpecialConnectionFactory"/>
    </entries>
    <connection-load-balancing-policy-class-name>org.hornetq.api.core.client.loadbalance.RandomConnectionLoadBalancingPolicy</connection-load-balancing-policy-class-name>
</connection-factory>

My hornet configuration file (hornetq-configuration.xml) contains the following details

<configuration xmlns="urn:hornetq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xsi:schemaLocation="urn:hornetq /schema/hornetq-configuration.xsd">

<clustered>true</clustered>
<shared-store>true</shared-store>
<backup>${backup:false}</backup>
<allow-failback>true</allow-failback>
<failover-on-shutdown>true</failover-on-shutdown>
<paging-directory>${data.dir:../data}/paging</paging-directory>
<bindings-directory>${data.dir:../data}/bindings</bindings-directory>
<journal-directory>${data.dir:../data}/journal</journal-directory>
<journal-min-files>10</journal-min-files>

1. Re: Failover / HA not working in stand-alone mode

ataylor Dec 16, 2011 6:39 AM (in response to ejb3workshop)

all your connection factories in you jms config have the same name, so I'm guessing you arent actually using the one youve configured.
Actions
2. Re: Failover / HA not working in stand-alone mode

ejb3workshop Dec 16, 2011 11:09 AM (in response to ataylor)

I thought the name was set in the entry name:
<entry name="/SpecialConnectionFactory"/>
This is also the name I am using for the JNDI lookup. I have several connection factories configured:
<entry name="/SpecialConnectionFactory"/>...
<entry name="/ExampleConnectionFactory"/>...
<entry name="/ConnectionFactory"/>...
<entry name="/XAThroughputConnectionFactory"/>...
I based this on the default configuration (config/stand-alone/clustered/hornetq-jms.xml) included with the download. In this example there are several different connection factories named NettyConnectionFactory but with different "entry names". Please could you confirm that this is the issue. I am not quite sure what NettyConnectionFactory refers to as I haven't been able to find any other references other than this file. I thought it was a reference back to the class used for the connection factory.
Actions

Go to original post