2 Replies Latest reply on Mar 23, 2012 4:32 AM by Alexander Hartner

    Failover in stand-alone mode

    Alexander Hartner Expert

      My client connects to hornetq using JNDI (java.naming.provider.url=jnp://10.1.40.27:1099,jnp://10.1.40.26:2099) which includes both servers. With either server running the client is able to connect and send messages. Howevever with both servers running and one failing things fall apart.

       

      Here is my test scenario:

       

      1. Start both servers
      2. Start the client which sends a series of messages
      3. Similar to the ApplicationLayerFailover example, when an exception occurs my client re-connects to the server performing a new lookup
      4. Stop one of the servers, usually the one listed first in the JNDI host list
      5. The client reports the exception and tries to reconnect

       

      At this point it encounters a problem connecting to the first host in the list. Since the first host was stopped I don't expect it to be able to connect, but I am hoping it would fail over to the second host on the list.

       

      javax.naming.CommunicationException [Root exception is java.rmi.ConnectException: Connection refused to host: 10.1.40.27; nested exception is:

          java.net.ConnectException: Connection refused: connect]

          at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:839)

          at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:686)

          at javax.naming.InitialContext.lookup(InitialContext.java:392)

          at com.abc.hornetq.ClientSender.connect(ClientSender.java:33)

          at com.abc.hornetq.ClientSender.onException(ClientSender.java:93)

          at org.hornetq.jms.client.HornetQConnection$JMSFailureListener$1.run(HornetQConnection.java:651)

          at java.lang.Thread.run(Thread.java:662)

      Caused by: java.rmi.ConnectException: Connection refused to host: 10.1.40.27; nested exception is:

          java.net.ConnectException: Connection refused: connect

          at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601)

          at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198)

          at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184)

          at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110)

          at org.jnp.server.NamingServer_Stub.lookup(Unknown Source)

          at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:726)

          ... 6 more

      Caused by: java.net.ConnectException: Connection refused: connect

          at java.net.PlainSocketImpl.socketConnect(Native Method)

          at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

          at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

          at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

          at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

          at java.net.Socket.connect(Socket.java:529)

          at java.net.Socket.connect(Socket.java:478)

          at java.net.Socket.<init>(Socket.java:375)

          at java.net.Socket.<init>(Socket.java:189)

          at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)

          at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)

          at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595)

          ... 11 more

      In the example the alternate host is specified explicitly. In my case I want to choose the first available host in the provider list. If I stop my application at this point and restart it, all works again correctly. The problem only occurs once I am connected and try to re-connect using mulitple hosts in my provider list.

       

      Any thoughts on how I can get the client to re-connect properly.

        • 1. Re: Failover in stand-alone mode
          Yong Hao Gao Master

          I think what you need is a HA connection factory. How is your connection factory configured at the server?

           

          Howard

          • 2. Re: Failover in stand-alone mode
            Alexander Hartner Expert

            in hornetq-jms.xml I define my connection factory using the following. I have XA and HA enabled.

            <connection-factory name="NettyConnectionFactory">

                <xa>true</xa>     

                <ha>true</ha>

                <!-- Pause 1 second between connect attempts -->

                <retry-interval>1000</retry-interval>

                <!-- Multiply subsequent reconnect pauses by this multiplier. This can be used to

                  implement an exponential back-off. For our purposes we just set to 1.0 so each reconnect

                  pause is the same length -->

                <retry-interval-multiplier>1.0</retry-interval-multiplier>

                <!-- Try reconnecting an unlimited number of times (-1 means "unlimited") -->

                <reconnect-attempts>-1</reconnect-attempts>

                <client-failure-check-period>100</client-failure-check-period>

                <failover-on-server-shutdown>true</failover-on-server-shutdown>

                <failover-on-initial-connection>true</failover-on-initial-connection>

                <discovery-group-ref discovery-group-name="dg-group1"/>

                <connectors>

                  <connector-ref connector-name="netty"/>

                </connectors>

                <entries>

                  <entry name="/SpecialConnectionFactory"/>

                </entries>

                <connection-load-balancing-policy-class-name>org.hornetq.api.core.client.loadbalance.RandomConnectionLoadBalancingPolicy</connection-load-balancing-policy-class-name>     

              </connection-factory>