7 Replies Latest reply on Jun 9, 2016 1:52 AM by meabhi007

    Jboss 7.1 hornetq cluster using static connectors, client node is failing to connect in some specific scenarios

    meabhi007

      Hi,

       

      I am trying to create the 2 nodes HornetQ cluster using <static-connectors> (its a active-active cluster)

      Cluster is working fine, in following scenarios:

      • Both hornetQ nodes are up. connecting client.
      • shutting down one HornetQ server.  ***Outcome: client fails over to other HornetQ server.
      • restart 1st HornetQ, shutdown other hornetQ ***Outcome: client fails over to other HornetQ server.
      • restart 2nd HornetQ, shutdown 1st hornetQ   ***Outcome: client fails over to other HornetQ server.

      Conclusion: IF client was started when both hornetQ servers were up and running, failover working fine.

       

       

      Cluster is failing in following scenario:

      • One hornetQ server is up, connect client.
      • start other hornetQ server, shutdown first hornetQ server. *** Outcome: client is logging exception that HornetQ server is down, trying to connect the same.

      Conclusion: If Client was started when Only one HornetQ server was up. now even if other node of cluster is started later, client is not able to failover to active node.

       

       

      On the Messaging server node, configuration is mentioned below, Where "netty-connector-cluster-node" is FQDN for other Hornetq server.

      <subsystem xmlns="urn:jboss:domain:messaging:1.2">

      ...

      <connectors>

          <netty-connector name="netty-connector-cluster-node" socket-binding="jms-cluster-node"/>

      </connectors>

      <cluster-connections>

             <cluster-connection name="my-cluster">

             <address>jms</address>

             <connector-ref>netty</connector-ref>

             <static-connectors>

                       <connector-ref>netty-connector-cluster-node</connector-ref>

             </static-connectors>

      </cluster-connection>

      ...

      </subsystem>

       


      On the client machine, configuration is as mentioned below, where netty-connector-cluster-node-1, netty-connector-cluster-node-2 are FQDN for both HornetQ servers.

      <subsystem xmlns="urn:jboss:domain:messaging:1.2">

      ...

      <connectors>

                <netty-connector name="netty-connector-cluster-node-1" socket-binding="jms-cluster-node-1"/>

                 <netty-connector name="netty-connector-cluster-node-2" socket-binding="jms-cluster-node-2"/>

      </connectors>

      ...

      <jms-connection-factories>

          ...

      <pooled-connection-factory name="hornetq-ra">

                    <transaction mode="xa"/>

                   <consumer-window-size>0</consumer-window-size>

                  <connectors>

                       <connector-ref connector-name="netty-connector-cluster-node-1"/>

                      <connector-ref connector-name="netty-connector-cluster-node-2"/>

                  </connectors>

                  <entries>

                           <entry name="java:/JmsXA"/>

                  </entries>

                  <connection-ttl>600000</connection-ttl>

                  <client-failure-check-period>120000</client-failure-check-period>

      </pooled-connection-factory>

      </jms-connection-factories>

      ...

      </subsystem>

        • 1. Re: Jboss 7.1 hornetq cluster using static connectors, client node is failing to connect in some specific scenarios
          jbertram

          I'm confused about your use-case.  You say you have an active-active configuration (i.e. 2 live cluster nodes), but that your client is failing over in certain circumstances when brokers are shut down.  However, HornetQ doesn't support fail-over in an active-active configuration.  HornetQ only supports fail-over between a live server and a backup (i.e. an active-passive configuration).  Can you clarify you use-case and configuration?

           

          Also, when talking about the outcomes of various failure scenarios you say, "client fails over to other client" (emphasis mine).  How is a client failing over to a client?  Or did you mean to say the client is failing over to the other server?

          • 2. Re: Jboss 7.1 hornetq cluster using static connectors, client node is failing to connect in some specific scenarios
            meabhi007

            Hi Justin,

            I might have used wrong word "fail-over", so just trying to explain the use cases once again. configurations are as mentioned in original question.

            Also Corrected the typo "client fails over to other client" in original question.

             

            Its a two nodes cluster with active-active configuration, so both nodes have registered the topics/queues.

            UseCase -1 [both servers (Server-A and Server-B) were up while starting the client for first time]

            • Start Both nodes (Server-A and Server-B)
            • Start client (MDBs) - Client is able to connect Server-A/Server-B
            • shutdown Server-A (server-B is still alive) - Client can connect to server-B
            • shutdown Server-B (restart Server-A) - Client can connect to Server-A
            • Shutdown Server-A (restart Server-D) - Client can connect to server-B

             

            UseCase-2

            • Start One node (Server-A)
            • Start Client (MDBs)  - Client is able to connect Server-A
            • Start server-B (Keep the Server-A alive) - Client is able to connect Server-A
            • Shutdown Server-A (Keep Server-B alive) - Client Fail to connect Server -B
            • Restart Server-A - Client is able to connect Server-A


            My Concern with the configuration: in case, Client got restarted while one Server was down, then i will start hitting UserCase-2 in production environment.

            • 3. Re: Jboss 7.1 hornetq cluster using static connectors, client node is failing to connect in some specific scenarios
              jbertram

              I believe what you are seeing is the expected behavior with your configuration. 

               

              The key here is to understand that in the first scenario (i.e. when both broker servers are up before the client is started) the client is actually connected to both brokers since the MDB's sessions will be load-balanced across the 2 nodes.  That means when one of the brokers servers goes down still half of the sessions are viable while the other half are silently attempting to reconnect to the server which is down.  However, in the second scenario (i.e. when only one broker server is up before the client is started) then all the MDB's sessions are connected to that single broker server.  Therefore, when that broker server goes down it doesn't matter that the other one is up; the MDB sessions will try to connect to the down server until it comes back up (since the default reconnect-attempts for a pooled-connection-factory is -1).

               

              I think you might be able to change this behavior by setting reconnect-attempts on the pooled-connection-factory to > 0 so that it doesn't retry into infinity.

               

              Also, if you really want fail-over then you should configure a live/backup pair.

              • 4. Re: Jboss 7.1 hornetq cluster using static connectors, client node is failing to connect in some specific scenarios
                meabhi007

                Hi Justin,

                I tried to change the reconnect-attempts to positive value(1000), still client didn't connect to other JMS server.

                In the logs, i didn't find any trace that it even tried to look for other JMS server. it just tried 1000 times on original JMS server and then stopped any further attempts.

                • 5. Re: Jboss 7.1 hornetq cluster using static connectors, client node is failing to connect in some specific scenarios
                  meabhi007

                  Hi Justin,

                  As i mentioned that with positive reconnect-attempts didn't help me.

                  I moved ahead and switched to active-backup mode and upgraded to wildfly 8.2.0 Final, which seems to be working fine.

                  Thanks for your help.

                  • 6. Re: Jboss 7.1 hornetq cluster using static connectors, client node is failing to connect in some specific scenarios
                    jbertram

                    At first look it seems you're using replication with the "null" persistence manager (i.e. you've disabled persistence).  I doubt the null persistence manager has ever been tested with replication since the whole point of replication is to replicate persistent data between a live and a backup server so that could be the problem.  Aside from that you'd have to provide some more information before I could comment further.

                    • 7. Re: Jboss 7.1 hornetq cluster using static connectors, client node is failing to connect in some specific scenarios
                      meabhi007

                      Hi Justin,

                       

                      Thanks for your reply.

                      That exception was resulted due to wrong configuration.

                      By Mistake "persistence-enabled" was set to false.

                      Although logs were not clear to explain that configuration error.

                      Once again, that for your help.