2 Replies Latest reply on Aug 3, 2016 7:05 PM by meabhi007

    During Network failure for approx 2 mins, Active HornetQ server (part of static connectors based cluster) goes down.

    meabhi007

      Hi,

       

      I am using the static-connectors to configure hornetQ cluster.

      attaching the standalone-egain.xml file for reference.

       

      When i am trying network failure for more than 90 secs on active node, backup node becomes active.

      but active node goes down and logs following strings in console logs.

      attaching the complete console log for reference.

       

      I never observed this behavior with Multicast based clustering.

      Looking for some inputs to make sure that backup node get active but primary node should not shutdown automatically.

       

       

      2016-07-13 23:15:10,589 WARN  [org.hornetq.core.client] (hornetq-failure-check-thread) HQ212037: Connection failure has been detected: HQ119014: Did not receive data from null. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]

      2016-07-13 23:15:10,589 INFO  [org.hornetq.core.server] (hornetq-failure-check-thread) HQ221021: failed to remove connection

      2016-07-13 23:15:20,653 WARN  [org.hornetq.core.client] (hornetq-failure-check-thread) HQ212037: Connection failure has been detected: HQ119014: Did not receive data from null. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]

      2016-07-13 23:15:20,653 INFO  [org.hornetq.core.server] (hornetq-failure-check-thread) HQ221021: failed to remove connection

      2016-07-13 23:15:20,653 WARN  [org.hornetq.core.client] (hornetq-failure-check-thread) HQ212037: Connection failure has been detected: HQ119014: Did not receive data from null. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]

      2016-07-13 23:15:20,653 WARN  [org.hornetq.core.server] (hornetq-failure-check-thread) HQ222092: Connection to the backup node failed, removing replication now: HornetQConnectionTimedOutException[errorType=CONNECTION_TIMEDOUT message=HQ119014: Did not receive data from null. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed.]

        at org.hornetq.core.remoting.server.impl.RemotingServiceImpl$FailureCheckAndFlushThread.run(RemotingServiceImpl.java:765) [hornetq-server-2.4.5.Final.jar:]

       

       

      2016-07-13 23:15:20,653 INFO  [org.hornetq.core.server] (hornetq-failure-check-thread) HQ221021: failed to remove connection

      2016-07-13 23:15:22,674 WARN  [org.hornetq.core.client] (hornetq-failure-check-thread) HQ212037: Connection failure has been detected: HQ119014: Did not receive data from null. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]

      2016-07-13 23:15:22,674 INFO  [org.hornetq.core.server] (hornetq-failure-check-thread) HQ221021: failed to remove connection

      2016-07-13 23:16:45,362 INFO  [org.jboss.as.connector.deployment] (MSC service thread 1-2) JBAS010410: Unbound JCA ConnectionFactory [java:/JmsXA]

      2016-07-13 23:16:45,362 INFO  [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-2) JBAS010409: Unbound data source [java:jboss/datasources/ExampleDS]

      2016-07-13 23:16:45,472 INFO  [org.jboss.as.messaging] (ServerService Thread Pool -- 106) JBAS011605: Unbound messaging object to jndi name java:jboss/exported/test.pl.user

      2016-07-13 23:16:45,503 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) JBAS017535: Unregistered web context: /{CONTEXT_ROOT}/jmsController

      2016-07-13 23:16:45,519 INFO  [org.jboss.as.messaging] (ServerService Thread Pool -- 106) JBAS011605: Unbound messaging object to jndi name java:/javax.jms.knowledgeSessionLoggingQueue

      2016-07-13 23:16:45,550 INFO  [org.jboss.modcluster] (ServerService Thread Pool -- 149) MODCLUSTER000002: Initiating mod_cluster shutdown

      2016-07-13 23:16:45,565 INFO  [org.jboss.as.connector.deployers.jdbc] (MSC service thread 1-3) JBAS010418: Stopped Driver service with driver-name = h2

      2016-07-13 23:16:45,597 WARN  [org.jboss.modcluster] (ServerService Thread Pool -- 149) MODCLUSTER000033: Failed to interrupt socket reception.: java.net.NoRouteToHostException: No route to host: Datagram send failed

        at java.net.TwoStacksPlainDatagramSocketImpl.send(Native Method) [rt.jar:1.8.0_45]

        at java.net.DatagramSocket.send(DatagramSocket.java:693) [rt.jar:1.8.0_45]

        at org.jboss.modcluster.advertise.impl.AdvertiseListenerImpl.interruptDatagramReader(AdvertiseListenerImpl.java:209)

        at org.jboss.modcluster.advertise.impl.AdvertiseListenerImpl.stop(AdvertiseListenerImpl.java:228)

        at org.jboss.modcluster.advertise.impl.AdvertiseListenerImpl.destroy(AdvertiseListenerImpl.java:244)

        at org.jboss.modcluster.ModClusterService.shutdown(ModClusterService.java:220)

        at org.wildfly.extension.mod_cluster.ContainerEventHandlerService.stop(ContainerEventHandlerService.java:115)

        at org.jboss.as.clustering.msc.AsynchronousService$2.run(AsynchronousService.java:114) [wildfly-clustering-common-8.2.0.Final.jar:8.2.0.Final]

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_45]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_45]

        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_45]

        at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.1.1.Final.jar:2.1.1.Final]

       

       

      2016-07-13 23:16:45,597 WARN  [org.jboss.modcluster.advertise.impl.AdvertiseListenerImpl] (ServerService Thread Pool -- 149) error setting options: java.net.SocketException: error setting options

        at java.net.TwoStacksPlainDatagramSocketImpl.leave(Native Method) [rt.jar:1.8.0_45]

        at java.net.AbstractPlainDatagramSocketImpl.leave(AbstractPlainDatagramSocketImpl.java:187) [rt.jar:1.8.0_45]

        at java.net.MulticastSocket.leaveGroup(MulticastSocket.java:358) [rt.jar:1.8.0_45]

        at org.jboss.modcluster.advertise.impl.AdvertiseListenerImpl.destroy(AdvertiseListenerImpl.java:248)

        at org.jboss.modcluster.ModClusterService.shutdown(ModClusterService.java:220)

        at org.wildfly.extension.mod_cluster.ContainerEventHandlerService.stop(ContainerEventHandlerService.java:115)

        at org.jboss.as.clustering.msc.AsynchronousService$2.run(AsynchronousService.java:114) [wildfly-clustering-common-8.2.0.Final.jar:8.2.0.Final]

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_45]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_45]

        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_45]

        at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.1.1.Final.jar:2.1.1.Final]

       

       

      2016-07-13 23:16:45,675 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) JBAS017532: Host default-host stopping

      2016-07-13 23:16:45,722 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-3) JBAS017521: Undertow AJP listener ajp suspending

      2016-07-13 23:16:45,847 INFO  [org.hibernate.validator.internal.util.Version] (MSC service thread 1-6) HV000001: Hibernate Validator 5.1.3.Final

      2016-07-13 23:16:45,987 INFO  [org.hornetq.ra] (ServerService Thread Pool -- 151) HQ151003: HornetQ resource adaptor stopped

      2016-07-13 23:16:46,003 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-3) JBAS017520: Undertow AJP listener ajp stopped, was bound to node0076.test.na/10.10.24.76:8009

      2016-07-13 23:16:46,128 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-5) JBAS015974: Stopped subdeployment (runtime-name: egpl_jms.war) in 842ms

      2016-07-13 23:16:46,175 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-5) JBAS015877: Stopped deployment eService.ear (runtime-name: eService.ear) in 887ms

      2016-07-13 23:16:46,331 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 151) HQ221002: HornetQ Server version 2.4.5.FINAL (Wild Hornet, 124) [ea7e128b-3ca5-11e6-8f91-edc33fd9a41e] stopped

      2016-07-13 23:16:46,347 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) JBAS017521: Undertow HTTP listener default suspending

      2016-07-13 23:16:46,347 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) JBAS017520: Undertow HTTP listener default stopped, was bound to node0076.test.na/10.10.24.76:9001

      2016-07-13 23:16:46,347 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-4) JBAS017506: Undertow 1.1.0.Final stopping

      2016-07-13 23:16:46,394 INFO  [org.jboss.as] (MSC service thread 1-1) JBAS015950: WildFly 8.2.0.Final "Tweek" stopped in 1026ms