How often do you see this message? The message itself says, "will attempt reconnect on next pass," so I'm curious if it actually does reconnect after a few warnings.
The JBoss server log is as given below. While attempting to reconnect on next pass, XA exception occurs.. And it stops receiving events from the remote HornetQ cluster.
23:57:52,009 WARN [org.hornetq.jms.server.recovery.HornetQXAResourceWrapper] (Thread-197 (HornetQ-client-global-threads-1956891780)) Notified of connection failure in xa recovery connectionFactory for provider ClientSessionFactoryImpl [serverLocator=ServerLocatorImpl [initialConnectors=[org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=10-252-122-239], discoveryGroupConfiguration=DiscoveryGroupConfiguration [discoveryInitialWaitTimeout=10000, groupAddress=231.7.7.8, groupPort=9879, localBindAddress=null, name=5198dc46-710e-11e3-bfe1-00505689359f, refreshTimeout=10000]], connectorConfig=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=10-252-122-239, backupConfig=null] will attempt reconnect on next pass: HornetQException[errorCode=2 message=Channel disconnected]
at org.hornetq.core.client.impl.ClientSessionFactoryImpl.connectionDestroyed(ClientSessionFactoryImpl.java:380) [hornetq-core-2.2.13.Final.jar:]
at org.hornetq.core.remoting.impl.netty.NettyConnector$Listener$1.run(NettyConnector.java:711) [hornetq-core-2.2.13.Final.jar:]
at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:100) [hornetq-core-2.2.13.Final.jar:]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_21]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_21]
at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_21]
23:57:52,535 WARN [org.hornetq.core.cluster.impl.DiscoveryGroupImpl] (hornetq-discovery-group-thread-51bda25a-710e-11e3-bfe1-00505689359f) There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=ff558ad8-67be-11e3-9ca1-d78e71deb2e5
23:57:52,536 WARN [org.hornetq.core.cluster.impl.DiscoveryGroupImpl] (hornetq-discovery-group-thread-51bd2d28-710e-11e3-bfe1-00505689359f) There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=ff558ad8-67be-11e3-9ca1-d78e71deb2e5
23:57:52,538 WARN [org.hornetq.core.cluster.impl.DiscoveryGroupImpl] (hornetq-discovery-group-thread-5198dc44-710e-11e3-bfe1-00505689359f) There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=ff558ad8-67be-11e3-9ca1-d78e71deb2e5
23:57:57,873 WARN [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016027: Local XARecoveryModule.xaRecovery got XA exception XAException.XAER_RMERR: javax.transaction.xa.XAException: Error trying to connect to any providers for xa recovery
at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.getDelegate(HornetQXAResourceWrapper.java:275) [hornetq-jms-2.2.13.Final.jar:]
at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.recover(HornetQXAResourceWrapper.java:77) [hornetq-jms-2.2.13.Final.jar:]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecovery(XARecoveryModule.java:503) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.resourceInitiatedRecoveryForRecoveryHelpers(XARecoveryModule.java:471) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.bottomUpRecovery(XARecoveryModule.java:385) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkSecondPass(XARecoveryModule.java:166) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:789) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:371) [jbossjts-4.16.2.Final.jar:]
Caused by: java.lang.IllegalStateException: Cannot create session factory, server locator is closed (maybe it has been garbage collected)
at org.hornetq.core.client.impl.ServerLocatorImpl.assertOpen(ServerLocatorImpl.java:1823) [hornetq-core-2.2.13.Final.jar:]
at org.hornetq.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:699) [hornetq-core-2.2.13.Final.jar:]
at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.connect(HornetQXAResourceWrapper.java:321) [hornetq-jms-2.2.13.Final.jar:]
at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.getDelegate(HornetQXAResourceWrapper.java:251) [hornetq-jms-2.2.13.Final.jar:]
If you really want fail-over functionality (which it appears that you do) you need to enable it by using <ha>true</ha> on the pooled-connection-factory.
Even after adding this tag (<ha>true</ha>), the JBoss server fails to get the messages after the very first failover happens at Hornetq cluster. i.e JBoss gets messages till the first failover of HornetQ happens after the JBoss has started.
Still it is unable to find what resource? Are you talking about the previous WARN message from HornetQXAResourceWrapper? Are you seeing any functional impact from that? Are you actually using XA transactions in your application?
Sorry for being unclear about my statement. What I meant was when the resource-adapter tag is not provided, the above mentioned exception happens. On providing resource-adapter tag, the above mentioned XA Exception does not occur instead the below shown message comes thrice. Still it stops receiving further messages from the HornetQ cluster. So the disturbing part is that a failover affects the functionality and to regain the functionality JBoss server has to be restarted. I dont think we are using any XA transactions in our application as of now. But we may use it in future. So that was the reason to use XA. If we can get this solved by avoiding XA transactions, then that would also be fine with us for time being.
00:35:14,356 WARN [org.hornetq.core.cluster.impl.DiscoveryGroupImpl] (hornetq-discovery-group-thread-013546bf-7114-11e3-9370-00505689359f) There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=ff558ad8-67be-11e3-9ca1-d78e71deb2e5
On enabling DEBUG logs, the following log was seen in the JBoss server.log. The blue color highlighted text in the log shows that JBoss server is able to detect the live and backup server details. However, the red text shows that backup configuration is null. Is it because the backupConfig is being set before the actual details are obtained via discovery?
04:29:05,182 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Periodic Recovery) Trying to connect with connector = org.hornetq.core.remoting.impl.netty.NettyConnectorFactory@7b4ad4ae, parameters = {port=5445, host=10.252.122.239} connector = NettyConnector [host=10.252.122.239, port=5445, httpEnabled=false, useServlet=false, servletPath=/messaging/HornetQServlet, sslEnabled=false, useNio=false]
04:29:05,183 DEBUG [org.hornetq.core.remoting.impl.netty.NettyConnector] (Periodic Recovery) Started Netty Connector version 3.2.5.Final-a96d88c
04:29:05,183 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Periodic Recovery) Trying to connect at the main server using connector :org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=10-252-122-239
04:29:05,186 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Periodic Recovery) Reconnection successfull
04:29:05,186 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Periodic Recovery) ClientSessionFactoryImpl received backup update for live/backup pair = org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=10-252-122-239 / null but it didn't belong to org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=10-252-122-239
04:29:05,189 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Old I/O client worker ([id: 0x6dd12abe, /10.252.122.248:58338 => /10.252.122.239:5445])) Node ff558ad8-67be-11e3-9ca1-d78e71deb2e5 going up, connector = Pair[a=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=10-252-122-239, b=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5446&host=10-252-122-240], isLast=true csf created at
serverLocator=ServerLocatorImpl [initialConnectors=[org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=10-252-122-239], discoveryGroupConfiguration=DiscoveryGroupConfiguration [discoveryInitialWaitTimeout=10000, groupAddress=231.7.7.8, groupPort=9879, localBindAddress=null, name=f6e95cf7-71fd-11e3-bf8f-00505689359f, refreshTimeout=10000]]: java.lang.Exception
at org.hornetq.core.client.impl.ClientSessionFactoryImpl.<init>(ClientSessionFactoryImpl.java:180) [hornetq-core-2.2.13.Final.jar:]
at org.hornetq.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:732) [hornetq-core-2.2.13.Final.jar:]
at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.connect(HornetQXAResourceWrapper.java:321) [hornetq-jms-2.2.13.Final.jar:]
at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.getDelegate(HornetQXAResourceWrapper.java:251) [hornetq-jms-2.2.13.Final.jar:]
at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.recover(HornetQXAResourceWrapper.java:77) [hornetq-jms-2.2.13.Final.jar:]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecovery(XARecoveryModule.java:503) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.resourceInitiatedRecoveryForRecoveryHelpers(XARecoveryModule.java:471) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.bottomUpRecovery(XARecoveryModule.java:385) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkSecondPass(XARecoveryModule.java:166) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:789) [jbossjts-4.16.2.Final.jar:]
at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:371) [jbossjts-4.16.2.Final.jar:]