7 Replies Latest reply on Jul 5, 2012 10:36 AM by mrobson

    HornetQ 2.2.10 RecoveryDiscovery Issue

    mrobson

      I start my live server (no issues in logs) and then I start my backup server, it starts fine, announces backup ext... No issue.

       

      Once its started, I start getting this WARN log every 2m 10s:

       

      2012-07-04 18:56:23,455 WARN  [org.hornetq.jms.server.recovery.RecoveryDiscovery] (HornetQ Recovery Discovery Reinitialization) Couldn't start recovery discovery on XARecoveryConfig [transportConfiguration = [org-hornetq-core-remoting-impl-netty-NettyConnectorFactory], discoveryConfiguration = null, username=null, password=null], we will retry this on the next recovery scan

      2012-07-04 18:58:33,455 WARN  [org.hornetq.jms.server.recovery.RecoveryDiscovery] (HornetQ Recovery Discovery Reinitialization) Couldn't start recovery discovery on XARecoveryConfig [transportConfiguration = [org-hornetq-core-remoting-impl-netty-NettyConnectorFactory], discoveryConfiguration = null, username=null, password=null], we will retry this on the next recovery scan

       

      If I failover from live server to backup server, the error goes away...

       

      I see this failure to connect and then backup server is live and it seems to work fine.

       

      2012-07-04 19:12:40,644 WARN  [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Thread-2 (HornetQ-client-global-threads-1551057432)) Failed to connect to server.

      2012-07-04 19:12:40,654 INFO  [org.hornetq.core.server.impl.HornetQServerImpl] (Activation for server HornetQServerImpl::serverUUID=6aac28d9-c601-11e1-b690-0050563f000b) Backup Server is now live

       

      Then I fail back by starting up the live server again, it fails to announce the backup a few times and then finally announces the backup.

       

      2012-07-04 19:15:31,861 WARN  [org.hornetq.core.server.cluster.impl.ClusterConnectionImpl] (Thread-1 (HornetQ-server-HornetQServerImpl::serverUUID=6aac28d9-c601-11e1-b690-0050563f000b-543275388)) Unable to announce backup, retrying

      HornetQException[errorCode=3 message=Timed out waiting to receive initial broadcast from cluster]

              at org.hornetq.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:726)

              at org.hornetq.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:603)

              at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$2.run(ClusterConnectionImpl.java:485)

              at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:100)

              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

              at java.lang.Thread.run(Thread.java:662)

      2012-07-04 19:15:32,631 INFO  [org.hornetq.core.server.cluster.impl.ClusterConnectionImpl] (Thread-2 (HornetQ-server-HornetQServerImpl::serverUUID=6aac28d9-c601-11e1-b690-0050563f000b-543275388)) backup announced

       

       

      After that happens, I get a similar error to the once I originally had, but there is a lot more to it this time:

       

      2012-07-04 19:15:53,460 WARN  [org.hornetq.jms.server.recovery.RecoveryDiscovery] (HornetQ Recovery Discovery Reinitialization) Couldn't start recovery discovery on XARecoveryConfig [transportConfiguration = [org-hornetq-core-remoting-impl-netty-NettyConnectorFactory], discoveryConfiguration = null, username=null, password=null], we will retry this on the next recovery scan

      2012-07-04 19:15:53,460 WARN  [org.hornetq.jms.server.recovery.HornetQXAResourceWrapper] (Thread-18) Can't connect to XARecoveryConfig [transportConfiguration = [org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=5-27-85-135&tcp-send-buffer-size=700000&tcp-no-delay=true&tcp-receive-buffer-size=700000], discoveryConfiguration = null, username=null, password=null] on auto-generated resource recovery

      HornetQException[errorCode=2 message=Cannot connect to server(s). Tried with all available servers.]

              at org.hornetq.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:784)

              at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.connect(HornetQXAResourceWrapper.java:347)

              at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.getDelegate(HornetQXAResourceWrapper.java:262)

              at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.recover(HornetQXAResourceWrapper.java:76)

              at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecovery(XARecoveryModule.java:773)

              at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.resourceInitiatedRecoveryForRecoveryHelpers(XARecoveryModule.java:712)

              at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkSecondPass(XARecoveryModule.java:201)

              at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:799)

              at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:412)

      2012-07-04 19:15:53,461 WARN  [org.hornetq.jms.server.recovery.HornetQXAResourceWrapper] (Thread-18) Can't connect to any hornetq server on recovery [XARecoveryConfig [transportConfiguration = [org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=5-27-85-135&tcp-send-buffer-size=700000&tcp-no-delay=true&tcp-receive-buffer-size=700000], discoveryConfiguration = null, username=null, password=null]]

      2012-07-04 19:15:53,467 WARN  [com.arjuna.ats.jta.logging.loggerI18N] (Thread-18) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery  got XA exception javax.transaction.xa.XAException: Error trying to connect to any providers for xa recovery, XAException.XAER_RMERR

      2012-07-04 19:15:53,468 WARN  [org.hornetq.jms.server.recovery.HornetQXAResourceWrapper] (Thread-18) Can't connect to XARecoveryConfig [transportConfiguration = [org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=5-27-85-135&tcp-send-buffer-size=700000&tcp-no-delay=true&tcp-receive-buffer-size=700000], discoveryConfiguration = null, username=null, password=null] on auto-generated resource recovery

      HornetQException[errorCode=2 message=Cannot connect to server(s). Tried with all available servers.]

              at org.hornetq.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:784)

              at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.connect(HornetQXAResourceWrapper.java:347)

              at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.getDelegate(HornetQXAResourceWrapper.java:262)

              at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.recover(HornetQXAResourceWrapper.java:76)

              at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecovery(XARecoveryModule.java:797)

              at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.resourceInitiatedRecoveryForRecoveryHelpers(XARecoveryModule.java:712)

              at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkSecondPass(XARecoveryModule.java:201)

              at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:799)

              at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:412)

      2012-07-04 19:15:53,468 WARN  [org.hornetq.jms.server.recovery.HornetQXAResourceWrapper] (Thread-18) Can't connect to any hornetq server on recovery [XARecoveryConfig [transportConfiguration = [org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=5-27-85-135&tcp-send-buffer-size=700000&tcp-no-delay=true&tcp-receive-buffer-size=700000], discoveryConfiguration = null, username=null, password=null]]

       

       

      I can see a lot of changes to RecoveryDiscovery in 2.2.16, but I couldn't find a JIRA for this kind of issue...

       

      Is it a known issue or something which has been fixed?

       

      Thanks

      Matt