1 Reply Latest reply on Jun 16, 2011 2:27 AM by Clebert Suconic

    Clean up on BridgeImpl

    Clebert Suconic Master

      I'm changing a bit the retry logic on the bridges...

       

       

      Instead of having failover to do it automatically, I'm changing the connection factories to try only once and having the bridge to retry in case of failures.

       

      That completely eliminated the issues we were having with locks since the message is sent inside the queue. Having the handle to perform failover was creating several issues. Besides eventual messages being lost due to delivering state and cancel not being called.

       

       

      So far it seems this is a better approach, However I have a test failure I didn't have time to finish today: OnewayTwoNodeClusterTest::testStartSourceServerBeforeTargetServer

       

       

      I have this on a branch: https://svn.jboss.org/repos/hornetq/branches/Branch_2_2_EAP_cluster_clean2/

       

       

      If Andy Taylor could take a look before I start my day

       

       

      thanks,

       

       

      Clebert

        • 1. Re: Clean up on BridgeImpl
          Clebert Suconic Master

          There's one deadlock I will have to fix:

           

           

             [junit] Java stack information for the threads listed above:

              [junit] ===================================================

              [junit] "Thread-0 (group:HornetQ-server-threads1795346247-91649332)":

              [junit]     at org.hornetq.core.server.cluster.impl.ClusterManagerImpl.removeClusterTopologyListener(ClusterManagerImpl.java:344)

              [junit]     - waiting to lock <0x00000000ece0ed08> (a org.hornetq.core.server.cluster.impl.ClusterManagerImpl)

              [junit]     at org.hornetq.core.protocol.core.impl.CoreProtocolManager$1$2.connectionClosed(CoreProtocolManager.java:133)

              [junit]     at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.callClosingListeners(RemotingConnectionImpl.java:548)

              [junit]     at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.destroy(RemotingConnectionImpl.java:324)

              [junit]     at org.hornetq.core.remoting.server.impl.RemotingServiceImpl.connectionDestroyed(RemotingServiceImpl.java:409)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMAcceptor$Listener.connectionDestroyed(InVMAcceptor.java:233)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMConnection.close(InVMConnection.java:93)

              [junit]     - locked <0x00000000ede903c0> (a org.hornetq.core.remoting.impl.invm.InVMConnection)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMAcceptor.disconnect(InVMAcceptor.java:206)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMConnector$Listener.connectionDestroyed(InVMConnector.java:193)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMConnection.close(InVMConnection.java:93)

              [junit]     - locked <0x00000000ede90188> (a org.hornetq.core.remoting.impl.invm.InVMConnection)

              [junit]     at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.internalClose(RemotingConnectionImpl.java:563)

              [junit]     at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.destroy(RemotingConnectionImpl.java:322)

              [junit]     at org.hornetq.core.client.impl.ClientSessionFactoryImpl.checkCloseConnection(ClientSessionFactoryImpl.java:1019)

              [junit]     at org.hornetq.core.client.impl.ClientSessionFactoryImpl.close(ClientSessionFactoryImpl.java:447)

              [junit]     - locked <0x00000000ede8ef78> (a java.lang.Object)

              [junit]     - locked <0x00000000ede8ef68> (a java.lang.Object)

              [junit]     at org.hornetq.core.server.cluster.impl.BridgeImpl$StopRunnable.run(BridgeImpl.java:798)

              [junit]     at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:100)

              [junit]     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

              [junit]     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

              [junit]     at java.lang.Thread.run(Thread.java:662)

              [junit] "main":

              [junit]     at org.hornetq.core.client.impl.ClientSessionFactoryImpl.close(ClientSessionFactoryImpl.java:427)

              [junit]     - waiting to lock <0x00000000ede8ef68> (a java.lang.Object)

              [junit]     at org.hornetq.core.client.impl.ServerLocatorImpl.close(ServerLocatorImpl.java:1102)

              [junit]     at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl.stop(ClusterConnectionImpl.java:348)

              [junit]     - locked <0x00000000ece1c4e0> (a org.hornetq.core.server.cluster.impl.ClusterConnectionImpl)

              [junit]     at org.hornetq.core.server.cluster.impl.ClusterManagerImpl.stop(ClusterManagerImpl.java:207)

              [junit]     - locked <0x00000000ece0ed08> (a org.hornetq.core.server.cluster.impl.ClusterManagerImpl)

              [junit]     at org.hornetq.core.server.impl.HornetQServerImpl.stop(HornetQServerImpl.java:659)

              [junit]     - locked <0x00000000ec868380> (a org.hornetq.tests.util.ServiceTestBase$InVMNodeManagerServer)

              [junit]     at org.hornetq.core.server.impl.HornetQServerImpl.stop(HornetQServerImpl.java:637)

              [junit]     at org.hornetq.tests.integration.cluster.distribution.ClusterTestBase.stopServers(ClusterTestBase.java:1866)

              [junit]     at org.hornetq.tests.integration.cluster.distribution.ClusterWithBackupTest.stopServers(ClusterWithBackupTest.java:137)

              [junit]     at org.hornetq.tests.integration.cluster.distribution.ClusterWithBackupTest.tearDown(ClusterWithBackupTest.java:52)

              [junit]     at junit.framework.TestCase.runBare(TestCase.java:136)

              [junit]     at junit.framework.TestResult$1.protect(TestResult.java:106)

              [junit]     at junit.framework.TestResult.runProtected(TestResult.java:124)

              [junit]     at junit.framework.TestResult.run(TestResult.java:109)

              [junit]     at junit.framework.TestCase.run(TestCase.java:120)

              [junit]     at junit.framework.TestSuite.runTest(TestSuite.java:230)

              [junit]     at junit.framework.TestSuite.run(TestSuite.java:225)

              [junit]     at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)

              [junit]     at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)

              [junit]     at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)