1 Reply Latest reply on Jun 16, 2011 2:27 AM by clebert.suconic

    Clean up on BridgeImpl

    clebert.suconic

      I'm changing a bit the retry logic on the bridges...

       

       

      Instead of having failover to do it automatically, I'm changing the connection factories to try only once and having the bridge to retry in case of failures.

       

      That completely eliminated the issues we were having with locks since the message is sent inside the queue. Having the handle to perform failover was creating several issues. Besides eventual messages being lost due to delivering state and cancel not being called.

       

       

      So far it seems this is a better approach, However I have a test failure I didn't have time to finish today: OnewayTwoNodeClusterTest::testStartSourceServerBeforeTargetServer

       

       

      I have this on a branch: https://svn.jboss.org/repos/hornetq/branches/Branch_2_2_EAP_cluster_clean2/

       

       

      If Andy Taylor could take a look before I start my day

       

       

      thanks,

       

       

      Clebert

        • 1. Re: Clean up on BridgeImpl
          clebert.suconic

          There's one deadlock I will have to fix:

           

           

             [junit] Java stack information for the threads listed above:

              [junit] ===================================================

              [junit] "Thread-0 (group:HornetQ-server-threads1795346247-91649332)":

              [junit]     at org.hornetq.core.server.cluster.impl.ClusterManagerImpl.removeClusterTopologyListener(ClusterManagerImpl.java:344)

              [junit]     - waiting to lock <0x00000000ece0ed08> (a org.hornetq.core.server.cluster.impl.ClusterManagerImpl)

              [junit]     at org.hornetq.core.protocol.core.impl.CoreProtocolManager$1$2.connectionClosed(CoreProtocolManager.java:133)

              [junit]     at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.callClosingListeners(RemotingConnectionImpl.java:548)

              [junit]     at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.destroy(RemotingConnectionImpl.java:324)

              [junit]     at org.hornetq.core.remoting.server.impl.RemotingServiceImpl.connectionDestroyed(RemotingServiceImpl.java:409)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMAcceptor$Listener.connectionDestroyed(InVMAcceptor.java:233)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMConnection.close(InVMConnection.java:93)

              [junit]     - locked <0x00000000ede903c0> (a org.hornetq.core.remoting.impl.invm.InVMConnection)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMAcceptor.disconnect(InVMAcceptor.java:206)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMConnector$Listener.connectionDestroyed(InVMConnector.java:193)

              [junit]     at org.hornetq.core.remoting.impl.invm.InVMConnection.close(InVMConnection.java:93)

              [junit]     - locked <0x00000000ede90188> (a org.hornetq.core.remoting.impl.invm.InVMConnection)

              [junit]     at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.internalClose(RemotingConnectionImpl.java:563)

              [junit]     at org.hornetq.core.protocol.core.impl.RemotingConnectionImpl.destroy(RemotingConnectionImpl.java:322)

              [junit]     at org.hornetq.core.client.impl.ClientSessionFactoryImpl.checkCloseConnection(ClientSessionFactoryImpl.java:1019)

              [junit]     at org.hornetq.core.client.impl.ClientSessionFactoryImpl.close(ClientSessionFactoryImpl.java:447)

              [junit]     - locked <0x00000000ede8ef78> (a java.lang.Object)

              [junit]     - locked <0x00000000ede8ef68> (a java.lang.Object)

              [junit]     at org.hornetq.core.server.cluster.impl.BridgeImpl$StopRunnable.run(BridgeImpl.java:798)

              [junit]     at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:100)

              [junit]     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

              [junit]     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

              [junit]     at java.lang.Thread.run(Thread.java:662)

              [junit] "main":

              [junit]     at org.hornetq.core.client.impl.ClientSessionFactoryImpl.close(ClientSessionFactoryImpl.java:427)

              [junit]     - waiting to lock <0x00000000ede8ef68> (a java.lang.Object)

              [junit]     at org.hornetq.core.client.impl.ServerLocatorImpl.close(ServerLocatorImpl.java:1102)

              [junit]     at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl.stop(ClusterConnectionImpl.java:348)

              [junit]     - locked <0x00000000ece1c4e0> (a org.hornetq.core.server.cluster.impl.ClusterConnectionImpl)

              [junit]     at org.hornetq.core.server.cluster.impl.ClusterManagerImpl.stop(ClusterManagerImpl.java:207)

              [junit]     - locked <0x00000000ece0ed08> (a org.hornetq.core.server.cluster.impl.ClusterManagerImpl)

              [junit]     at org.hornetq.core.server.impl.HornetQServerImpl.stop(HornetQServerImpl.java:659)

              [junit]     - locked <0x00000000ec868380> (a org.hornetq.tests.util.ServiceTestBase$InVMNodeManagerServer)

              [junit]     at org.hornetq.core.server.impl.HornetQServerImpl.stop(HornetQServerImpl.java:637)

              [junit]     at org.hornetq.tests.integration.cluster.distribution.ClusterTestBase.stopServers(ClusterTestBase.java:1866)

              [junit]     at org.hornetq.tests.integration.cluster.distribution.ClusterWithBackupTest.stopServers(ClusterWithBackupTest.java:137)

              [junit]     at org.hornetq.tests.integration.cluster.distribution.ClusterWithBackupTest.tearDown(ClusterWithBackupTest.java:52)

              [junit]     at junit.framework.TestCase.runBare(TestCase.java:136)

              [junit]     at junit.framework.TestResult$1.protect(TestResult.java:106)

              [junit]     at junit.framework.TestResult.runProtected(TestResult.java:124)

              [junit]     at junit.framework.TestResult.run(TestResult.java:109)

              [junit]     at junit.framework.TestCase.run(TestCase.java:120)

              [junit]     at junit.framework.TestSuite.runTest(TestSuite.java:230)

              [junit]     at junit.framework.TestSuite.run(TestSuite.java:225)

              [junit]     at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)

              [junit]     at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)

              [junit]     at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)