3 Replies Latest reply on Dec 18, 2008 11:02 AM by jmesnil

    infinite loop when connecting to a stopped acceptor

    jmesnil

      I'm currently adding management operations to start/stop JBM2 acceptors.

      One of my tests ends up with a infinite loop when connecting to an acceptor that I've just stopped (I was expecting to have an error when creating the connection instead).
      My code is in a middle of a refactoring but I'm able to reproduce it using the trunk code:

       public void testStartStop2() throws Exception
       {
       TransportConfiguration acceptorConfig = new TransportConfiguration(NettyAcceptorFactory.class.getName(),
       new HashMap<String, Object>(),
       randomString());
      
       Configuration conf = new ConfigurationImpl();
       conf.setSecurityEnabled(false);
       conf.setJMXManagementEnabled(false);
       conf.getAcceptorConfigurations().add(acceptorConfig);
       MessagingService service = MessagingServiceImpl.newNullStorageMessagingService(conf);
       service.start();
      
       Connection connection = JMSUtil.createConnection(NettyConnectorFactory.class.getName());
       assertNotNull(connection);
       // create a session to check the connection is working properly
       connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
       connection.close();
      
       // this simulates the management method to stop an acceptor
       Set<Acceptor> acceptors = service.getServer().getRemotingService().getAcceptors();
       assertEquals(1, acceptors.size());
       Acceptor acceptor = acceptors.iterator().next();
       assertNotNull(acceptor);
      
       acceptor.stop();
       try
       {
       connection = JMSUtil.createConnection(NettyConnectorFactory.class.getName());
       assertNotNull(connection);
       connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
       fail("acceptor must not accept connections when stopped");
       }
       catch (Exception e)
       {
       }
      
       /*
       acceptor.start();
      
       connection = JMSUtil.createConnection(NettyConnectorFactory.class.getName());
       assertNotNull(connection);
       connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
       // create a session to check the connection is working properly
       connection.close();
       */
       }
      


      the test is simple:
      - start JBM2 with a single netty acceptor
      - create a connection and check it works
      - stop the (only) acceptor
      - create a connection <- expected failure
      - start the acceptor
      - create a connection and check it works

      the test also gives an infinite loop if I stop the whole messaging service (instead of stopping only the acceptor)

      The logs show that I've a infinite loop when I create the connection after stopping the acceptor:

      11 déc. 2008 18:13:17 org.jboss.messaging.core.logging.Logger info
      INFO: Started messaging server
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Commencing automatic failover / reconnection
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Attempting reconnection
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Successfully reconnected
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger error
      GRAVE: caught exception java.net.ConnectException: Connection refused for channel NioClientSocketChannel(id: 2709c0a8-011e-1000-8cab-013f31dc9386)
      java.net.ConnectException: Connection refused
       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
       at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:527)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:300)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:292)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:231)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
       at java.lang.Thread.run(Thread.java:613)
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger error
      GRAVE: caught exception java.net.ConnectException: Connection refused for channel NioClientSocketChannel(id: 2709c0a8-011e-1000-8cac-013f31dc9386)
      java.net.ConnectException: Connection refused
       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
       at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:527)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:300)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:292)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:231)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
       at java.lang.Thread.run(Thread.java:613)
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Commencing automatic failover / reconnection
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Attempting reconnection
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Successfully reconnected
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger error
      GRAVE: caught exception java.net.ConnectException: Connection refused for channel NioClientSocketChannel(id: 2709c0a8-011e-1000-8cad-013f31dc9386)
      java.net.ConnectException: Connection refused
       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
       at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:527)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:300)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:292)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:231)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
       at java.lang.Thread.run(Thread.java:613)
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Commencing automatic failover / reconnection
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Attempting reconnection
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Successfully reconnected
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Commencing automatic failover / reconnection
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Attempting reconnection
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Successfully reconnected
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger error
      GRAVE: caught exception java.net.ConnectException: Connection refused for channel NioClientSocketChannel(id: 2709c0a8-011e-1000-8cae-013f31dc9386)
      java.net.ConnectException: Connection refused
       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
       at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:527)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:300)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:292)
       at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:231)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
       at java.lang.Thread.run(Thread.java:613)
      11 déc. 2008 18:13:18 org.jboss.messaging.core.logging.Logger info
      INFO: Commencing automatic failover / reconnection
      


      and it continues like that on and on.

      The repeated "Successfully reconnected" log are weird: there seems to be a code path that we're missing

        • 1. Re: infinite loop when connecting to a stopped acceptor
          jmesnil

          I've isolated the problem: when there is no server to connect to, we have the infinite loop:

           public void testStartStop3() throws Exception
           {
           /***********************************/
           /* No JBM Server have been started */
           /***********************************/
          
           JBossConnectionFactory cf = new JBossConnectionFactory(new TransportConfiguration(NettyConnectorFactory.class.getName()),
           null,
           DEFAULT_CONNECTION_LOAD_BALANCING_POLICY_CLASS_NAME,
           DEFAULT_PING_PERIOD,
           DEFAULT_CONNECTION_TTL,
           DEFAULT_CALL_TIMEOUT,
           null,
           DEFAULT_ACK_BATCH_SIZE,
           DEFAULT_ACK_BATCH_SIZE,
           DEFAULT_CONSUMER_WINDOW_SIZE,
           DEFAULT_CONSUMER_MAX_RATE,
           DEFAULT_SEND_WINDOW_SIZE,
           DEFAULT_PRODUCER_MAX_RATE,
           DEFAULT_MIN_LARGE_MESSAGE_SIZE,
           DEFAULT_BLOCK_ON_ACKNOWLEDGE,
           DEFAULT_BLOCK_ON_NON_PERSISTENT_SEND,
           true,
           DEFAULT_AUTO_GROUP,
           DEFAULT_MAX_CONNECTIONS,
           DEFAULT_PRE_ACKNOWLEDGE,
           DEFAULT_RETRY_INTERVAL,
           DEFAULT_RETRY_INTERVAL_MULTIPLIER,
           DEFAULT_MAX_RETRIES_BEFORE_FAILOVER,
           DEFAULT_MAX_RETRIES_AFTER_FAILOVER);
           Connection connection = cf.createConnection();
           assertNotNull(connection);
          
           try
           {
           connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
           fail("acceptor must not accept connections when stopped");
           }
           catch (Exception e)
           {
           }
           }
          


          The code runs in an infinite loop in ConnectionManagerImpl.getConnectionForCreateSession: the failover() method returns true while there has been no failover.

          I'm continuing the investigation

          • 2. Re: infinite loop when connecting to a stopped acceptor
            jmesnil

            I got it: the infinite loop happens when
            - no backup connector is specified
            - *and* the default values for failover's max retries are used (in fact any values > 0)

            I'm fixing the failover algorithm to check this case

            • 3. Re: infinite loop when connecting to a stopped acceptor
              jmesnil

              I've added a test to show the infinite loop in org.jboss.messaging.tests.integration.basic.ClientSessionFactoryImplTest

              The test is named _testCreateSessionFailureWithDefaultValuesWhenNoServer with a leading '_' so that it is not run by our test suites