4 Replies Latest reply on Aug 11, 2010 2:08 AM by jimmy.hui

    Stuck JMS message queues on unreliable networks?

    jimmy.hui

      Hi,

       

      We're currently having an issue with JMS queues getting "stuck". That is messages are not delivered and trying to access the Queues via the JMX console will refuse to load the mbean page. We have the following setup:

       

      - JBoss 5.0.1GA Server 1

      - HornetQ 2.1.0 Final

      - Queue1

       

      - JBoss 5.0.1GA Server 2

      - HornetQ 2.1.0 Final

      - Queue1

       

      And then a Core Bridge linking the two queues together.

       

      We're running mostly default configurations for HornetQ besides adding our own JMS queues and core bridges.

       

      Now this works perfectly fine on a reliable network connection (WAN) however when we introduce the server to an unreliable and slow network we consistently run into this problem where the queues get "stuck". Now sometimes they will automatically fix themselves (wait ~20minutes) and the queues will run again, but sometimes they won't. And you can't access the mbean via the JMXConsole in JBoss, trying to load the mbean page for that Queue will time-out. When the queues are running fine the mbeans in the JMX Console work normally. You can access the mbean for the core bridge but HornetQ will report a timed out trying to stop the bridge.

       

      Is this an issue with how the message is delivered if it fails to properly deliver on an unreliable network? Or an issue with the core bridges? Or is there something that I haven't configured that could help in unreliable network connections?

       

      Cheers,

      Jimmy

        • 1. Re: Stuck JMS message queues on unreliable networks?
          clebert.suconic
          • 2. Re: Stuck JMS message queues on unreliable networks?
            andreas_back

            Hello Jimmy,

             

            your case sounds to be different but you may check if you have multiple consumers with different selectors.

             

            If this is true then

             

                 https://jira.jboss.org/browse/HORNETQ-469

             

            and the related thread could be of interest to you.

             

             

            Best regards,

             

            Andreas

            • 3. Re: Stuck JMS message queues on unreliable networks?
              jimmy.hui

              Ah, we restrict 1 consumer per queue unfortunately. The issue I'm experiencing is similar to http://community.jboss.org/message/549197#549197 however we're using Core Bridges instead of JMS Bridges. We get the same scenario where we can't view the queue in the jmx console and the bridge appears stuck. As it works reliably on a good network connection i was wondering what the delivery process was like when packets are loss etc. I shall have a play with TTL and lower it and see if messages/queues seem more responsive/reliable

              • 4. Re: Stuck JMS message queues on unreliable networks?
                jimmy.hui

                From further testing today, 2 messages, each 4mb in size were sent to 4 different JBoss servers. 1 with 1mb link, 1 with 512kbs link, and 2 with 256kbs links. The messages were sent to the 2 faster networked servers however they seem to have gotten stuck on the 256kbs links. I left them there overnight and the messages had still yet to be delivered. Sending and recieving small messages on other queues on these 2 slower servers were still working however.

                 

                There was an exception thrown due to NettyConnection error:

                 

                ERROR [org.hornetq.core.remoting.impl.netty.NettyConnector] Failed to create netty connection
                java.net.SocketTimeoutException: connect timed out
                     at java.net.PlainSocketImpl.socketConnect(Native Method)
                     at java.net.PLainSocketImpl.doConnect(PlainSocketImpl.java:333)
                     at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
                     at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
                     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
                     at java.net.Socket.connect(Socket.java:529)
                     at org.jboss.netty.channel.socket.oio.OioClientSocketPipelineSink.connect(OioClientSocketPipelineSink.java:114)
                     at org.jboss.netty.channel.socket.oio.OioClientSocketPipelineSink.eventSunk(OioClientSocketPipelineSink.java:74)
                     at org.jboss.netty.channel.Channels.connect(Channels.java:541)
                     at org.jboss.netty.channel.AbstractChannel.connect(AbstractChannel.java:217)
                     at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBoostrap.java:227)
                     at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBoostrap.java:188)
                     at org.hornetq.core.remoting.impl.netty.NettyConnector.createConnection(NettyConnector.java:447)
                     at org.hornetq.core.client.impl.FailoverManagerImpl.getConnection(FailoverManagerImpl.java:950)
                     etc...
                


                I'm not sure if that exception explains why the messages weren't delivered as I assume the connection would be re-established