4 Replies Latest reply on Jun 7, 2012 7:59 PM by clebert.suconic

    cluster queue message distribution stop on failure?

    ekfliu

      I have setup a cluster queue with HA on jboss 6. One issue I have is the message redistribution stop when ever one of the server throws an exception on message consumption. Usually with a log like this

       

      WARN  [org.hornetq.core.server.impl.QueueImpl] removing consumer which did not handle a message, consumer=org.hornetq.core.server.cluster.impl.ClusterConnectionBridge@5852cfc2, message=Reference[1614]:RELIABLE:ServerMessage[messageID=1614,priority=4,expiration=[null], durable=true, address=jms.queue.luceneQueue,properties=TypedProperties[{_HQ_LARGE_SIZE=179288, _HQ_ROUTE_TOsf.my-cluster.d74ec64b-d424-11e0-9efe-005056a50009=[B@7cdff405}]]: java.lang.NullPointerException

       

      How can I stop this from happening? If I temporary remove a resource from a server, like database or network, it should not cause the entire queue to shutdown. I would expect the message to send back to queue and redelivered later.

        • 1. Re: cluster queue message distribution stop on failure?
          ekfliu

          Googling seem to indicate the cluster core bridge will attempt to connect to a server that is in a cluster. If you shutdown the server it is no longer part of the cluster and the core bridge will no longer connect. But If I restart the server again the Bridge itself should attempt to reconnect to the now restarted server. But I dont think this is happening.

           

          The above error message seems to indicate the cluster core bridge itself is being kick off as a consumer from the queue itself. So I am not sure if that really is the reason.

          • 2. Re: cluster queue message distribution stop on failure?
            clebert.suconic

            Coincidently: I'm currently fixing an issue with LargeMessages on clustering. I should update this soon.

            • 3. Re: cluster queue message distribution stop on failure?
              safetytrick

              I'm seeing a similar error using hornetq 2.2.13. I have a topic that will occasionally throw the error pasted below. When the error shows up the connection between the server that threw the error and the I would guess whatever connection it was on when the error was thrown is severed. Messages no longer pass from A -> B, messages from B -> A are processed normally. The problem does not occur often, so far only a few times in weeks of usage. Bouncing server A fixes the problem.

               

              Michael

               

              11:26:00,646 WARN  [org.hornetq.core.server.impl.QueueImpl] (Thread-5 (HornetQ-server-HornetQServerImpl::serverUUID=8893baec-a228-11e1-ab74-00163e27a659-1239330288))  removing consumer which did not handle a message, consumer=ClusterConnectionBridge@3b59d796 [name=sf.default-cluster-connection.88e3d6a5-a228-11e1-89f9-00163e6b2332,  queue=QueueImpl[name=sf.default-cluster-connection.88e3d6a5-a228-11e1-89f9-00163e6b2332,  postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=8893baec-a228-11e1-ab74-00163e27a659]]@758d74b  targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@3b59d796 [name=sf.default-cluster-connection.88e3d6a5-a228-11e1-89f9-00163e6b2332,  queue=QueueImpl[name=sf.default-cluster-connection.88e3d6a5-a228-11e1-89f9-00163e6b2332,  postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=8893baec-a228-11e1-ab74-00163e27a659]]@758d74b  targetConnector=ServerLocatorImpl [initialConnectors=[org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=app01-dot8-va-us-attask-com],  discoveryGroupConfiguration=null]]::ClusterConnectionImpl@629106582[nodeUUID=8893baec-a228-11e1-ab74-00163e27a659,  connector=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=app02-dot8-va-us-attask-com,  address=jms, server=HornetQServerImpl::serverUUID=8893baec-a228-11e1-ab74-00163e27a659]))  [initialConnectors=[org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=app01-dot8-va-us-attask-com],  discoveryGroupConfiguration=null]], message=Reference[827145]:RELIABLE:ServerMessage[messageID=827145,priority=4,  bodySize=24000,expiration=0, durable=true, address=jms.topic.topic/toplinkSynchronization,properties=TypedProperties[{_HQ_ROUTE_TOsf.default-cluster-connection.88e3d6a5-a228-11e1-89f9-00163e6b2332=[B@3dcf748f,  _HQ_ROUTE_TOsf.default-cluster-connection.877245b6-a228-11e1-92e6-52540025082b=[B@7eed77fc}]]@408978046:  java.lang.IndexOutOfBoundsException

                  at org.jboss.netty.buffer.AbstractChannelBuffer.setIndex(AbstractChannelBuffer.java:67) [netty-3.2.6.Final.jar:]

                  at org.hornetq.core.buffers.impl.ChannelBufferWrapper.setIndex(ChannelBufferWrapper.java:497) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.message.impl.MessageImpl.<init>(MessageImpl.java:182) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.message.impl.MessageImpl.<init>(MessageImpl.java:146) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.ServerMessageImpl.<init>(ServerMessageImpl.java:90) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.ServerMessageImpl.copy(ServerMessageImpl.java:206) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.cluster.impl.ClusterConnectionBridge.beforeForward(ClusterConnectionBridge.java:184) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.cluster.impl.BridgeImpl.handle(BridgeImpl.java:548) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.QueueImpl.handle(QueueImpl.java:2195) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.QueueImpl.deliver(QueueImpl.java:1746) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.QueueImpl.doPoll(QueueImpl.java:1625) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.QueueImpl.access$1300(QueueImpl.java:77) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.QueueImpl$ConcurrentPoller.run(QueueImpl.java:2482) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:100) [hornetq-core-2.2.13.Final.jar:]

                  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) [rt.jar:1.6.0_30]

                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [rt.jar:1.6.0_30]

                  at java.lang.Thread.run(Unknown Source) [rt.jar:1.6.0_30]

               

              11:26:00,666 ERROR [org.hornetq.utils.OrderedExecutorFactory] (Thread-5 (HornetQ-server-HornetQServerImpl::serverUUID=8893baec-a228-11e1-ab74-00163e27a659-1239330288)) Caught unexpected Throwable: java.util.NoSuchElementException

                  at org.hornetq.utils.PriorityLinkedListImpl$PriorityLinkedListIterator.repeat(PriorityLinkedListImpl.java:189) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.QueueImpl.deliver(QueueImpl.java:1763) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.QueueImpl.doPoll(QueueImpl.java:1625) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.QueueImpl.access$1300(QueueImpl.java:77) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.core.server.impl.QueueImpl$ConcurrentPoller.run(QueueImpl.java:2482) [hornetq-core-2.2.13.Final.jar:]

                  at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:100) [hornetq-core-2.2.13.Final.jar:]

                  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) [rt.jar:1.6.0_30]

                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [rt.jar:1.6.0_30]

                  at java.lang.Thread.run(Unknown Source) [rt.jar:1.6.0_30]

              • 4. Re: cluster queue message distribution stop on failure?
                clebert.suconic

                I'm not aware of such issue. (it's probably similar to your other post here: https://community.jboss.org/thread/170127?tstart=0)