2 Replies Latest reply on Jul 12, 2011 5:01 PM by zohar_melamed

    Receive failure with Large Messages

    zohar_melamed

      Hi

       

      We are seeing a failure secnario where there are some ( always seem to be 31 for some reason ) messages left in flight that are never processed.

       

      The server thread dump looks fine and the server is able to do new jobs.

       

      The client thread dump shows a thread waiting mid read :

       

      "SingleThreadedMessageSubscriber-77" - Thread t@101

         java.lang.Thread.State: TIMED_WAITING

                at java.lang.Object.wait(Native Method)

                - waiting on <acafce> (a org.hornetq.core.client.impl.LargeMessageControllerImpl)

                at org.hornetq.core.client.impl.LargeMessageControllerImpl.waitCompletion(LargeMessageControllerImpl.java:304)

                at org.hornetq.core.client.impl.LargeMessageControllerImpl.saveBuffer(LargeMessageControllerImpl.java:283)

                at org.hornetq.core.client.impl.ClientLargeMessageImpl.checkBuffer(ClientLargeMessageImpl.java:201)

                at org.hornetq.core.client.impl.ClientLargeMessageImpl.getBodyBuffer(ClientLargeMessageImpl.java:101)

                at org.hornetq.jms.client.HornetQMessage.doBeforeReceive(HornetQMessage.java:890)

                at org.hornetq.jms.client.HornetQTextMessage.doBeforeReceive(HornetQTextMessage.java:141)

                at org.hornetq.jms.client.HornetQMessageConsumer.getMessage(HornetQMessageConsumer.java:236)

                at org.hornetq.jms.client.HornetQMessageConsumer.receive(HornetQMessageConsumer.java:133)

                at giraffe.messaging.jms.JmsSessionProvider$1.receive(JmsSessionProvider.java:162)

                at giraffe.messaging.jms.MessageSubscriberWithQuota.consume(MessageSubscriberWithQuota.java:39)

                at giraffe.messaging.RoundRobinMessageSubscriber.consume(RoundRobinMessageSubscriber.java:23)

                at giraffe.messaging.SingleThreadedMessageSubscriber$1.run(SingleThreadedMessageSubscriber.java:31)

                at java.lang.Thread.run(Thread.java:662)

       

       

      the majority of the other hornet threads look like so ( there are many of those as we have many consumers  ) :

       

       

       

      "SingleThreadedMessageSubscriber-86" - Thread t@110

         java.lang.Thread.State: TIMED_WAITING

                at java.lang.Object.wait(Native Method)

                - waiting on <1af6add> (a org.hornetq.core.client.impl.ClientConsumerImpl)

                at org.hornetq.core.client.impl.ClientConsumerImpl.receive(ClientConsumerImpl.java:239)

                at org.hornetq.core.client.impl.ClientConsumerImpl.receive(ClientConsumerImpl.java:358)

                at org.hornetq.jms.client.HornetQMessageConsumer.getMessage(HornetQMessageConsumer.java:224)

                at org.hornetq.jms.client.HornetQMessageConsumer.receive(HornetQMessageConsumer.java:133)

                at giraffe.messaging.jms.JmsSessionProvider$1.receive(JmsSessionProvider.java:162)

                at giraffe.messaging.jms.MessageSubscriberWithQuota.consume(MessageSubscriberWithQuota.java:39)

                at giraffe.messaging.RoundRobinMessageSubscriber.consume(RoundRobinMessageSubscriber.java:23)

                at giraffe.messaging.SingleThreadedMessageSubscriber$1.run(SingleThreadedMessageSubscriber.java:31)

                at java.lang.Thread.run(Thread.java:662)

       

       

         Locked ownable synchronizers:

                - None

       

      Other hornet threads :

       

      "Thread-1 (group:HornetQ-client-global-scheduled-threads-2883071)" - Thread t@288

         java.lang.Thread.State: TIMED_WAITING

                at sun.misc.Unsafe.park(Native Method)

                - parking to wait for <1785e26> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

                at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)

                at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)

                at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)

                at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)

                at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)

                at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)

                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

                at java.lang.Thread.run(Thread.java:662)

       

       

       

      "Thread-0 (group:HornetQ-client-global-scheduled-threads-2883071)" - Thread t@22

         java.lang.Thread.State: TIMED_WAITING

                at sun.misc.Unsafe.park(Native Method)

                - parking to wait for <1785e26> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

                at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)

                at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)

                at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)

                at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)

                at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)

                at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)

                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

                at java.lang.Thread.run(Thread.java:662)

       

       

         Locked ownable synchronizers:

                - None

       

       

       

         Locked ownable synchronizers:

                - None

       

       

      "Old I/O client worker ([id: 0x0126382b, /159.156.0.180:20918 => slon19p10701b.csfb.cs-group.com/159.156.0.182:61627])" - Thread t@287

         java.lang.Thread.State: RUNNABLE

                at java.net.SocketInputStream.socketRead0(Native Method)

                at java.net.SocketInputStream.read(SocketInputStream.java:129)

                at java.net.SocketInputStream.read(SocketInputStream.java:182)

                at java.io.FilterInputStream.read(FilterInputStream.java:66)

                at java.io.PushbackInputStream.read(PushbackInputStream.java:122)

                at org.jboss.netty.channel.socket.oio.OioWorker.run(OioWorker.java:76)

                at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)

                at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)

                at org.jboss.netty.util.VirtualExecutorService$ChildExecutorRunnable.run(VirtualExecutorService.java:181)

                at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

                at java.lang.Thread.run(Thread.java:662)

       

       

         Locked ownable synchronizers:

                - locked <1d92c7d> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

       

       

       

       

      We can not see in the log any exception that would lead to this issue.

       

       

      Any ideas/ thoughts/direction welcome

       

      Zohar

        • 1. Re: Receive failure with Large Messages
          clebert.suconic

          I - Are you acknowledging the messages?

          II - if using transactions how you are committing?

          III - What max-size and page-mode are you using?

           

           

          If you answer these questions, maybe you find the issue yourself.

          • 2. Re: Receive failure with Large Messages
            zohar_melamed

            Thanks for the reply Clebert.

             

            We use neither transactions nor persistenet messages

            We use auto acknowledge

            We use the defaults for most things so we do not set max-size and page-mode

            (Couldnt find a reference to page-mode in the docs?)

             

            We have pre-fetch switched off ( set consumer-window-size to 0 )

             

            Ill have a read of the details for the latter.

             

            I changed the client ( as i did for the server yesterday due to the dead lock ) to not use NIO and we have not seen this issue today.