5 Replies Latest reply on Jun 10, 2011 11:05 AM by clebert.suconic

    messages stuck in "delivering" state

    carl.heymann

      Hi

       

      We are trying to swap out JBM for HornetQ, and have it running in our system test

      environment at the moment. HornetQ is in standalone, non-clustered mode with most

      default configuration settings still in place.

       

      The system is set up as services, each with a

      queue and a consumer pool. Consumers process messages in JMS transactions

      (session transacted). When problem occurs while processing, then the JMS

      transaction is completed, normally, but a message is sent back to the same

      queue, but with the _HQ_SCHED_DELIVERY property set to a time in the future.

      The idea is that the same request is attempted to be processed again at a later

      stage. We control the max scheduled redelivery attempts manually, and back off

      the redelivery time manually.

       

      We had a lot of these failures today due to an external resource that was

      failing. This caused many such scheduled messages to pile up in the queue.

      When the resource was restored, the scheduled messages started getting

      processed, as their scheduled times arrived. However, at the end there were

      still 8 messages in the "DeliveringCount". Even if I restart all consumers,

      this doesn't change. The state is currently:

       

      ConsumerCount    1

      DeliveringCount  8

      MessageCount    19

       

       

      What could cause the messages to stay in the "delivering" state, even if all

      consumers are restarted?

       

      Some more background:

      - initially hornet was configured to BLOCK instead of PAGE, which caused

        the system to hang once the dead letter queue filled up, as well as the service

        queue that had problems (because we manuallysend to the DLQ once the

        redelivery attempts run out). In this blocked state, many queues had

         high "DeliveringCount" values.

      - I then changed it to PAGE, which loosened up the system and let the

         messages move to the DLQ, and new messages could be accepted into the

         service queues.

      - While the system was processing these messages, I stopped hornet,

         increased the max size and max page size, and started hornetq again.

        There were some warnings like this:

       

      WARNING [org.hornetq.core.paging.cursor.impl.PageSubscriptionImpl]  Couldn't locate page transaction 42734785, ignoring message on position PagePositionImpl [pageNr=1, messageNr=14, recordID=0]

       

      Maybe this page size increase could have caused some messages to be

      orphaned?

       

      Thanks

      Carl