9 Replies Latest reply on Apr 18, 2017 5:26 AM by coenenpe

    Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'

    coenenpe

      Hi,

      I have the following configuration:

      • Wildfly 10.0.0 - Final
      • Full-HA replication

       

      On our production system I notice that some messages which are available on the queue, are never consumed by our MDB. It only happens after a couple of days and upto now I'm unable to reproduce it in a simple isolated test.

      When restartnig the server it runs fine for a couple of days before getting into the same state.

      Other messages which are send to the queue, with the same priority and at a later timestamp, are consumed immediately.

      When requesting "list-delivering-messages-as-json" on the queue via CLI, I can see all my 'lost' messages (see log below)

       

      In the log I see that 6 consumers are listed having some messages.

       

      "durable":true,"address":"jms.queue.jms.queue.myQueue","__AMQ_CID":"2a35fea0-e76a-11e6-bb94-cdf9f003cebb","messageID":316711,"expiration":0,"type":2,"priority":4,"userID":"ID:49a381ac-e79e-11e6-bb94-cdf9f003cebb","timestamp":1485857770768},
      {"durable":true,"address":"jms.queue.jms.queue.myQueue","__AMQ_CID":"adabb8f2-e79e-11e6-bb94-cdf9f003cebb","messageID":316847,"expiration":0,"type":2,"priority":4,"userID":"ID:af21de39-e79e-11e6-bb94-cdf9f003cebb","timestamp":148585794104
      <...>
      <...>
      ,"consumerName":"ServerConsumer [id=e0826d31-e707-11e6-bb94-cdf9f003cebb:e0826d32-e707-11e6-bb94-cdf9f003cebb:0, filter=null, binding=LocalQueueBinding [address=jms.queue.jms.queue.myQueue, queue=QueueImpl[name=jms.queue.jms.queueddd.myQueue, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=f2b03d8c-e4a2-11e6-bb94-cdf9f003cebb]]@649168c2]]"},
      
      
      <...>
      <Some more messages with other consumer>
      

       

      In the queue Metrics I see:

      • Consumer count: 15
      • Message count: 1753 (According to my own logs these have never been processed by our MDB yet, all of these messages are listed in the 'delivering' list above )
      • Messaged added: 130000
      • Scheduled Count: 0

       

      My MDB config:

      @MessageDriven(mappedName = "ourNotificationDispatcher", activationConfig = {
         @ActivationConfigProperty(propertyName = "acknowledgeMode", propertyValue = "Auto-acknowledge"),
         @ActivationConfigProperty(propertyName = "destinationType", propertyValue = "javax.jms.Queue"),
         @ActivationConfigProperty(propertyName = "destination", propertyValue = "jms/queue/myQueue")
      })
      public class EventDispatcher implements MessageListener {
      public void onMessage(Message message)
      //Processing of message
      }
      }
      

       

       

      Do you have any idea what could be causing this issue?

       

      Could it be related to this problem: https://developer.jboss.org/message/953165#953165 ? Do you know this jbertram ?

       

      Many thanks in advance!

        • 1. Re: Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'
          mnovak

          Hi Peter,

           

          there might be number of problems. First what comes to my mind is that some messages got "paged" to disk and MDBs are processing new messages which are still in memory. This might happen if there is constant flow of messages.

           

          Further Artemis HA with replicated journal in Wildfy 10.0.0.Final is unstable. There were applied number of fixes. It's possible that you might be hitting one of them. I would suggest to update to newer version of WF 10.x.

           

          If the problem persist then try to make thread dump of the server (jstack -l <pid) to see if there are any "stuck" threads which might "hold" the messages. Next step would be enable trace logging for "org.apache.activemq.artemis" and "io.netty" and search logs for one "stuck messageID" and see what is happening with the message.

           

          Cheers,

          Mirek

          • 2. Re: Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'
            coenenpe

            Hi,

            Thank you for your previous reaction!

            Unfortunately this problem still persists.

            I upgraded our environment to Wildfly 10.1.0.Final but some messages still get stuck after some time (I'm unable to reproduce it locally or on a test-environment).

            I made a thread-dump to search for 'stuck' threads like you said, but I'm unable to identify them. Could you maybe have a look at it, I attached the thead-dump?

            What I find noticeable is that I have a lot (317) of threads with the same stack, in parked state (see below). Are these activeMQ related threads? Is this normal, and why are there so many created when I only have 10 MDB?

             

             

            pool-2-thread-10 - priority:5 - threadId:0x000000003cc96800 - nativeId:0x1ebc - state:WAITING
            stackTrace:
            java.lang.Thread.State: WAITING (parking)
            at sun.misc.Unsafe.park(Native Method)
            - parking to wait for <0x0000000541d8efd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
            at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
            at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
            at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
            Locked ownable synchronizers:
            - None
            

             

             

            Many thanks in advance!

            • 3. Re: Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'
              jbertram

              Couple of things:

              • Having lots of parked threads in Wildfly (and in Artemis) is normal.  Threads are used for lots of different tasks, not just delivering messages to MDBs.
              • In general, you can't find a stuck thread using a single thread dump.  You have to compare multiple thread dumps to see it.  Otherwise you're just looking at a snapshot in time.

               

              Your thread dump shows 4 MDBs receiving messages in these threads:

              1. "Thread-63 (ActiveMQ-client-global-threads-1856342127)" #612 daemon prio=5 os_prio=0 tid=0x000000002e7fd000 nid=0x258c runnable [0x0000000060b8b000]
              2. "Thread-33 (ActiveMQ-client-global-threads-1856342127)" #577 daemon prio=5 os_prio=0 tid=0x000000003a35e000 nid=0x96c runnable [0x000000005d86b000]
              3. "Thread-22 (ActiveMQ-client-global-threads-1856342127)" #562 daemon prio=5 os_prio=0 tid=0x000000003a360800 nid=0xd5c runnable [0x000000005e3db000]
              4. "Thread-19 (ActiveMQ-client-global-threads-1856342127)" #551 daemon prio=5 os_prio=0 tid=0x000000003a354800 nid=0x890 runnable [0x0000000052e2c000]

               

              Each of these threads is sitting in an invocation of java.net.SocketInputStream.socketRead() which may be blocking the thread and causing the issue.  You'd have to acquire additional thread dumps and find these threads in each one to know if the thread is actually stuck or not.

              • 4. Re: Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'
                coenenpe

                Hi,

                First of all, thank you for your comment.

                I took some additional thread-dumps (see attached) like you suggested and I checked if the same threads are still in 'running' state. This is the case I think.

                I notice that it is 'busy' executing a rest call using the rest-easy framework. I think it's blocked on a socket-read, is this correct?

                 

                2 threads with trace:
                States: {RUNNABLE=2}
                Names:
                - "Thread-19 (ActiveMQ-client-global-threads-1856342127)"
                - "Thread-63 (ActiveMQ-client-global-threads-1856342127)"
                Stack:
                - java.net.SocketInputStream.socketRead0(Native Method)
                - java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
                - java.net.SocketInputStream.read(SocketInputStream.java:170)
                - java.net.SocketInputStream.read(SocketInputStream.java:141)
                - sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
                - sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
                - sun.security.ssl.InputRecord.read(InputRecord.java:532)
                - sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
                - - locked <IGNORED> (a java.lang.Object)
                - sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
                - sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
                - - locked <IGNORED> (a sun.security.ssl.AppInputStream)
                - org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
                - org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
                - org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
                - org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
                - org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
                - org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
                - org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
                - org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
                - org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
                - org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
                - org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
                - org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685)
                - org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487)
                - org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:882)
                - org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
                - org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
                - org.jboss.resteasy.client.jaxrs.engines.ApacheHttpClient4Engine.invoke(ApacheHttpClient4Engine.java:283)
                - org.jboss.resteasy.client.jaxrs.internal.ClientInvocation.invoke(ClientInvocation.java:436)
                - org.jboss.resteasy.client.jaxrs.internal.proxy.ClientInvoker.invoke(ClientInvoker.java:102)
                - org.jboss.resteasy.client.jaxrs.internal.proxy.ClientProxy.invoke(ClientProxy.java:76)
                

                 

                 

                I have a couple of questions regarding the functionality of activeMQ:

                • If in fact the REST-call is blocking the activeMq thread, why aren't other messages on the queue getting picked up by the other MDB's? According to the thread-dump only 4 ActiveMQ threads are blocked because of the REST problem, so I would think the other 6 MDB's are still able to process the other messages? At the moment we have >1000 messages stuck.
                • Why isn't the transaction-reaper of activeMQ kicking in and killing the thread? I would expect it to give some kind of a timeout?

                 

                 

                Some questions regarding the blocked REST-call:

                We have the following dependencies defined for rest-easy:

                <dependency>
                   <groupId>org.jboss.resteasy</groupId>
                   <artifactId>resteasy-jaxrs</artifactId>
                   <version>3.0.17.Final</version>
                   <scope>provided</scope>
                </dependency>
                <dependency>
                   <groupId>org.jboss.resteasy</groupId>
                   <artifactId>resteasy-client</artifactId>
                   <version>3.0.17.Final</version>
                   <scope>provided</scope>
                </dependency>
                <dependency>
                   <groupId>org.jboss.resteasy</groupId>
                   <artifactId>resteasy-jaxb-provider</artifactId>
                   <version>3.0.17.Final</version>
                   <scope>provided</scope>
                </dependency>
                

                 

                To setup the connection we produce a proxy as you can see below.

                This proxy is injected in a service which is called by the MDB's upon processing some message.

                @Singleton
                public class RestHttpClient {
                private ResteasyWebTarget resteasyWebTarget;
                @PostConstruct
                public void init() {
                ResteasyClientBuilder clientBuilder = new ResteasyClientBuilder();
                clientBuilder.sslContext(sslContextConfig);
                clientBuilder = clientBuilder.connectionPoolSize(20);
                clientBuilder = clientBuilder.connectionTTL(10000, TimeUnit.MILLISECONDS);
                this.resteasyWebTarget = clientBuilder.build().target(url);
                }
                
                @Produces
                @DataRestServiceProxy
                public IRestDataService dataRestInterface() {
                   return this.resteasyWebTarget.proxy(IRestDataService.class);
                }
                }
                

                I notice that we didn't specify the socketTimeout, could this be causing this issue? The default seems to be '-1', which is forever I guess?

                I also found that it might be related to this bug, what do you think?

                [RESTEASY-975] Client Framework not honoring connection timeouts Apache Client 4.3 - JBoss Issue Tracker

                Should I override the provided rest-easy libs in Wildfly with the latest ones?

                 

                Since I can't reproduce it, I just want to make sure I'm checking all possibilities.

                 

                Many thanks in advance!

                • 5. Re: Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'
                  jbertram

                  If in fact the REST-call is blocking the activeMq thread, why aren't other messages on the queue getting picked up by the other MDB's? According to the thread-dump only 4 ActiveMQ threads are blocked because of the REST problem, so I would think the other 6 MDB's are still able to process the other messages? At the moment we have >1000 messages stuck.

                  My guess is that the 4 blocked MDBs have all those messages in their buffer.  Every consumer has a message buffer to mitigate the performance penalty you'd have if you had to have a network round-trip to fetch every message.  You can control the size of the the buffer on the MDB by using the "consumerWindowSize" activation configuration property.  See more details in the documentation.  For local MDBs it's not really necessary to buffer messages so you can safely set consumerWindowSize to 0.

                   

                  Why isn't the transaction-reaper of activeMQ kicking in and killing the thread? I would expect it to give some kind of a timeout?

                  The default JTA transaction timeout is 300 seconds (i.e. 5 minutes).  Assuming your MDB is consuming the message in a JTA transaction then the transaction manager will abort the transaction, but it won't "kill" the thread.  That issue has been covered lots of times on the forum over the years.  Aborting the transaction will cause a rollback and potentially a redelivery (depending on your configuration).

                   

                  I can't really address any of your questions about RESTEasy.  I'd recommend you start a new thread to deal with those specific questions.

                  • 6. Re: Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'
                    jbertram

                    Any progress on this?

                    • 7. Re: Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'
                      coenenpe

                      Hi Justin,

                       

                      We investigated the issue further and think we have a problem with infinite blocked threads because of the timeout problems in RESTEasy.

                      We upgraded the RESTEasy lib in Wildfly and have specified some extra timeouts in our code.

                      In a couple of days our new version will go to our acceptance env for further testing. Hopefully this will solve the problem once and for all. I'll let you know if it's solved.

                      • 8. Re: Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'
                        jbertram

                        Thanks for the update.  Your conclusion makes sense given the evidence presented so far.

                        • 9. Re: Wildfly 10.0.0-Final  - ActiveMQ messages stuck in 'Delivering'
                          coenenpe

                          Justin,

                          The issue was resolved. Upto now no messages became 'stuck' anymore.

                          Thanks for your support!