7 Replies Latest reply on Jul 5, 2012 9:37 AM by bill.burke

    REST API: Changes proposed to avoid leaking server resources

    hstewart

      I would like to propose for discussion two small changes to the Hornet-Q REST API, which I believe are needed to avoid a potentially serious leakage of Hornet-Q resources.

       

      Background:

       

      My group at Juniper Networks is developing a system containing redundant REST clients that can failover from one to another. All of the REST clients use the Hornet-Q REST API to talk to a single Hornet-Q server (on a JBoss cluster). The REST clients create various queues, as well as push-subscriptions that forward messages from topics to queues on that single server, and pull consumers that read from queues.

       

      The problem:

       

      We have observed that under common failure scenarios, such as the abrupt death of a REST client, or the loss of network connectivity between the REST client and the Hornet-Q server, it is almost impossible to ensure that resources allocated on the Hornet-Q server by the REST client are properly deleted. These resources include queues, push-subscriptions, and pull-consumers. The resources will continue to live on the server long after the client that created them has died. We have observed the server become completely unresponsive, possibly as a result of this resource leak. The following are 3 cases in which we have observer Hornet-Q resources leaking:

       

      Case 1:

      1. A REST client creates a queue, and a pull-consumer to read messages from that queue.
      2. The REST client dies abruptly, without deleting the queue and the pull-consumer.
      3. The queue lives on either forever (durable) or until server restart (non-durable).
      4. Messages accumulate, consuming memory or disk.

      Case 2:

      1. A topic exists on the Hornet-Q server.
      2. The REST client creates a queue with the name "queueX", and a push-subscription that forwards messages from the topic to queueX.
      3. The REST client dies abruptly, without deleting the queueX and the push-subscription.
      4. The push-subscription and queueX live on.
      5. Messages accumulate, consuming memory or disk.

      Case 3:

      1. Same as case 2, except the REST client successfully deletes queueX, but fails to delete the push-subscription.
      2. The following error is found in the JBoss server.log file: javax.ws.rs.WebApplicationException org.hornetq.rest.queue.QueueDestinationsResource.findQueue(QueueDestinationsResource.java:153) many times, presumably once for each message found in the topic by the push-subscription.

       

      Why this is a problem:

       

      Depending on the number of REST clients and the number of queues and the rate of message creation, this resource leakage can be significant. We have not found a way to use the Hornet-Q REST API that avoids this leakage. In other words, it is almost impossible for the client or set of redundant clients to reliably delete all queues, push-subscriptions, and pull-consumers that they create on the server, for the following reasons:

      • The client may die abruptly and never restart.
      • If the client does restart, or fails over to a backup redundant REST client, it might know the names of queues created by its predecessor based on its internal application logic, but it will not know the URIs of the push-subscription and pull-consumer created by its predecessor, as these URIs are generated by the server and returned to the REST client after the resources are created on the server. Therefore, it can not perform the necessary HTTP DELETE operations.
      • If the client attempts to solve this by storing queue names and URIs that it receives from the HQ server persistently, it may die before receiving them from the server and storing them on disk, and it would needs to handle distributed persistence amongst the set of redundant REST clients, with all the network connectivity and transaction issues that entails. Here we are getting well beyond what should be expected of the REST clients.

       

      Proposed changes to the API and Hornet-Q behaviour:

       

      1. Modify the REST API such that when a REST client is POSTing to create a queue on the Hornet-Q server, it can specify an "idle timeout" value. If the queue is not read from for at least an "idle timeout" interval, the Hornet-Q server will delete the queue and all associated pull-consumers.
      2. Modify the REST API such that when a REST client is POSTing to create a push-subscription on the Hornet-Q server, it can specify an "error timeout" value. If the push-subscription fails to publish messages to its target for at least the "error timeout" interval, the Hornet-Q server will delete the push-subscription.

       

      Is this a real problem, or is there some existing means to avoid these resource leakage issues?

      If it is a real problem, does anyone have other ideas about how to solve these leakage issues?

        • 1. REST API: Changes proposed to avoid leaking server resources
          bill.burke

          Pull queue consumers should idle, timeout, and be garbage collected.  This is configurable.

           

           

          Ok, can look to add some support/fix for this.  Maybe another option for push consumers is not to delete the consumer, but instead just acknowledge the message?  That way the push subscription can still live, but start to ignore messages.  Not sure if you need that or not.

          • 2. REST API: Changes proposed to avoid leaking server resources
            hstewart

            Thanks Bill for correcting me re. the existence of an idle timeout for pull-consumers.

             

            In our particular application, having the target of a push-subscription receive and ignore messages, thus allowing the push-subscription itself to live on, is neither desirable nor practical, but it may well be a highly desirable strategy for other applications. In the failure scenarios specific to our application, we need a mechanism that will delete abandoned queues and push-subscriptions automatically.

             

            Given that the failure scenarios that leave HQ resources abandoned are likely to be application-specific, I think it makes sense for the application (i.e. the REST client) to specifiy at queue and push-subscription creation time the criteria to be used by the Hornet-Q server to determine when to automatically delete queues that are not being read from, and push-subscriptions whose target is not reachable.

             

            Being able to specify:

            1. a queue "idle timeout" (defined as: a time period in which no attempts are made to read from the queue)
            2. a push-subscription "error timeout" (defined as: a time period in which the push-subscription's target is known to be unreachable)

            would be appropriate criteria for our particular application.

             

            Does anyone have comments on what might be appropriate criteria for other applications?

            • 3. REST API: Changes proposed to avoid leaking server resources
              bill.burke

              Yup sounds good, I'll schedule this to be added.

              • 4. REST API: Changes proposed to avoid leaking server resources
                hstewart

                That is much appreciated, Bill. Thank you.

                • 5. REST API: Changes proposed to avoid leaking server resources
                  bill.burke

                  Ok, here's what I've done (you'll see it with the 2.2.2 release of hornetq)

                   

                  I've added a disableOnFailure element to the push registration xml.  Also maxRetries and retryWaitMillis.  This allows you to specify how many retries you want on a connection failure.  If there are too many connection failures when pushing a specific message, the registration will be disabled.  For topics, this means that the subscription will be deleted, even if it is durable.

                   

                  Pull topic subscriptions, you can specific a delete-when-idle parameter and a idle-timeout.  It will delete the underlying jms subscription if idle time is met.  Its not perfect.  The idle timeout will not trigger after a server restart unless the client manually recreates the subscription.

                   

                  I did not implement dead queue reaping. Queue pull consumers already clean themselves up.  Pull topic subscriptions are a little messy still.  Also, sending a message also allows you to specify an expiration or a ttl on the individual message.  Finally a producer-time-to-live default config is provided if you want to set the default TTL for every message posted.

                   

                  I don't have something fully complete because I want support for this within core HornetQ and not managed by the REST layer.  I would have had to add a whole persistence layer which is something I didn't want to do.  Also idle config at the REST level doesn't make sense if your mixing non-rest producer and consumers with REST ones.  Sorry its not complete, but hopefully its enough to get by for now.

                  • 6. Re: REST API: Changes proposed to avoid leaking server resources
                    aledb1978

                    Hi Bill,

                     

                    I'm looking for the delete-when-idle and idle-timeout parameters on the HornetQ guide at http://docs.jboss.org/hornetq/2.2.5.Final/rest-interface-manual/html_single/index.html but I can't find them.

                     

                    Have you any idea how to fix these problems on the 2.2.5 version?

                     

                    Thank you

                     

                      Alessandro.

                    • 7. Re: REST API: Changes proposed to avoid leaking server resources
                      bill.burke

                      When you create your topic pull subscription, add the additional delete-when-idle and idle-timeout form parameters to your POST request.  The docs show an example creating a durable scription, so do the same thing, except add those additional parameters.