REST API: Changes proposed to avoid leaking server resources
hstewart Feb 9, 2011 4:30 PMI would like to propose for discussion two small changes to the Hornet-Q REST API, which I believe are needed to avoid a potentially serious leakage of Hornet-Q resources.
Background:
My group at Juniper Networks is developing a system containing redundant REST clients that can failover from one to another. All of the REST clients use the Hornet-Q REST API to talk to a single Hornet-Q server (on a JBoss cluster). The REST clients create various queues, as well as push-subscriptions that forward messages from topics to queues on that single server, and pull consumers that read from queues.
The problem:
We have observed that under common failure scenarios, such as the abrupt death of a REST client, or the loss of network connectivity between the REST client and the Hornet-Q server, it is almost impossible to ensure that resources allocated on the Hornet-Q server by the REST client are properly deleted. These resources include queues, push-subscriptions, and pull-consumers. The resources will continue to live on the server long after the client that created them has died. We have observed the server become completely unresponsive, possibly as a result of this resource leak. The following are 3 cases in which we have observer Hornet-Q resources leaking:
Case 1:
- A REST client creates a queue, and a pull-consumer to read messages from that queue.
- The REST client dies abruptly, without deleting the queue and the pull-consumer.
- The queue lives on either forever (durable) or until server restart (non-durable).
- Messages accumulate, consuming memory or disk.
Case 2:
- A topic exists on the Hornet-Q server.
- The REST client creates a queue with the name "queueX", and a push-subscription that forwards messages from the topic to queueX.
- The REST client dies abruptly, without deleting the queueX and the push-subscription.
- The push-subscription and queueX live on.
- Messages accumulate, consuming memory or disk.
Case 3:
- Same as case 2, except the REST client successfully deletes queueX, but fails to delete the push-subscription.
- The following error is found in the JBoss server.log file: javax.ws.rs.WebApplicationException org.hornetq.rest.queue.QueueDestinationsResource.findQueue(QueueDestinationsResource.java:153) many times, presumably once for each message found in the topic by the push-subscription.
Why this is a problem:
Depending on the number of REST clients and the number of queues and the rate of message creation, this resource leakage can be significant. We have not found a way to use the Hornet-Q REST API that avoids this leakage. In other words, it is almost impossible for the client or set of redundant clients to reliably delete all queues, push-subscriptions, and pull-consumers that they create on the server, for the following reasons:
- The client may die abruptly and never restart.
- If the client does restart, or fails over to a backup redundant REST client, it might know the names of queues created by its predecessor based on its internal application logic, but it will not know the URIs of the push-subscription and pull-consumer created by its predecessor, as these URIs are generated by the server and returned to the REST client after the resources are created on the server. Therefore, it can not perform the necessary HTTP DELETE operations.
- If the client attempts to solve this by storing queue names and URIs that it receives from the HQ server persistently, it may die before receiving them from the server and storing them on disk, and it would needs to handle distributed persistence amongst the set of redundant REST clients, with all the network connectivity and transaction issues that entails. Here we are getting well beyond what should be expected of the REST clients.
Proposed changes to the API and Hornet-Q behaviour:
- Modify the REST API such that when a REST client is POSTing to create a queue on the Hornet-Q server, it can specify an "idle timeout" value. If the queue is not read from for at least an "idle timeout" interval, the Hornet-Q server will delete the queue and all associated pull-consumers.
- Modify the REST API such that when a REST client is POSTing to create a push-subscription on the Hornet-Q server, it can specify an "error timeout" value. If the push-subscription fails to publish messages to its target for at least the "error timeout" interval, the Hornet-Q server will delete the push-subscription.
Is this a real problem, or is there some existing means to avoid these resource leakage issues?
If it is a real problem, does anyone have other ideas about how to solve these leakage issues?