We need to find out more use cases to be able to better model the data model. The tags discussion is one aspect. Another one is if we want to have a way to find out what we store, have hyperlinking in the rest-api or if we completely leave this to clients.
To quote a comment in the source
// The idExists call [...]. We
// need to decided whether or not this check is really necessary. There isn't really
// an efficient way to do this in Cassandra unless we either query all keys or
// introduce new schema to support this method.
Basically we are saying that a user that does not know if a metric with an id of "lulu" exist or not will never be able to find out:
snert:~ hrupp$ curl -i http://localhost:8080/rhq-metrics/metrics/lulu
HTTP/1.1 200 OK
I think we should have an index over existing ids that can be searched / browsed. And that an api call like the curl example should correctly return a 404 Not Found if such an id does not exist.
So in this case we may need a different table in Cassandra to record those. Perhaps along the tags from the other discussion.
- Do we want to allow the user to delete metrics? Individual items? Full history?
- Do we want to add api methods that bucketize metric results (e.g. if we display 60 bars, we need 60 buckets)?
- How where do we want to store metadata (units, monotonically increasing/dynamic)?
For 1 I think yes and yes
For 2 I think yes - as this reduced the need for clients to implement this and it also allows for much easier comparison of two metrics with different ids at a certain point of time, as this means to compare the values of one bucket as very often metric values are not taken at the exact same point in time.
For 3 I think It may make sense to have a table in C* to encode this -- perhaps along with the table of ids discussed above, so that the metadata does not need to be stored in the tags on each tuple (I know I am thinking relational