3 Replies Latest reply on Jun 16, 2014 5:00 AM by theute

    Support for aggregated metrics

    theute

      Talking to Liveoak and Switchyard teams, their usecase (which are actually very similar) would benefit from the ability to aggregate metrics.

       

      I didn't wrap my head fully around this, but wanted to share early (It's very likely related to tags as well). Note that in this context those are all counters.

       

      For instance, they would want to keep metrics of (and then use "alerts" since the idea is to stop a user or an application when it reached 1,000,000 calls for instance)

      a) number of API calls for a particular user for a particular method

      b) number of API calls for a particular method (or REST endpoint)

      c) number of API calls for a particular user

      PS It's actually more complex than that as they can be multiple applications and at the end they would want to know either if:

      •      A user reached x calls for a particular application
      •      A user reached x calls for the whole set of applications
      •      An application reached x calls
      •      ...

          

      One could argue that it's up to the application writer to make as many store calls as needed but at the same time RHQ metrics could tell b and c from a (the most explicit).

       

       

      PS: I don't have a usecase for a metric that has a value but there are probably some (and addition may not be the only operation of several metrics, average could be of interest). Maybe the average frequency of a machine CPU calculated from the various CPU core values ?

        • 1. Re: Support for aggregated metrics
          john.sanda

          These examples certainly seem like good use cases for counters.

           

          Being able to "alert" when a counter reaches a particular value could involve doing a read before write (or immediately after the write) or maintaining counter values in memory. Cassandra is not a good fit for the former. For the latter Infinispan might be a good fit.

           

          I was not sure whether or not we would need to tagging for counters, but I think we will want it based on these use cases. I am not sure how RHQ Metrics could distinguish between b) and c) without applying additional, domain/application specific semantics to the metrics. Right now the schema for the counters table looks like,

           

          CREATE TABLE counters (
             group text,
             c_name text,
             c_value counter,
             PRIMARY KEY (group, c_name)
          );
          

           

          If I want to track API calls per method, I could do

           

          UPDATE counters SET c_value = c_value + 1 WHERE group = 'api calls' AND c_name = 'Method1';
          UPDATE counters SET c_value = c_value + 1 WHERE group = 'api calls' AND c_name = 'Method2';
          

           

          And then to track calls per user

           

          UPDATE counters SET c_value = c_value + 1 WHERE group = 'jsanda api calls' AND c_name = 'Method1';
          UPDATE counters SET c_value = c_value + 1 WHERE group = 'jsanda api calls' AND c_name = 'Method2';
          

           

          This is pretty straightforward. If the application writer has some understanding of the schema, then he might realize that all counters in the same group can be fetched in a single query. And if he wants to fetch all user counts, then he might consider doing something like,

           

          UPDATE counters SET c_value = c_value + 1 WHERE group = 'user api calls' AND c_name = 'jsanda:Method1';
          UPDATE counters SET c_value = c_value + 1 WHERE group = 'user api calls' AND c_name = 'theuete:Method2';
          
          • 2. Re: Support for aggregated metrics
            pilhuhn

            I think counters sound about right, but not entirely. Very often such api usage calls involve a (sliding) window and apparently our current counters don't do that. See also the other thread about metric types.

            • 3. Re: Support for aggregated metrics
              theute

              Good point, I missed an important element. Usually it's a limit to some API number call from a specific date (first of the month or so)