What is a resource type, anyway?
lkrejci Dec 12, 2014 5:08 PMResource Types
In RHQ.next we’re trying to re-think how we handle resource types. We want to make them more dynamic, CRUD-able using an API instead of being "statically" defined by a descriptor as it is in RHQ today.
But before we can ask what is a resource type, we first have to answer the question of what actually is a resource.
For the questions below, assume that we’re talking strictly about server-side features, not anything that should be defined or live on the actual monitored machines/applications.
Only assume that somehow we receive datapoints from the machines/applications (unless additional assumptions are explicitly specified).
Resource
A resource generally speaking can be described as a number of measurements and configuration that logically belong together.
While the measurements and the configuration have meaning on their own, resource gives them the primary context - they together represent one logical thing.
But there are questions about this:
- Would it be useful to compose new resources ad-hoc from arbitrary measurements/config?
E.g. out of the data that is coming from my infrastructure, I’d be able to compose a logical resource that would represent only stuff I’m really interested in. Say I knew that from time to time my webapp is causing high CPU load. If I were able to have "system load on CPU + number of requests per second on my REST endpoint" as metrics on a single resource, that resource would become a one-stop-shop for me to learn (crudely) the health of my app and the system it’s running on.
- Is a resources something set in stone or a more fluid concept? In another words, would it be useful to dynamically add and remove metrics/config from a resource?
In the example at 1., I’d be able to add/remove additional metrics from my logical "health resource" over time as I learned more about the perf indicators.
- Is a measurement or configuration confined to a single resource or can it be part of multiple?
Again, this is really only a useful question to ask if we consider 1. as a viable option. If it is, then the answer to this question is that obviously they would be part of multiple resources.
- Wouldn’t grouping of "whole" resources be enough to learn the same things?
IMHO, the answer is "no", because the information that would be readily available at the resource itself, would be burried more deep - in the example given, I’d have a group of a CPU resource and a WAR resource and I’d have to look at the system load of the CPU and requests per second on the WAR resource (or maybe on a "sub resource" of it) to learn what I need to know.
- What if we always retained the resources as they are coming from the feeds but would add a concept of a "view" that would pick and choose data from the individual feed-backed resources?
This is not very much different from the ad-hoc composable resource but adds a clear distinction between feed-backed resources and "synthetic" ad-hoc data view.
Resource Type
Resource types as they are in current RHQ are immensely useful because they provide metadata about the individual measurements/configs/operations/etc. that are applicable to resources. Thus, measurements have units and descriptions, configurations have a defined format with descriptions, too, operations have defined return types and parameter types, etc. It is crucial we keep this kind of information in some form or fashion. At the same time, we’ve known that the current way of defining the resource types is a little bit rigid and not easily understood by users that wish to create their own.
Currently we imagine that the way out of the current "rigid" situation might be the following:
- Still allow for uploading resource type descriptors of some form using an API - the type descriptor either can remain the XML we have right now or we can try to come up with alternative formats like JSON or YAML based definitions. Still, the metadata would be described by some kind of markup.
- The API must allow for renaming the resource types (and their constituent parts).
This can be done either by a special "update resource type" endpoint that would accept a "diff" of sorts or by additional metadata in the descriptors specifying the renames (like "oldNames" attribute or some such).
- There can be multiple versions of the resource type active in the system.
- The resource types can extend others (unless some important usecase is not possible, I think single inheritance should be enough).
- We should keep the
runs-inside
semantics from current RHQ because it allows for dynamic resource tree.
But there is more that should be done to make the work with resource types as flexible as possible while also less cumbersome.
Currently, the resource types are required to be first uploaded to the server and only when they are uploaded and registered there, the agents can download them and start using them.
This is often suboptimal in situations where the feed would itself scan some kind of resource (say an MBean Server) and would like to start reporting on the found data without first needing to formally describe it using resource type.
So, it should be possible for the feeds themselves to define resource types (this actually is kinda implicitly contained in the first bullet point above). But then there is a question of how to match 2 resource types (that may come from different feeds).
Another suboptimal use case is when a "dumb feed" doesn’t even have a notion of a resource type and just sends name=value
pairs. Even then we should be able to allow the data in and letting the server-side to then augment the data with the metadata (in the form of a resource type). In this case, the data needs to "sit" somehow on the server somewhere and wait for the server- side to assign a resource type to it.
Also, the current practice of having the "plugins" with the resource types on the serverside is actually quite useful so as the feeds don’t have to repeat themselves and define everything that is already known to the server.
If the feed supports RHQ plugins, it can reference the resource types used by their server-side IDs (e.g. "rhq://global/<resourceType>") instead of the details specified in the further chapters.
Resource Type Descriptor
The resource type is described by the feeds or users using a structure that resembles today’s resource type definitions in the RHQ agent’s plugin descriptors.
The difference here is that these feeds can be sent up from the feed to the server and server then works with this information.
Feed-side Full Resource Type Definition
{ //this is the agent-side ID of the resource type in the predefined format"
"id" : "<agentType>://<agentName>/<resourceType>/<version>","extends" : "<agentType>://<agentName>/<otherResourceType>/<otherVersion>",
"runsInside" : [
//list of resource type ids
],
"configuration" : {
//json-schema
},
"connection" : {
//json-schema
},
"metrics" : [
{
//no need for an ID of this metric, because it can be deduced from the
//name and the resource type id
"name" : "asdf",
"type" : "numeric",
"unit" : "jiffies"
},
...
],
"operations" : [
{
"name" : "asdf",
"returnType" : {
//json-schema
},
"parameters" : [
{
"name" : "asdf",
"type" : {
//json-schema
}
},
...
]
},
... ]
}
Feed-side Datum Definition
For feeds that don’t support resource types, a simplified format might be used.
{
"id" : "<agentType>://<agentName>/<resourceType>/<version>/metric/<name>",
"type" : "numeric",
"unit" : "jiffies"
}
{
"id" : "<agentType>://<agentName>/<resourceType>/<version>/operation/<name>",
"returnType" : {
//json-schema
},
"parameters" : [
{
"name" : "asdf",
"type" : {
//json-schema
}
},
...
]
}
and similarly for configuration
and connection
.
Server-side Datum Definition
Notice that on the server, the datum is represented quite differently: * i18n definitions * contentHash
for server-side matching of "similar" types * representedBy
for linking "stuff" together (see Resource Type Linking post)
{
//the dataType, name and version got merged into this, resourceType is optional
"id" : "agentType://feedName/resourceType/version/metric/asdf",
"type" : "numeric",
"unit" : "jiffies",
"i18nKey" : "metric.asdf",
"contentHash" : "hash" //sha of "metric" + type + unit
}
or simply
{
"id" : "agentType://feedName/resourceType/version/metric/asdf",
"representedBy" : "otherAgentType://otherFeedName/otherVersion/otherResourceType/metric/otherName"
}
{
//the dataType and name got merged into this, resourceType is optional.
//Yes, this means no operation overloading
"id" : "agentType://feedName/resourceType/version/operation/asdf",
"returnType" : {
//json-schema
},
"parameters" : [
{
"name" : "asdf",
"type" : {
//json-schema
}
},
...
],
"i18nKey" : "operation.asdf",
"contentHash" : "hash" //sha of "operation" + returnType.toString() + parameters.toString()
}
Server-side Resource Type Definition
Notice that on the server, the datum is represented quite differently: * i18n definitions * contentHash
for server-side matching of "similar" types * representedBy
for linking "stuff" together (see Resource Type Linking chapter)
{
"id" : "<agentType>://<agentName>/<resourceType>/<version>",
"metrics" : [
//list of metric ids
],
"operations" : [
//list of operation ids
]
}
or simply
{
"id" : "<agentType>://<agentName>/<resourceType>/<version>",
"representedBy" : "<otherAgentType>://<otherAgentName>/<otherResourceType>/<otherVersion>"
}
Type Update
{
"removed" : [
//list of datum or resource type IDs
],
"added" : [
//list of resource type or datum definitions as above
],
"changed" : [
{
"id" : "agentType://agentName/resourceType/...",
"newDefinition": {
//datum definition (NOT full resource type) as defined above
}
},
...
]
}