ID generation
mazz Nov 14, 2014 2:35 PMI commented on this in Lukas' doc, but it was long and well, it is more for a discussion so here it is.
I've been thinking about ID generation (agent IDs, resource IDs, anything that a "feed" is going to need to identify when it sends data for it (like metric data or configuration data).
I think each resource managed by an agent can get assigned an ID by the agent (not the server) - that resource ID will be paired with the agent ID to make it unique server-side (i.e. a ID therefore is really a tuple: <agent-ID>:<resource ID>). I've been reading up on UUIDs - versions 3 and 5 of UUIDs have the concept of a namespace and name - this fits well with this concept. Notice that now it would be the agent's job to make sure it assigns unique IDs to its resources, not the server. This could be as simple as generating a name string unique among peers and pair it with the parent UUID as a namespace (which is analogous to resource keys in RHQ today - keys are names (idempotently discovered by plugins) that are unique among peer resources (i.e. all children under a single parent have unique keys), but not unique among other trees in the inventory). Since this is on the agent/feed, each parent resource's ID is a "sub-namespace" - each parent being a namespace for its children IDs but the parent itself being in the agent namespace This "agent-generated IDs" solves the need for the server to have idempotent ID generation for resources.
How do we solve this for agent (or feed) IDs? The agent could suggest or request its own ID (kind of like agent name today - each agent today tells the server what its agent name should be) - it would be required that that agent ID never change. Since it is the agent who requested the ID, it can theoretically regenerate or rediscover it upon restart or even re-install (thus, it too would be idempotent). It would have to be understood that if the agent ID cannot be re-discovered by the agent, it will not be able to send up data to the server-side - effectively, the agent would have forgotten who it is since it can't remember its ID. We could provide some way for the agent to "recover" the ID through some server-side API, but that will not be 100% foolproof, there will be cases where the agent just flat out cannot get its ID back without manual admin intervention. We'd want to miniize that and yet still provide some mechanism to recover (even if its through manual admin intervention).
What this means is we need an "ID generation" service, or "Agent registration service" (or call it "Feed registration", if we want to move away from the word "Agent") that will need to take an agent registration request with its suggested name and return a success/fail message. There would be no need for server-side calls to generate resource IDs since its just an algorithm the agent would use to generate IDs agent-side. We could provide a client .jar for Java agents to use so it could have a local API to call to generate IDs. But the point is there would be no need for the agent to ask the server for IDs - since the agent now would be responsible for ID generation. The agent could then immediately start sending data with its own home-cooked IDs. For example, once an agent has his own ID registered on the server, it can start pumping data to rhq-metrics with its own resource IDs it generated (along with the metrics for those resources obviously).
I envision using UUIDs because they are easy to generate, obviously make it easy to come up with unique combinations, have a small fixed width (only 16 bytes of binary data, the string form is small and fixed width as well), and have this notion of namespace and name when you generate them, so it should be easy to build unique UUIDs based on the agent/parent/child tuple I discuss above. But this is an implementation detail - if it is easier to just use free-form strings, that's doable (though that has some drawbacks such as variable lengths).