Activity collection configurations to be supported for rtgov 1.0
objectiser May 30, 2013 1:03 PMI want to discuss how activity collection, for the purpose of runtime governance, should be controlled within a client side environment. This is where activity events are being collected in an execution environment (e.g. switchyard) and then reporting those activities to a different "governance" server - so basically just looking at the collection mechanism rather than the whole event processing/presentation mechanism.
There are two related topics for this discussion, filtering and client side configuration.
So the issue is how activity collection should be controlled, so that ideally there is minimal impact on the execution environment but at the same time ensuring that the relevant information is being collected for both immediate (near realtime analysis) and historic analysis.
If we first examine the activity events themselves, then we may wish to filter:
(a) Per technology - decide which technologies to monitor by only installing the interceptors for the technologies of interest. Course grained, and generally static, although the interceptors could potentially be dynamically installed/uninstalled - but not advisable as the way to control activity event collection in a production environment. Should each technology interceptor have an MBean to enable external control? If so what should the default state be? For development/testing, on would be good, but production may be off is more appropriate.
(b) Per activity type(s) - we could have ways to categorise activity types, either based on groups or levels, and enable the user to set a configuration value - again probably via MBean? The issue with this type of filtering is the unknown consequences of not record specific activity types, as depending upon the nature of the information being collected, it is possible that certain activity types are required to correlate the events associated with a business transaction. If some events are not collected, it may cause a break in the "correlation chain" and thus it will not be possible (historically) to rebuild a (for example) call trace of a business transaction.
As well as controlling what is recorded, we also need to decide what flexibility is required in how the events are reported. The current options are:
1) None (i.e. disabled)
2) Reporting direct to a store (e.g. database, although a file system option could be considered) - issue with this approach is that the events are only persisted, which means they are available for historic analysis (e.g. building a call trace) but would not trigger (near) realtime event processing - although potentially a batch process could be created that would poll the db periodically to discover new events and trigger the event processing for them (?).
3) Reporting to a server via REST (although other protocols could also be considered)
NOTE: activity events are not persisted, or sent to the server, within the same thread reporting the event (i.e. the executing business txn) - they are batched and sent/stored periodically.
A question related to these configuration options is, do we need the ability to switch between them dynamically, or would static configuration be acceptable?
Hopefully this is enough information to get a discussion going - although a subsequent area for discussion may be the impact that these configuration/filtering options have on user defined policies.
Any thoughts appreciated.