RHQ seems to be a great product to monitor any resources. But I have the requirement to not only monitor any resource but to also trigger "External Actions" based on any "Alert Conditions": For Example a Server goes down. RHQ allows to send an email. But what I have to achieve is, that I need for example to call a webservice that will bring up a backup server! Does RHQ provide some "Plugins-Concepts" for the Alerts to not only send an email but rather execute any arbitrarty Plugin or Java Code
Improvement Possibility: Its seems RHQ does have the 30 seconds monitoring limit (as seen inanother posting, this limit can be manually patched.) However I think a great percentage of the users of RHQ will not have "hundertthousands" of services to monitor. I think a standardsetup by a lot of user might comprise 5 to 10 Servers only monitoring the most important Services. So the data collected is quite limited and will not overload the system. But what is actually many times required, is to have faster Monitoring Intervall. I have to monitor a Webserver and bring up alternative service. Therefore I need to operate on a 1 Second Monitoring basis. But this is of course limited to only one or two critical Services per Server. So the generated data is limited and could also be discarded/aggregated before storing them in the DB for historical reasons).
=>You are loosing a big marketshare and userbase by limiting the polling Intervalls to 30 seconds. Making this configurable for every user will allow a lot more users to use your system! RHQ is a great plattform to integrate further plugins into but a 30 second monitoring limit is for critical services too much!)
Now my question related to monitoring intervalls: Does RHQ offer (or what whould be the best approach to implement this) to have for example a monitoring intervall of 10 Seconds. If the Service goes down (the first checks returns DOWN status) I want to start monitoring with an intervall of 1 second and reconfirm(!!) the down Status. I will do for example 5 checks (ever Second a check) and only after this 5 failed checks, I want to report the Service as down? Is there a possiblity to configure existing Plugins (the Alerts) to work like this?
In my own plugins (Heiko has written a great tutorial about Plugins, Thanks!!!) I wold have to implement this logic directly in the method that checks for up and down status? Or is there any clever way to configure this kind of monitoring (changing the monitoring intervall for reconfirmation) in the Alets-Logik in the GUI?
Thank you very much for any advice!!!!