-
1. Re: Jopr Recovery Alert Not working
joe.marques Jun 9, 2009 10:07 AM (in response to amit8484)I've not seen issues with recovery alerts. Please post the full, exact details of both alert definitions. I'll use that information to try and reproduce locally.
-
2. Re: Jopr Recovery Alert Not working
amit8484 Jun 9, 2009 10:34 AM (in response to amit8484)Jboss AS Alert
Condition Set
If Condition: Availability goes DOWN
Dampening Rule: Each time condition set is true
Action Filters: Disable alert until re-enabled manually or by recovery alert : true
Recovery Alert for JboSS Server Alert
Condition Set
If Condition: Availability goes UP
Dampening Rule: Each time condition set is true
Recovery Alert: for Jboss AS Alert
Action Filters: Disable alert until re-enabled manually or by recovery alert : false -
3. Re: Jopr Recovery Alert Not working
joe.marques Jun 9, 2009 10:49 AM (in response to amit8484)That definition looks correct.
When you check the availability history for this resource (monitor>availability subtab if you're using Jopr 2.2.x) what are the timestamps for when it went down and came back up?
It's possible that if your resource restarts very quickly (say in 5 seconds) that the blip would not have been detected. Jopr polls for availability of the resource once per minute, so if your resource went down and came back up in between successive polls, the monitoring system might not see the small window of downtime. Could that be happening here? -
4. Re: Jopr Recovery Alert Not working
amit8484 Jun 9, 2009 11:12 AM (in response to amit8484)how confident are you one the availability once per minute, because that can help a lot setting up my alerts
-
5. Re: Jopr Recovery Alert Not working
amit8484 Jun 9, 2009 11:17 AM (in response to amit8484)well, my thinking is, is that if it goes down, it will fire and alert, and if the server is restarting (which doesnt take long) would the next poll indicate up and it should re-enable the original alert
-
6. Re: Jopr Recovery Alert Not working
amit8484 Jun 9, 2009 11:23 AM (in response to amit8484)yeah i dont think it can be a minute between polls because if i set the dampening to alert every time its true, i get like 40 emails every second notifying me its down
-
7. Re: Jopr Recovery Alert Not working
joe.marques Jun 9, 2009 2:19 PM (in response to amit8484)40 emails every second? That sounds like a serious issue. What version of Jopr are you running on? Can you give me any other information about your environment?
It most certainly is one minute between polls. -
8. Re: Jopr Recovery Alert Not working
amit8484 Jun 11, 2009 9:08 AM (in response to amit8484)yeah the JVM free memory one is not working either. I have the original alert fire when memory goes below 50 MB 3 times in the last 5 polls. The recovery alert should fire and renable the first one when it goes back up over 50 MB, but it doesnt.
-
9. Re: Jopr Recovery Alert Not working
josh2268 Jun 11, 2009 1:16 PM (in response to amit8484)I am having the same trouble with recovery alerts as well. Mine are setup exactly like yours with availability going down or up. They do not work for me on postgres, apache, jboss, or rhq agents. I've been looking for any documentation to clarify anything else needed for the setup of recovery alerts. The down condition always works and disables the alert, but the up condition never fires no matter how short or long the servers are down. I'm running jopr 2.2 with everything on ubuntu.
-
10. Re: Jopr Recovery Alert Not working
amit8484 Jun 15, 2009 2:13 PM (in response to amit8484)actually i have jopr version 1.1.0.GA could this be the problem?
-
11. Re: Jopr Recovery Alert Not working
desmetch Jul 8, 2009 8:16 AM (in response to amit8484)Same symptoms on RHEL4, jdk 1.5.0, oracle and jopr 2.2.0.
The down condition always fires and disables the alert, but the up condition never fires. -
12. Re: Jopr Recovery Alert Not working
joe.marques Jul 8, 2009 2:27 PM (in response to amit8484)The recovery alerts bug only affected alert templates. If you created an alert definition on a single resource, the recovery rules should work properly.
This issue was tracked here -- http://jira.rhq-project.org/browse/RHQ-2150
Whenever you make any edits to any alert template that has recovery rules, the workaround SQL (listed in that case) needs to be executed. After that, you must either:
1) create a compatible group of "RHQ Server Alerts Engine Subsystem" services, execute the "reload caches" operation on that group
2) if you have an older version of Jopr that doesn't have that operation available, you'll need to bounce/restart your Jopr server(s) after executing the SQL
It has been fixed and will be released in the next version of Jopr (2.3)