12 Replies Latest reply on Jul 8, 2009 2:27 PM by joe.marques

    Jopr Recovery Alert Not working

    amit8484

      I currently have this setup.

      I have one alert to fire when availability goes down, and then disable until re-enabled by an recovery alert

      I have the recovery alert to fire when availability goes back up, and re-enabled the first alert, but thats not working

      my first alert is not being re-enabled by the recovery alert, whats wrong?

        • 1. Re: Jopr Recovery Alert Not working

          I've not seen issues with recovery alerts. Please post the full, exact details of both alert definitions. I'll use that information to try and reproduce locally.

          • 2. Re: Jopr Recovery Alert Not working
            amit8484

            Jboss AS Alert

            Condition Set
            If Condition: Availability goes DOWN

            Dampening Rule: Each time condition set is true


            Action Filters: Disable alert until re-enabled manually or by recovery alert : true

            Recovery Alert for JboSS Server Alert

            Condition Set
            If Condition: Availability goes UP

            Dampening Rule: Each time condition set is true

            Recovery Alert: for Jboss AS Alert

            Action Filters: Disable alert until re-enabled manually or by recovery alert : false

            • 3. Re: Jopr Recovery Alert Not working

              That definition looks correct.

              When you check the availability history for this resource (monitor>availability subtab if you're using Jopr 2.2.x) what are the timestamps for when it went down and came back up?

              It's possible that if your resource restarts very quickly (say in 5 seconds) that the blip would not have been detected. Jopr polls for availability of the resource once per minute, so if your resource went down and came back up in between successive polls, the monitoring system might not see the small window of downtime. Could that be happening here?

              • 4. Re: Jopr Recovery Alert Not working
                amit8484

                how confident are you one the availability once per minute, because that can help a lot setting up my alerts

                • 5. Re: Jopr Recovery Alert Not working
                  amit8484

                  well, my thinking is, is that if it goes down, it will fire and alert, and if the server is restarting (which doesnt take long) would the next poll indicate up and it should re-enable the original alert

                  • 6. Re: Jopr Recovery Alert Not working
                    amit8484

                    yeah i dont think it can be a minute between polls because if i set the dampening to alert every time its true, i get like 40 emails every second notifying me its down

                    • 7. Re: Jopr Recovery Alert Not working

                      40 emails every second? That sounds like a serious issue. What version of Jopr are you running on? Can you give me any other information about your environment?

                      It most certainly is one minute between polls.

                      • 8. Re: Jopr Recovery Alert Not working
                        amit8484

                        yeah the JVM free memory one is not working either. I have the original alert fire when memory goes below 50 MB 3 times in the last 5 polls. The recovery alert should fire and renable the first one when it goes back up over 50 MB, but it doesnt.

                        • 9. Re: Jopr Recovery Alert Not working

                          I am having the same trouble with recovery alerts as well. Mine are setup exactly like yours with availability going down or up. They do not work for me on postgres, apache, jboss, or rhq agents. I've been looking for any documentation to clarify anything else needed for the setup of recovery alerts. The down condition always works and disables the alert, but the up condition never fires no matter how short or long the servers are down. I'm running jopr 2.2 with everything on ubuntu.

                          • 10. Re: Jopr Recovery Alert Not working
                            amit8484

                            actually i have jopr version 1.1.0.GA could this be the problem?

                            • 11. Re: Jopr Recovery Alert Not working
                              desmetch

                              Same symptoms on RHEL4, jdk 1.5.0, oracle and jopr 2.2.0.

                              The down condition always fires and disables the alert, but the up condition never fires.

                              • 12. Re: Jopr Recovery Alert Not working

                                The recovery alerts bug only affected alert templates. If you created an alert definition on a single resource, the recovery rules should work properly.

                                This issue was tracked here -- http://jira.rhq-project.org/browse/RHQ-2150

                                Whenever you make any edits to any alert template that has recovery rules, the workaround SQL (listed in that case) needs to be executed. After that, you must either:

                                1) create a compatible group of "RHQ Server Alerts Engine Subsystem" services, execute the "reload caches" operation on that group
                                2) if you have an older version of Jopr that doesn't have that operation available, you'll need to bounce/restart your Jopr server(s) after executing the SQL

                                It has been fixed and will be released in the next version of Jopr (2.3)