1 Reply Latest reply on Apr 1, 2011 10:17 AM by mazz

    Events and Alerts

    runtis

      Hi Guys

       

      I have setup a bunch of alerts that are based on events in the server.log. The events will typically mean the profile needs to restart, so I made the alert do the restart (I use recovery alerts to prevent problems with events being non deterministic).

       

      Sometimes the event takes quite a long time to trigger an alert (maybe 5mins) and by then I have manually restarted it, making the alert pointless.

       

      This wiki page

       

      http://www.rhq-project.org/display/JOPR2/Events

       

      says that events don't have a schedule, and I have not found anywhere in JON to do this. Is there anyway to fine tune how often the events are processed?

       

      Thankyou

        • 1. Events and Alerts
          mazz

          I'm looking at the code and I see this:

           

          org.rhq.core.pc.event.EventManager:

           

                  this.senderThreadPool.scheduleAtFixedRate(senderRunner, this.pcConfig.getEventSenderInitialDelay(),

                      this.pcConfig.getEventSenderPeriod(), TimeUnit.SECONDS);

           

          So this is configurable. The events sender period has a default of 30s:

           

          org.rhq.core.pc.PluginContainerConfiguration:

           

              public static final long EVENT_SENDER_PERIOD_DEFAULT = 30L; // in seconds

          ...

              public long getEventSenderPeriod() {

                  Long period = (Long) configuration.get(EVENT_SENDER_PERIOD_PROP);

                  return (period == null) ? EVENT_SENDER_PERIOD_DEFAULT : period.longValue();

              }

           

          This is configured in the agent, via the prop "rhq.agent.plugins.event-sender.period-secs". In agent-configuration.xml:

           

                         <!--

                         _______________________________________________________________

                         rhq.agent.plugins.event-sender.period-secs

           

                         Defines how often event reports get sent to the server. The

                         value is specified in seconds.

                         -->

                         <!--

                         <entry key="rhq.agent.plugins.event-sender.period-secs" value="30"/>

                         -->

           

          (note: please read the top comments in agent-configuration.xml before editing the file assuming its values will be picked up by the agent)

           

          So, I'm not sure how you'd see a 5 minute delay unless either a) you already altered this sender period setting to something higher b) the log message filters/alerts you set up aren't picking up the messages you are expecting to pick up c) the log messages aren't actually getting logged at all or in a timely manner d) I'm reading this code all wrong or looking at the wrong thing e) there is a bug that causes a delay in sending the messages up to the server.

           

          Therefore, we'd need more information. What is the agent configuration (see the agent prompt command "getconfig")? What do your agent logs say? Maybe put the agent in debug mode and see what the agent log says between the time your event log message actually gets written to your resource's log file and the time the agent detects it and sends it up to the server. What is your alert definition (i.e. what's the event filters/regex you use?).  When was the log message actually written to the log file and when did the agent actually detect it?