4 Replies Latest reply on Jun 7, 2007 8:58 AM by marklittle

    ESB Monitoring and Metrics : what should be monitored?

    tcunning

      We're looking to put together a list of what should be monitored in the ESB, and are looking for suggestions. Burr put together a preliminary list, which I've posted below, but we're looking for suggestions of what would everyone would like to see.

      Burr's list :
      - What listeners/services are running?
      - Is the listener/service alive?
      - Recycle the listener/service
      - How many messages are waiting to be processed?
      - How old is the oldest message?

      Please feel free to suggest anything that you think would be useful.

        • 1. Re: ESB Monitoring and Metrics : what should be monitored?
          kurtstam

          Monitoring and managing are two different things. I think

          management:
          - What listeners/services are running?
          - Recycle the listener/service

          both of which are close to what we already have for an .esb archive (See the jmx-console)

          monitoring:
          - Is the listener/service alive?
          - How many messages are waiting to be processed?
          - How old is the oldest message?
          and here I'd like to add:
          - Statistics/history on message processing over time: Messages/min for each service.

          • 2. Re: ESB Monitoring and Metrics : what should be monitored?
            kukeltje

            for each service:
            maximum time of processing a message
            average time of message processing
            standard deviation of message processing time
            messages/min over 1 minute period
            messages/min over 5 minute period
            messages/min over 60 minute period
            maximum wait time for a message to be processed
            ...

            • 3. Re: ESB Monitoring and Metrics : what should be monitored?
              marklittle

              We're really talking about runtime management/governance aspects. That does encapsulate monitoring as well, because in order to manage, you need to know what's going on ;-)

              The sorts of things I'd like us to see be able to do include:

              What services are deployed?
              How long have they been available
              - MTF/MTTR

              CBR information
              - What rules have been triggered and how often?
              - What messages have not been delivered and why?

              Service lifecycle
              - Start/stop/suspend/resume

              General monitoring
              - Number of failed requests and why e.g., endpoint down versus message incompatibility
              - Number of messages through a service
              - Time for responses

              Supported message set
              Supported message exchange pattern

              Transformation
              - what transforms have been triggered
              - time to do a transform (cross referenced with message size)

              A lot of this can then be cross referenced to determine things like service/network bottlenecks, availability issues, application deployment problems, scalability requirements etc.

              • 4. Re: ESB Monitoring and Metrics : what should be monitored?
                marklittle

                Tom, I'd start to make progress based on this feedback. This will always be a moving target to some degree, but what's been mentioned here make a good first pass.