4 Replies Latest reply on Jun 7, 2007 8:58 AM by marklittle

ESB Monitoring and Metrics : what should be monitored?

tcunning May 31, 2007 1:25 PM

We're looking to put together a list of what should be monitored in the ESB, and are looking for suggestions. Burr put together a preliminary list, which I've posted below, but we're looking for suggestions of what would everyone would like to see.

Burr's list :
- What listeners/services are running?
- Is the listener/service alive?
- Recycle the listener/service
- How many messages are waiting to be processed?
- How old is the oldest message?

Please feel free to suggest anything that you think would be useful.

1. Re: ESB Monitoring and Metrics : what should be monitored?

kurtstam May 31, 2007 3:33 PM (in response to tcunning)

Monitoring and managing are two different things. I think

management:
- What listeners/services are running?
- Recycle the listener/service

both of which are close to what we already have for an .esb archive (See the jmx-console)

monitoring:
- Is the listener/service alive?
- How many messages are waiting to be processed?
- How old is the oldest message?
and here I'd like to add:
- Statistics/history on message processing over time: Messages/min for each service.
Actions
2. Re: ESB Monitoring and Metrics : what should be monitored?

kukeltje May 31, 2007 6:08 PM (in response to tcunning)

for each service:
maximum time of processing a message
average time of message processing
standard deviation of message processing time
messages/min over 1 minute period
messages/min over 5 minute period
messages/min over 60 minute period
maximum wait time for a message to be processed
...
Actions
3. Re: ESB Monitoring and Metrics : what should be monitored?

marklittle Jun 1, 2007 6:12 AM (in response to tcunning)

We're really talking about runtime management/governance aspects. That does encapsulate monitoring as well, because in order to manage, you need to know what's going on ;-)

The sorts of things I'd like us to see be able to do include:

What services are deployed?
How long have they been available
- MTF/MTTR

CBR information
- What rules have been triggered and how often?
- What messages have not been delivered and why?

Service lifecycle
- Start/stop/suspend/resume

General monitoring
- Number of failed requests and why e.g., endpoint down versus message incompatibility
- Number of messages through a service
- Time for responses

Supported message set
Supported message exchange pattern

Transformation
- what transforms have been triggered
- time to do a transform (cross referenced with message size)

A lot of this can then be cross referenced to determine things like service/network bottlenecks, availability issues, application deployment problems, scalability requirements etc.
Actions
4. Re: ESB Monitoring and Metrics : what should be monitored?

marklittle Jun 7, 2007 8:58 AM (in response to tcunning)

Tom, I'd start to make progress based on this feedback. This will always be a moving target to some degree, but what's been mentioned here make a good first pass.
Actions

Go to original post