SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
William Brander and Sean Farmar show how the monitoring game changes when a system becomes distributed and you start delving into the world of microservices.
Learn:
* Why monitoring changes in distributed systems
* A monitoring philosophy that ensures all bases are covered
* The aspects of monitoring that affect asynchronous messaging systems
William Brander and Sean Farmar show how the monitoring game changes when a system becomes distributed and you start delving into the world of microservices.
Learn:
* Why monitoring changes in distributed systems
* A monitoring philosophy that ensures all bases are covered
* The aspects of monitoring that affect asynchronous messaging systems
1.
http://particular.net
What to consider when monitoring
microservices
Sean Farmar holds the world record for answering the most
NServiceBus questions - even more than Udi.
With over 20 years of experience, he specializes in providing
simple solutions for complex business requirements using
NServiceBus and applying SOA principles inspired by Udi
Dahan.
As a solution architect with Particular Software, the creators of
NServiceBus, Sean provides support, training and consulting
for customers using NServiceBus and the Particular Platform.
A professional geek, William works for Particular Software
writing amazing software like NServiceBus. Passionate about
the web and security, he is engaged in a sordid love affair with
JavaScript, and spends most of his free time trying to convince
others of it's beauty and elegance.
When not behind his laptop hacking away, this amateur beer
enthusiast can often be found playing boardgames or drinking
cold-brew coffee.
2.
William Brander http://particular.net Sean Farmar
What to consider when monitoring
microservices
3.
Agenda
• Introduction
• A Philosophy on Monitoring
• How things change when they’re distributed
• Monitoring Metrics
• Q & A
4.
An average production system
Database
• Is the web server up?
• Is the database up?
• Can the webserver talk
to the db?
5.
What are you actually monitoring?
Business
Capability
Application
Infrastructure
Are my servers running?Is my application process running?Can users access the system?
6.
A Monitoring Philosophy
Business
Capability
Application
Infrastructure
Capacity
Performance
Health
Monitoring Area Monitoring Concern
7.
Monitoring Concerns
Capacity
Performance
Health
Is the server up?Is there high CPU?Do I have enough disk space?
Is my application generating exceptions?
How quickly is my system processing messages?
Can I handle month end batch jobs?
Is the server up?
Is there high CPU?
Do I have enough disk space?
Application
Infrastructure
Can users access the system?
Are we meeting our SLAs?
What is the impact of adding another customer?
Business
Capability
8.
A Monitoring Philosophy
Business
Capability
Application
Infrastructure
Capacity
Performance
Health
Monitoring Area Monitoring Concern
Proactive
Reactive
Passive
Interaction Type
9.
An average production system
Database
• Is the web server up?
• Is the database up?
• Can the webserver talk
to the db?
Infrastructure PassiveHealth
12.
Monitoring distributed systems
Multiple processes and servers and queues
We want to monitor the time it takes for a message to be processed
We need to monitor the message queues
14.
Queue Length
• Queue length is an indicator of work still outstanding
• High queue length doesn’t necessarily indicate a problem though
Stable or
decreasing
is good
Increasing
is bad
16.
Processing Time and fault tolerance
• Processing Time does not include error handling time
• Avoid losing data due to exceptions or temporary connectivity issues
• If all else fails, move the message to the error queue
17.
Invoke
Exception Diagnostics – Immediate Retries
Input Queue
Immediate Retries x n
Start Delayed Retries
Timeout Queue
If all retries fail:
✘
✘
18.
Exception Diagnostics – Delayed Retries
Return to timeout queue
Invoke
Error Queue
Retry x timesTimeout Queue
If all fails: move to error queue
✘
✘
19.
Detecting Connectivity
• Distributed systems typically work when other parts aren’t available
• How do you know the endpoint you’re sending messages to is
actually processing messages?
21.
✔⌛
⏱️
Critical time
⏱️
Critical time = The entire time taken to process a
message successfully
⏱️
22.
• Critical Time is the total duration between when a message is created
to when it is processed
Critical Time = Time in Queue +
Processing Time +
Retry Time +
Network Time
Critical Time
Stable or decreasing could
be good
Increasing is bad
23.
Putting these together
• Each of these metrics presents a piece of the puzzle
• Look at them from an endpoint’s perspective, not per message
• Looking at them together gives great insight into your system
Critical Time Processing Time Queue LengthCritical Time Processing Time Queue LengthCritical Time Processing Time Queue Length
24.
Keeping your eye on everything
• These 5 metrics can give a lot of insight
• Some individual metrics are meaningful
• But most tell a story when combined with others
• Let the monitoring philosophy guide what you focus on
• NServiceBus already provides a lot of these metrics for you!
• Letting you focus on monitoring the metrics that impact your business
25.
Learn more
• Try NServiceBus + the Particular Service Platform
• https://docs.particular.net/tutorials/quickstart/
• Take a look at NServiceBus.Metrics Nuget package
• Follow us to find out about the next webinar in the series!