Latency: The Silent Monitoring System Killer
Upcoming SlideShare
Loading in...5
×
 

Latency: The Silent Monitoring System Killer

on

  • 1,887 views

Slides from a lighting talk presented at the January Sydney DevOps Meetup.

Slides from a lighting talk presented at the January Sydney DevOps Meetup.

Statistics

Views

Total Views
1,887
Views on SlideShare
1,887
Embed Views
0

Actions

Likes
2
Downloads
13
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Latency: The Silent Monitoring System Killer Latency: The Silent Monitoring System Killer Presentation Transcript

  • LATENCY THE SILENT MONITORING SYSTEM KILLERSaturday, 21 January 2012
  • #MONITORINGSUCKSSaturday, 21 January 2012
  • SCALABILITYSaturday, 21 January 2012
  • execute large volumes of monitoring checks under a variety of conditions (good + bad) with a consistent throughputSaturday, 21 January 2012
  • CONSISTENT THROUGHPUTSaturday, 21 January 2012
  • SHORT CHECK EXECUTION TIMESaturday, 21 January 2012
  • WHAT INTRODUCES VARIABILITY?Saturday, 21 January 2012
  • LATENCY INTRODUCED c SYNCHRONOUS CALLSSaturday, 21 January 2012
  • Saturday, 21 January 2012
  • = 1 monitoring checkSaturday, 21 January 2012
  • Saturday, 21 January 2012
  • 150 monitoring checks each executed every 300 seconds each takes 1 second checks are executed seriallySaturday, 21 January 2012
  • all checks executed in 150 seconds monitoring system at 50% capacitySaturday, 21 January 2012
  • DOUBLE THE CHECKSSaturday, 21 January 2012
  • Saturday, 21 January 2012
  • all checks executed in 300 seconds monitoring system at 100% capacitySaturday, 21 January 2012
  • DOUBLE THE EXECUTION TIMESaturday, 21 January 2012
  • Saturday, 21 January 2012
  • all checks executed in 600 seconds monitoring system at 200% capacity only 50% of the checks are “on time”Saturday, 21 January 2012
  • CHECK LATENCYSaturday, 21 January 2012
  • Saturday, 21 January 2012
  • HOW DO WE FIX THIS!?Saturday, 21 January 2012
  • PERFORMANCE ANALYSIS!Saturday, 21 January 2012
  • AN ANALOGYSaturday, 21 January 2012
  • MONITORING CHECK == “ACTION” ON MVC WEB APPSaturday, 21 January 2012
  • Saturday, 21 January 2012
  • Saturday, 21 January 2012
  • SEPARATE DATA COLLECTION FROM THRESHOLDING f NOTIFICATIONSSaturday, 21 January 2012
  • Saturday, 21 January 2012
  • THIS SHIFTS LATENCYSaturday, 21 January 2012
  • IT DOES NOT ELIMINATE IT!Saturday, 21 January 2012
  • Saturday, 21 January 2012
  • RRDTOOL IS EVILSaturday, 21 January 2012
  • USE SOMETHING BETTER!Saturday, 21 January 2012
  • USE OPENTSDB OR GANGLIA WITH CHECK_TSDB OR CHECK_GMONDSaturday, 21 January 2012
  • Saturday, 21 January 2012
  • A DIFFERENT SET a PROBLEMSSaturday, 21 January 2012
  • STORAGE WILL GO AWAYSaturday, 21 January 2012
  • CHAOS WILL ENSUESaturday, 21 January 2012
  • PAGERS WILL MELTSaturday, 21 January 2012
  • SET UP “META- PARENTING”Saturday, 21 January 2012
  • BUILD A KILL SWITCHSaturday, 21 January 2012
  • Saturday, 21 January 2012
  • READ MORE ABOUT THIS:Saturday, 21 January 2012
  • bit.ly/yN4mdySaturday, 21 January 2012