Nagios Conference 2012 - Alexis Le Quoc - Deep Dive into Nagios Analytics

609 views
518 views

Published on

Alexis Le Quoc's presentation on Diving into Nagios Analytics
The presentation was given during the Nagios World Conference North America held Sept 25-28th, 2012 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
609
On SlideShare
0
From Embeds
0
Number of Embeds
17
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Nagios Conference 2012 - Alexis Le Quoc - Deep Dive into Nagios Analytics

  1. 1. A Deep Dive intoNagios Analytics Alexis Lê-Quôc (@alq) http://datadoghq.com
  2. 2. @alqDev & OpsNagios user since2008Datadog co-founder
  3. 3. A little survey
  4. 4. Top 3 failed checks
  5. 5. That woke me up That I responded to That I responded to 5 weeks ago last week Top 3 failed checks That impacts our business That most of my team the most?responded to at least once
  6. 6. That woke me upThat I responded to last week That I responded to 5 weeks ago Top 3 failed checks That most of my team That impacts our business responded to at least once the most?
  7. 7. At best, finding local optimumsUsing memory to prioritize remediation... At worst, brownian motion
  8. 8. Analytics
  9. 9. Performance Metrics Nagios Traffic Other Sources In the “Cloud”
  10. 10. Nagios a “chatty” source out of 40+ Datadog supports
  11. 11. One example
  12. 12. Almost 13000 Nagios “events” over past week
  13. 13. Constant stream
  14. 14. 86 notifications!
  15. 15. Pattern
  16. 16. Pattern
  17. 17. More data?More questions.
  18. 18. A dialog with data Not a scientific study
  19. 19. Population25% 50% 75% 100%20 93 322 904
  20. 20. Does size matter?
  21. 21. Weekly Count per host split by quartile
  22. 22. Weekly count per host split by quartile Outliers Sick hosts, silenced checks
  23. 23. Notifications
  24. 24. Notifications 1-3% of alerts notifyLittle difference per quartile
  25. 25. Does time of day matter?
  26. 26. Mean about the same across quartilesTime-based deviation?
  27. 27. Does the day of week matter?
  28. 28. Not really
  29. 29. Squeaky wheels? (checks)
  30. 30. Outlier
  31. 31. Outlier in more detail
  32. 32. Long Tail
  33. 33. Squeaky wheel? (hosts)
  34. 34. Same outlier
  35. 35. Similar pattern as checks
  36. 36. Long Tail
  37. 37. Recurring alerts
  38. 38. Happen s OftenSeldomhappens Young Old
  39. 39. Occur often, for a long time Tolerated Happen once in a while
  40. 40. More data?More questions.
  41. 41. HOWTO?
  42. 42. Awk RFind out tomorrow! d3 Postgres
  43. 43. Presentation matters
  44. 44. Take-away?
  45. 45. Take-aways• Don’t rely on your memory• Your Nagios logs are a treasure trove• Have a dialog with your data• Presentation matters

×