Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data science the solution to #monitoringsucks

2,082 views

Published on

Ever since the #monitoringsucks trend kicked off a conversation about the state of monitoring tools in 2011, there has been a flurry of activity resulting in new solutions, improved tools, and applications generating tons of data. However, we are still faced with the same issues almost 4 years later. Alerts still generate far too much noise to be useful. Dashboards aren’t actionable and require human interpretation. The volume of log, time series, and other data makes it difficult to collate, visualize, and interpret in the mythical single pane of glass. How do we definitively solve these problems?

Data science. Using advances in data science and machine learning that are already being applied to “sexy” problems at companies around the globe, we can finally reach a tipping point when it comes to #monitoringsucks issues. New data science tools can pinpoint problems before they hit a static threshold, group alerts from a variety of sources into a single logical error, and prevent eye strain from studying hundreds of graphs. In this talk, I will be discussing the virtues – and pitfalls – of new monitoring entrants like Kale from Etsy, Bosun from StackExchange, and Twitter’s open source R package AnomalyDetection.

Published in: Data & Analytics
  • Be the first to comment

Data science the solution to #monitoringsucks

  1. 1. Data Science: The Solution to #monitoringsucks #DevOpsDays Amsterdam 2015
  2. 2. Who am I? ● Operations Engineer @ STYLIGHT GmbH
  3. 3. Who am I? ● Operations Engineer @ STYLIGHT GmbH ● 11+ years as a System Administrator
  4. 4. Who am I? ● Operations Engineer @ STYLIGHT GmbH ● 11+ years as a System Administrator ● Tired of being woken up at 2am by false positives
  5. 5. Who am I? ● Operations Engineer @ STYLIGHT GmbH ● 11+ years as a System Administrator ● Tired of being woken up at 2am by false positives ● It all started with a 14.4Kbps modem
  6. 6. Your online magazine for fashion, beauty, stars, and shopping. STYLIGHT
  7. 7. STYLIGHT
  8. 8. #monitoringsucks: A Brief History ● Started in 2011 ● Loosely-organized movement to address the shortcomings with monitoring tools ● Spawned an IRC channel and GitHub repo linking to available tools
  9. 9. Why Does Monitoring Still Suck? ● Alerts generate far too much noise to be useful
  10. 10. Why Does Monitoring Still Suck? ● Alerts generate far too much noise to be useful ● Dashboards aren’t actionable and require human interpretation
  11. 11. Why Does Monitoring Still Suck?
  12. 12. Why Does Monitoring Still Suck?
  13. 13. Why Does Monitoring Still Suck? ● Alerts generate far too much noise to be useful ● Dashboards aren’t actionable and require human interpretation ● Volume of data makes it difficult to collate, visualize, and interpret
  14. 14. ● Finding relationships and patterns in data ● Predictive Analysis ● Anomoly Detection in large datasets ● Natural Language Processing can process and understand unstructured data How does data science help us?
  15. 15. What does data science mean to me? ● Pinpoint problems before they hit a static threshold
  16. 16. What does data science mean to me? ● Pinpoint problems before they hit a static threshold ● Group alerts from a variety of sources into a single logical event
  17. 17. What does data science mean to me? ● Pinpoint problems before they hit a static threshold ● Group alerts from a variety of sources into a single logical event ● Prevent eye strain from studying hundreds of graphs
  18. 18. What are the tools of the future ● Kale - Etsy ● Bosun - StackExchange ● AnomalyDetection - Open source R package from Twitter
  19. 19. Kale ● Composed from Skyline & Oculus
  20. 20. Kale ● Composed of Skyline & Oculus ● Skyline is an anomaly detection system
  21. 21. Kale
  22. 22. Kale ● Composed from Skyline & Oculus ● Skyline is an anomaly detection system ● Oculus is the anomaly correlation component of the Kale system
  23. 23. Kale
  24. 24. Bosun ● Monitoring and alerting system by Stack Exchange
  25. 25. Bosun ● Monitoring and alerting system by Stack Exchange ● Domain Specific Language for alerts and notifications
  26. 26. Bosun ● Monitoring and alerting system by Stack Exchange ● Domain Specific Language for alerts and notifications ● Backtest your alerts against historical data
  27. 27. AnomalyDetection ● Open-source R package created by Twitter
  28. 28. AnomalyDetection ● Open-source R package created by Twitter ● Detects anomalies in time series data and numerical vectors
  29. 29. AnomalyDetection ● Open-source R package created by Twitter ● Detects anomalies in time series data and numerical vectors ● Provides visualization support
  30. 30. AnomalyDetection
  31. 31. Recap
  32. 32. STYLIGHT.COM/JOBS JOIN US
  33. 33. Let’s Get In Touch @patrickroelke @codetailors patrick.roelke@stylight.com patrickroelke.com

×