From MonitoringSucks to Monitoring Love , 2016 Edition

17,473 views

Published on

FlossUK 2016, London

Published in: Technology
0 Comments
12 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
17,473
On SlideShare
0
From Embeds
0
Number of Embeds
3,701
Actions
Shares
0
Downloads
43
Comments
0
Likes
12
Embeds 0
No embeds

No notes for slide

From MonitoringSucks to Monitoring Love , 2016 Edition

  1. 1. From #MonitoringSucks toFrom #MonitoringSucks to #MonitoringLove#MonitoringLove Open Source Monitoring in 2016Open Source Monitoring in 2016 @KrisBuytaert FlossUK 2016, London , UK
  2. 2. Kris BuytaertKris Buytaert ● I used to be a Dev,I used to be a Dev, ● Then Became an OpThen Became an Op ● Chief Trolling Officer and Open SourceChief Trolling Officer and Open Source Consultant @inuits.euConsultant @inuits.eu ● Everything is an effing DNS ProblemEverything is an effing DNS Problem ● Building Clouds since before the bookstoreBuilding Clouds since before the bookstore ● Organising ConferencesOrganising Conferences ● Evangelizing devopsEvangelizing devops
  3. 3. An opinionated talk about the Open SourceAn opinionated talk about the Open Source Monitoring tooling landscapeMonitoring tooling landscape In which I hope to learn from YOUIn which I hope to learn from YOU
  4. 4. #devops=~C(L)AMS#devops=~C(L)AMS ● CultureCulture ● (Lean)(Lean) ● AutomationAutomation ● Monitoring and MeasurementMonitoring and Measurement ● SharingSharing Damon Edwards and John WillisDamon Edwards and John Willis Gene KimGene Kim
  5. 5. Monitoring is usually anMonitoring is usually an aftertoughtaftertought ENOBUDGET, ENOTIMEENOBUDGET, ENOTIME
  6. 6. An 2008 OLS PaperAn 2008 OLS Paper ● We have bloated Java toolsWe have bloated Java tools ● Some open Core stuffSome open Core stuff ● DYI folks want traditional NagiosDYI folks want traditional Nagios ● DBA RequiredDBA Required
  7. 7. #monitoringsucks#monitoringsucks ● John Vincent (@lusis), june 2011John Vincent (@lusis), june 2011 ● A sub #devops movementA sub #devops movement ● https://github.com/monitoringsucks/https://github.com/monitoringsucks/
  8. 8. Why #monitoringsucksWhy #monitoringsucks ● Manual config (gui)Manual config (gui) ● Not in sync with realityNot in sync with reality ● Hosts onlyHosts only ● Services sometimesServices sometimes ● Application neverApplication never ● Chaos or out of sync with realityChaos or out of sync with reality ● Alert FatigueAlert Fatigue
  9. 9. #monitoringlove#monitoringlove • • Ulf Mansson #devopsdays Rome 2011Ulf Mansson #devopsdays Rome 2011 • A new era of toolingA new era of tooling • #monitoringlove hacksessions @inuits#monitoringlove hacksessions @inuits • #monitorama#monitorama
  10. 10. What we wantWhat we want ● Small , well suited componentsSmall , well suited components • CollectCollect • Transport / MangleTransport / Mangle • StoreStore • AnalyseAnalyse • Act / AlertAct / Alert • VisualizeVisualize
  11. 11. #monitoringlove#monitoringlove But the love was about :But the love was about :
  12. 12. SensuSensu ● Awesome for non staticAwesome for non static environmentsenvironments ● Scaling a clustered RabbitMQ ?Scaling a clustered RabbitMQ ? ● This is Europe, U no do cloudThis is Europe, U no do cloud
  13. 13. Automation ofAutomation of #monitoring#monitoring brought backbrought back thethe #love#love
  14. 14. AutomationAutomation
  15. 15. Monitoring aMonitoring a serviceservice vsvs Monitoring aMonitoring a ServiceService
  16. 16. definition of done:definition of done: monitored and in productionmonitored and in production
  17. 17. A software project is not doneA software project is not done until your last end user is deaduntil your last end user is dead
  18. 18. Culture,Culture, Automation,Automation, Measurement :Measurement : measure all the thingsmeasure all the things SharingSharing
  19. 19. CollectD all the metrics,CollectD all the metrics, at high intervalsat high intervals
  20. 20. Oldschool graphiteOldschool graphite
  21. 21. Graphite++Graphite++ ● APIAPI ● DashboardsDashboards • GrafanaGrafana • GdashGdash ● Engines :Engines : • InfluxDBInfluxDB • CyaniteCyanite
  22. 22. Draw as InfiniteDraw as Infinite ● Time To DeployTime To Deploy ● DeployDeploy FrequencyFrequency ● LifecycleLifecycle frequencyfrequency ● Map to otherMap to other metricsmetrics
  23. 23. Graph-ExplorerGraph-Explorer ● (Vimeo)(Vimeo) ● Metrics 2.0Metrics 2.0 ● Add EventsAdd Events
  24. 24. GrafanaGrafana
  25. 25. Graphs to KnowledgeGraphs to Knowledge SkylineSkyline • OculusOculus • Creating Information out of this dataCreating Information out of this data • Big dataBig data • Machine LearningMachine Learning
  26. 26. AggregationAggregation ● Alert on streamsAlert on streams ● Alert on aggregated metricsAlert on aggregated metrics
  27. 27. RiemannRiemann ● I still don't get it ?I still don't get it ? ● Distributed TopDistributed Top ● Do you like Clojure ?Do you like Clojure ? ● Riemann Health plugin ?Riemann Health plugin ? ● s/riemann-health/collectd/g;s/riemann-health/collectd/g; ● Output to graphiteOutput to graphite
  28. 28. PrometheusPrometheus ● Started 2012Started 2012 ● SoundCloudSoundCloud ● Metrics BasedMetrics Based ● ScrapesScrapes EndpointsEndpoints • ExistingExisting endpoints forendpoints for limited toolslimited tools ● GraphiteGraphite ExporterExporter ● Push GatewayPush Gateway ● Great AlertingGreat Alerting ● Might needMight need some codingsome coding
  29. 29. But I have log files..But I have log files..
  30. 30. Logs and MetricsLogs and Metrics ● Graylog2Graylog2 ● ELSA (Enterprise Log Search andELSA (Enterprise Log Search and Archive)Archive) ● ELK StackELK Stack
  31. 31. ● Collect fromCollect from anywhereanywhere ● FilterFilter ● Send anywhereSend anywhere
  32. 32. Infitnite Diskspace ?Infitnite Diskspace ? ● Logstash outputLogstash output • Statsd => GraphiteStatsd => Graphite • Keep patterns around,Keep patterns around, • Selectively purge dataSelectively purge data •
  33. 33. APMAPM But what about my apps ?But what about my apps ? Half the world cheers about SAASHalf the world cheers about SAAS tools :(tools :(
  34. 34. PacketbeatPacketbeat ● Traffic FlowTraffic Flow through networkthrough network ● TransactionsTransactions causing errroscausing errros ● SQL per HTTPSQL per HTTP ● API call usageAPI call usage
  35. 35. Old PacketBeatOld PacketBeat
  36. 36. Beats ?Beats ? ● Elasti.coElasti.co ● Collect, Parse and ShipCollect, Parse and Ship ● Q: Is all the data you care aboutQ: Is all the data you care about suitable for Elastic Search ?suitable for Elastic Search ? ● What about Long Term Storage ?What about Long Term Storage ? ● Do you even want to build alertingDo you even want to build alerting from this ?from this ?
  37. 37. Checking for FailureChecking for Failure ● IcingaIcinga • Automated config generationAutomated config generation ● SensuSensu • CloudstyleCloudstyle ● PrometheusPrometheus • Metric basedMetric based
  38. 38. Waking you up at nightWaking you up at night ● FlapjackFlapjack flapjack.ioflapjack.io monitoring notification routing +monitoring notification routing + event processing systemevent processing system ● OpenDutyOpenDuty github.com/szechuen/OpenDutygithub.com/szechuen/OpenDuty Duty managementDuty management
  39. 39. AggregatingAggregating ● ThrukThruk ● GrafanaGrafana ● DashingDashing
  40. 40. Our Current StackOur Current Stack
  41. 41. I love where Monitoring is headingI love where Monitoring is heading We have much less false positives these daysWe have much less false positives these days
  42. 42. ContactContact Kris Buytaert kris.buytaert@inuits.euKris Buytaert kris.buytaert@inuits.eu Further ReadingFurther Reading @krisbuytaert@krisbuytaert http://www.krisbuytaert.be/blog/http://www.krisbuytaert.be/blog/ http://www.inuits.eu/http://www.inuits.eu/ Find Inuits inFind Inuits in Brasschaat,Ghent,Brasschaat,Ghent, Rotterdam,Prague,Rotterdam,Prague, Kiev,BrnoKiev,Brno

×