Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DevOpsCon - Listen to your infrastructure

447 views

Published on

Monitoring is essential to understand what are you doing and to predict possible patterns. It's not really important if you are monitoring your infrastructure or your application, what you need is a strong and stable system to trust your change and your ecosystem. InfluxDB offers an entire stack to collect, store and alert.

Published in: Software
  • Be the first to comment

DevOpsCon - Listen to your infrastructure

  1. 1. Listen to your infrastructure Gianluca Arbezzano Software Engineer @CurrencyFair
  2. 2. I am an open source developer I am involved as speaker, maintain, contributor in different projects and community. https://twitter.com/gianarb https://github.com/gianarb http://gianarb.it
  3. 3. Drive your boat like a Captain This ebook drives you to manage Docker in production. http://scaledocker.com
  4. 4. MONITORING because we need to trust someone
  5. 5. 1. To understand what is happening
  6. 6. 2. To predict the future
  7. 7. Because we are not John
  8. 8. Badoo migrated to PHP 7
  9. 9. @dgryski tested for 10 mins a service in Golang (previously it was in Perl)
  10. 10. Sometimes you just need to compare
  11. 11. tail -f /var/log/nginx/live.access.log WIDESPREAD MONITORING TOOL
  12. 12. 2016/04/15 15:42:46 [warn] 2330#0: *167 using uninitialized variable, client: 10.0.1.1, server: localhost.dev, request: "POST /auth HTTP/1.1", host: "localhost" 2016/04/15 15:44:44 [error] 2330#0: *171 FastCGI sent in stderr: " PHP message: PHP Fatal error: Uncaught exception 'RuntimeException' with message 'All broken)[500]' in /var/www/my/project.php:237 Stack trace: #0 /var/www/index.php:45 ObjectService->flush() #1 [internal function] ->save()
  13. 13. We are here to speak about Time Series [ { "name": "log_lines", "columns": ["time", "line"], "point": [1400425947368, "here's some useful log info"] } ]
  14. 14. EASY! EASY! EASY! { "name": "cpu_percent_use", "columns": ["value"], "point": 40 }
  15. 15. Time is a perfect sharding key It means that Time Series scale really well
  16. 16. InfluxDB ▸ Optimized to work with time series data ▸ Open source ▸ Big community and huge ecosystem
  17. 17. Easy wget https://dl.influxdata.com/influxdb/releases/influxdb_1.0.0_ amd64.deb sudo dpkg -i influxdb_1.0.0_amd64.deb Influxd -config /usr/local/etc/influxdb.conf
  18. 18. Easy ▸ HTTP server ▸ UDP server ▸ Admin Panel
  19. 19. 20 [key] [fields] [timestamp] temperature,machine=unit internal=3,external=10 1434055562000000035 Inline Protocol thinked to be smart and slim
  20. 20. SQL Like SELECT value FROM cpu_load_short WHERE region='us-west'
  21. 21. 22 CorleyBenchmarksInfluxDBAdapterEvent Method Name Iterations Average Time Ops/second ------------------------ ------------ -------------- ------------- sendDataUsingHttpAdapter: [1,000 ] [0.0026700308323] [374.52751] sendDataUsingUdpAdapter : [1,000 ] [0.0000436344147] [22,917.69026] UDP vs TCP protocol
  22. 22. 23 CREATE CONTINUOUS QUERY minnie ON world BEGIN SELECT min(mouse) INTO min_mouse FROM zoo GROUP BY time(30m) END Continuous Query
  23. 23. T-shirts time!
  24. 24. 25
  25. 25. 26
  26. 26. Telegraf https://github.com/influxdata/telegraf Collector to grab and send data from different sources to InfluxDB and other databases 27 Based on Input and out Plugin System
  27. 27. 28 Telegraf Plugins
  28. 28. Kapacitor https://github.com/influxdata/kapacitor Trigger notifications and make action in case of specific behaviors 29 framework for processing, monitoring, and alerting on time series data
  29. 29. 30 Kapacitor high CPU alertstream |from() .measurement('cpu_usage_idle') .groupBy('host') |window() .period(1m) .every(1m) |mean('value') |eval(lambda: 100.0 - "mean").as('used') |alert() .message('{{ .Level}}: {{ .Name }}/{{ index .Tags "host" }} has high cpu: {{ index .Fields "used" }}') .warn(lambda: "used" > 70.0) .crit(lambda: "used" > 85.0) // Send alert to hander of choice. // Slack .slack() .channel('#alerts') // PagerDuty .pagerDuty()
  30. 30. 31 Demo https://github.com/gianarb/tick-php
  31. 31. 32 When you start to work with "micro"services understand the topology of your connections is really important Time series can help you
  32. 32. 33 Why InfluxDB and not something else? https://www.influxdata.com/influxdb-is-27x-faster-vs-mongodb-for-time-series-workloads/ 27x greater write throughput 84x less disk space
  33. 33. 34 That’s it! A series of great tools to monitor your applications and your infrastructure
  34. 34. A monitoring system isn’t for all

×