Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Dynamic Infrastructure and Container Monitoring with Prometheus

500 views

Published on

Presentation on prometheus and its role in DevOps and how datascience might help in this scenario.

Published in: Data & Analytics
  • Writing a good research paper isn't easy and it's the fruit of hard work. For help you can check writing expert. Check out, please ⇒ www.HelpWriting.net ⇐ I think they are the best
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Video / Sources / Orig presentation can be found here: https://goettl79.github.io/pres17-infracoders-dynamic-infrastructure-monitoring-with-prometheus/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Dynamic Infrastructure and Container Monitoring with Prometheus

  1. 1. Dynamic Infrastructure and Container Monitoring with PROMETHEUS Georg Öttl Infracoders Meetup, 2017, Graz Follow @goettl
  2. 2. ● Enterprise Software dev ● Data Science Services ● Dev / DevOps / Ops ● Developer who likes Math Twitter: @goettl About me Follow @goettl
  3. 3. Overview ● Monitoring ● Prometheus by example ● DevOps demo, scaling Gitblit ● Analyze Prometheus metrics like a data scientist Follow @goettl
  4. 4. Monitoring Follow @goettl
  5. 5. Why is monitoring a DevOps topic? ● Check functionality / performance ● Analyse behavior ● Insight how software works ● Trend analytics / resources You build it you run it! Follow @goettl
  6. 6. Metrics, tracing, logging? Follow @goettl Blog Peter Bourgon - Metrics, Tracing and Logging
  7. 7. Well known monitoring tools ● Nagios, Check_Mk ● Opentsb, Graphite ● Influxdb + Kapacitor (Similar to Prometheus) ● Elasticsearch + Logstash + Kibana + ... ● ... Hard to use in a DevOps stack Follow @goettl
  8. 8. Rule #1 "Spend more time working on code that analyzes the meaning of metrics, than code that collects, moves, stores and displays metrics", Adrian Cockroft Follow @goettl
  9. 9. Prometheus by example Follow @goettl
  10. 10. Demo: app scenario scaling Gitblit Follow @goettl
  11. 11. Demo: exporter / endpoint (Gitblit) ... # TYPE jvm_memory_pool_bytes_max gauge jvm_memory_pool_bytes_max{pool="Code Cache",} 2.5165824E8 jvm_memory_pool_bytes_max{pool="Metaspace",} -1.0 jvm_memory_pool_bytes_max{pool="Compressed Class Space",} 1.073741824E9 jvm_memory_pool_bytes_max{pool="PS Eden Space",} 1.320157184E9 jvm_memory_pool_bytes_max{pool="PS Survivor Space",} 3.670016E7 jvm_memory_pool_bytes_max{pool="PS Old Gen",} 2.793406464E9 # HELP log4j_appender_total Log4j log statements at various log levels # TYPE log4j_appender_total counter log4j_appender_total{level="debug",} 0.0 log4j_appender_total{level="warn",} 4.0 log4j_appender_total{level="trace",} 0.0 log4j_appender_total{level="error",} 1034.0 log4j_appender_total{level="fatal",} 0.0 log4j_appender_total{level="info",} 6049.0 ... Follow @goettl
  12. 12. Demo: Prometheus out of the box functionality ● Scrape raw metrics ● Persist metrics ● Navigate data / promql ● Visualisation Follow @goettl
  13. 13. Demo: Prometheus advanced vis + navigation ● Grafana dashboards ● Navigation with labels Follow @goettl
  14. 14. Demo: monitoring as part of development ● Monitoring for verification of load tests ● Tests should trigger similar load to production ● DevOps is the best way to get high quality data ● Alertmanager as Assert.that Follow @goettl
  15. 15. Demo: the admin part of Prometheus ● Prometheus time series database ● Integration to existing monitoring solutions ● How to scale Prometheus ● 11 integrations to container orchestrators (k8s, marathon, dns, ... ) Follow @goettl
  16. 16. Whitebox instrumentation in Java Follow @goettl
  17. 17. How to do whitebox monitoring so far ● Json / CSV / SQL View, ... ● JMX ● Libraries with hooks push (e.g. datadog, ... ) Follow @goettl
  18. 18. Prometheus client instrumentation, example Gitblit ● Client instrumentation ● Default metrics for Log4j ● Default metrics für JDK ● Custom Metric for git garbage collection, ldap sync Follow @goettl
  19. 19. Prometheus client Metrics HTTP / Servlet Gitblit Servlet / Guice WebModule konfigurieren bind(MetricsServlet.class).in(Scopes.SINGLETON); serve("/Prometheus").with(MetricsServlet.class); ... that's it ... Follow @goettl
  20. 20. Prometheus client Metrics JDK Register default JDK Metrics DefaultExports.initialize(); ... that's it ... Follow @goettl
  21. 21. Client Metriken Log4j Instrumen Logger / Log4j log4j.rootCategory=INFO, S, METRICS ... log4j.appender.METRICS = io.Prometheus.client.log4j.InstrumentedAppender log4j.appender.METRICS.Append = false ... that's it ... Follow @goettl
  22. 22. Custom Metrics ... that's it ... private final Counter garbageCollectsTotal = Counter.build() .name("GIT_GARBAGE_COLLECTS_TOTAL") .help("Number of git garbage collects issued by giblit for a repository") .register(); ... garbageCollectsTotal.inc(); Follow @goettl
  23. 23. What did we see? Whitebox monitoring won't work without Developers! Follow @goettl
  24. 24. Analyze Prometheus Metrics Like a Data Scientist Follow @goettl
  25. 25. ... should I? Don't use deep learning and datasience when a straight- forward 15 minute rule-based system does well. Datascience can help you to detect patterns and facts in your metrics you can't see. Follow @goettl
  26. 26. What is already available ● Great architecture to get high quality data ● Numerical data ● Apply mathematical functions on it ● Easy and fast navigable (promql) ● Alert / rule model ● Chart / histogram vis with Grafana Follow @goettl
  27. 27. When do I start? Already working alerts / dashboards you want to improve Follow @goettl
  28. 28. Two ways to get data out of prometheus ● HTTP API (Poll) ● Exploratory data analysis ● REMOTE API (Push) ● Streaming analysis Follow @goettl
  29. 29. HTTP API - /api/v1/query_range requests.get( url = 'http://127.0.0.1:9090/api/v1/query_range', params = { 'query': 'sum({__name__=~".+"}) by (__name__,instance)', 'start': '1502809554', 'end' : '1502839554', 'step' : '1m' }) {"data": {..., "resultType": "matrix", "result": [{ "metric": {"method": "GET",...}, "values": [[1500008340,"3"], ... ]},...] }} Follow @goettl
  30. 30. Normalize prometheus datatypes ● Gauges, histograms are ok ● Counters have to be processed ● No repetition in counters. No statistical value in that. ● Use e.g derivative function to convert a counter to a gauge equivalent Follow @goettl
  31. 31. Example 1 I can predict the latency of http requests ● Can I use the prometheus function predict_linear? ● Are there other predictions possible? ↡↡ R Notebook predict_linear↡↡ Follow @goettl
  32. 32. Histogramme, Monitoring for the long tail histogram_quantile(0.99, sum( rate( http_request_duration_seconds_bucket{method="GET"}[1m] ) ) by (le)) Follow @goettl
  33. 33. Outliers Detection Algorithms Follow @goettl https://github.com/twitter/AnomalyDetection
  34. 34. Demo export from grafana ● Demo API ● Export into csv Follow @goettl
  35. 35. Thx for having me here at infracoders meetup 2017! Questions? Georg Öttl Twitter Handle: @goettl Follow @goettl

×