Successfully reported this slideshow.
Your SlideShare is downloading. ×

Code instrumentation in Py with Prometheus and Grafana

Ad

Code instrumentation in Py
with Prometheus &
Grafana
Francois SCHMIDTS
Vlad ZLOTEANU
DOLEAD

Ad

Contents
- Prometheus & Grafana
- Code instrumentation example
- 3 use cases
- (Dolead’s) push client

Ad

Prometheus + Grafana = ❤
Metrics
retrieval
Target 2
Target 1
Target N
Querying
PromQL
TimeSeriesDB
- Multidimensional data...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Check these out next

1 of 22 Ad
1 of 22 Ad

More Related Content

Similar to Code instrumentation in Py with Prometheus and Grafana (20)

Code instrumentation in Py with Prometheus and Grafana

  1. 1. Code instrumentation in Py with Prometheus & Grafana Francois SCHMIDTS Vlad ZLOTEANU DOLEAD
  2. 2. Contents - Prometheus & Grafana - Code instrumentation example - 3 use cases - (Dolead’s) push client
  3. 3. Prometheus + Grafana = ❤ Metrics retrieval Target 2 Target 1 Target N Querying PromQL TimeSeriesDB - Multidimensional data model Exporter Grafana Pulls Queries Alert Manager Instrum entation Pulls Dashboard
  4. 4. Prometheus - TSDB - Open Source - Incubated by CNCF (After Kubernetes) - Adapted to VM/containers monitoring - Autodiscovery - Pull model - Multidimensional data - Includes alerting
  5. 5. Grafana - OS metric analytics / visualisation - multiple providers: CloudWatch, Prometheus, InfluxDb, ES, .. - multiple dashboards already available - in coop with Prometheus exporters
  6. 6. Node exporter + Grafana dashboard
  7. 7. MongoDB exporter + Grafana dashboard
  8. 8. Case study: RR Stats import ● Metric: Duration of execution Labels ● Result ○ success/failure ● Source ○ Google Ads, Fb Ads, Bing Ads, Taboola, etc. ● Category ○ Account, Campaign, Keyword, .. ○ Today vs Past ● Node
  9. 9. Instrumentation - Code example
  10. 10. 1. Debugging / Gain insight "Where does the problem come from / What is going on?" ● Segment by sources (Google Ads, Fb Ads, Bing Ads, Taboola, etc.) ○ Did they slow down? Error rate gone up? Are they unavailable? ● Segment by category ○ Did we introduce a bug on that code? ● Segment by node ○ do I have a problem on that node?
  11. 11. All successful stats downloads
  12. 12. All successful stats downloads - vs Bing
  13. 13. 1. Debugging / Gain insight Combination with external data / corroboration - deployments - CPU/Ram/Load on the node - “can we corroborate with a slow query increase in Mongodb?”
  14. 14. Example: Sync activity vs machine load
  15. 15. 2. Alerting - Grafana alerts: - alerts based on configured data sources - Prometheus AlertManager: - can alert based on PromQL query - Infrastructure as Code Instrument now, decide later
  16. 16. 2. Alerting - Example
  17. 17. 2. Alerting - Graph
  18. 18. 3. Trends / Scale ● Trends over time, drive scale (technical) / business decisions ○ Capacity planning ○ "Will I (when will I) have a problem in the future?" ● SLA / QoS
  19. 19. And all this is available thanks to this code:
  20. 20. Push (vs pull) - Async, short-lived processes - The prometheus way => send metrics to a push gateway - One push gateway per process ! - More infrastructure to setup - Our way, the prometheus-distributed-client => send metrics to a database - Available from everywhere - Consistent in case of concurrent calls - Use either
  21. 21. Conclusion - Try to always instrument your code - Limite the cardinality of the metrics you use - Make nice graphs ! - Use Our lib : https://github.com/dolead/prometheus-distributed-client
  22. 22. Thank you! Questions?

×