Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

End to-end monitoring with the prometheus operator - Max Inden

1,280 views

Published on

Retrouvez la présentation de Max Inden de Core OS lors du Paris Container Day

Published in: Technology
  • Be the first to comment

End to-end monitoring with the prometheus operator - Max Inden

  1. 1. End-to-end Monitoring with the Prometheus Operator By @mxinden
  2. 2. Max Inden Test-Engineer at CoreOS @mxinden Max.Inden@CoreOS.com
  3. 3. Secure, simplify and automate container infrastructure
  4. 4. Secure, simplify and automate container infrastructure
  5. 5. Secure, simplify and automate container infrastructure
  6. 6. Secure, simplify and automate container infrastructure
  7. 7. Why Monitoring?
  8. 8. Why Monitoring? Alerting
  9. 9. Why Monitoring? Long-term trendsAlerting
  10. 10. What is Prometheus? ● Open Source Monitoring ● Built by Soundcloud ● Inspired by borgmon ●
  11. 11. What is Prometheus? ● Pull-based ●
  12. 12. What is Prometheus? ● Pull-based ● Multi-Dimensional ●
  13. 13. What is Prometheus? ● Pull-based ● Multi-Dimensional ● Metrics, not logging, not tracing ●
  14. 14. What is Prometheus? ● Pull-based ● Multi-Dimensional ● Metrics, not logging, not tracing ● No magic! ●
  15. 15. Target Target Target
  16. 16. Target /metrics Target /metrics Target /metrics
  17. 17. Prometheus Target /metrics Target /metrics Target /metrics
  18. 18. Prometheus Target /metrics Target /metrics Target /metrics 15s
  19. 19. Target /metrics # HELP http_requests_total Total number of HTTP requests made. # TYPE http_requests_total counter http_requests_total{code="200",path="/status"} 8
  20. 20. Target /metrics # HELP http_requests_total Total number of HTTP requests made. # TYPE http_requests_total counter http_requests_total{code="200",path="/status"} 8 Metric name
  21. 21. Target /metrics # HELP http_requests_total Total number of HTTP requests made. # TYPE http_requests_total counter http_requests_total{code="200",path="/status"} 8 Label
  22. 22. Target /metrics # HELP http_requests_total Total number of HTTP requests made. # TYPE http_requests_total counter http_requests_total{code="200",path="/status"} 8 Value
  23. 23. Prometheus Target /metrics Target /metrics Target /metrics
  24. 24. Prometheus Target /metrics Target /metrics Target /metrics PromQL
  25. 25. Current percentage of HTTP errors across all service instances?
  26. 26. Current percentage of HTTP errors across all service instances? sum by(path) rate(http_requests_total{status="500"}[5m])) / sum by(path) rate(http_requests_total[5m]))
  27. 27. Current percentage of HTTP errors across all service instances? {path="/status"} 0.0039 {path="/"} 0.0011 {path="/api/v1/topics/:topic"} 0.087 {path="/api/v1/topics} 0.0342 sum by(path) rate(http_requests_total{status="500"}[5m])) / sum by(path) rate(http_requests_total[5m]))
  28. 28. Prometheus Target /metrics Target /metrics Target /metrics PromQL
  29. 29. Prometheus Target /metrics Target /metrics Target /metrics PromQL Web UI Dashboard
  30. 30. Prometheus Target /metrics Target /metrics Target /metrics
  31. 31. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition
  32. 32. ALERT DiskWillFillIn4Hours IF predict_linear(node_filesystem_free[1h], 4*3600) < 0 Is any disk about to run full within 4 hours? 0 now-1h +4h
  33. 33. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition
  34. 34. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition 1m
  35. 35. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition 1m
  36. 36. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition Alertmanager 1m
  37. 37. Alertmanager Deduplicates Alert Alert Alert Alert Alert Alert Alert
  38. 38. Alertmanager Deduplicates Alert Alert Alert Alert Alert Alert Alert Groups Alert Alert Alert Alert Alert Alert
  39. 39. Alertmanager Deduplicates Alert Alert Alert Alert Alert Alert Alert Groups Alert Alert Alert Alert Alert Alert Routes Alert Alert Alert Alert Alert Team A Team B Team C
  40. 40. Alertmanager Deduplicates Alert Alert Alert Alert Alert Alert Alert Groups Alert Alert Alert Alert Alert Alert Routes Alert Alert Alert Alert Alert Team A Team B Team C
  41. 41. Prometheus Target /metrics Target /metrics Target /metrics Alertmanager
  42. 42. Prometheus Target /metrics Target /metrics Target /metrics Alertmanager
  43. 43. Monitoring
  44. 44. Application Cluster Monitoring
  45. 45. Cluster Monitoring
  46. 46. What is Kubernetes? Platform for running containerized applications
  47. 47. What is Kubernetes? Announced 2014 by Google Influenced by Borg & Omega v1.01 in July 2015 Kubernetes joins the CNCF
  48. 48. Master
  49. 49. Master API-Server etcd Controller-Manager Scheduler Kube-DNS ...
  50. 50. Master API-Server etcd Controller-Manager Scheduler Kube-DNS ... Worker
  51. 51. Master API-Server etcd Controller-Manager Scheduler Kube-DNS ... Worker Kubelet Kube-Proxy ...
  52. 52. Application Monitoring
  53. 53. Location User AppX
  54. 54. Location User AppX User AppX Location
  55. 55. Location User AppX User AppX Location Service Service Service
  56. 56. Location User AppX User AppX Location Service Service Service Prometheus
  57. 57. Location User AppX User AppX Location Service Service Service Prometheus ?
  58. 58. K8s-API-Server Location User AppX User AppX Location Service Service Service Prometheus
  59. 59. Location User AppX User AppX Location Service Service Service Prometheus K8s-API-Server
  60. 60. Service Discovery ● Static target list ● DNS discovery ● Kubernetes discovery ● ...
  61. 61. Master API-Server etcd Controller-Manager Scheduler Kube-DNS ... Worker Kubelet Kube-Proxy ... Location User AppX User AppX Location Service Service Service Prometheus K8s-API-Server Application-MonitoringCluster-Monitoring
  62. 62. Problem Prometheus is stateful and difficult to configure!
  63. 63. Introducing the Prometheus Operator
  64. 64. What is a K8s Operator?
  65. 65. What is a K8s Operator? Application specific operational knowledge
  66. 66. What is a K8s Operator?
  67. 67. What is a K8s Operator? </>
  68. 68. What is a K8s Operator? </>
  69. 69. What is a K8s Operator? </> Operator
  70. 70. Prometheus Operator ● Kubernetes native configuration ● Automated management and upgrades of Prometheus & Alertmanager
  71. 71. apiVersion: extensions/v1beta1 kind: Deployment metadata: name: my-app spec: ...
  72. 72. apiVersion: monitoring.coreos.com/v1alpha1 kind: Prometheus metadata: name: prometheus-k8s spec: ...
  73. 73. Kube-Prometheus Single command to install: ● Prometheus & Alertmanager Cluster ● Alerting rules ● Dashboarding
  74. 74. Demo
  75. 75. Recap
  76. 76. What is Prometheus? ● Pull-based ● Multi-Dimensional ● Metrics, not logging, not tracing ● No magic! ●
  77. 77. Prometheus Target /metrics Target /metrics Target /metrics 15s
  78. 78. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition Alertmanager 1m
  79. 79. Prometheus-Operator & Kube-Prometheus </> Operator
  80. 80. Where to go from here? Prometheus.io /coreos/prometheus-operator
  81. 81. San Francisco, New York & Berlin We are hiring!
  82. 82. Max Inden Test-Engineer at CoreOS @mxinden Max.Inden@CoreOS.com

×