Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

7

Share

Download to read offline

Monitoring kubernetes with prometheus

Download to read offline

Monitoring containerised apps creates a whole new set of challenges that traditional monitoring systems struggle with. In this talk, Brice Fernandes from Weaveworks will introduce and demo the open source Prometheus monitoring toolkit and its integration with Kubernetes. After this talk, you'll be able to use Prometheus to monitor your microservices on a Kubernetes cluster. We'll cover:
- An introduction to Kubernetes to manage containers;
- The monitoring maturity model;
- An overview of whitebox and blackbox monitoring;
- Monitoring with Prometheus;
- Using PromQL (the Prometheus Query Language) to monitor your app in a dynamic system

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Monitoring kubernetes with prometheus

  1. 1. Making sure your containers aren’t on fire Monitoring microservices with Prometheus Brice Fernandes @fractallambda
  2. 2. Getting started with Kubernetes1 2 3 4 The monitoring maturity ladder Whitebox vs blackbox monitoring Monitoring with Prometheus Using PromQL5
  3. 3. Getting started with Kubernetes1 2 3 4 The monitoring maturity ladder Whitebox vs blackbox monitoring Monitoring with Prometheus Using PromQL5
  4. 4. How I.T. Was OS App
  5. 5. How I.T. Was OS App Foo v1.1.0
  6. 6. How I.T. Was OS App Foo v1.1.0 Foo v1.5.0
  7. 7. How I.T. Was OS App Foo v1.1.0 Foo v1.5.0 ?
  8. 8. How I.T. Was Reproducible Deployment Continuous Deployment Fault Recovery Memory & CPU allocation Managing VMs? ? ? ? ? ? ?
  9. 9. The New Hotness OS Manager Container App
  10. 10. The New Hotness OS Manager Container App Somebody Else’s Problem (SEP)™
  11. 11. The New Hotness OS Manager Container App Somebody Else’s Problem (SEP)™
  12. 12. Reproducible deployments Fault recovery Continuous deployment Don’t care about machine virtualisation Memory & CPU multiplexing Buzzword compliance The New Hotness
  13. 13. But…
  14. 14. But…
  15. 15. But…
  16. 16. But…
  17. 17. But…
  18. 18. Mo’ containers Mo’ problems
  19. 19. Kubernetes – Greek for Helmsman or Pilot
  20. 20. Master kube-apiserver kube-controller-manager kube-scheduler
  21. 21. Node kubelet kube-proxy
  22. 22. This is what I want
  23. 23. This is what I want
  24. 24. This is what I want
  25. 25. This is what I want
  26. 26. This is what I want xxx.xxx.xxx.xxx:30003
  27. 27. This is what I want xxx.xxx.xxx.xxx:30003
  28. 28. xxx.xxx.xxx.xxx:30003
  29. 29. Set up Kubernetes
  30. 30. ➤ minikube start
  31. 31. ➤ minikube start Starting local Kubernetes v1.7.5 cluster... Starting VM... Getting VM IP address... Moving files into cluster... Setting up certs... Connecting to cluster... Setting up kubeconfig... Starting cluster components… Kubectl is now configured to use the cluster.
  32. 32. ➤ minikube start Starting local Kubernetes v1.7.5 cluster... Starting VM... Getting VM IP address... Moving files into cluster... Setting up certs... Connecting to cluster... Setting up kubeconfig... Starting cluster components… Kubectl is now configured to use the cluster. Start a local cluster
  33. 33. ➤ minikube start Starting local Kubernetes v1.7.5 cluster... Starting VM... Getting VM IP address... Moving files into cluster... Setting up certs... Connecting to cluster... Setting up kubeconfig... Starting cluster components… Kubectl is now configured to use the cluster. Set up the kubernetes tools to point to our cluster
  34. 34. ➤ kubectl get all
  35. 35. ➤ kubectl get all NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc/kubernetes 10.0.0.1 <none> 443/TCP 5m
  36. 36. ➤ kubectl get all NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc/kubernetes 10.0.0.1 <none> 443/TCP 5m Default kubernetes service
  37. 37. HOST
  38. 38. HOST VM
  39. 39. HOST VM Kubernetes
  40. 40. Deploy an app
  41. 41. ➤ kubectl apply -f 
 https://tinyurl.com/kube-prom-demo-v1
  42. 42. ➤ kubectl apply -f 
 https://tinyurl.com/kube-prom-demo-v1 Our definition manifest
  43. 43. Where to find the image
  44. 44. How many to run
  45. 45. Which port to expose externally
  46. 46. ➤ kubectl apply -f 
 https://tinyurl.com/kube-prom-demo-v1 deployment "mighty-fine-fe" created service "mighty-fine-fe" created Creates our pods
  47. 47. ➤ kubectl apply -f 
 https://tinyurl.com/kube-prom-demo-v1 deployment "mighty-fine-fe" created service "mighty-fine-fe" created Exposes a service
  48. 48. ➤ kubectl get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) kubernetes 10.0.0.1 <none> 443/TCP mighty-fine-fe 10.0.0.223 <nodes> 3000:30001/TCP Port 3000 of app is visible on port 30001 of cluster
  49. 49. ➤ open http://$(minikube ip):30001
  50. 50. ➤ open http://$(minikube ip):30001
  51. 51. Getting started with Kubernetes1 2 3 4 The monitoring maturity ladder Whitebox vs blackbox monitoring Monitoring with Prometheus Using PromQL5
  52. 52. Why monitor?
  53. 53. Quality Assurance Continuous Improvement
  54. 54. NOT about collecting data Why vs How
  55. 55. IS relevant outside of IT
  56. 56. Q:What’s the most important metric?
  57. 57. A: What’s the purpose of your organisation?
  58. 58. Maybe: Educational goals # People reached # Papers published
  59. 59. Probably £/$/€
  60. 60. Metrics come from purpose. Monitor your goals
  61. 61. Ignorance Availability Collection Aggregation 0 Analysis 1 Learning Automation Proactivity 2 3 4 5 6 7 The Monitoring Ladder
  62. 62. Ignorance0 The Monitoring Ladder Availability Collection Aggregation Analysis 1 Learning Automation Proactivity 2 3 4 5 6 7 You don’t know what’s going on.
  63. 63. Ignorance0 The Monitoring Ladder Availability Collection Aggregation Analysis 1 Learning Automation Proactivity 2 3 4 5 6 7 You know whether your systems are available. You may have alerts
  64. 64. Ignorance0 The Monitoring Ladder Availability Logging Aggregation Analysis 1 Learning Automation Proactivity 2 3 4 5 6 7 You collect logs. Forensics is possible . Alerts
  65. 65. Ignorance0 The Monitoring Ladder Availability Collection Aggregation Analysis 1 Learning Automation Proactivity 2 3 4 5 6 7 You aggregate and persist data in a central place. Correlation is possible. Alerts Logs Forensics
  66. 66. Ignorance0 The Monitoring Ladder Availability Collection Analysis 1 Learning Automation Proactivity 2 4 5 6 7 You actually analyse the aggregated and correlated data. Use it to fix issues. Alerts Logs Forensics Aggregation3Persistence
  67. 67. Ignorance0 The Monitoring Ladder Availability Collection Analysis 1 Learning Automation Proactivity 2 4 5 6 7 Root cause analysis. Strengthening fixes. Antifragile. Still responsive. Alerts Logs Forensics Aggregation3Persistence
  68. 68. Ignorance0 The Monitoring Ladder Availability Collection Analysis 1 Learning Automation Proactivity 2 4 5 6 7 Automated remedial actions. Data collection for analysis. No customer impact. Alerts Logs Forensics Aggregation3Persistence Antifragile
  69. 69. Ignorance0 The Monitoring Ladder Availability Collection Analysis 1 Learning Automation Proactivity 2 4 5 6 7 Automated remedial actions. Data collection for analysis. No customer impact. Alerts Logs Forensics Aggregation3Persistence Antifragile
  70. 70. Ignorance0 The Monitoring Ladder Availability Collection Analysis 1 Learning Automation Proactivity 2 4 5 6 7 Active strengthening by attacking production systems. Alerts Logs Forensics Aggregation3Persistence Antifragile 0-Impact
  71. 71. Ignorance0 The Monitoring Ladder Availability Collection Analysis 1 Learning Automation Proactivity 2 4 5 6 7 Alerts Logs Forensics Aggregation3Persistence Antifragile 0-Impact Monitoring is a broad topic
  72. 72. Getting started with Kubernetes1 2 3 4 The monitoring maturity ladder Whitebox vs blackbox monitoring Monitoring with Prometheus Using PromQL5
  73. 73. Whitebox vs Blackbox
  74. 74. Whitebox vs Blackbox
  75. 75. Push vs Pull
  76. 76. Push vs Pull
  77. 77. Realtime vs Historic
  78. 78. Realtime vs Historic
  79. 79. Which one is right? Pull or Push? Whitebox or Blackbox?
  80. 80. Which one is right? Pull or Push? Whitebox or Blackbox? Both
  81. 81. Pull based Whitebox Historic
  82. 82. Getting started with Kubernetes1 2 3 4 The monitoring maturity ladder Whitebox vs blackbox monitoring Monitoring with Prometheus Using PromQL5
  83. 83. Monitoring infrastructure Key metrics
  84. 84. ➤ kubectl apply -f 
 https://tinyurl.com/kube-prom-monitoring deployment "prometheus" created service "prometheus" created service "internal-prometheus" created deployment "grafana" created service "grafana" created configmap "prometheus-configmap" created Create and expose Prometheus
  85. 85. ➤ kubectl apply -f 
 https://tinyurl.com/kube-prom-monitoring deployment "prometheus" created service "prometheus" created service "internal-prometheus" created deployment "grafana" created service "grafana" created configmap "prometheus-configmap" created Create and expose Grafana
  86. 86. ➤ kubectl apply -f 
 https://tinyurl.com/kube-prom-monitoring deployment "prometheus" created service "prometheus" created service "internal-prometheus" created deployment "grafana" created service "grafana" created configmap "prometheus-configmap" created Configure Prometheus Using a ConfigMap
  87. 87. ➤ kubectl get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) grafana 10.0.0.120 <nodes> 3000:30002/TCP internal-prometheus 10.0.0.39 <none> 9090/TCP kubernetes 10.0.0.1 <none> 443/TCP mighty-fine-fe 10.0.0.223 <nodes> 3000:30001/TCP prometheus 10.0.0.112 <nodes> 9090:30003/TCP
  88. 88. ➤ kubectl get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) grafana 10.0.0.120 <nodes> 3000:30002/TCP internal-prometheus 10.0.0.39 <none> 9090/TCP kubernetes 10.0.0.1 <none> 443/TCP mighty-fine-fe 10.0.0.223 <nodes> 3000:30001/TCP prometheus 10.0.0.112 <nodes> 9090:30003/TCP Prometheus internal IP
  89. 89. ➤ kubectl get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) grafana 10.0.0.120 <nodes> 3000:30002/TCP internal-prometheus 10.0.0.39 <none> 9090/TCP kubernetes 10.0.0.1 <none> 443/TCP mighty-fine-fe 10.0.0.223 <nodes> 3000:30001/TCP prometheus 10.0.0.112 <nodes> 9090:30003/TCP Prometheus external port
  90. 90. ➤ open http://$(minikube ip):30003
  91. 91. ➤ open http://$(minikube ip):30003
  92. 92. ➤ open http://$(minikube ip):30002
  93. 93. ➤ open http://$(minikube ip):30002
  94. 94. Built-in Prometheus provider
  95. 95. Internal Prometheus IP and port
  96. 96. Proxy instead of data from browser
  97. 97. PromQL Query
  98. 98. But… Aggregation3 Ignorance Availability Collection 0 Analysis 1 Learning Automation Proactivity 2 4 5 6 7 What about persistence?
  99. 99. Using Weave Cloud’s Hosted Prometheus
  100. 100. Name your cluster
  101. 101. Pick your platform
  102. 102. Choose your environment
  103. 103. Run Command
  104. 104. ➤ kubectl apply 
 -n kube-system 
 -f “<some_url>&t=<some_token>” serviceaccount "weave-flux" created clusterrole "weave-flux" created clusterrolebinding "weave-flux" created secret "flux-git-deploy" created deployment "weave-flux-memcached" created service "weave-flux-memcached" created deployment "weave-flux-agent" created serviceaccount "weave-scope" created clusterrole "weave-scope" created clusterrolebinding "weave-scope" created daemonset "weave-scope-agent" created serviceaccount "weave-cortex" created clusterrole "weave-cortex" created clusterrolebinding "weave-cortex" created deployment "weave-cortex-agent" created service "weave-cortex-agent" created daemonset "weave-cortex-node-exporter" created configmap "weave-cortex-agent-config" created
  105. 105. ➤ kubectl get pods -n kube-system NAME READY STATUS RESTARTS kube-addon-manager-minikube 1/1 Running 1 kube-dns-910330662-bv35c 3/3 Running 3 kubernetes-dashboard-zj028 1/1 Running 1 weave-cortex-agent-815474457-5q0rg 1/1 Running 0 weave-cortex-node-exporter-5tf88 1/1 Running 0 weave-flux-agent-1731903026-d0gw8 1/1 Running 0 weave-flux-memcached-2601059440-f31vp 1/1 Running 0 weave-scope-agent-6fq0b 1/1 Running 0
  106. 106. Go to monitoring
  107. 107. Monitoring infrastructure Key metrics
  108. 108. Adding the Prometheus Agent to our app
  109. 109. ➤ npm install —save epimetheus
  110. 110. ➤ npm install —save epimetheus Client libraries in: Go, Java, Python, Ruby, Bash, C++, Common Lisp, Elixir, Erlang, Haskell, Lua, .NET, PHP, Rust…
  111. 111. Very straight forward in most languages
  112. 112. … Omitted for brevity: Pushing new image to registry Creating new manifest …
  113. 113. ➤ kubectl apply -f 
 https://tinyurl.com/kube-prom-demo-v2
  114. 114. ➤ kubectl apply -f 
 https://tinyurl.com/kube-prom-demo-v2 deployment "mighty-fine-fe" configured service "mighty-fine-fe" configured
  115. 115. ➤ open http://$(minikube ip):30001/metrics
  116. 116. ➤ open http://$(minikube ip):30001/metrics
  117. 117. Getting Prometheus to scrape our app
  118. 118. ➤ kubctl apply -f 
 https://tinyurl.com/kube-prom-monitoring-v2
  119. 119. ➤ kubctl apply -f 
 https://tinyurl.com/kube-prom-monitoring-v2 deployment "prometheus" configured service "prometheus" configured service "internal-prometheus" configured deployment "grafana" configured service "grafana" configured configmap "prometheus-configmap" configured
  120. 120. ➤ curl -X POST 
 http://$(minikube ip):30001/-/reload Tell Prometheus to reload its config
  121. 121. NodeJS Metrics
  122. 122. Weave discovers the new Metrics too
  123. 123. Defining a custom metric
  124. 124. … Omitted for brevity: Pushing new image to registry Creating new manifest …
  125. 125. ➤ kubctl apply -f 
 https://tinyurl.com/kube-prom-monitoring-v3 deployment "prometheus" configured service "prometheus" configured service "internal-prometheus" configured deployment "grafana" configured service "grafana" configured configmap "prometheus-configmap" configured
  126. 126. ➤ open http://$(minikube ip):30001
  127. 127. ➤ open http://$(minikube ip):30001
  128. 128. ➤ open http://$(minikube ip):30001
  129. 129. Custom Metric
  130. 130. Monitoring infrastructure Key metrics
  131. 131. Getting started with Kubernetes1 2 3 4 The monitoring maturity ladder Whitebox vs blackbox monitoring Monitoring with Prometheus Using PromQL5
  132. 132. Joel York’s SaaS Metrics http://chaotic-flow.com
  133. 133. Worked Example: Churn rate C × Δt Churn Ratemonth = ΔCcancel
  134. 134. Worked Example: Churn rate C × Δt Number of cancellations In interval Churn Ratemonth = ΔCcancel
  135. 135. Worked Example: Churn rate C × Δt Number of cancellations In interval Churn Ratemonth = ΔCcancel Number of customers (at start of interval)
  136. 136. Worked Example: Churn rate Time interval Number of customers (at start of interval) Number of cancellations In interval Churn Ratemonth = ΔCcancel C × Δt
  137. 137. Worked Example: Churn rate Churn Ratemonth = ΔCcancel C × Δt
  138. 138. Worked Example: Churn rate Assumed metrics: total_signups (counter) total_cancels (counter) Churn Ratemonth = ΔCcancel C × Δt
  139. 139. Worked Example: Churn rate ΔCcancel = rate(total_cancels[1m]) Churn Ratemonth = ΔCcancel C × Δt
  140. 140. Worked Example: Churn rate ΔCcancel = rate(total_cancels[1m]) Base metric (scalar) Churn Ratemonth = ΔCcancel C × Δt
  141. 141. Worked Example: Churn rate ΔCcancel = rate(total_cancels[1m]) Base metric (scalar) Churn Ratemonth = ΔCcancel C × Δt t0, t1, t2, t3, t4, t5, t6
  142. 142. Worked Example: Churn rate ΔCcancel = rate(total_cancels[1m]) Base metric (scalar) Data window Churn Ratemonth = ΔCcancel C × Δt
  143. 143. Worked Example: Churn rate ΔCcancel = rate(total_cancels[1m]) Base metric (scalar) Data window Data range (vector) Churn Ratemonth = ΔCcancel C × Δt
  144. 144. Worked Example: Churn rate ΔCcancel = rate(total_cancels[1m]) Base metric (scalar) Data window Data range (vector) Churn Ratemonth = ΔCcancel C × Δt t0, t1, t2, t3, t4, t5, t6 0 2 4 7 9 11 … 0 2 4 7 9 11 … 0 2 4 7 9 11 … 0 2 4 7 9 11 … 0 2 4 7 9 11 … 0 2 4 7 9 11 … 0 2 4 7 9 11 …
  145. 145. Worked Example: Churn rate ΔCcancel = rate(total_cancels[1m]) Base metric (scalar) Data window Data range (vector) Built-in rate function Churn Ratemonth = ΔCcancel C × Δt
  146. 146. Worked Example: Churn rate C = (total_signups offset 1m) - (total_cancels offset 1m) Churn Ratemonth = ΔCcancel C × Δt
  147. 147. Worked Example: Churn rate C = (total_signups offset 1m) - (total_cancels offset 1m) Churn Ratemonth = ΔCcancel C × Δt
  148. 148. Worked Example: Churn rate C = (total_signups offset 1m) - (total_cancels offset 1m) One month ago Churn Ratemonth = ΔCcancel C × Δt
  149. 149. Worked Example: Churn rate Churn Ratemonth = rate(total_cancels[1m]) / ((total_signups offset 1m) - (total_cancels offset 1m)) Churn Ratemonth = ΔCcancel C × Δt
  150. 150. Worked Example: Churn rate Churn Ratemonth = rate(total_cancels[1m]) / ((total_signups offset 1m) - (total_cancels offset 1m)) Churn Ratemonth = ΔCcancel C × Δt
  151. 151. Getting started with Kubernetes1 2 3 4 The monitoring maturity ladder Whitebox vs blackbox monitoring Monitoring with Prometheus Using PromQL5 Review
  152. 152. References & useful links - https://landing.google.com/sre/book/chapters/monitoring-distributed-systems.html - http://www.ncsysadmin.org/meetings/1010/Monitoring_and_Alerting.pdf - https://www.oreilly.com/ideas/monitoring-distributed-systems - https://www.slideshare.net/brianbrazil/monitoring-what-matters-the-prometheus-approach-to- whitebox-monitoring-berlin-ops-summit-2016 Thank You! Brice Fernandes @fractallambda @weaveworks Slides: https://tinyurl.com/prometheus-kubernetes-slides Code: https://tinyurl.com/prometheus-kubernetes-code Video: https://tinyurl.com/cloud-native-2017 https://weave.works
  • HajimeYoshida2

    Jan. 11, 2021
  • persevere

    Dec. 9, 2020
  • shashisatya

    Nov. 2, 2020
  • nangha7967

    Jul. 4, 2020
  • VaibhavGaur5

    Sep. 18, 2019
  • ssuser8f29ef

    May. 24, 2019
  • SalvatoreCampagna1

    Mar. 7, 2018

Monitoring containerised apps creates a whole new set of challenges that traditional monitoring systems struggle with. In this talk, Brice Fernandes from Weaveworks will introduce and demo the open source Prometheus monitoring toolkit and its integration with Kubernetes. After this talk, you'll be able to use Prometheus to monitor your microservices on a Kubernetes cluster. We'll cover: - An introduction to Kubernetes to manage containers; - The monitoring maturity model; - An overview of whitebox and blackbox monitoring; - Monitoring with Prometheus; - Using PromQL (the Prometheus Query Language) to monitor your app in a dynamic system

Views

Total views

1,118

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

78

Shares

0

Comments

0

Likes

7

×