Successfully reported this slideshow.
Your SlideShare is downloading. ×

Prometheus Monitoring Mixins (Berlin CNCB Meetup)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 31 Ad

More Related Content

Slideshows for you (20)

Similar to Prometheus Monitoring Mixins (Berlin CNCB Meetup) (20)

Advertisement

Recently uploaded (20)

Advertisement

Prometheus Monitoring Mixins (Berlin CNCB Meetup)

  1. 1. Prometheus Monitoring Mixins Using Jsonnet to Package Together Dashboards, Alerts and Exporters David Kaltschmidt @davkals kausal.co – February 2018
  2. 2. Add Monitoring kubectl apply -f https://trust.worthy/prom-n-grafana/config.yml kubectl apply -f https://they.knowtheir.stuff/dashboards/config.yml #YOLO
  3. 3. I’m looking at you, KOPS! scrape_configs+: [ k8s_pod_scrape("kube-system/kube-apiserver", 443) { scheme: "https", }, k8s_pod_scrape("kube-system/kube-scheduler", 10251), k8s_pod_scrape("kube-system/kube-controller-manager", 10252), // kops doesn't configure kube-proxy to listen on non-localhost, // can't scrape. // k8s_pod_scrape("kube-system/kube-proxy", 10249), // kops firewalls etcd on masters off from nodes, so we // can't scrape it with Prometheus. // k8s_pod_scrape("kube-system/etcd-server", 4001), // k8s_pod_scrape("kube-system/etcd-server-events", 4002), ],
  4. 4. Selector Mismatch node_cpu{job=“default/node-exporter”} container_cpu_usage_seconds_total{job="kubernetes- nodes",image!=""} node_memory_MemFree + node_memory_Cached + node_memory_Buffers
  5. 5. Someone Made Assumptions
  6. 6. Grafana Templating container_cpu_usage_seconds_total{namespace=“$namespace"} container_memory_usage_bytes{namespace="$namespace", pod_name="$pod"}
  7. 7. Templating System at the Config Level node_cpu{job=“%(node_exporter)s”} container_cpu_usage_seconds_total{job=“%(cadvisor)s"}
  8. 8. Towards JSON templating • Grafana dashboards are JSON • K8s deployment configs are YAML, that’s practically JSON • Prometheus Config and Recording Rules are YAML, that’s practically JSON • Grafana and Prometheus configs can be put into K8s as ConfigMaps, that can be JSON
  9. 9. Jsonnet Primer 
 { a: 1, b: 2 } +
 
 { b: 22, c: 3 } +
 
 { d:: self.a + 10 } +
 
 { list: [{s: “%(var)s}] } + 
 
 { list+: [{ e: $.d }]} %
 
 { var: “foo” } = { a: 1, b: 22, c: 3, list: [{ s: “foo” }, { e: 11}] }

  10. 10. Jsonnet Primer // Default/Base config
 { a: 1, b: 2 } +
 // Override some values
 { b: 22, c: 3 } +
 // Calculate local fields
 { d:: self.a + 10 } +
 // Base rules
 { list: [{s: “%(var)s}] } +
 // Add custom rules
 { list+: [{ e: $.d }]} %
 // Set variables
 { var: “foo” } = { a: 1, b: 22, c: 3, list: [{ s: “foo” }, { e: 11}] }

  11. 11. Jsonnet Resources • http://jsonnet.org/ • https://github.com/databricks/jsonnet-style-guide • https://ksonnet.io
  12. 12. Aside: Ksonnet
  13. 13. Aside: Ksonnet K8s Primitives local container = $.core.v1.container, node_exporter_container:: container.new("node-exporter", $._images.nodeExporter) + container.withPorts($.core.v1.containerPort.new("http-metrics", 9100)) + container.withArgs([ "--path.procfs=/host/proc", "--path.sysfs=/host/sys", "--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|host|etc)($|/)", ]), 
 local daemonSet = $.extensions.v1beta1.daemonSet, node_exporter_deamonset: daemonSet.new("node-exporter", [$.node_exporter_container]) + $.util.hostVolumeMount("proc", "/proc", "/host/proc") + $.util.hostVolumeMount("sys", "/sys", "/host/sys") + $.util.hostVolumeMount("root", "/", "/rootfs"),
  14. 14. Grafana Primitives • https://github.com/kausalco/public/blob/master/klumps/lib/ grafana.libsonnet • https://github.com/grafana/grafonnet-lib
  15. 15. Configurable Grafana Configs "k8s-resources-namespace.json": local tableStyles = { pod: { alias: "Pod", link: "%s/dashboard/file/k8s-resources-pod.json?var-datasource=$datasource&var- namespace=$namespace&var-pod=$__cell" % $._config.grafanaPrefix, }, }; g.dashboard("K8s / Compute Resources / Namespace") .addTemplate("namespace", "kube_pod_info", "namespace") .addRow( g.row("CPU Usage") .addPanel( g.panel("CPU Usage") + g.queryPanel('sum(irate(container_cpu_usage_seconds_total{namespace="$namespace"}[1m])) by (pod_name)', "{{pod_name}}") + g.stack, ) )
  16. 16. Prefab dashboards kubectl apply -f https://they.knowtheir.stuff/dashboards/config.yml is a good idea, it just needs to be configurable
  17. 17. USE dashboards • USE method to monitor your resources • http://www.brendangregg.com/usemethod.html • “For every resource, check utilization, saturation, and errors.”
  18. 18. K8s Resources • What resources are there? • What is a node • Someone needs to think about this
  19. 19. Demo KLUMPS • Kubernetes/Linux USE Method with Prometheus • OOS: https://github.com/kausalco/public/tree/master/klumps • Live demo:
 https://dev.kausal.co/admin/grafana/dashboard/file/k8s-cluster-rsrc- use.json?refresh=10s&orgId=1
  20. 20. Anatomy of a Mixin • klumps.libsonnet defines all dashboards • grafana.libsonnet has the Grafana primitives • parts.yaml is metadata for ksonnet • recording_rules.jsonnet?
  21. 21. Assumptions of the KLUMPS Mixin • Grafana, Prometheus, node-exporter, kube-state-metrics • Grafana has Prometheus as a data source • Metrics are available: node, kube-state-metrics, cadvisor • Available as a mixin as well:
 https://github.com/kausalco/public/tree/master/prometheus-ksonnet • Note what’s absent: any mention of namespaces, cluster structure, etc.
  22. 22. Include in your config base + prometheus + default + { _config+:: { namespace: "default", grafana_root_url: "https://dev.kausal.co/admin/grafana",
 jobs: { node_exporter: "default/node-exporter" } }, dashboards+::
 (import "klumps/klumps.libsonnet"), prometheus_rules+:: (import "klumps/recording_rules.jsonnet"), }
  23. 23. Service Mixins • A service component or library could bring its own mixing • Authors know their components’ failure modes • Components can encode operational practices • Expand instrumentation: not just expose metrics, but also show me how to use them
  24. 24. Live Demo: Build Consul Mixin • Audience: What does Consul even do? • Audience: What are its failure modes? • Let’s build a Consul Monitoring Mixin together that includes a dashboard and an alert!
  25. 25. • Result:
 https://gist.github.com/davkal/033ad0f739e03b7c0a2ff55950721c99
  26. 26. Takeaways • If you can find a way to codify best practices, do it. • If you can share those, make them reusable and configurable. • Check out Jsonnet!
  27. 27. Thanks for listening. Questions? 
 David Kaltschmidt @davkals kausal.co – February 2018 Photo credits: 
 https://unsplash.com/photos/dmkmrNptMpw https://unsplash.com/photos/nlMYrApFE7s

×