Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Velocity NY 2018: Monitoring Containers Correctly

59 views

Published on

Michael Kehoe walks you through building a small monitoring utility for cgroup containers to illustrate best practices in container monitoring. You'll explore various cgroup constraints and learn how to specifically monitor for each of them to ensure that your application is behaving as expected. Along the way, Michael shares tricks and tips about monitoring containerized applications.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Velocity NY 2018: Monitoring Containers Correctly

  1. 1. Monitoring Containers Correctly Michael Kehoe Staff Site Reliability Engineer https://github.com/michael-kehoe/container-monitoring-workshop
  2. 2. Getting Started • Setup your workshop platform: • https://app.strigo.io/event/QXDpmTiR AufQ4LBis • Token: F7C7 • Background slides: https://bit.ly/2NcEBQN • Code repo: https://github.com/michael- kehoe/container-monitoring-workshop • Please let me know ASAP if you’re
  3. 3. Today’s agenda 1 Introductions 2 Container Primitives 3 What we’ll monitor 4 Cgroup interface file formats 5 Exercises
  4. 4. Today’s agenda Exercises 100 CPU Basics 101 CPU Enhanced 102 CPU Advanced 200 Memory Basics 201 Memory Enhanced 300 IO Basics 400 PID
  5. 5. Michael Kehoe $ WHOAMI • Staff Site Reliability Engineer @ LinkedIn • Production-SRE Team • Funny accent = Australian + 4 years American • Worked on: • Networks • Micro-services • Traffic Engineering • Databases
  6. 6. Production-SRE Team @ LinkedIn $ WHOAMI • Disaster Recovery - Planning & Automation • Incident Response – Process & Automation • Visibility Engineering – Making use of operational data • Reliability Principles – Defining best practice & automating it
  7. 7. Container Primitives
  8. 8. Containers Limiting the resources that can be used by a process/ set of processes cgroups Isolating filesystem resources Namespaces Implicit sharing or shadowing Copy on Write Locking down container privileges Linux Security Modules
  9. 9. Cgroup • Abbreviation for ‘Control Groups’ • Provides • Resource Limiting • Prioritization • Accounting • Control
  10. 10. What we’ll monitor
  11. 11. • 100: Basic cgroup CPU utilization • 101: Enhanced cgroup CPU utilization (with percentiles • 102: cgroup throttles What we’ll monitor CPU
  12. 12. • 200: Memory Basics • Cgroup utilization • 201: Enhanced Memory Metrics What we’ll monitor MEMORY
  13. 13. • 300: Disk IO Monitoring What we’ll monitor DISK/ NETWORK
  14. 14. • 400: PID Utilization What we’ll monitor PID
  15. 15. Cgroup interface file formats
  16. 16. Cgroup interface file formats https://www.kernel.org/doc/Documentation/cgroup-v2.txt
  17. 17. Exercises
  18. 18. 100: CPU Monitoring
  19. 19. 101: Enhanced CPU Monitoring
  20. 20. Enhanced CPU Monitoring
  21. 21. 102: CPU Advanced Monitoring
  22. 22. Advanced CPU Monitoring
  23. 23. 200: Memory Basics
  24. 24. 201: Memory Enhanced
  25. 25. 300: Disk IO Basics
  26. 26. 400: PID Monitoring

×