SlideShare a Scribd company logo
Monitoring and Running
Docker Containers at Scale
Docker NYC Meetup
February 25th, 2015
@alq — CTO at Datadog
Datadog
• Monitoring service
• Made for the cloud
• Aggregates everything
• Support for Docker (since 1.0)
Goal of this talk
Rethink the monitoring of Docker containers
Agenda
1.A (very) brief history of containers
2.Operational complexity
3.Monitoring Docker effectively
4.Demo
A brief history of
containers
Containers in a nutshell
• Been around for a long time
– jails, zones, cgroups
• No full-virtualization overhead
• Used for runtime isolation (e.g. jails)
• Docker is an Escape from Dependency Hell
Escape from dependency hell
a.out
shared libs
packages
omnibus
Docker ==
?
Mini-host or über-process?
Process Container Host
Spec Source Dockerfile Kickstart
On disk .TEXT /var/lib/docker /
In memory PID Container ID Hostname
In the network Socket veth* eth*
Runtime
context
server core host data center
Mini-host or über-process?
Operational
complexity
Combinatorial multiplication
Hardware
OS
Off-the-shelf
Your Application
Hardware
Hypervisor
Off-the-
shelf
App
OS OS
Off-the-
shelf
App
Hardware
Hypervisor
OS OS
A A A A
Containers
O O O O
Operational complexity
• Average containers per host: N (N=5, 10/2014)
• N-times as many “hosts” to manage
• Affects
– provisioning: prep’ing & building containers
– configuration: passing config to containers
– orchestration: deciding where/when containers
run
– monitoring: making sure containers run
properly
Complexity increases with...
1. Number of things to measure
2. Velocity of change
Number of things to measure
• 1 Amazon EC2 instance
– 10 CloudWatch metrics
• 1 operating system (e.g. linux)
– 100 metrics
•N containers
– 100*N metrics
•110 + 100*N metrics per instance
Combinatorial multiplication
100 500instances containers
Assuming only 5 containers per instance
Combinatorial multiplication
160 610metrics
per host
metrics
per host
Assuming only 5 containers per
instance
Combinatorial multiplication
100 61,000instances metrics
Assuming only 5 containers per instance
Velocity
hours,
days,
months
minutes,
hours,
days
Host half-life Container half-life
Aggravating factors
• Registry-based provisioning
– new images as fast as you can git commit
• Autonomic orchestration
– from imperative to declarative
– automated
– individual containers don’t matter
– e.g. kubernetes, mesos
A lot more,
A lot faster.
If your monitoring is still centered on individual hosts or
instances…
Host-centric monitoring
Monitor
Monitor
GA
P
Hypervisor
OS OS
A A A A
Containers
O O O O
A lot more pain,
A lot faster.
Monitoring containers
effectively
A new approach to container monitoring
Layers +
Tags
Layers of monitoring
Monitor
Hypervisor
OS OS
A A A A
Containers
O O O O
Layers of monitoring
CloudWatch
Infrastructure
Monitoring
APM
Hypervisor
OS OS
A A A A
Containers
O O O O
Layers of monitoring
cpu/net/io
filesystem
docker mem
docker cpu
db queries
web requests
app throughput
CloudWatch
Infrastructure
Monitoring
APM
e.g
.
Hypervisor
OS OS
A A A A
Containers
O O O O
Layers of monitoring
• Access to metrics from all the layers
• Amazon CloudWatch, OS metrics, Docker metrics,
app metrics in 1 place
• Shared timeline
If monitoring
does not cover all
layers,
pain.
Tags (a.k.a. labels)
You (probably) already use them
Tags
• Monitoring is like Auto-Scaling Groups
• Monitoring is like Docker orchestration
• From imperative to declarative
• Query-based
• Queries operate on tags
Monitoring with tags and queries
“Monitor all Docker containers running image web”
“… in region us-west-2 across all availability zones”
“… and make sure resident set size < 1GB on c3.xl”
Monitoring with tags and queries
“Monitor all Docker containers running image web”
“… in region us-west-2 across all availability zones”
“… and make sure resident set size < 1GB on c3.xl”
Monitoring with tags and queries
“Monitor all Docker containers running image web”
“… in region us-west-2 across all availability zones”
“… that use more than 1.5x the average on c3.xl”
Demo: layers & tags
Take-aways
1. Docker increases operational complexity by an
order of magnitude unless…
2. You have layered monitoring, from the instance to
the container and to the application, and…
3. You monitor using tags and queries

More Related Content

What's hot

CoreOS: The Inside and Outside of Linux Containers
CoreOS: The Inside and Outside of Linux ContainersCoreOS: The Inside and Outside of Linux Containers
CoreOS: The Inside and Outside of Linux Containers
Ramit Surana
 
Tupperware: Containerized Deployment at FB
Tupperware: Containerized Deployment at FBTupperware: Containerized Deployment at FB
Tupperware: Containerized Deployment at FBDocker, Inc.
 
Take an Analytics-driven Approach to Container Performance with Splunk for Co...
Take an Analytics-driven Approach to Container Performance with Splunk for Co...Take an Analytics-driven Approach to Container Performance with Splunk for Co...
Take an Analytics-driven Approach to Container Performance with Splunk for Co...
Docker, Inc.
 
Fully automated kubernetes deployment and management
Fully automated kubernetes deployment and managementFully automated kubernetes deployment and management
Fully automated kubernetes deployment and management
LinuxCon ContainerCon CloudOpen China
 
Docker for Ops: Operationalize your Docker Built Apps in Production by Evan H...
Docker for Ops: Operationalize your Docker Built Apps in Production by Evan H...Docker for Ops: Operationalize your Docker Built Apps in Production by Evan H...
Docker for Ops: Operationalize your Docker Built Apps in Production by Evan H...
Docker, Inc.
 
Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases
Krishna-Kumar
 
Docker for Ops: Docker Networking Deep Dive, Considerations and Troubleshooti...
Docker for Ops: Docker Networking Deep Dive, Considerations and Troubleshooti...Docker for Ops: Docker Networking Deep Dive, Considerations and Troubleshooti...
Docker for Ops: Docker Networking Deep Dive, Considerations and Troubleshooti...
Docker, Inc.
 
Structured Container Delivery by Oscar Renalias, Accenture
Structured Container Delivery by Oscar Renalias, AccentureStructured Container Delivery by Oscar Renalias, Accenture
Structured Container Delivery by Oscar Renalias, Accenture
Docker, Inc.
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflix
aspyker
 
How to Build Your First Web App in Go
How to Build Your First Web App in GoHow to Build Your First Web App in Go
How to Build Your First Web App in Go
All Things Open
 
Application Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt BaldwinApplication Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt Baldwin
Docker, Inc.
 
Fluentd and docker monitoring
Fluentd and docker monitoringFluentd and docker monitoring
Fluentd and docker monitoring
Vinay Krishna
 
Re:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS IntegrationRe:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS Integration
aspyker
 
K8S in prod
K8S in prodK8S in prod
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and Kubernetes
Will Hall
 
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Roberto Hashioka
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016
aspyker
 
Container orchestration overview
Container orchestration overviewContainer orchestration overview
Container orchestration overview
Wyn B. Van Devanter
 
Introducing Chef | An IT automation for speed and awesomeness
Introducing Chef | An IT automation for speed and awesomenessIntroducing Chef | An IT automation for speed and awesomeness
Introducing Chef | An IT automation for speed and awesomeness
Ramit Surana
 
Kubernetes 101 for Developers
Kubernetes 101 for DevelopersKubernetes 101 for Developers
Kubernetes 101 for Developers
Ross Kukulinski
 

What's hot (20)

CoreOS: The Inside and Outside of Linux Containers
CoreOS: The Inside and Outside of Linux ContainersCoreOS: The Inside and Outside of Linux Containers
CoreOS: The Inside and Outside of Linux Containers
 
Tupperware: Containerized Deployment at FB
Tupperware: Containerized Deployment at FBTupperware: Containerized Deployment at FB
Tupperware: Containerized Deployment at FB
 
Take an Analytics-driven Approach to Container Performance with Splunk for Co...
Take an Analytics-driven Approach to Container Performance with Splunk for Co...Take an Analytics-driven Approach to Container Performance with Splunk for Co...
Take an Analytics-driven Approach to Container Performance with Splunk for Co...
 
Fully automated kubernetes deployment and management
Fully automated kubernetes deployment and managementFully automated kubernetes deployment and management
Fully automated kubernetes deployment and management
 
Docker for Ops: Operationalize your Docker Built Apps in Production by Evan H...
Docker for Ops: Operationalize your Docker Built Apps in Production by Evan H...Docker for Ops: Operationalize your Docker Built Apps in Production by Evan H...
Docker for Ops: Operationalize your Docker Built Apps in Production by Evan H...
 
Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases
 
Docker for Ops: Docker Networking Deep Dive, Considerations and Troubleshooti...
Docker for Ops: Docker Networking Deep Dive, Considerations and Troubleshooti...Docker for Ops: Docker Networking Deep Dive, Considerations and Troubleshooti...
Docker for Ops: Docker Networking Deep Dive, Considerations and Troubleshooti...
 
Structured Container Delivery by Oscar Renalias, Accenture
Structured Container Delivery by Oscar Renalias, AccentureStructured Container Delivery by Oscar Renalias, Accenture
Structured Container Delivery by Oscar Renalias, Accenture
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflix
 
How to Build Your First Web App in Go
How to Build Your First Web App in GoHow to Build Your First Web App in Go
How to Build Your First Web App in Go
 
Application Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt BaldwinApplication Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt Baldwin
 
Fluentd and docker monitoring
Fluentd and docker monitoringFluentd and docker monitoring
Fluentd and docker monitoring
 
Re:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS IntegrationRe:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS Integration
 
K8S in prod
K8S in prodK8S in prod
K8S in prod
 
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and Kubernetes
 
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016
 
Container orchestration overview
Container orchestration overviewContainer orchestration overview
Container orchestration overview
 
Introducing Chef | An IT automation for speed and awesomeness
Introducing Chef | An IT automation for speed and awesomenessIntroducing Chef | An IT automation for speed and awesomeness
Introducing Chef | An IT automation for speed and awesomeness
 
Kubernetes 101 for Developers
Kubernetes 101 for DevelopersKubernetes 101 for Developers
Kubernetes 101 for Developers
 

Viewers also liked

Measuring Micro-services. Richard Rodger
Measuring Micro-services. Richard RodgerMeasuring Micro-services. Richard Rodger
Measuring Micro-services. Richard Rodger
Future Insights
 
Performance monitoring for Docker - Lucerne meetup
Performance monitoring for Docker - Lucerne meetupPerformance monitoring for Docker - Lucerne meetup
Performance monitoring for Docker - Lucerne meetup
Stijn Polfliet
 
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios
 
Monitoring docker container and dockerized applications
Monitoring docker container and dockerized applicationsMonitoring docker container and dockerized applications
Monitoring docker container and dockerized applications
Ananth Padmanabhan
 
Docker Indy Meetup Monitoring 30-Aug-2016
Docker Indy Meetup Monitoring 30-Aug-2016Docker Indy Meetup Monitoring 30-Aug-2016
Docker Indy Meetup Monitoring 30-Aug-2016
Matt Bentley
 
Monitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applicationsMonitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applications
Satya Sanjibani Routray
 
ContainerDays NYC 2016: "Observability and Manageability in a Container Envir...
ContainerDays NYC 2016: "Observability and Manageability in a Container Envir...ContainerDays NYC 2016: "Observability and Manageability in a Container Envir...
ContainerDays NYC 2016: "Observability and Manageability in a Container Envir...
DynamicInfraDays
 
Voxxed Days Thessaloniki 2016 - Microservices in production
Voxxed Days Thessaloniki 2016 - Microservices in productionVoxxed Days Thessaloniki 2016 - Microservices in production
Voxxed Days Thessaloniki 2016 - Microservices in production
Voxxed Days Thessaloniki
 
2008 "An overview of Methods for analysis of Identifiability and Observabilit...
2008 "An overview of Methods for analysis of Identifiability and Observabilit...2008 "An overview of Methods for analysis of Identifiability and Observabilit...
2008 "An overview of Methods for analysis of Identifiability and Observabilit...
Steinar Elgsæter
 
BFF Pattern in Action: SoundCloud’s Microservices
BFF Pattern in Action: SoundCloud’s MicroservicesBFF Pattern in Action: SoundCloud’s Microservices
BFF Pattern in Action: SoundCloud’s Microservices
Bora Tunca
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
Engin Yoeyen
 
Tracing 2000+ polyglot microservices at Uber with Jaeger and OpenTracing
Tracing 2000+ polyglot microservices at Uber with Jaeger and OpenTracingTracing 2000+ polyglot microservices at Uber with Jaeger and OpenTracing
Tracing 2000+ polyglot microservices at Uber with Jaeger and OpenTracing
Yuri Shkuro
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Brian Brazil
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
Theo Schlossnagle
 
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
Martin Etmajer
 
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method
Brendan Gregg
 
SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
Brendan Gregg
 
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
Amazon Web Services
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
Amazon Web Services
 
AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...
AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...
AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...
Amazon Web Services
 

Viewers also liked (20)

Measuring Micro-services. Richard Rodger
Measuring Micro-services. Richard RodgerMeasuring Micro-services. Richard Rodger
Measuring Micro-services. Richard Rodger
 
Performance monitoring for Docker - Lucerne meetup
Performance monitoring for Docker - Lucerne meetupPerformance monitoring for Docker - Lucerne meetup
Performance monitoring for Docker - Lucerne meetup
 
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
 
Monitoring docker container and dockerized applications
Monitoring docker container and dockerized applicationsMonitoring docker container and dockerized applications
Monitoring docker container and dockerized applications
 
Docker Indy Meetup Monitoring 30-Aug-2016
Docker Indy Meetup Monitoring 30-Aug-2016Docker Indy Meetup Monitoring 30-Aug-2016
Docker Indy Meetup Monitoring 30-Aug-2016
 
Monitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applicationsMonitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applications
 
ContainerDays NYC 2016: "Observability and Manageability in a Container Envir...
ContainerDays NYC 2016: "Observability and Manageability in a Container Envir...ContainerDays NYC 2016: "Observability and Manageability in a Container Envir...
ContainerDays NYC 2016: "Observability and Manageability in a Container Envir...
 
Voxxed Days Thessaloniki 2016 - Microservices in production
Voxxed Days Thessaloniki 2016 - Microservices in productionVoxxed Days Thessaloniki 2016 - Microservices in production
Voxxed Days Thessaloniki 2016 - Microservices in production
 
2008 "An overview of Methods for analysis of Identifiability and Observabilit...
2008 "An overview of Methods for analysis of Identifiability and Observabilit...2008 "An overview of Methods for analysis of Identifiability and Observabilit...
2008 "An overview of Methods for analysis of Identifiability and Observabilit...
 
BFF Pattern in Action: SoundCloud’s Microservices
BFF Pattern in Action: SoundCloud’s MicroservicesBFF Pattern in Action: SoundCloud’s Microservices
BFF Pattern in Action: SoundCloud’s Microservices
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
Tracing 2000+ polyglot microservices at Uber with Jaeger and OpenTracing
Tracing 2000+ polyglot microservices at Uber with Jaeger and OpenTracingTracing 2000+ polyglot microservices at Uber with Jaeger and OpenTracing
Tracing 2000+ polyglot microservices at Uber with Jaeger and OpenTracing
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
 
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method
 
SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
 
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
 
AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...
AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...
AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...
 

Similar to Monitoring Docker containers - Docker NYC Feb 2015

Devoxx 2016 - Docker Nuts and Bolts
Devoxx 2016 - Docker Nuts and BoltsDevoxx 2016 - Docker Nuts and Bolts
Devoxx 2016 - Docker Nuts and Bolts
Patrick Chanezon
 
Intro Docker october 2013
Intro Docker october 2013Intro Docker october 2013
Intro Docker october 2013dotCloud
 
Docker introduction
Docker introductionDocker introduction
Docker introduction
dotCloud
 
Dock ir incident response in a containerized, immutable, continually deploy...
Dock ir   incident response in a containerized, immutable, continually deploy...Dock ir   incident response in a containerized, immutable, continually deploy...
Dock ir incident response in a containerized, immutable, continually deploy...
Shakacon
 
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013dotCloud
 
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
dotCloud
 
Application Deployment on Openstack
Application Deployment on OpenstackApplication Deployment on Openstack
Application Deployment on OpenstackDocker, Inc.
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017
Patrick Chanezon
 
The challenge of application distribution - Introduction to Docker (2014 dec ...
The challenge of application distribution - Introduction to Docker (2014 dec ...The challenge of application distribution - Introduction to Docker (2014 dec ...
The challenge of application distribution - Introduction to Docker (2014 dec ...
Sébastien Portebois
 
State of the Container Ecosystem
State of the Container EcosystemState of the Container Ecosystem
State of the Container Ecosystem
Vinay Rao
 
Webinar Docker Tri Series
Webinar Docker Tri SeriesWebinar Docker Tri Series
Webinar Docker Tri Series
Newt Global Consulting LLC
 
Detailed Introduction To Docker
Detailed Introduction To DockerDetailed Introduction To Docker
Detailed Introduction To Docker
nklmish
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
Aditya Konarde
 
Using Docker in production: Get started today!
Using Docker in production: Get started today!Using Docker in production: Get started today!
Using Docker in production: Get started today!
Clarence Bakirtzidis
 
Containing the world with Docker
Containing the world with DockerContaining the world with Docker
Containing the world with Docker
Giuseppe Piccolo
 
Docker and-daily-devops
Docker and-daily-devopsDocker and-daily-devops
Docker and-daily-devops
Satria Ady Pradana
 
Docker & Daily DevOps
Docker & Daily DevOpsDocker & Daily DevOps
Docker & Daily DevOps
Satria Ady Pradana
 
Docker-Hanoi @DKT , Presentation about Docker Ecosystem
Docker-Hanoi @DKT , Presentation about Docker EcosystemDocker-Hanoi @DKT , Presentation about Docker Ecosystem
Docker-Hanoi @DKT , Presentation about Docker Ecosystem
Van Phuc
 

Similar to Monitoring Docker containers - Docker NYC Feb 2015 (20)

Devoxx 2016 - Docker Nuts and Bolts
Devoxx 2016 - Docker Nuts and BoltsDevoxx 2016 - Docker Nuts and Bolts
Devoxx 2016 - Docker Nuts and Bolts
 
Intro Docker october 2013
Intro Docker october 2013Intro Docker october 2013
Intro Docker october 2013
 
Docker introduction
Docker introductionDocker introduction
Docker introduction
 
Dock ir incident response in a containerized, immutable, continually deploy...
Dock ir   incident response in a containerized, immutable, continually deploy...Dock ir   incident response in a containerized, immutable, continually deploy...
Dock ir incident response in a containerized, immutable, continually deploy...
 
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
 
OpenStack Summit
OpenStack SummitOpenStack Summit
OpenStack Summit
 
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
 
Application Deployment on Openstack
Application Deployment on OpenstackApplication Deployment on Openstack
Application Deployment on Openstack
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017
 
The challenge of application distribution - Introduction to Docker (2014 dec ...
The challenge of application distribution - Introduction to Docker (2014 dec ...The challenge of application distribution - Introduction to Docker (2014 dec ...
The challenge of application distribution - Introduction to Docker (2014 dec ...
 
State of the Container Ecosystem
State of the Container EcosystemState of the Container Ecosystem
State of the Container Ecosystem
 
Webinar Docker Tri Series
Webinar Docker Tri SeriesWebinar Docker Tri Series
Webinar Docker Tri Series
 
Detailed Introduction To Docker
Detailed Introduction To DockerDetailed Introduction To Docker
Detailed Introduction To Docker
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
 
Docker-Intro
Docker-IntroDocker-Intro
Docker-Intro
 
Using Docker in production: Get started today!
Using Docker in production: Get started today!Using Docker in production: Get started today!
Using Docker in production: Get started today!
 
Containing the world with Docker
Containing the world with DockerContaining the world with Docker
Containing the world with Docker
 
Docker and-daily-devops
Docker and-daily-devopsDocker and-daily-devops
Docker and-daily-devops
 
Docker & Daily DevOps
Docker & Daily DevOpsDocker & Daily DevOps
Docker & Daily DevOps
 
Docker-Hanoi @DKT , Presentation about Docker Ecosystem
Docker-Hanoi @DKT , Presentation about Docker EcosystemDocker-Hanoi @DKT , Presentation about Docker Ecosystem
Docker-Hanoi @DKT , Presentation about Docker Ecosystem
 

More from Datadog

What it Means to be a Next-Generation Managed Service Provider
What it Means to be a Next-Generation Managed Service ProviderWhat it Means to be a Next-Generation Managed Service Provider
What it Means to be a Next-Generation Managed Service Provider
Datadog
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
Datadog
 
Datadog + VictorOps Webinar
Datadog + VictorOps WebinarDatadog + VictorOps Webinar
Datadog + VictorOps Webinar
Datadog
 
Dataday Texas 2016 - Datadog
Dataday Texas 2016 - DatadogDataday Texas 2016 - Datadog
Dataday Texas 2016 - Datadog
Datadog
 
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
Datadog
 
Treating Infrastructure as Garbage
Treating Infrastructure as GarbageTreating Infrastructure as Garbage
Treating Infrastructure as Garbage
Datadog
 
Events and metrics the Lifeblood of Webops
Events and metrics the Lifeblood of WebopsEvents and metrics the Lifeblood of Webops
Events and metrics the Lifeblood of Webops
Datadog
 
Big (IT) data
Big (IT) dataBig (IT) data
Big (IT) data
Datadog
 
Deep dive into Nagios analytics
Deep dive into Nagios analyticsDeep dive into Nagios analytics
Deep dive into Nagios analytics
Datadog
 
Just enough web ops for web developers
Just enough web ops for web developersJust enough web ops for web developers
Just enough web ops for web developers
Datadog
 
Customer Ops: DevOps &lt;3 customer support
Customer Ops: DevOps &lt;3 customer supportCustomer Ops: DevOps &lt;3 customer support
Customer Ops: DevOps &lt;3 customer support
Datadog
 
I &lt;3 graphs in 20 slides
I &lt;3 graphs in 20 slidesI &lt;3 graphs in 20 slides
I &lt;3 graphs in 20 slides
Datadog
 
Effective monitoring with StatsD
Effective monitoring with StatsDEffective monitoring with StatsD
Effective monitoring with StatsD
Datadog
 
Alerting: more signal, less noise, less pain
Alerting: more signal, less noise, less painAlerting: more signal, less noise, less pain
Alerting: more signal, less noise, less pain
Datadog
 
Fact based monitoring
Fact based monitoringFact based monitoring
Fact based monitoringDatadog
 
Fact-Based Monitoring
Fact-Based MonitoringFact-Based Monitoring
Fact-Based Monitoring
Datadog
 
Monitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-toMonitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-to
Datadog
 
What’s in this Cookbook? - Mike Fiedler
What’s in this Cookbook? - Mike FiedlerWhat’s in this Cookbook? - Mike Fiedler
What’s in this Cookbook? - Mike Fiedler
Datadog
 
I Love Graphs - Alexis Lê-Quôc
I Love Graphs - Alexis Lê-QuôcI Love Graphs - Alexis Lê-Quôc
I Love Graphs - Alexis Lê-Quôc
Datadog
 
Virtualization at Gilt - Rangarajan Radhakrishnan
Virtualization at Gilt - Rangarajan RadhakrishnanVirtualization at Gilt - Rangarajan Radhakrishnan
Virtualization at Gilt - Rangarajan Radhakrishnan
Datadog
 

More from Datadog (20)

What it Means to be a Next-Generation Managed Service Provider
What it Means to be a Next-Generation Managed Service ProviderWhat it Means to be a Next-Generation Managed Service Provider
What it Means to be a Next-Generation Managed Service Provider
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
Datadog + VictorOps Webinar
Datadog + VictorOps WebinarDatadog + VictorOps Webinar
Datadog + VictorOps Webinar
 
Dataday Texas 2016 - Datadog
Dataday Texas 2016 - DatadogDataday Texas 2016 - Datadog
Dataday Texas 2016 - Datadog
 
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
 
Treating Infrastructure as Garbage
Treating Infrastructure as GarbageTreating Infrastructure as Garbage
Treating Infrastructure as Garbage
 
Events and metrics the Lifeblood of Webops
Events and metrics the Lifeblood of WebopsEvents and metrics the Lifeblood of Webops
Events and metrics the Lifeblood of Webops
 
Big (IT) data
Big (IT) dataBig (IT) data
Big (IT) data
 
Deep dive into Nagios analytics
Deep dive into Nagios analyticsDeep dive into Nagios analytics
Deep dive into Nagios analytics
 
Just enough web ops for web developers
Just enough web ops for web developersJust enough web ops for web developers
Just enough web ops for web developers
 
Customer Ops: DevOps &lt;3 customer support
Customer Ops: DevOps &lt;3 customer supportCustomer Ops: DevOps &lt;3 customer support
Customer Ops: DevOps &lt;3 customer support
 
I &lt;3 graphs in 20 slides
I &lt;3 graphs in 20 slidesI &lt;3 graphs in 20 slides
I &lt;3 graphs in 20 slides
 
Effective monitoring with StatsD
Effective monitoring with StatsDEffective monitoring with StatsD
Effective monitoring with StatsD
 
Alerting: more signal, less noise, less pain
Alerting: more signal, less noise, less painAlerting: more signal, less noise, less pain
Alerting: more signal, less noise, less pain
 
Fact based monitoring
Fact based monitoringFact based monitoring
Fact based monitoring
 
Fact-Based Monitoring
Fact-Based MonitoringFact-Based Monitoring
Fact-Based Monitoring
 
Monitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-toMonitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-to
 
What’s in this Cookbook? - Mike Fiedler
What’s in this Cookbook? - Mike FiedlerWhat’s in this Cookbook? - Mike Fiedler
What’s in this Cookbook? - Mike Fiedler
 
I Love Graphs - Alexis Lê-Quôc
I Love Graphs - Alexis Lê-QuôcI Love Graphs - Alexis Lê-Quôc
I Love Graphs - Alexis Lê-Quôc
 
Virtualization at Gilt - Rangarajan Radhakrishnan
Virtualization at Gilt - Rangarajan RadhakrishnanVirtualization at Gilt - Rangarajan Radhakrishnan
Virtualization at Gilt - Rangarajan Radhakrishnan
 

Recently uploaded

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 

Recently uploaded (20)

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 

Monitoring Docker containers - Docker NYC Feb 2015

Editor's Notes

  1. My name is Alexis. I’m the CTO of Datadog. We monitor cloud-based infrastructures. We have been monitoring containers for a few years now (lxc then docker)
  2. Datadog is a monitoring service made for cloud environments, such as AWS, Azure, Google Cloud, etc. By that I mean that Datadog understands that your infrastructure can change at any time and deals with it naturally. To be able to monitor effectively, Datadog acts as an aggregator: it aggregates everything, it speaks native Cloudwatch and over 100 different other sources, like databases, web servers, etc.
  3. My goals for this talk are three-fold. Dive into key Docker metrics Explain operational complexity. In other words I want to take what we have seen on the field and show you where the pain points will be. Rethink monitoring of Docker containers. The old tricks won’t work.
  4. Here’s what I would like to talk about today. I will start with very brief history of containers and docker. This is a popular topic so I will only focus on operational matters, including key metrics that containers expose. I will focus on the inherent complexity that comes with running fleets of containers. I will illustrate this with what we see out there, in the real world. We have a particular vantage point that gives us good insight into this.
  5. Containers, as lightweight virtual runtimes have been around for a while without going back all the way to the mainframe. Depending on the operating system, they go by the name of jails, zones, cgroups and are like traditional VMs, without the flexibility but also without the overhead. They were initially designed for security reasons (e.g. jails) but most recently have been used to escape dependency hell.
  6. Dependency hell is this state where you end up having tens or hundreds of dependencies on shared code. Before shared libraries we had compile-time dependencies to build static executables. Shared libraries were a good idea when the size of a library was commensurate to the amount of RAM available in a machine. Now, obviously, there is a lot less memory pressure. Still, that has remained the default way to build software. Then, packages came: apt, yum, rvm, virtualenv, etc. as a partial solution to have a group of binaries that reliably work together. That proved too slow, having to wait for upstream updates so people started to bundle their code and dependencies into /opt. Then a way to make self-contained packages. And now we are back full-circle to static binaries, when we realized how much baggage we carried in shared code.
  7. When you look at it a container is a hybrid between a process and a full-blown host. It has a Dockerfile, which is a manifest or a recipe to build the container, much like source code builds a binary and kickstart, chef or puppet build a full-blown host. Then you have the actual binary representation of the container on disk, in /var/lib/docker. For a binary, it’s the .text section. For a host it’s its filesystem. Finally when it runs a container has a unique ID, much like a process has a PID and a host has a hostname. So a container is this intermediary between a single binary and a full-blown host. It’s lik a static binary with a fully-functioning IP stack. To put it simply if you look at it from a dev point of view, a container looks like a binary. If you are think about it from an operations point of view, a container is closer to a host.
  8. Let’s recap for a minute. We know that a container is a lightweight VM We know roughly what current deployments look like in number of containers per instance. We know how to measure the performance of a single container. How do we monitor the whole thing. Here I want to make the case that Docker introduces operational complexity
  9. This is how the stack has evolved over the past 15 years. On the left, without virtualization. Off-the-shelf could be your J2EE runtime, or your database. Then when virtualization and services like EC2 were introduced, in the middle. It’s allowed better utilization and quasi-instant provisioning but for an engineer, few things have changed. And now running Docker containers inside EC2 instances on top of real hardware. There is a clear trend here toward a lot more moving parts than before. It also puts engineering much closer to operations.
  10. Specifically by an order of magnitude or so given the 5 containers per instance on average.. This affects a lot of different things at run-time. provisioning: docker configuration: etcd, confd, consul, etc. orchestration: kubernetes, mesos monitoring: where I can contribute the most
  11. Let’s look at monitoring an EC2 instance. I counted 10 CloudWatch metrics, about 100 metrics coming from the OS, 50 metrics coming from a container, 10-15 of which are critical to monitor, and let’s say 50 metrics for an off-the-shelf component, for instance a database. This is a conservative estimate as we see our customers use many more metrics per instance.
  12. Now let’s plug in some numbers. Assuming you have 100 instances, and 5 containers per instance, you have 500 containers to manage and monitor. And remember, from a management standpoint, containers behave like hosts. Single-purpose hosts, but hosts none the less.
  13. So for a given instance, you have moved from 160 metrics per instance, to about 410. Again assuming, 5 containers per host and being conservative on the number of metrics you need to keep an eye on.
  14. If I recap, 100 instances, 41,000 metrics generated. That’s already 3x what you had before.
  15. And it gets worse. Much worse Let’s talk about velocity. If you compare the “half-life” of an EC2 instance, and by half-life I mean the median uptime of your instances. You’re likely having a mix of hourly instances and long-lived instances that will go on for months. Compare this to containers. A container’s half-life can be in minutes, days at the most.
  16. On top of that, you’ll have to layer in much faster provisioning, where new versions of containers are created on a daily basis, so you rotate your container fleet on a daily basis between versions. Much faster and much more often than doing an OS upgrade. And you add autonomic orchestration that go from imperative to declarative. So you can say, I need 1 container of this kind per instance per zone, at all times. And the scheduler makes sure it’s always the case. If you use mesos or kubernetes, this is your new reality
  17. In summary, from a management and monitoring standpoint, it means a lot more and a lot faster. More moving parts that change pretty much all the time with limited predictability.
  18. If your monitoring is still centered around hosts, this is what your world view looks like: complicated. When we talk to customers, they feel that the move to EC2 was a key factor to rethink their monitoring. Because instances come and go, different groups within their organization would spin up new stacks with little advance notice. Imagine if you throw containers in the mix. The old, host-centric monitoring practice simply stops working altogether. The host-centric monitoring practice that has you track individual hosts. It’s a bit like ptolemaic astronomy. Put the earth at the center of the universe and account for the movement of the planets. It gets pretty complicated.
  19. In other words host-centric monitoring does not really understand containers, so either you treat them as hosts, and you have a lot of hosts that come and go every few minutes, which makes your life miserable because the host-centric monitoring system thinks half of your infrastructure is on fire. Or you don’t track containers, and you essentially have a gap. You see the OS, you see the app, and what happens in the middle, well…
  20. So in short, if you think about monitoring containers like you’ve monitored hosts before, you’re in for a painful ride very very quickly.
  21. So how do we do it properly?
  22. We need a new approach, that does not treat everything like a host. The picture here, as you’ve guessed, comes from Copernicus. He suggested a radical approach to simplifying the universe. Don’t put the earth at the center of it… Compared to putting the earth at the center of the universe, this one is striking in clarity and simplicity.
  23. So what’s the secret sauce? It’s simple: forget about hosts, think in layers and tags. What do you I mean by that…
  24. Using a layered monitoring approach is pretty simple. This is where you want to be: have coverage from the bottom of the stack all the way to the top.
  25. Which means using monitoring tools that don’t leave any gap. At the bottom, CloudWatch to know about the VMs. In the middle, an infrastructure monitoring system that understands containers. Ad at the top, an application performance monitoring tool.
  26. So in terms of what you can see through these tools: At the bottom, raw resources like cpu, network, io of the VM. In the middle, anything from the OS to docker metrics. At the top, application throughput.
  27. The key here is to have 1 shared timeline for everything. You want to get CloudWatch metrics, OS metrics, Docker metrics and app metrics, ideally in 1 place, all on the same timeline so that you can see when things break, how changes ripples through the different layers.
  28. That’s the first part of the equation. Layers.
  29. Tags is the second half of the equation. The good news is that you use them already. How are they relevant to monitoring in general and monitoring containers in particular?
  30. Think of monitoring like ASG. Think of monitoring like container orchestration. Don’t think “imperative”, think “declarative”. Don’t monitor host X, Y and Z. Instead, monitor everything that share a common property, for instance being located in the same AZ. Think in terms of queries and you will see that tags work beautifully because queries operate on tags.
  31. Here’s an example: Monitor… to make sure a container does not blow up in memory.
  32. You can see the tags: Name of container image: web AWS Region: us-west-2 Instance type: c3.xlarge Do you see how powerful this is?
  33. Once you have queries in place, you can express even more interesting things such as: Monitor …
  34. Ok, demo time.