Containers come and go rapidly, which is great for scalable or fast-evolving infrastructure. However, the short life of containers make it more challenging to monitor, leaving many with questions such as: How many containers can you run on a given Amazon EC2 instance type? Which metric should you look at to measure contention? How do you manage fleets of containers at scale? In this session, we'll present the challenges and benefits of running containers at scale, how to use quantitative performance patterns to monitor your infrastructure at this magnitude and complexity, and we'll discuss proven strategies for monitoring your containerized infrastructure on AWS and ECS.
Learning Objectives:
- Set up the infrastructure to monitor your containers running on AWS
- Understand the metrics available and what they mean
- Define a strategy to monitor your containers
3. Amazon EC2 Container Service (ECS)
Container Management
at Any Scale
Flexible Container
Placement
Integration
with the AWS Platform
4. Components of Amazon ECS
Task
One or more containers
running together on an
Instance
Task Definition
Definition of containers and
environment configuration
Cluster
Fleet of EC2 instances on
which tasks run
Cluster Manager
Manage cluster resource
and state of tasks
Scheduler
Places tasks onto cluster
Agent
Coordinate EC2 instances
and Manager
10. Configuring Logging in Task Definition
logConfiguration task definition parameter
Requires version 1.18 or greater of the Docker Remote API
Maps to docker run --log-driver option
Log drivers: json-file, syslog, journald, gelf, fluentd, awslogs
12. Monitoring with Amazon CloudWatch
Metric data sent to CloudWatch in 1-minute periods and
recorded for a period of two weeks
Available metrics: CPUReservation, MemoryReservation,
CPUUtilization, MemoryUtilization
Available dimensions: ClusterName, ServiceName
15. Monitoring with Amazon CloudWatch
Use the Amazon CloudWatch Monitoring Scripts to monitor
additional metrics, e.g. disk space:
# Edit crontab
> crontab -e
# Add command to report disk space utilization to CloudWatch every five minutes
*/5 * * * * <path_to>/mon-put-instance-data.pl --disk-space-util --disk-space-used --disk-space-avail
--disk-path=/ --from-cron
19. • SaaS based infrastructure and application monitoring
• Focus on modern environments
• Cloud, Containers, Micro Services
• Processing nearly a trillion data points per day
• Intelligent Alerting and Insightful Dashboards
Datadog Overview
20. Operating Systems, Cloud Providers (AWS), Containers, Web Servers, Datastores,
Caches, Queues and more...
Monitor Everything
25. Pseudo-files
• Provide visibility into container metrics via the file system.
• Generally under:
/cgroup/<resource>/docker/$CONTAINER_ID/
or
/sys/fs/cgroup/<resource>/docker/$CONTAINER_ID/
26. Pseudo-files: CPU Metrics
$ cat /sys/fs/cgroup/cpuacct/docker/$CONTAINER_ID/cpuacct.stat
> user 2451 # time spent running processes since boot
> system 966 # time spent executing system calls since boot
$ cat /sys/fs/cgroup/cpu/docker/$CONTAINER_ID/cpu.stat
> nr_periods 565 # Number of enforcement intervals that have elapsed
> nr_throttled 559 # Number of times the group has been throttled
> throttled_time 12119585961 # Total time that members of the group were throttled (12.12 seconds)
Pseudo-files: CPU Throttling
27. Docker API
• Detailed streaming metrics as JSON HTTP socket
$ curl -v --unix-socket /var/run/docker.sock
http://localhost/containers/28d7a95f468e/stats
28. STATS Command
# Usage: docker stats CONTAINER [CONTAINER...]
$ docker stats $CONTAINER_ID
CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O BLOCK I/O
ecb37227ac84 0.12% 71.53 MiB/490 MiB 14.60% 900.2 MB/275.5 MB 266.8 MB/872.7 MB
30. Agents and Daemons
• Ideally we’d want to schedule an agent or daemon on
each node via ECS Tasks.
• Current Solutions:
1. Bake it into your image.
2. Install on each host at provision time.
3. Automate with User Scripts and Launch
Configs
31. Grant Privileges via IAM
$ aws iam create-role
--role-name ecs-monitoring
--assume-role-policy-document file://trust.policy
$ aws iam put-role-policy
--role-name ecs-monitoring
--policy-name ecs-monitoring-policy
--policy-document file://ecs.policy
$ aws iam create-instance-profile
--instance-profile-name ECSNode
$ aws iam add-role-to-instance-profile
--instance-profile-name ECSNode
--role-name ecs-monitoring
39. Service Discovery
Docker API ECS & CloudWatch
Monitoring Agent
Container
A O A O
Containers List &
Metadata
Additional Metadata
(Tags, etc)
Config Backend
Integration Configurations
Host Level
Metrics
40. Custom Metrics
• Instrument custom applications
• You know your key transactions best.
• Use async protocols like STATSD
42. Monday, October 24, 2016
JW Marriot Austin
https://aws.amazon.com/events/devday-austin
Free, one-day developer event featuring tracks,
labs, and workshops around Serverless,
Containers, IoT, and Mobile
Q&A If you want to learn more, register for our upcoming DevDay Austin: