Amazon ECS with Docker | AWS Public Sector Summit 2016

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Chad Schmutzer, Solutions Architect, AWS
June 21, 2016
Amazon ECS with Docker
It’s All About Containers

Agenda
Why containers?
What is Docker?
Amazon EC2 Container Service (Amazon ECS)
• Cluster management
• Benefits
• Running services

The Problem
Different application stacks
Different hardware deployment
environments
How to run all applications
across different environments?
How to easily migrate from one
environment to another?
Static
website
Web
front end
Background
workers
User DB
Analytics
DB
Queue
Develop-
ment VM
QA
server
Single
prod
server
On-site
cluster
Public
cloud
Contributor’s
laptop
Customer
servers

Static
website
Web
front end
Background
workers
User DB
Analytics
DB
Queue
Develop-
ment VM
QA
server
Single
prod
server
On-site
cluster
Public
cloud
Contributor’s
laptop
Customer
servers
The Solution
Unit of software delivery
Lightweight, portable, consistent
Deploy and run everywhere
Deploy and run anything

Containers
User space running on OS kernel
Little overhead
Guest OS choices limited to host OS kernel
Been around for a while: chroot, FreeBSD jails,
Solaris containers, OpenVZ, LXC

VMs vs. Containers
VMs Containers
https://www.docker.com/what-docker

Container Advantages
Portable
Flexible
Fast
EfficientServer
Guest OS
Bins/Libs Bins/Libs
App2App1

Benefits
Portable runtime application environment
Package application and dependencies in a single artifact
Run different application versions (different dependencies)
simultaneously
Faster development & deployment cycles
Better resource utilization

Use Cases
Consistent environment between development & production
Service-oriented architectures / microservices
Short lived workflows
Isolated environments for testing

Services Evolve to Microservices
Monolithic Application
Order UI User UI Shipping UI
Order
Service
User
Service
Shipping
Service
Data
Access
Host 1
Service A
Service B
Host 2
Service B
Service D
Host 3
Service A
Service C
Host 4
Service B
Service C

Containers Are Natural for Microservices
Simple to model
Any app, any language
Image is the version
Test & deploy same artifact
Stateless servers decrease change risk

Docker
Lightweight container virtualization platform
Tools to manage and deploy your applications
Licensed under the Apache 2.0 license
Built by Docker, Inc.

Docker Engine
Docker daemon
Docker client
Image source - https://docs.docker.com/engine/introduction/understanding-docker/
Client DOCKER_HOST Registry
docker build
docker pull
docker run
Docker daemon
Containers Images

Amazon ECS:
Cluster Management

Server
Guest OS
Bins/Libs Bins/Libs
App2App1
Scheduling One Resource Is Straightforward

Scheduling a Cluster Is Hard
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS

General Cluster Management: Resource
Management
Docker
Task
EC2 Instance
Container
Docker
Task
EC2 Instance
Container
Task
Container
Docker
EC2 Instance
Task
Container
AZ 1 AZ 2

General Cluster Management: Scheduling
Docker
Task
EC2 Instance
Container
Docker
Task
EC2 Instance
Container
Task
Container
Docker
EC2 Instance
Task
Container
AZ 1 AZ 2

Amazon ECS: Resource Management
Docker
Task
Container Instance
Container
Task
Container
Docker
Task
Container Instance
Container
Task
Container
Docker
Task
Container Instance
Container
Task
Container
AZ 1 AZ 2
Cluster Management Engine

Amazon ECS: Agent Communication
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
AZ 1 AZ 2
Agent Communication Service

Amazon ECS: Key/Value Store
Docker
Task
Container Instance
Container
ECS Agent
ELB
Internet
ELB
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
AZ 1 AZ 2
Key/Value Store

Amazon ECS: APIs
Docker
Task
Container Instance
Container
ECS Agent
ELB
Internet
ELB
User /
Scheduler
API
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
AZ 1 AZ 2
Key/Value Store

Amazon ECS: Scheduling
Docker
Task
Container Instance
Container
ECS Agent
ELB
Internet
ELB
User /
Scheduler
API
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
AZ 1 AZ 2
Key/Value Store

Easily Manage Clusters for Any Scale
Nothing to run
Complete state
Control and monitoring
Scale

Flexible Container Placement
Applications
Batch jobs
Multiple schedulers

Designed for Use with Other AWS Services
Elastic Load Balancing
Amazon Elastic Block Store
Amazon Virtual Private Cloud
Amazon CloudWatch
AWS Identity and Access Management
AWS CloudTrail

Extensible
Comprehensive APIs
Custom schedulers
Open source agent and CLI

Amazon ECS
Docker
Task
Container Instance
Amazon
ECS
Container
ECS Agent
ELB
Internet
ELB
User /
Scheduler
API
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
Docker
Task
Container Instance
Container
ECS Agent
Task
Container
AZ 1 AZ 2
Key/Value Store

Task Definitions
Volume Definitions
Container Definitions

Key Components: Task Definitions

Tasks
Shared Data Volume
Containers
schedule
Container
Instance
Volume Definitions
Container Definitions

Unit of work
Grouping of related containers
Run on container instances
Tasks

Create a Service
Good for long-running
applications and services

Create Service
Load balance traffic across containers
Automatically recover unhealthy containers
Discover services
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers

Scale Service
Scale up
Scale down
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers

Update Service
Deploy new version
Drain connections
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
new new new
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
old old old

Update Service (cont.)
Deploy new version
Drain connections
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
new new new
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
old old old

Update Service (cont.)
Deploy new version
Drain connections
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
new new new

• California Institute of Technology
• Pasadena, CA
• Top tier university: #1 in Times Higher
Education world rankings
• Small: 6400 people (1000 undergrads,
1200 grads, 300 faculty, 3900 staff)
• 3:1 undergrad-faculty ratio
• JPL: Founded by Caltech in 30’s,
managed for NASA since 1958

• Part of IMSS, the central IT org
• Lean, 6 people, all developers, even management
• 35 years of collective systems administration experience
• 50 years of collective development experience
• ~130 websites and web applications, including www.caltech.edu and
the campus intranet portal
• Much smaller than counterparts at peer institutions
Our job: Enable research and instruction through software
Academic Development Services

Upper management, operations and developers pro-cloud
Move all on-premise services to cloud within 3 years
We've been production in AWS since 2010
Many Caltech production workloads currently in AWS
Strategy: DevOps, public data, low-hanging fruit, Field of Dreams model
Cloud Adoption (2010-present)

Leverage AWS scale, expertise, and capabilities
• AZs, APIs, Infrastructure as code
• AWS better than us at many things
• AWS allows us to do things we can’t on-premises
• Don’t have to run low level services
Allows us to concentrate on how we add value
Why cloud?

• Distributed system comprised of many interconnected
systems.
• Authenticating proxy server with around 90 applications
behind it
• Covers most of the academic and administrative apps
people might use
Two parts: core system and proxied apps
access.caltech: Caltech’s intranet portal

• Needs to be highly-available
• Be performant at variable loads
• Typical traffic: 5-10 hits/s
• Must scale to 800 hits/s during registration
• Protect and secure proxied apps and data
• Certain core components should stay up during disaster
• Be able to easily deploy new versions of core software
• Need many DEV, TEST, QA and production support envs
access.caltech: key requirements

user
AUTH SERVICE
REDIS
CONTROL SERVICE
haproxy
home
admin
prefs
my_account
loadapp
challenge_questi
ons_api
LDAP
LEGACY LDAP
LDAP
LEGACY LDAP
Active
Directory
mail servers
mailman API
mailman
MySQL
PROXY SERVERS
CORE SERVER
LDAP SLAVES LDAP MASTERS AD
~90 PROXIED APPS
ON-PREM ARCHITECTURE

user
AUTH SERVICE
REDIS
CONTROL SERVICE
haproxy
home
admin
prefs
my_account
loadapp
challenge_questi
ons_api
LDAP
LEGACY LDAP
LDAP
LEGACY LDAP
Active
Directory
mail servers
mailman API
mailman
MySQL
PROXY SERVERS
CORE SERVER
LDAP SLAVES LDAP MASTERS AD
~90 PROXIED APPS
CLOUD MIGRATION: PHASE 1

• Move access.caltech core PROD to VPC in AWS
• Continuous deployment system based on
Jenkins, Docker containers, and Consul
• Be able to build DEV and TEST environments in AWS
• Proxy from AWS to on-premises apps via VPN tunnel
Later phases: move proxied apps individually to AWS
access.caltech in AWS: phase 1

ELB
NAT
NAT
RDS
AWS
VPN
VPC 1
AZ1
ELASTICACHE
(REDIS)
AZ2
PROXY CORE
CONSUL
CONSUL
PROXY CORE
ELASTICACHE
(REDIS)RDS
LDAP
MASTER
LDAP
SLAVE
VPC 2
AZ1
LDAP
SLAVE
LDAP
MASTER
LDAP
SLAVE
AZ2
LDAP
SLAVE
ELB ELB
ELASTICSEARCH
~90 PROXIED APPS
CALTECH
P
E
E
R
I
N
G
ECS MACHINE
AWS SERVICE
EC2
INSTANCE
PRIVATE
PUBLIC
SUBNETS
JENKINS

Need a more rapid, consistent deployment mechanism
• Our current process takes weeks to months to get new versions
to production, and deployments are rocky
• Raw vs cooked. Cooked: build as much before deployment as
possible.
• encapsulation of entire OS as a software artifact
• guaranteed same code and OS build for DEV, TEST, PROD
• easily replicate whole systems architectures in DEV
Docker image community
Why Docker?

Deployment pipeline (Jenkins)
Build Test Image Run Tests
Build and Push
final image
Deploy to QA
infrastructure
Run integration
tests
Human
Review
Deploy to prod
infrastructure
Run integration
tests
Deploy to prod
support
infrastructure
QA Pipeline
Developer
pushes
code
Promote to Prod pipeline

No orchestration infrastructure to run
• Container scheduling and placement are implicitly at cloud scale
— no need to plan for HA, throughput, etc.
• Built in monitoring via CloudWatch and ECS event stream
• Powerful ECS command line tools
AWS API for managing tasks and services
AWS service integration, especially for load balancers and
VPCs
ECS repositories
Why ECS? (vs Docker Swarm) PROS

Painful to debug container launch fails
docker version lags behind current, sometimes significantly
No equivalent to swarm overlay network
Different strategies for deploying containers
• Swarm has spread, binpack and random
• ECS has task and service strategies, which both seem to be like
Swarm’s “spread” strategy
• Although ECS allows you to develop your own strategies via
custom schedulers via StartTask API
Why ECS? (vs Docker Swarm) CONS

The entire container is your software
• not just your own code.
• OS + code becomes a software artifact
Development team will need to have or develop systems
experience
• Or work closely with systems people
Probably need to remediate your code in order to take
advantage of the container environment
Docker/ECS Challenges

Containers are truly disposable and anonymous
• Figuring out which container is having issues is interesting
• Entire OS is destroyed when re-deploying containers
Containers are not VMs
• No ssh interface to containers
• Containers are minimal systems: no ssh, no cron, no syslogd,
etc.
Need to change your architecture and practices
Logging, monitoring
Docker/ECS Challenges, cont.

Amazon ECS with Docker | AWS Public Sector Summit 2016

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (18)

Similar to Amazon ECS with Docker | AWS Public Sector Summit 2016

Similar to Amazon ECS with Docker | AWS Public Sector Summit 2016 (20)

More from Amazon Web Services

More from Amazon Web Services (20)

Recently uploaded

Recently uploaded (20)

Amazon ECS with Docker | AWS Public Sector Summit 2016

Editor's Notes