These slides cover my experience building Convox, an open-source AWS automation system. He’ll cover the overall architecture — including ECS, CloudFormation and Lambda — of a simple but robust app deployment.
It shares some of the hard parts about running services on ECS in production.
Scaling in conventional MOSFET for constant electric field and constant voltage
The Good Parts / The Hard Parts
1. ╔══════════════════════════════════════════╗
║ The Good Parts / The Hard Parts ║
║ ║
║ Noah Zoschke ║
║ noah@convox.com ║
║ @nzoschke ║
║ ║
║ 03/01/2016 ║
╚══════════════════════════════════════════╝
CONVOX
Open Source PaaS
https://github.com/convox/rack
2. • Provision new infrastructure
• Update base operating system
• Add capacity with horizontal
and vertical scaling
• Monitor health
• Handle failures automatically
• Create new apps
• Deploy new code
• Add capacity with horizontal
and vertical scaling
• Configure secrets and services
• Debug problems and tune
performance
• Monitor health
• Handle failures automatically
MAKE DEVOPS BORING
15. • Writing templates
• DependsOn
• Transient internal errors
• UPDATE_ROLLBACK_FAILED and DELETE_FAILED
• Migrating custom resources to native resources
• Debugging Lambda
• Sitting helpless during a Lambda outage
• Waiting for things to provision
THE HARD PARTS
CLOUDFORMATION + LAMBDA
17. THE GREAT PARTS
$ convox rack update
$ convox rack scale --type c3.xlarge --count 10
$ convox rack update <previous release>
• Update convox API quickly
• Update cluster AMIs one at a time and with zero downtime
• Resize instances one at a time and with zero downtime
• Roll out new subsystems like ECR, CloudWatch Logs and NAT Gateways
• Fail towards not modifying working infrastructure
• Roll back to previous good state if something truly unexpected happens
24. • Setting it all up: VPC, ASG, ELBs, health checks
• Managing instances
• Understanding its distributed state machine
• Rolling deploys
• Container scheduling and re-scheduling
• Capacity problems
• Collecting and making sense of logs and events
THE HARD PARTS
ECS
25. • CloudFormation updates
• ECS Task Definition and Service updates
• On-instance observations
• ecs-agent
• dockerd
• convox/agent
• App failures
• crashes
• port unresponsive
• Instance failures
• filesystem lockups
• kernel panics
• General EC2 / ASG health
THE HARD PARTS
COMPLEX INTERACTIONS AND FEEDBACK LOOPS
ecs-agent dockerd ecs-agent dockerd ecs-agent dockerd
api
128 MB
registry
256 MB
rails web.2
1024 MB
data worker.1
512 MB
rails web.3
1024 MB
data worker.2
512 MB
rails worker.2
256 MB
rails worker.3
256 MB
rails web.1
1024 MB
rails worker.1
256 MB
rails worker.4
256 MB
ECS
ASG
api ELB rails ELB
27. THE GREAT PARTS
$ convox deploy
• Configure desired container formation with one API call
• Watch extremely sophisticated automation execute it
• Assure new containers start and are healthy
• Drain old containers
• Trust automation will try its hardest to keep it running
• Re-schedule on observed failures
28. • Provision new infrastructure
• Update base operating system
• Add capacity with horizontal
and vertical scaling
• Monitor health
• Handle failures automatically
• Create new apps
• Deploy new code
• Add capacity with horizontal
and vertical scaling
• Configure secrets and services
• Debug problems and tune
performance
• Monitor health
• Handle failures automatically
CONVOX
MAKE DEVOPS BORING