12. Update EC2 hosts in a ECS cluster: Use cases
➢ You have a custom AMI for your ECS cluster(s).
➢ You want to always rollout the latest ECS-optimized AMIs.
➢ You want to rotate the admin keys.
➢ Change Instance type.
➢ Use an updated user_data script.
13. Update EC2 hosts in a ECS cluster: The process
➢ Terraform emits an AWS cloudwatch event once launch
configuration was created.
➢ Detach “old instances“ from ASG and wait for capacity.
➢ “Move” services from old instances to new instances.
➢ Terminate old instances when no more tasks running.
➢ Alert on failures.
14. Terraform + AWS Events + AWS Step functions =
Awesome
I created a new
launch configuration
lc-1234 for ASG
asg-1234 belonging
to ECS cluster
cluster-A
15. AWS CloudWatch Events
time
Task A
started
bar
Task C
started
Task B
stopped
ECS Host
bla baz
custom event
custom event
custom event
26. #1 ECS agent disconnects - Initial solution
➢ Cron job on ECS hosts to notify via SNS event and restart
ECS agent.
➢ Chances of ECS agent failing again due to some inherent
problem within the instance are high.
28. #1 ECS agent disconnects - Better solution
➢ Detect ECS agent disconnects.
➢ Bootup new ECS host and wait for it to be healthy.
➢ “Move” all the existing containers from the problematic
instance to a new Instance.
➢ Terminate the problematic instance.
➢ Alert on failures.
30. #1 ECS agent disconnects: Detection
How do we detect ECS agent disconnects?
AWS Cloudwatch EVENTS to the
rescue!!!
31. #1 ECS agent disconnects: ECS Events
time
Task A
started
bar
Task C
started
Task B
stopped foo baz
ECS agent
disconnected
ECS agent
connected
ECS agent
disconnected