Automating Docker Enterprise installations and upgrades can provide efficiency, reproducibility, and reliability. The document discusses two approaches: provisioning infrastructure with tools like Packer, Terraform, and Ansible; and using a configuration management system like Saltstack for continuous upgrades. It also covers automating the installation, upgrade, and replacement of key Docker components like the Docker Engine, Universal Control Plane (UCP), and Docker Trusted Registry (DTR). Automating these processes can help scale operations while reducing risks from manual processes.
7. WE STARTED SETTING UP
A DOCKER CLUSTER
• Handcrafted VM’s with no visibility wrt. versions
• Dependency on specific colleagues
• Rudimentary automation with Puppet
• More handicraft to avoid automation tools and
scripts from e.g. upgrading all nodes at once
Alm. Brand
It all started
back in 1792, …
and then, in 2015:
8. Docker
2016
Hub Built
Moved
to AWS
• Individual engines in ASG’s
• Deployed Engine using Ansible
• Developers deployed code
using Docker Engine TLS
certificate + their own tools
• No automatic replacement –
HA only
2015
2014
14. Install base stack of
Swarm services
*Only if Engine or
other OS packages
changed
Alm. Brand
Change Engine/UCP/
DTR version in GitLab
1 2 3 4 5 6
Wait for new VM
template built
with Packer*
Switch VM template
for one node at a time
Run Terraform
to destroy/
create node
Run Ansible playbooks
for each node
1) Wait for business
workloads to leave
node
2) Engine version
determined by
VM template
3) Install or join UCP and
DTR as needed
4) Configure LDAP,
teams, grants,
collections
5) Wait for UCP
to reconcile
15. Docker
Each AutoScaling Group
Change Engine
version in Salt
1 2 3
Salt cron on
node – 5 min
If correct Engine
version not installed…
Release lock
4 5 6
Get a lock in
Consul for that
ASG
Run engine install script
1) Install Docker Engine
2) Start Docker
3) Wait for Engine and
UCP to respond
4) Sleep 30 – reduce task
churn
Get
Lock
Install
Script
Release Lock
Lock Wait
16. Comparison - Engine Upgrade
Replacement In-place
• Container like
(cruft is removed
when replaced)
• Atomic
• Migration of
running services
• “Slow”
• Fast
• One operator
step
• Timing complexity
• Risk
20. Comparison - Automation
Centralized
/
Triggered
Decentralized
/
Continuous
• Full overview of cluster
• Failures stops the
pipeline
• Ability to re-run failed
tasks
• Single pipeline approach
• Started by a
human/schedule
• Non-reactive (but
can be)
• Automatic
replacement
• Drift correction
• Complexity of order
21. Init
• Engineer starts pipeline in
GitLab
• Terraform creates VMs
• Ansible inventory generated by
TF
• TF launches Ansible playbooks,
waits for completion
Create Swarm
• Check nodes from inventory for
existing Swarm managers
• If none found, docker swarm init
• If no UCP containers (first node):
docker run docker/ucp
install
• Wait for ucp-reconcile container to
complete
• For other UCP manager nodes, run
docker swarm join
• Again, wait for ucp-reconcile
container to complete (on each
manager)
• Configure LDAP
Install UCP
Alm. Brand - UCP Install/Replace
22. Alm. Brand - UCP Workers
Organize LabelJoin
• Run docker swarm join
• Wait for ucp-reconcile container
to complete
In Ansible, based on inventory metadata,
call UCP API to:
• Create Collections
• Create Teams
• Create Grants
• Adds Swarm node labels, including
assigning to a Collection, which
usually corresponds to a deployment
stage
23. Alm. Brand - UCP Upgrade
Plan RunPrep
• Change UCP version in GitLab
• If upgrading Docker Engine,
change its version too
Produces new VM template
Pre-pulls UCP/DTR images
• Engineer starts pipeline and inspects
TF plan
• If no unexpected actions in plan,
engineer continues pipeline to run
upgrade from Ansible
docker run docker/ucp upgrade
24. Docker - UCP Install/Replace
Managers Launch Create Swarm Install UCP
• Autoscaling group launches
new or replacement Manager
node
• Salt configures Engine
• Consul lock infra/swarm/manager
• If no response from manager ELB:
docker swarm init
encrypt swarm tokens
put tokens in Consul k/v store
• Otherwise:
get manager token and decrypt
docker swarm join
• Wait for swarm status “Active”
• Consul lock infra/swarm/ucp
• If no UCP containers (first node):
Wait for X swarm managers
docker run docker/ucp install
• UCP scheduled on every node by
Swarm
25. UCP Upgrade
Prep Pull Run
• Change UCP version in Salt • Salt pulls UCP images
Necessary because we run private
pre-release images
Not necessary for customers
• Wait until every node has all UCP
images
• Engineer runs “docker run
docker/ucp:$version upgrade”
• Monitor status - “docker service
inspect ucp-agent”
29. Alm. Brand - DTR install
Configure PopulateInstall
• If no DTR containers found (first
time):
Run docker/dtr install
• Otherwise:
Run docker/dtr join
• Install CVE file and enable security
scanning (this is also done nightly)
• Load and push platform‐enabling
images
GitLab Runner
Registrator
Consul Agent
...
30. Alm. Brand - DTR Upgrade/Replace
Plan RunPrep
• Change DTR version in GitLab
• If upgrading Docker Engine,
change its version too
Produces new VM template
Pre-pulls UCP/DTR images
• Engineer starts pipeline and inspects
TF plan
• If no unexpected actions in plan,
engineer continues pipeline to run
upgrade from Ansible
docker run docker/dtr upgrade
31. Docker - DTR Install
Wait
Sleep until UCP
containers
present on node
Lock
• Get consul lock
• Confirm no other
DTR replicas
Install
docker run
docker/dtr install
…
Configure
• Set S3 storage
• Install web
certificates
• Add replica id
and IP in
Consul
32. DTR Join/Replace
Check
Replicas
• Get k/v list
from Consul
• Check /health
endpoint
Remove Dead
Replicas
Join
docker run
docker/dtr join
…
Track
Add replica id
and IP in Consul
docker node rm
docker run
docker/dtr remove
34. Service Deployment Automation
“Source of truth” for Swarm and Kube services
Push or pull?
Git repo Webhook Deploy
Git repo Kube Cronjob Deploy
Client
bundle?
Application
secrets?
35. Reverse Uptime
Ensures a limited
amount of cruft left by
long-running
processes
Ensures packages
are at most one or
two weeks out of
date
Run upgrades
unattended on a
weekly or bi-weekly
schedule
36. Thank you! HALLWAY TRACK
Wednesday December 5th
at 13:00
hallwaytrack.dockercon.com
/topics/30485/
37. Take A Breakout Survey
Access your session and/or workshop surveys for the conference at any time by tapping the Sessions
link on the navigation menu or block on the home screen.
Find the session/workshop you attended and tap on it to view the session details. On this page, you will
find a link to the survey.