8. ●
What's in it for us?
●
Will it help?
●
Is it a hype?
●
Static vs. Cloud
●
Virtualization vs.
Containers
●
Private vs. public
Docker?
9. ●
Gradual adoption
of virtualization over
5 years
●
Explosion adoption
of containers over 2
years
Virtualization
OpenStack
Docker
Interest over time (by Google Analytics)
10. ●
Starting slow
●
Getting used to
●
Find limitations
●
Isolation of the builds
●
Slow?
●
Container hosts
●
Network vs. Storage
12. Paradigm shift to MicroServices
●
Loosely coupled service oriented architecture with
bounded contexts
From Adrian Cockroft (ex Netflix Chief Architect)
13. What is an application?
●
A single container
– Putting multiple processes into a single container simplifies the deployment
– Breaks Docker best-practices model
– monit, supervisord, runsvdir, runIt
●
A composition of related containers
– Pod (Kubernetes)
– Task (Amazon AWS ECS – Elastic Container Service)
– Separation of operational concerns
– Not all frameworks understand the container composition
●
A graph of dependent containers
14. Immutable
Artifacts
●
Configuration management doesn't guarantee
immutability
●
Cumulative change/Drift vs. refresh
●
Version everything!
●
Turn your release process into an artifact!
Pipeline Builder http://bit.ly/1Eoz7WV
15. Release Process / Pipeline
1. A developer commits new code to a Repo
2. A build is triggered and creates an app artifact and
pushes it into the artifact repository with metadata:
1. Artifact has a hard version
2. Declares its contracts and contract versions
3. List of dependencies and their versions (Bill-of-materials) attached
3. Builds a Docker images and pushes it to the Docker
registry
1. Inherits from official base image approved by InfoSec and Systems teams
2. Has exactly the same tag as the version of the app artifact – creates correlations
1:1 with the source
4. Deployment ...
16. Release Process Challenges
●
Pick Container Registry:
– Your own
– DockerHub
– Artifactory
●
Registry management is important:
– Disk space, Heavy images
– Tracking of what's in use
– Decommissioning and pruning of the artifacts
– Availability
– Auditing
– Permissions
17. Deployment
●
Prepare Docker host (configuration management)
– Fry and not Bake
●
Pull Docker container
– Beware of growing size
– Pre-warm the host with the base image or a previous version
●
Start application
– Single container – easy
– Composition of containers is a challenge (Fig? Your own? ...)
– What configuration (env vars, partitions, etc...) is needed?
●
External HIERARCHICAL config/settings management is the key (Consul,
Zookeeper, Hiera)
– Passing secrets into the containers – think carefully!
●
Secret management is important (Consul, EtcD, ...)
19. Testing Considerations
●
Not much different from Virtualized payload
●
Spin up sandbox environment
●
Test against API, Mocks, Fakes, Pact
●
Go live?
– Use Blue/Green deployment
●
Pressure testing?
– Simpler and cheaper to do it in production
– Isolate traffic
– Gradually add load to the point of failure
– Monitor and measure
25. Service Discovery
●
No built-in SDN yet, just simple linking
●
Where my dependencies?
– Eureka
– EtcD
– Consul
●
Need to manage state of the App
– Starting
– Running
●
When do you know that the app is healthy and running?
●
Healtchecks
●
RunScope - tests contracts and validates the payload
– Stopping
– Dead
– Or check the state from the LB – requires extra code
26. Am I alive?
●
When the service is ready to
receive traffic?
●
How do you know if your service
is alive? Or still alive?
●
When the service is actually can start accessing the
linked dependencies/volumes?
●
Introduce delayed initialization or retries
●
Make your orchestration smarter to recognize the
composition time
●
Stagger the start and introduce jitter into the system
27. Monitoring / Alerting
●
Adds another layer to monitor
●
Monitor both host and the
containers
●
Rate of change is drastically
different
●
Location, Names, Versions – everything in motion
●
Mutiple running versions at the same time
●
Multiple locations, regions, zones, DC, HA, etc...
●
Tools start to recognize Docker – DataDog, Librato, NewRelic,
…
●
Composite SLA metrics
28. Reasoning about failure
●
Tools assume containment
hierarchy
●
Most can't reason about
the relationship
●
Your apps spanning
across multiple containers
and hosts
●
Ex: Machine component
(disk?) failure will affect all
instances, VMs, Containers
and Apps
Region
Zone/DC
Environment
Machine
VM/Instance
Container
Process
Process
Linked
Container
Volume
Storage
29. Failure Detection, Cleanup
●
When to clean up the containers?
●
What the container failure mean?
●
How to deal with the partial failure of the app
dependencies or linked containers
●
Volume containers filling up the host storage – beware!
●
How to decommission / tear down:
– What?
– In what order?
– How to communicate with the Monitoring/Alerting
– Notify Change Management system
30. Container storage
●
Stateful containers are hard for the moment
●
Volumes disappear if the Docker host dies –
especially on the clouds: AWS, OpenStack, etc...
●
Use host mounts, but don't forget where is your stuff
and when to clean it
●
Interesting: volume relocation by Flocker
31. Log Management
●
Eagerly move logs out – containers are short lived
●
Beware of sheer volume of logs – be smart about what and when
you ship
●
Can't truncate or rotate container STDOUT and STDERR
●
Write to volumes
●
Log rotation – volume rotation?
●
Log analysis
●
Log monitoring & alerting
●
Tools examples:
– Scribe, LogStash
– FluentD
– Splunk (if you can afford it)
33. Mesos
●
Cluster management, provides efficient, fine-
grained resource sharing and isolation across
distributed applications, or frameworks
●
Distributed resource broker
●
Since 2012 runs in Twitter in Production
●
In July 2013 became top-level Apache project
36. Missing Mesos features
●
AWS Multi-region?
●
Sticky locations?
●
Persistent volumes?
●
No Pods support (multi-container apps)
●
No REST Api to schedule jobs
●
No built-in clean-up
●
Tricky to write frameworks (but getting easier)
●
A lot of work to integrate with the
monitoring/alerting/logging systems
38. What's next?
●
Kubernetes
– What will be the solution for SDN?
– Container dependencies discovery
●
Lambda architecture
– What's an on-prem alternative?
– How do we test apps?
– What is an app?
– Should we just stop using apps concepts and move to stream processing?
39. Work in progress
●
Failures tracking
– Correlation does not imply causation (from Wikipedia)
– Derivatives and predictive monitoring
– Machine learning
41. Credits ...
●
Who Moved My Cheese? Movie by Dr. Spencer Johnson
●
Apache Mesos at Twitter (Texas LinuxFest 2014)
●
Containers at Hong Kong commercial port
●
Yes, prime minister