Containerization:
The DevOps Revolution
Why do we need
containers?
Shipping Containers
• Standardized dimensions
• Mechanized handling system
• Remote sorting and packing
• Remote customs services
• Greatly decreases cost and speed
of international trade
Software Container is like a VM
• Own Process Space
• Can run commands
• Packages can be installed
• Can run services/daemons
• Isolated root privileges
• Shell access
Software Container is not like a VM
• Uses host kernel
• Restricted to host OS
• Can’t have it’s own kernel modules
• Is plain user-space process
VM vs Container
Containers Chronology
• 1982 - chroot
• 2000 - FreeBSD Jail
• 2001 – Linux VServer
• 2004 – Solaris Containers
• 2007 – HP-UX Containers
• 2008 – LXC (Linux Containers)
• 2013 - Docker
Linux cgroups (control groups)
• Resource limiting
• Prioritization
• Accounting
• Control
• Used by
• LXC
• libvirt
• systemd
• Docker
• Kubernetes
• Mesos
Linux namespaces
• Isolate and virtualize resources
• Every process (group) has its own view of
the system
• 6 kinds of namespaces:
• mnt – mount points
• pid – process IDs
• net – network stack
• ipc – POSIX mq filesystem
• uts - hostname
• user – users and groups
• Resource Metering and Limiting
• CPU and CPUSet
• Memory
• Network
• Block I/O
• /dev/*
cgroups (control groups):
• Provides containers with their own view of
the system
• Limits what you can see (and use)
• Multiple namespaces: pid, net, mnt, uts,
ipc, user
Namespace:
• Create new container instantly instead of
copying whole system
• Storage keeps tracking of what has change
(AUFS, ZFS, etc)
• Reduces footprint and overhead
• Decreases boot time
Copy-on-write storage:
• LXC
• systemd-nspawn
• Docker Engine
• rkt/runC
• OpenVZ
• Jails (FreeBSD), Zones (Solaris)
Container Runtimes:
• Uses the same kernel features => Performance will
be the same
• What matters is:
Design
Features
Ecosystem (e.g. 100.000+ apps in Docker Hub)
Support
What’s the difference between them?
The Story of Success
Problem & Opportunity
• Rapid innovation in computing and application development
services
• No single service is optimal for all solutions
• Customers want to run multiple services in a single
cluster and run multiple clusters in Intercloud
environment
...to maximize utilization
...to share data between services
Datacenter and solution today
VM7 VM8
VM4 VM5 VM6
VM1 VM2 VM3
VM1 VM2 VM2
Visualization Service
Data Ingestion Service
Analytics Service
• Configuration and
management
of 3 separate clusters
• Resources stay idle if service
is not active
• Need to move data between
clusters for each service
What do we want to do?
Data Ingestion Service
Analytics Service
Visualization Service
….to maximize utilization
...to share data between services
Shared cluster
Multiple clusters
Shared Cluster
AWS
VM1 VM2 VM3 VM4 VM5
What is in it for customers?
Maximize utilization
Deliver more services with smaller footprint
Shared clusters for all services
Easier deployment and management with unified service platform
Shared data between services
Faster and more competitive services and solutions
How does this work?
Mesos Slave
Spark Task Executor Mesos Executor
Mesos Slave
Docker Executor Docker Executor
Mesos Master
Task #1 Task #2 ./python XYZ java -jar XYZ.jar ./xyz
Mesos Master Mesos Master
Spark Service Scheduler Marathon Service Scheduler
Zookeeper quorum
How does this work?
Mesos provides fine grained resource isolation
Mesos Slave Process
Spark Task Executor Mesos Executor
Task #1 Task #2 ./python XYZ
Compute Node
Executor
Container
(cgroups)
How does this work?
Mesos provides scalability
Mesos Slave Process
Spark Task Executor
Task #1 Task #2 ./ruby XYZ
Compute Node
Python executor finished,
more available resources
more Spark
Container
(cgroups)
Task #3 Task #4
How does this work?
Mesos has no single point of failure
Mesos MasterMesos Master
Mesos Master
VM1 VM2 VM3 VM4 VM5
Services keep running if VM fails!
How does this work?
Master node can failover
Mesos MasterMesos Master
Mesos Master
VM1 VM2 VM3 VM4 VM5
Services keep running if Mesos Master fails!
How does this work?
Slave process can failover
Tasks keep running if Mesos Slave Process fails!
Mesos Slave Process
Spark Task Executor
Task #1 Task #2 ./ruby XYZ
Compute Node
Task #3 Task #4
How does this work?
Can deploy in many environments
Get orchestrated by Openstack, Ansible
(scripts), Cloudbreak
True Hybrid Cloud deployment: AWS, CIS, UCS,
vSphere, other
AWS
VM5VM1 VM2 VM3 VM4
Terraform
REST API
(policy, auto-
scaling)
REST API
(direct
provisioning)
Scripted
provisioning
Containers:
Service Product
Cloud/Virtualization AWS/CIS/vSphere/Metacloud/UCS…
Provisioning Terraform
Automation Ansible
Clustering & Resource
Management Mesos, Marathon, Docker
Load Balancing Avi Networks
ETL & Data Shaping StreamSets
Log Data Gathering Logstash
Metrics Gathering CollectD, Avi Networks
Messaging Kafka, Solace
Data Storing (Batch) HDFS
Data Storing (OLTP/Real-time) Cassandra
Data Storing (Indexing) Elastic search
Data Processing Apache Spark
Visualization Zoomdata
*Subset example
Issues
• Service Discovering
• Networking for Containers
• Persistent Storage
• Docker Performance
More Details
https://mantl.io

Containerization - The DevOps Revolution

  • 1.
  • 2.
    Why do weneed containers?
  • 3.
    Shipping Containers • Standardizeddimensions • Mechanized handling system • Remote sorting and packing • Remote customs services • Greatly decreases cost and speed of international trade
  • 4.
    Software Container islike a VM • Own Process Space • Can run commands • Packages can be installed • Can run services/daemons • Isolated root privileges • Shell access
  • 5.
    Software Container isnot like a VM • Uses host kernel • Restricted to host OS • Can’t have it’s own kernel modules • Is plain user-space process
  • 6.
  • 7.
    Containers Chronology • 1982- chroot • 2000 - FreeBSD Jail • 2001 – Linux VServer • 2004 – Solaris Containers • 2007 – HP-UX Containers • 2008 – LXC (Linux Containers) • 2013 - Docker
  • 8.
    Linux cgroups (controlgroups) • Resource limiting • Prioritization • Accounting • Control • Used by • LXC • libvirt • systemd • Docker • Kubernetes • Mesos
  • 9.
    Linux namespaces • Isolateand virtualize resources • Every process (group) has its own view of the system • 6 kinds of namespaces: • mnt – mount points • pid – process IDs • net – network stack • ipc – POSIX mq filesystem • uts - hostname • user – users and groups
  • 10.
    • Resource Meteringand Limiting • CPU and CPUSet • Memory • Network • Block I/O • /dev/* cgroups (control groups):
  • 11.
    • Provides containerswith their own view of the system • Limits what you can see (and use) • Multiple namespaces: pid, net, mnt, uts, ipc, user Namespace:
  • 12.
    • Create newcontainer instantly instead of copying whole system • Storage keeps tracking of what has change (AUFS, ZFS, etc) • Reduces footprint and overhead • Decreases boot time Copy-on-write storage:
  • 13.
    • LXC • systemd-nspawn •Docker Engine • rkt/runC • OpenVZ • Jails (FreeBSD), Zones (Solaris) Container Runtimes:
  • 14.
    • Uses thesame kernel features => Performance will be the same • What matters is: Design Features Ecosystem (e.g. 100.000+ apps in Docker Hub) Support What’s the difference between them?
  • 15.
    The Story ofSuccess
  • 16.
    Problem & Opportunity •Rapid innovation in computing and application development services • No single service is optimal for all solutions • Customers want to run multiple services in a single cluster and run multiple clusters in Intercloud environment ...to maximize utilization ...to share data between services
  • 17.
    Datacenter and solutiontoday VM7 VM8 VM4 VM5 VM6 VM1 VM2 VM3 VM1 VM2 VM2 Visualization Service Data Ingestion Service Analytics Service • Configuration and management of 3 separate clusters • Resources stay idle if service is not active • Need to move data between clusters for each service
  • 18.
    What do wewant to do? Data Ingestion Service Analytics Service Visualization Service ….to maximize utilization ...to share data between services Shared cluster Multiple clusters
  • 19.
  • 20.
    What is init for customers? Maximize utilization Deliver more services with smaller footprint Shared clusters for all services Easier deployment and management with unified service platform Shared data between services Faster and more competitive services and solutions
  • 21.
    How does thiswork? Mesos Slave Spark Task Executor Mesos Executor Mesos Slave Docker Executor Docker Executor Mesos Master Task #1 Task #2 ./python XYZ java -jar XYZ.jar ./xyz Mesos Master Mesos Master Spark Service Scheduler Marathon Service Scheduler Zookeeper quorum
  • 22.
    How does thiswork? Mesos provides fine grained resource isolation Mesos Slave Process Spark Task Executor Mesos Executor Task #1 Task #2 ./python XYZ Compute Node Executor Container (cgroups)
  • 23.
    How does thiswork? Mesos provides scalability Mesos Slave Process Spark Task Executor Task #1 Task #2 ./ruby XYZ Compute Node Python executor finished, more available resources more Spark Container (cgroups) Task #3 Task #4
  • 24.
    How does thiswork? Mesos has no single point of failure Mesos MasterMesos Master Mesos Master VM1 VM2 VM3 VM4 VM5 Services keep running if VM fails!
  • 25.
    How does thiswork? Master node can failover Mesos MasterMesos Master Mesos Master VM1 VM2 VM3 VM4 VM5 Services keep running if Mesos Master fails!
  • 26.
    How does thiswork? Slave process can failover Tasks keep running if Mesos Slave Process fails! Mesos Slave Process Spark Task Executor Task #1 Task #2 ./ruby XYZ Compute Node Task #3 Task #4
  • 27.
    How does thiswork? Can deploy in many environments Get orchestrated by Openstack, Ansible (scripts), Cloudbreak True Hybrid Cloud deployment: AWS, CIS, UCS, vSphere, other AWS VM5VM1 VM2 VM3 VM4 Terraform REST API (policy, auto- scaling) REST API (direct provisioning) Scripted provisioning
  • 28.
    Containers: Service Product Cloud/Virtualization AWS/CIS/vSphere/Metacloud/UCS… ProvisioningTerraform Automation Ansible Clustering & Resource Management Mesos, Marathon, Docker Load Balancing Avi Networks ETL & Data Shaping StreamSets Log Data Gathering Logstash Metrics Gathering CollectD, Avi Networks Messaging Kafka, Solace Data Storing (Batch) HDFS Data Storing (OLTP/Real-time) Cassandra Data Storing (Indexing) Elastic search Data Processing Apache Spark Visualization Zoomdata *Subset example
  • 29.
    Issues • Service Discovering •Networking for Containers • Persistent Storage • Docker Performance
  • 30.

Editor's Notes

  • #4 Containerization is a system of intermodal freight transport using intermodal containers (also called shipping containers andISO containers) made of weathering steel. The containers have standardized dimensions. They can be loaded and unloaded, stacked, transported efficiently over long distances, and transferred from one mode of transport to another—container ships,rail transport flatcars, and semi-trailer trucks—without being opened.
  • #8 Poul-Henning Kamp
  • #9 cgroups (abbreviated from control groups) is a Linux kernel feature that limits, accounts for, and isolates the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes. Engineers at Google (primarily Paul Menage and Rohit Seth) started the work on this feature in 2006 under the name "process containers”
  • #10 Namespaces are a Linux kernel feature that isolates and virtualizes resources (PID, hostname, userid, network, ipc, filesystem) of a collection of processes. Each process is assigned a symbolic link per namespace kind in /proc/<pid>/ns/. This symlink is handled specially by the kernel, the inode number pointed to by this symlink is the same for each process in this namespace, this way each namespace is uniquely identified by the inode number pointed to by one of its symlinks. Reading the symlink via readlink returns a string containing the namespace kind name and the inode number of the namespace.