SlideShare a Scribd company logo
Docker & Kubernetes
Dongwon Kim, PhD
Big Data Tech. Lab
SK Telecom
Big Data Tech. Lab in SK telecom
• Discovery Group
• Predictive Maintenance Group
• Manufacturing Solution Group
• Groups making own solutions
• Technology and Architecture Leading Group
• Big data processing engine
• Advanced analytics algorithms
• Systematize service deployment and service operation on cluster
• Docker
• Kubernetes
Prepare for an era of cloud with Docker and Kubernetes
oracle
ubuntu
cloud
Major technologies
Docker
Kubernetes
Amazon Web Service
Microsoft azure
Cloud technologies
for service providers
icloud
Cloud services
for users
one drive
dropbox
google
drive
Trend
- Buy both SW & HW
- Buy HW and DIY
- Run your SW on cloud
Ubiquitous cloud services
around us
Enabling technologies
for custom cloud services
* technology trend in USA (2004-2017)
Overview & Conclusion
• Docker to build portable software
• Build your software upon Docker
• Then distribute it anywhere (even on MS Azure and Amazon Web Service)
• Kubernetes to orchestrate multiple Docker instances
• Start using Docker and Kubernetes before too late!
• Google has been using container technologies more than 10 years
Docker
Kubernetes
Hadoop
The Enterprise IT Adoption CyclePopularity of Docker and Kubernetes
Docker
Motivation
Enabling technologies for Docker
How to use Docker
Docker came to save us from the dependency hell
Docker Dependency hell
Portable software
Dependency hell
Development
environment
Production
environment
Your program
program1
v2
program2
v2
program3
v2
depends on
Your program
program1
v2
program2
v2
program3
v2
depends on depends on
Customer program
program1
v1
program2
v1
program3
v1
conflict!
Package manager Package manager
Few choices left to you
1. Convince your customer (a.k.a. 甲)
2. Install all the dependencies manually (without the package manager)
3. Modify your program to make it depend v1
Docker container
Package manager in host OS
Use Docker for isolating your application
Package manager in guest OS
Your program
program1
v2
program2
v2
program3
v2
depends ondepends on
Customer program
program1
v1
program2
v1
program3
v1
Host operating system
Linux kernel must be ≥3.10 (such as Ubuntu 14.04 and CentOS 7)
Docker engine (daemon)
Virtual machines and docker containers
Host Operating System
Kernel
Hypervisor Docker engine
Virtual machines Docker containers
Device drivers
Host Operating System
Kernel Device drivers
CentOS-like
container
yum
Libraries
App
Ubuntu-like
container
apt
App
Libraries
CentOS
virtual machine
Kernel Device
drivers
yum
Libraries
App
Ubuntu
virtual machine
Kernel
apt
App
Libraries
Device
drivers
Containers share the kernel in the host
Linux namespaces – what makes isolated environments in a host OS
Host Operating System
Docker engine
Container
pid
ipc
uts
net
mnt
user
Various ipc objects
- POSIX message queue
- SystemV IPC objects
(mq, sem, shm)
System identifiers
- hostname
- NIS domain name
Network devices
- Network devices
- IPv4, IPv6 stacks
- Routing tables, Firewall
Mount points
(directory hierarchy)
Security-related identifiers
- User IDs
- Group IDs
Process ID number space
(staring from 1)
Container
pid
ipc
uts
net
mnt
user
Container
pid
ipc
uts
net
mnt
user
Six namespaces are enough to give an illusion of running inside a virtual machine
Analogy between program and docker
Dockerfile Docker image
(read-only layers)
Docker container
(read-only layers + writable layer)
Source code Byte/machine code
(read only)
Process
(read only)
text
data
heap
stack
compile execute
build run
Program
Docker
How to define an image and run a container from it?
1) Write Dockerfile
- Specify to install python with pip on ubuntu
- Tell pip to install numpy
2) Build an image from Dockerfile
- Execute each line of Dockerfile to build an image
3) Execute a Docker container from the image
1 to N relationship between image and container
Execute five containers from an image
Q) Five containers take up 2,445MB (=489MB*5) in the host?
A) No due to image layering & sharing
Images consists of layers each of which is a set of files
• Instructions (FROM, RUN, CMD, etc) create layers
• Base images (imported by “FROM”) also consist of layers
• If a file exists in multiple layers, the one in the upper layer is seen
Dockerfile
Base ubuntu image
Layer (apt-get install python-dev python-pip)
Layer (pip install numpy)
Layer
Layer (files)
Layer (files)
Layer (files)
Image
Docker container
• A container is just a thin read/write layer
• base images are not copied to containers
• Copy-On-Write (COW)
• When a file in the base image is modified,
• copy the file to the R/W layer
• and then modify the copied file
Image sharing between containers
ubuntu:15.04 image (~188MB) does not copied to all containers
Layer sharing between images
If multiple Dockerfiles
1. start from the same base image
2. share a sequence of instructions (one RUN instruction in a below example)
, then docker engine automatically reuses existing layers
numpy Dockerfile matplotlib Dockerfile
Example of stacking docker images
Kafka broker PdM engine
kafka
(with scala)
Zookeeper
container
Kafka
container
PdM engine
(librdkafka, avro, flask)
cuda
PdM engine
(librdkafka, avro, flask)
scipy
(numpy, scipy, matplotlib, ipython, jupyter, pandas, scikit-learn, h5py)
theano-gpu (theano, keras)
theano-cpu
(theano, keras)
openjdk:8
zookeeper
buildpack-deps:jessie
python:2.7
buildpack-deps:jessie-curl
official
official
official
official
Zookeeper cluster
zk
zk
zkzk
zk
broker
broker
broker
Kafka
consumer
Kafka
producer
Web server
scipy libraries has nothing to
do with GPU, so share it
theano compiles
its expression graphs into
CPU/GPU instructions
PdM container (cpu) PdM container (gpu)
buildpack-deps:jessie-scm
debian:jessie
official
official
jessie is the latest, stable
Debian release
buildpack-deps contains
essential tools to
download/compile softwares
Enabling technologies for docker (wrap-up)
• Linux namespaces (covered)
• To isolate system resources
• pid, net, ipc, mnt, uts, user
• It makes a secure & isolate environment (like a VM)
• Advanced multi-layer unification File System (covered)
• Image layering & sharing
• Linux control groups (not covered)
• To track, limit, and isolate resources
• CPU, memory, network, and IO
* https://mairin.wordpress.com/2011/05/13/ideas-for-a-cgroups-ui/
Docker topics not covered here
• How to install Docker engine
• What are the docker instructions other than FROM, RUN, and CMD
• ENV / ADD / ENTRYPOINT / LABEL / EXPOSE / COPY / VOLUME / WORKDIR /
ONBUILD
• How to push local Docker images to docker hub
• How to pull remote images from docker hub
• ...
Consult with https://docs.docker.com/engine/getstarted/
Kubernetes
Motivation
A motivating example
Disclaimer
• The purpose of this section is
to briefly explain Kubernetes without details
• For a detailed explanation
with the exact Kubernetes terminology,
see the following slide
• https://www.slideshare.net/ssuser6bb12d/kubernetes-introduction-
71846110
What is Kubernetes for?
Container-based virtualization + Container orchestration
To satisfy common needs in production
replicating application instances
naming and discovery
load balancing
horizontal auto-scaling
co-locating helper processes
mounting storage systems
distributing secrets
application health checking
rolling updates
resource monitoring
log access and ingestion
...
from the official site : https://kubernetes.io/docs/whatisk8s/
Why Docker with Kubernetes?
• A mission of our group
• Systematize service deployment and service operation on cluster
• I believe that systematizing smth. is to minimize human efforts on smth.
• How to minimize human efforts on service deployment?
• Make software portable using a container technology
• Docker (chosen for its maturity and popularity)
• Rkt from CoreOS (alternative)
• Build images and run containers anywhere
• Your laptop, servers, on-premise clusters, even cloud
• How to minimize human efforts on service operation?
• Inform a container orchestration runtime of service specification
• Kubernetes from Google (chosen for its maturity and expressivity)
• Docker swarm from Docker
• Define your specification and then the runtime operates your services as you wish
Kubernetes architecture
Server
- REST API server with a K/V store
- Scheduler
- Find suitable machines for containers
- Controller manager
- Current state  Desired state
- Make changes if states go undesirable
Service specification
(written in yaml)
- Execute a web-server image
- Two replicas for LB & HA
- 3GB memory each
Docker engine
Node agent
container
(3GB)
Docker engine
Node agent
container
(3GB)
Docker engine
Node agent
container
(3GB)
Ensure a specified
# of replicas running
all the time
Web server example
node 2
webserver
node 1
webserver
node 3
webserver
Want to launch 3 replicas
for high availability and load balancing
How to achieve the followings?
• Users must be unaware of the replicas
• Traffic is evenly distributed to replicas
webserver
4bp80
webserver
6dk12
webserver
g1sdf
a well-known address
It’s a piece of cake with Kubernetes!
How to replicate your service instances
node 2
webserver
6dk12
node 1
webserver
4bp80
node 3
webserver
g1sdfapp=web1 app=web1 app=web1
Server
Node agent Node agent Node agent
Docker engine Docker engine Docker engine
Specify your Docker image and a replication factor
using Deployment
Specify a common label
to group containers with
different names
node 2node 1 node 3
Define a service to do round-robin forwarding
Server
<service>
webserver:80
webserver
6dk12
webserver
4bp80
webserver
g1sdfapp=web1 app=web1 app=web1
33% 33% 33%
<ingress>
metatron:80
External traffic
over internet
Internal traffic
Kubernetes runs its own DNS server for name resolution
Kubernetes manipulates iptables on each node to proxy traffic
Kubernetes
How to guarantee a certain # of running containers during maintenance
node1
zk-0
Containers
Volumes
node2
zk-2
Containers
Volumes
node3
zk-3
Containers
Volumes
Drain node1
Operation is permitted
because allowed-disruptions=1
Kubernetes
Drain node2
3 replicas have to be running
due to StatefulSet,
so try scheduling zk-0
on other nodes!
Oops!
cannot schedule zk-0
on node2 and node3
due to anti-affinity!
Operation not permitted
because allowed-disruptions=0
(Note that minAvailable=2)
Please wait until
node1 is up and zk-0 is rescheduled!
node1
zk-0
Containers
Volumes
node2
zk-2
Containers
Volumes
node3
zk-3
Containers
Volumes
Define disruption budget
to specify requirement for
the minimum available containers
Hold on for a while
PdM Kubernetes cluster
Zookeeper headless service Kafka headless service
PdM service
QuorumPeer
Main
QuorumPeer
Main
QuorumPeer
Main
Pod Pod Pod
Kafka
(broker)
Kafka
(broker)
Kafka
(broker)
Pod Pod Pod2181
2888
3888
2181
2888
3888
2181
2888
3888
9092
9092
9092
Statefulset Statefulset
PdM engine
Kafka
consumer
Kafka
producer
Web
server
Pod (Deployment)
Ingress
rule
8080
Persistent
storage
Attached
volume
Volume
80
Overview & Conclusion
• Docker to build portable software
• Build your software upon Docker
• Then distribute it anywhere (even on MS Azure and AWS)
• Kubernetes to orchestrate multiple Docker instances
• Start using Docker and Kubernetes before too late!
• Google has been using container technologies more than 10 years
Docker
Kubernetes
Hadoop
The Enterprise IT Adoption CyclePopularity of Docker and Kubernetes
the end

More Related Content

What's hot

Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Edureka!
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
Martin Danielsson
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
EastBanc Tachnologies
 
Introduction to Docker Compose
Introduction to Docker ComposeIntroduction to Docker Compose
Introduction to Docker Compose
Ajeet Singh Raina
 
Kubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory GuideKubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory Guide
Bytemark
 
Kubernetes
KubernetesKubernetes
Kubernetes
Meng-Ze Lee
 
Introduction to Docker - VIT Campus
Introduction to Docker - VIT CampusIntroduction to Docker - VIT Campus
Introduction to Docker - VIT Campus
Ajeet Singh Raina
 
Docker introduction
Docker introductionDocker introduction
Docker introduction
dotCloud
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
Eueung Mulyana
 
An Introduction to Kubernetes
An Introduction to KubernetesAn Introduction to Kubernetes
An Introduction to Kubernetes
Imesh Gunaratne
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
Raffaele Di Fazio
 
Docker 101 : Introduction to Docker and Containers
Docker 101 : Introduction to Docker and ContainersDocker 101 : Introduction to Docker and Containers
Docker 101 : Introduction to Docker and Containers
Yajushi Srivastava
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
Aditya Konarde
 
Everything You Need To Know About Persistent Storage in Kubernetes
Everything You Need To Know About Persistent Storage in KubernetesEverything You Need To Know About Persistent Storage in Kubernetes
Everything You Need To Know About Persistent Storage in Kubernetes
The {code} Team
 
Docker Basics
Docker BasicsDocker Basics
Docker Basics
DuckDuckGo
 
Getting started with Docker
Getting started with DockerGetting started with Docker
Getting started with Docker
Ravindu Fernando
 
Introduction to helm
Introduction to helmIntroduction to helm
Introduction to helm
Jeeva Chelladhurai
 
Docker introduction
Docker introductionDocker introduction
Docker introduction
Phuc Nguyen
 
Docker and kubernetes_introduction
Docker and kubernetes_introductionDocker and kubernetes_introduction
Docker and kubernetes_introduction
Jason Hu
 
Introduction to Docker storage, volume and image
Introduction to Docker storage, volume and imageIntroduction to Docker storage, volume and image
Introduction to Docker storage, volume and image
ejlp12
 

What's hot (20)

Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
 
Introduction to Docker Compose
Introduction to Docker ComposeIntroduction to Docker Compose
Introduction to Docker Compose
 
Kubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory GuideKubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory Guide
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Introduction to Docker - VIT Campus
Introduction to Docker - VIT CampusIntroduction to Docker - VIT Campus
Introduction to Docker - VIT Campus
 
Docker introduction
Docker introductionDocker introduction
Docker introduction
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
An Introduction to Kubernetes
An Introduction to KubernetesAn Introduction to Kubernetes
An Introduction to Kubernetes
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Docker 101 : Introduction to Docker and Containers
Docker 101 : Introduction to Docker and ContainersDocker 101 : Introduction to Docker and Containers
Docker 101 : Introduction to Docker and Containers
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
 
Everything You Need To Know About Persistent Storage in Kubernetes
Everything You Need To Know About Persistent Storage in KubernetesEverything You Need To Know About Persistent Storage in Kubernetes
Everything You Need To Know About Persistent Storage in Kubernetes
 
Docker Basics
Docker BasicsDocker Basics
Docker Basics
 
Getting started with Docker
Getting started with DockerGetting started with Docker
Getting started with Docker
 
Introduction to helm
Introduction to helmIntroduction to helm
Introduction to helm
 
Docker introduction
Docker introductionDocker introduction
Docker introduction
 
Docker and kubernetes_introduction
Docker and kubernetes_introductionDocker and kubernetes_introduction
Docker and kubernetes_introduction
 
Introduction to Docker storage, volume and image
Introduction to Docker storage, volume and imageIntroduction to Docker storage, volume and image
Introduction to Docker storage, volume and image
 

Similar to Docker and kubernetes

Docker - Portable Deployment
Docker - Portable DeploymentDocker - Portable Deployment
Docker - Portable Deployment
javaonfly
 
IBM Bluemix Paris Meetup #14 - Le Village by CA - 20160413 - Introduction à D...
IBM Bluemix Paris Meetup #14 - Le Village by CA - 20160413 - Introduction à D...IBM Bluemix Paris Meetup #14 - Le Village by CA - 20160413 - Introduction à D...
IBM Bluemix Paris Meetup #14 - Le Village by CA - 20160413 - Introduction à D...
IBM France Lab
 
Containerization using docker and its applications
Containerization using docker and its applicationsContainerization using docker and its applications
Containerization using docker and its applications
Puneet Kumar Bhatia (MBA, ITIL V3 Certified)
 
Containerization using docker and its applications
Containerization using docker and its applicationsContainerization using docker and its applications
Containerization using docker and its applications
Puneet Kumar Bhatia (MBA, ITIL V3 Certified)
 
Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14
Simon Storm
 
Docker.pptx
Docker.pptxDocker.pptx
Docker.pptx
balaji257
 
Developer workflow with docker
Developer workflow with dockerDeveloper workflow with docker
Developer workflow with docker
Wyn B. Van Devanter
 
Docker Ecosystem on Azure
Docker Ecosystem on AzureDocker Ecosystem on Azure
Docker Ecosystem on Azure
Patrick Chanezon
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker Containers
BlueData, Inc.
 
Containers and Cloud: From LXC to Docker to Kubernetes
Containers and Cloud: From LXC to Docker to KubernetesContainers and Cloud: From LXC to Docker to Kubernetes
Containers and Cloud: From LXC to Docker to Kubernetes
Shreyas MM
 
Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !
Anthony Dahanne
 
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
dotCloud
 
Michigan IT Symposium 2017 - Container BOF
Michigan IT Symposium 2017 - Container BOFMichigan IT Symposium 2017 - Container BOF
Michigan IT Symposium 2017 - Container BOF
Jeffrey Sica
 
Introduction to Docker Containers - Docker Captain
Introduction to Docker Containers - Docker CaptainIntroduction to Docker Containers - Docker Captain
Introduction to Docker Containers - Docker Captain
Ajeet Singh Raina
 
Containers 101
Containers 101Containers 101
Containers 101
Black Duck by Synopsys
 
Cont0519
Cont0519Cont0519
Cont0519
Samuel Dratwa
 
Containers in depth – Understanding how containers work to better work with c...
Containers in depth – Understanding how containers work to better work with c...Containers in depth – Understanding how containers work to better work with c...
Containers in depth – Understanding how containers work to better work with c...
All Things Open
 
Docker Container Security
Docker Container SecurityDocker Container Security
Docker Container Security
Suraj Khetani
 
Docker slides
Docker slidesDocker slides
Docker slides
Jyotsna Raghuraman
 
OpenStack Summit
OpenStack SummitOpenStack Summit
OpenStack Summit
Docker, Inc.
 

Similar to Docker and kubernetes (20)

Docker - Portable Deployment
Docker - Portable DeploymentDocker - Portable Deployment
Docker - Portable Deployment
 
IBM Bluemix Paris Meetup #14 - Le Village by CA - 20160413 - Introduction à D...
IBM Bluemix Paris Meetup #14 - Le Village by CA - 20160413 - Introduction à D...IBM Bluemix Paris Meetup #14 - Le Village by CA - 20160413 - Introduction à D...
IBM Bluemix Paris Meetup #14 - Le Village by CA - 20160413 - Introduction à D...
 
Containerization using docker and its applications
Containerization using docker and its applicationsContainerization using docker and its applications
Containerization using docker and its applications
 
Containerization using docker and its applications
Containerization using docker and its applicationsContainerization using docker and its applications
Containerization using docker and its applications
 
Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14
 
Docker.pptx
Docker.pptxDocker.pptx
Docker.pptx
 
Developer workflow with docker
Developer workflow with dockerDeveloper workflow with docker
Developer workflow with docker
 
Docker Ecosystem on Azure
Docker Ecosystem on AzureDocker Ecosystem on Azure
Docker Ecosystem on Azure
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker Containers
 
Containers and Cloud: From LXC to Docker to Kubernetes
Containers and Cloud: From LXC to Docker to KubernetesContainers and Cloud: From LXC to Docker to Kubernetes
Containers and Cloud: From LXC to Docker to Kubernetes
 
Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !
 
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
 
Michigan IT Symposium 2017 - Container BOF
Michigan IT Symposium 2017 - Container BOFMichigan IT Symposium 2017 - Container BOF
Michigan IT Symposium 2017 - Container BOF
 
Introduction to Docker Containers - Docker Captain
Introduction to Docker Containers - Docker CaptainIntroduction to Docker Containers - Docker Captain
Introduction to Docker Containers - Docker Captain
 
Containers 101
Containers 101Containers 101
Containers 101
 
Cont0519
Cont0519Cont0519
Cont0519
 
Containers in depth – Understanding how containers work to better work with c...
Containers in depth – Understanding how containers work to better work with c...Containers in depth – Understanding how containers work to better work with c...
Containers in depth – Understanding how containers work to better work with c...
 
Docker Container Security
Docker Container SecurityDocker Container Security
Docker Container Security
 
Docker slides
Docker slidesDocker slides
Docker slides
 
OpenStack Summit
OpenStack SummitOpenStack Summit
OpenStack Summit
 

More from Dongwon Kim

Do Flink on Web with FLOW
Do Flink on Web with FLOWDo Flink on Web with FLOW
Do Flink on Web with FLOW
Dongwon Kim
 
Real-time driving score service using Flink
Real-time driving score service using FlinkReal-time driving score service using Flink
Real-time driving score service using Flink
Dongwon Kim
 
Predictive Maintenance with Deep Learning and Apache Flink
Predictive Maintenance with Deep Learning and Apache FlinkPredictive Maintenance with Deep Learning and Apache Flink
Predictive Maintenance with Deep Learning and Apache Flink
Dongwon Kim
 
Kubernetes introduction
Kubernetes introductionKubernetes introduction
Kubernetes introduction
Dongwon Kim
 
Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmark
Dongwon Kim
 
A Comparative Performance Evaluation of Apache Flink
A Comparative Performance Evaluation of Apache FlinkA Comparative Performance Evaluation of Apache Flink
A Comparative Performance Evaluation of Apache Flink
Dongwon Kim
 

More from Dongwon Kim (6)

Do Flink on Web with FLOW
Do Flink on Web with FLOWDo Flink on Web with FLOW
Do Flink on Web with FLOW
 
Real-time driving score service using Flink
Real-time driving score service using FlinkReal-time driving score service using Flink
Real-time driving score service using Flink
 
Predictive Maintenance with Deep Learning and Apache Flink
Predictive Maintenance with Deep Learning and Apache FlinkPredictive Maintenance with Deep Learning and Apache Flink
Predictive Maintenance with Deep Learning and Apache Flink
 
Kubernetes introduction
Kubernetes introductionKubernetes introduction
Kubernetes introduction
 
Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmark
 
A Comparative Performance Evaluation of Apache Flink
A Comparative Performance Evaluation of Apache FlinkA Comparative Performance Evaluation of Apache Flink
A Comparative Performance Evaluation of Apache Flink
 

Recently uploaded

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 

Recently uploaded (20)

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 

Docker and kubernetes

  • 1. Docker & Kubernetes Dongwon Kim, PhD Big Data Tech. Lab SK Telecom
  • 2. Big Data Tech. Lab in SK telecom • Discovery Group • Predictive Maintenance Group • Manufacturing Solution Group • Groups making own solutions • Technology and Architecture Leading Group • Big data processing engine • Advanced analytics algorithms • Systematize service deployment and service operation on cluster • Docker • Kubernetes
  • 3. Prepare for an era of cloud with Docker and Kubernetes oracle ubuntu cloud Major technologies Docker Kubernetes Amazon Web Service Microsoft azure Cloud technologies for service providers icloud Cloud services for users one drive dropbox google drive Trend - Buy both SW & HW - Buy HW and DIY - Run your SW on cloud Ubiquitous cloud services around us Enabling technologies for custom cloud services * technology trend in USA (2004-2017)
  • 4. Overview & Conclusion • Docker to build portable software • Build your software upon Docker • Then distribute it anywhere (even on MS Azure and Amazon Web Service) • Kubernetes to orchestrate multiple Docker instances • Start using Docker and Kubernetes before too late! • Google has been using container technologies more than 10 years Docker Kubernetes Hadoop The Enterprise IT Adoption CyclePopularity of Docker and Kubernetes
  • 6. Docker came to save us from the dependency hell Docker Dependency hell Portable software
  • 7. Dependency hell Development environment Production environment Your program program1 v2 program2 v2 program3 v2 depends on Your program program1 v2 program2 v2 program3 v2 depends on depends on Customer program program1 v1 program2 v1 program3 v1 conflict! Package manager Package manager
  • 8. Few choices left to you 1. Convince your customer (a.k.a. 甲) 2. Install all the dependencies manually (without the package manager) 3. Modify your program to make it depend v1
  • 9. Docker container Package manager in host OS Use Docker for isolating your application Package manager in guest OS Your program program1 v2 program2 v2 program3 v2 depends ondepends on Customer program program1 v1 program2 v1 program3 v1 Host operating system Linux kernel must be ≥3.10 (such as Ubuntu 14.04 and CentOS 7) Docker engine (daemon)
  • 10. Virtual machines and docker containers Host Operating System Kernel Hypervisor Docker engine Virtual machines Docker containers Device drivers Host Operating System Kernel Device drivers CentOS-like container yum Libraries App Ubuntu-like container apt App Libraries CentOS virtual machine Kernel Device drivers yum Libraries App Ubuntu virtual machine Kernel apt App Libraries Device drivers Containers share the kernel in the host
  • 11. Linux namespaces – what makes isolated environments in a host OS Host Operating System Docker engine Container pid ipc uts net mnt user Various ipc objects - POSIX message queue - SystemV IPC objects (mq, sem, shm) System identifiers - hostname - NIS domain name Network devices - Network devices - IPv4, IPv6 stacks - Routing tables, Firewall Mount points (directory hierarchy) Security-related identifiers - User IDs - Group IDs Process ID number space (staring from 1) Container pid ipc uts net mnt user Container pid ipc uts net mnt user Six namespaces are enough to give an illusion of running inside a virtual machine
  • 12. Analogy between program and docker Dockerfile Docker image (read-only layers) Docker container (read-only layers + writable layer) Source code Byte/machine code (read only) Process (read only) text data heap stack compile execute build run Program Docker
  • 13. How to define an image and run a container from it? 1) Write Dockerfile - Specify to install python with pip on ubuntu - Tell pip to install numpy 2) Build an image from Dockerfile - Execute each line of Dockerfile to build an image 3) Execute a Docker container from the image
  • 14. 1 to N relationship between image and container Execute five containers from an image Q) Five containers take up 2,445MB (=489MB*5) in the host? A) No due to image layering & sharing
  • 15. Images consists of layers each of which is a set of files • Instructions (FROM, RUN, CMD, etc) create layers • Base images (imported by “FROM”) also consist of layers • If a file exists in multiple layers, the one in the upper layer is seen Dockerfile Base ubuntu image Layer (apt-get install python-dev python-pip) Layer (pip install numpy) Layer Layer (files) Layer (files) Layer (files) Image
  • 16. Docker container • A container is just a thin read/write layer • base images are not copied to containers • Copy-On-Write (COW) • When a file in the base image is modified, • copy the file to the R/W layer • and then modify the copied file
  • 17. Image sharing between containers ubuntu:15.04 image (~188MB) does not copied to all containers
  • 18. Layer sharing between images If multiple Dockerfiles 1. start from the same base image 2. share a sequence of instructions (one RUN instruction in a below example) , then docker engine automatically reuses existing layers numpy Dockerfile matplotlib Dockerfile
  • 19. Example of stacking docker images Kafka broker PdM engine kafka (with scala) Zookeeper container Kafka container PdM engine (librdkafka, avro, flask) cuda PdM engine (librdkafka, avro, flask) scipy (numpy, scipy, matplotlib, ipython, jupyter, pandas, scikit-learn, h5py) theano-gpu (theano, keras) theano-cpu (theano, keras) openjdk:8 zookeeper buildpack-deps:jessie python:2.7 buildpack-deps:jessie-curl official official official official Zookeeper cluster zk zk zkzk zk broker broker broker Kafka consumer Kafka producer Web server scipy libraries has nothing to do with GPU, so share it theano compiles its expression graphs into CPU/GPU instructions PdM container (cpu) PdM container (gpu) buildpack-deps:jessie-scm debian:jessie official official jessie is the latest, stable Debian release buildpack-deps contains essential tools to download/compile softwares
  • 20. Enabling technologies for docker (wrap-up) • Linux namespaces (covered) • To isolate system resources • pid, net, ipc, mnt, uts, user • It makes a secure & isolate environment (like a VM) • Advanced multi-layer unification File System (covered) • Image layering & sharing • Linux control groups (not covered) • To track, limit, and isolate resources • CPU, memory, network, and IO * https://mairin.wordpress.com/2011/05/13/ideas-for-a-cgroups-ui/
  • 21. Docker topics not covered here • How to install Docker engine • What are the docker instructions other than FROM, RUN, and CMD • ENV / ADD / ENTRYPOINT / LABEL / EXPOSE / COPY / VOLUME / WORKDIR / ONBUILD • How to push local Docker images to docker hub • How to pull remote images from docker hub • ... Consult with https://docs.docker.com/engine/getstarted/
  • 23. Disclaimer • The purpose of this section is to briefly explain Kubernetes without details • For a detailed explanation with the exact Kubernetes terminology, see the following slide • https://www.slideshare.net/ssuser6bb12d/kubernetes-introduction- 71846110
  • 24. What is Kubernetes for? Container-based virtualization + Container orchestration To satisfy common needs in production replicating application instances naming and discovery load balancing horizontal auto-scaling co-locating helper processes mounting storage systems distributing secrets application health checking rolling updates resource monitoring log access and ingestion ... from the official site : https://kubernetes.io/docs/whatisk8s/
  • 25. Why Docker with Kubernetes? • A mission of our group • Systematize service deployment and service operation on cluster • I believe that systematizing smth. is to minimize human efforts on smth. • How to minimize human efforts on service deployment? • Make software portable using a container technology • Docker (chosen for its maturity and popularity) • Rkt from CoreOS (alternative) • Build images and run containers anywhere • Your laptop, servers, on-premise clusters, even cloud • How to minimize human efforts on service operation? • Inform a container orchestration runtime of service specification • Kubernetes from Google (chosen for its maturity and expressivity) • Docker swarm from Docker • Define your specification and then the runtime operates your services as you wish
  • 26. Kubernetes architecture Server - REST API server with a K/V store - Scheduler - Find suitable machines for containers - Controller manager - Current state  Desired state - Make changes if states go undesirable Service specification (written in yaml) - Execute a web-server image - Two replicas for LB & HA - 3GB memory each Docker engine Node agent container (3GB) Docker engine Node agent container (3GB) Docker engine Node agent container (3GB) Ensure a specified # of replicas running all the time
  • 27. Web server example node 2 webserver node 1 webserver node 3 webserver Want to launch 3 replicas for high availability and load balancing How to achieve the followings? • Users must be unaware of the replicas • Traffic is evenly distributed to replicas webserver 4bp80 webserver 6dk12 webserver g1sdf a well-known address It’s a piece of cake with Kubernetes!
  • 28. How to replicate your service instances node 2 webserver 6dk12 node 1 webserver 4bp80 node 3 webserver g1sdfapp=web1 app=web1 app=web1 Server Node agent Node agent Node agent Docker engine Docker engine Docker engine Specify your Docker image and a replication factor using Deployment Specify a common label to group containers with different names
  • 29. node 2node 1 node 3 Define a service to do round-robin forwarding Server <service> webserver:80 webserver 6dk12 webserver 4bp80 webserver g1sdfapp=web1 app=web1 app=web1 33% 33% 33% <ingress> metatron:80 External traffic over internet Internal traffic Kubernetes runs its own DNS server for name resolution Kubernetes manipulates iptables on each node to proxy traffic
  • 30. Kubernetes How to guarantee a certain # of running containers during maintenance node1 zk-0 Containers Volumes node2 zk-2 Containers Volumes node3 zk-3 Containers Volumes Drain node1 Operation is permitted because allowed-disruptions=1 Kubernetes Drain node2 3 replicas have to be running due to StatefulSet, so try scheduling zk-0 on other nodes! Oops! cannot schedule zk-0 on node2 and node3 due to anti-affinity! Operation not permitted because allowed-disruptions=0 (Note that minAvailable=2) Please wait until node1 is up and zk-0 is rescheduled! node1 zk-0 Containers Volumes node2 zk-2 Containers Volumes node3 zk-3 Containers Volumes Define disruption budget to specify requirement for the minimum available containers Hold on for a while
  • 31. PdM Kubernetes cluster Zookeeper headless service Kafka headless service PdM service QuorumPeer Main QuorumPeer Main QuorumPeer Main Pod Pod Pod Kafka (broker) Kafka (broker) Kafka (broker) Pod Pod Pod2181 2888 3888 2181 2888 3888 2181 2888 3888 9092 9092 9092 Statefulset Statefulset PdM engine Kafka consumer Kafka producer Web server Pod (Deployment) Ingress rule 8080 Persistent storage Attached volume Volume 80
  • 32. Overview & Conclusion • Docker to build portable software • Build your software upon Docker • Then distribute it anywhere (even on MS Azure and AWS) • Kubernetes to orchestrate multiple Docker instances • Start using Docker and Kubernetes before too late! • Google has been using container technologies more than 10 years Docker Kubernetes Hadoop The Enterprise IT Adoption CyclePopularity of Docker and Kubernetes