Cluster management with Kubernetes

Cluster Management with
Kubernetes
Please open the gears tab below for the speaker
notes
Satnam Singh satnam@google.com
Work of the Google Kubernetes team and many open source contributors
University of Edinburgh, 5 June 2015

The promise of cloud computing

Cloud software deployment is soul destroying
Typically a cloud cluster node is a VM running a specific version of Linux.
User applications comprise components each of which may have different
and conflicting requirements from libraries, runtimes and kernel features.
Applications are coupled to the version of the host operating system: bad.
Evolution of the application components is coupled to (and in tension with)
the evolution of the host operating system: bad.
Also need to deal with node failures, spinning up and turning down replicas
to deal with varying load, updating components with disruption …
You thought you were a programmer but you are now a sys-admin.

What is Docker?
An implementation of the container idea
A package format
Resource isolation
An ecosystem

Virtual Machines workloads?
We need to isolate the application components from the host environment.

Docker
“build once, run anywhere”

Resource isolation
Implemented by a number of Linux APIs:
• cgroups: Restrict resources a process can consume
• CPU, memory, disk IO, ...
• namespaces: Change a process’s view of the system
• Network interfaces, PIDs, users, mounts, ...
• capabilities: Limits what a user can do
• mount, kill, chown, ...
• chroots: Determines what parts of the filesystem a user can see

We need more than just packing and isolation
Scheduling: Where should my containers run?
Lifecycle and health: Keep my containers running despite failures
Discovery: Where are my containers now?
Monitoring: What’s happening with my containers?
Auth{n,z}: Control who can do things to my containers
Aggregates: Compose sets of containers into jobs
Scaling: Making jobs bigger or smaller
...

Google confidential │ Do not distribute
Everything at Google runs in
containers:
• Gmail, Web Search, Maps, ...
• MapReduce, MillWheel, Pregel, ...
• Colossus, BigTable, Spanner, ...
• Even Google’s Cloud Computing
product GCE itself: VMs run in
containers

Open Source Containers: Kubernetes
Greek for “Helmsman”; also the root of the word
“Governor” and “cybernetic”
• Container orchestrator
• Builds on Docker containers
• also supporting other container technologies
• Multiple cloud and bare-metal environments
• Supports existing OSS apps
• cannot require apps becoming cloud-native
• Inspired and informed by Google’s experiences
and internal systems
• 100% Open source, written in Go
Let users manage applications, not machines

Primary concepts
Container: A sealed application package (Docker)
Pod: A small group of tightly coupled Containers
Labels: Identifying metadata attached to objects
Selector: A query against labels, producing a set result
Controller: A reconciliation loop that drives current state towards desired state
Service: A set of pods that work together

Application Containers
Homogenous Machine Fleet (Virtual or Physical)
Kubernetes API: Unified Compute Substrate

Kubernetes Architecture
etcd API Server
Scheduler
Controller Manager
Kubelet
Service Proxy
kubectl,
ajax, etc

Modularity
Loose coupling is a goal everywhere
• simpler
• composable
• extensible
Code-level plugins where possible
Multi-process where possible
Isolate risk by interchangeable parts
Example: ReplicationController
Example: Scheduler

Reconciliation between declared and actual state

Control loops
Drive current state -> desired state
Act independently
APIs - no shortcuts or back doors
Observed state is truth
Recurring pattern in the system
Example: ReplicationController
observe
diff
act

Atomic storage
Backing store for all master state
Hidden behind an abstract interface
Stateless means scalable
Watchable
• this is a fundamental primitive
• don’t poll, watch
Using CoreOS etcd

Pods: Grouping containers
Container Foo
Namespaces
- Net
- IPC
- ..
Container Bar

Pods: Networking
Container Foo
Container Bar
Namespaces
- Net
- IPC
- ..

Pods: Volumes
Container Foo
Container Bar
Namespaces
- Net
- IPC
- ..

Pods: Labels
Container Foo
Container Bar
Namespaces
- Net
- IPC
- ..

User
owned
Admin
owned
Persistent Volumes
A higher-level abstraction - insulation
from any one cloud environment
Admin provisions them, users claim
them
Independent lifetime and fate
Can be handed-off between pods and
lives until user is done with it
Dynamically “scheduled” and managed,
like nodes and pods Pod
ClaimRef
PVClaim
PersistentVolume
GCE PD AWS ELB
NFSiSCSI

Labels
Arbitrary metadata
Attached to any API object
Generally represent identity
Queryable by selectors
• think SQL ‘select ... where ...’
The only grouping mechanism
Use to determine which objects to apply
an operation to
• pods under a ReplicationController
• pods in a Service
• capabilities of a node (scheduling constraints)
App: Nifty
Phase: Dev
Role: FE
App: Nifty
Phase: Dev
Role: BE
App: Nifty
Phase: Test
Role: FE
App: Nifty
Phase: Test
Role: BE

Selectors
App: Nifty
Phase: Dev
Role: FE
App: Nifty
Phase: Test
Role: FE
App: Nifty
Phase: Dev
Role: BE
App: Nifty
Phase: Test
Role: BE

App == NiftyApp: Nifty
Phase: Dev
Role: FE
App: Nifty
Phase: Test
Role: FE
App: Nifty
Phase: Dev
Role: BE
App: Nifty
Phase: Test
Role: BE
Selectors

App == Nifty
Role == FE
App: Nifty
Phase: Dev
Role: FE
App: Nifty
Phase: Test
Role: FE
App: Nifty
Phase: Dev
Role: BE
App: Nifty
Phase: Test
Role: BE
Selectors

App == Nifty
Role == BE
App: Nifty
Phase: Dev
Role: FE
App: Nifty
Phase: Test
Role: FE
App: Nifty
Phase: Dev
Role: BE
App: Nifty
Phase: Test
Role: BE
Selectors

App == Nifty
Phase == Dev
App: Nifty
Phase: Dev
Role: FE
App: Nifty
Phase: Test
Role: FE
App: Nifty
Phase: Dev
Role: BE
App: Nifty
Phase: Test
Role: BE
Selectors

App == Nifty
Phase == Test
App: Nifty
Phase: Dev
Role: FE
App: Nifty
Phase: Test
Role: FE
App: Nifty
Phase: Dev
Role: BE
App: Nifty
Phase: Test
Role: BE
Selectors

Pod lifecycle
Once scheduled to a node, pods do not move
• restart policy means restart in-place
Pods can be observed pending, running, succeeded, or failed
• failed is really the end - no more restarts
• no complex state machine logic
Pods are not rescheduled by the scheduler or apiserver
• even if a node dies
• controllers are responsible for this
• keeps the scheduler simple
Apps should consider these rules
• Services hide this
• Makes pod-to-pod communication more formal

Replication Controllers
production
backend
production
backendproduction
backend
#N

Replication Controllers
A type of controller (control loop)
Ensure N copies of a pod always running
• if too few, start new ones
• if too many, kill some
• group == selector
Cleanly layered on top of the core
• all access is by public APIs
Replicated pods are fungible
• No implied ordinality or identity
Other kinds of controllers coming
• e.g. job controller for batch
Replication Controller
- Name = “nifty-rc”
- Selector = {“App”: “Nifty”}
- PodTemplate = { ... }
- NumReplicas = 4
API Server
How
many?
3
Start 1
more
OK
How
many?
4

Services
production
backend
production
backend
production
backend
port(s)
name
1.2.3.4
“name”

Services
10.0.0.1 : 9376
Client
kube-proxy
Service
- Name = “nifty-svc”
- Selector = {“App”: “Nifty”}
- Port = 9376
- ContainerPort = 8080
Portal IP is assigned
iptables
DNAT
TCP / UDP
apiserver
watch
10.240.2.2 : 808010.240.1.1 : 8080 10.240.3.3 : 8080
TCP / UDP

A Kubernetes cluster on Google Compute Engine

Node 02ej: logging, monitoring

A counter pod
apiVersion: v1
kind: Pod
metadata:
name: counter
namespace: demo
spec:
containers:
- name: count
image: ubuntu:14.04
args: [bash, -c,
'for ((i = 0; ; i++)); do echo "$i: $(date)"; sleep 1; done']

A counter pod
$ kubectl create -f counter-pod.yaml --namespace=demo
pods/counter
$ kubectl get pods
NAME READY REASON RESTARTS AGE
fluentd-cloud-logging-kubernetes-minion-1xe3 1/1 Running 0 5m
fluentd-cloud-logging-kubernetes-minion-p6cu 1/1 Running 0 5m
fluentd-cloud-logging-kubernetes-minion-s2dl 1/1 Running 0 5m
fluentd-cloud-logging-kubernetes-minion-ypau 1/1 Running 0 5m
kube-dns-v3-55k7n 3/3 Running 0 6m
monitoring-heapster-v1-55ix9 0/1 Running 12 6m

Observing the output of the counter
$ kubectl logs counter --namespace=demo
0: Tue Jun 2 21:37:31 UTC 2015
1: Tue Jun 2 21:37:32 UTC 2015
2: Tue Jun 2 21:37:33 UTC 2015
3: Tue Jun 2 21:37:34 UTC 2015
4: Tue Jun 2 21:37:35 UTC 2015
5: Tue Jun 2 21:37:36 UTC 2015
...

ssh onto node and “ps”
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS
PORTS NAMES
532247036a78 ubuntu:14.04 ""bash -c 'i=0; whi About a minute ago Up About a
minute k8s_count.dca54bea_counter_demo_479b8894-0971-11e5-a784-42010af00df1_f6159d40
8cd07658287d gcr.io/google_containers/pause:0.8.0 "/pause" About a minute ago Up About a
minute k8s_POD.e4cc795_counter_demo_479b8894-0971-11e5-a784-42010af00df1_7de2fec0
b2dc87db6608 gcr.io/google_containers/fluentd-gcp:1.6 ""/bin/sh -c '/usr/ 16 minutes ago Up 16 minutes
k8s_fluentd-cloud-logging.463ca0af_fluentd-cloud-logging-kubernetes-minion-
27gf_default_4ab77985c0cb4f28a020d3b097af9654_3e908886
c5d8641d884d gcr.io/google_containers/pause:0.8.0 "/pause" 16 minutes ago Up 16 minutes
k8s_POD.e4cc795_fluentd-cloud-logging-kubernetes-minion-27gf_default_4ab77985c0cb4f28a020d3b097af9654_2b980b91

Example: Music DB + UI
http://music-db:9200
http://music-ui:5601
music-db
music-db
music-db
music-db
music-ui

Example: Elasticsearch + Kibana Music DB & UI
apiVersion: v1
kind: ReplicationController
metadata:
labels:
app: music-db
name: music-db
spec:
replicas: 4
selector:
app: music-db
template:
metadata:
labels:
app: music-db
spec:
containers:
- name: es
image: kubernetes/elasticsearch:1.0
env:
- name: "CLUSTER_NAME"
value: "mytunes-db"
- name: "SELECTOR"
value: "name=music-db"
- name: "NAMESPACE"
value: "mytunes"
ports:
- name: es
containerPort: 9200
- name: es-transport
containerPort: 9300

Music DB Replication Controller
apiVersion: v1
kind: ReplicationController
metadata:
labels:
app: music-db
name: music-db
spec:
replicas: 4
selector:
app: music-db
template:
metadata:
labels:
app: music-db
spec:
containers:
...

Music DB container
containers:
- name: es
image: kubernetes/elasticsearch:1.0
env:
- name: "CLUSTER_NAME"
value: "mytunes-db"
- name: "SELECTOR"
value: "name=music-db"
- name: "NAMESPACE"
value: "mytunes"
ports:
- name: es
containerPort: 9200
- name: es-transport
containerPort: 9300

Music DB Service
apiVersion: v1
kind: Service
metadata:
app: music-db
labels:
app: music-db
spec:
selector:
app: music-db
ports:
- name: db
port: 9200
targetPort: es

Music DB
music-db
music-db
music-db
music-db

Music UI Pod
apiVersion: v1
kind: Pod
metadata:
name: music-ui
labels:
app: music-ui
spec:
containers:
- name: kibana
image: kubernetes/kibana:1.0
env:
- name: "ELASTICSEARCH_URL"
value: "http://music-db:9200"
ports:
- name: kibana
containerPort: 5601

Music UI Service
apiVersion: v1
kind: Service
metadata:
name: music-ui
labels:
app: music-ui
spec:
selector:
app: music-ui
ports:
- name: kibana
port: 5601
targetPort: kibana
type: LoadBalancer

Music DB + UI
http://music-ui:5601
music-db
music-db
music-db
music-db
music-ui
http://104.197.86.235:5601

Scale DB and UI independently
music-db
music-db
music-db
music-ui
music-ui

Monitoring
Optional add-on to Kubernetes clusters
Run cAdvisor as a pod on each node
• gather stats from all containers
• export via REST
Run Heapster as a pod in the cluster
• just another pod, no special access
• aggregate stats
Run Influx and Grafana in the cluster
• more pods
• alternately: store in Google Cloud Monitoring

Logging
Optional add-on to Kubernetes clusters
Run fluentd as a pod on each node
• gather logs from all containers
• export to elasticsearch
Run Elasticsearch as a pod in the cluster
• just another pod, no special access
• aggregate logs
Run Kibana in the cluster
• yet another pod
• alternately: store in Google Cloud Logging

Example: Rolling Upgrade with Labels
Servers:
Labels:
backend
v1.2
backend
v1.2
backend
v1.2
backend
v1.2
backend
v1.3
backend
v1.3
backend
v1.3
backend
v1.3
backend
Replication
Controller
replicas: 4
v1.2
Replication
Controller
replicas: 1
v1.3
replicas: 3 replicas: 2replicas: 3replicas: 2replicas: 1 replicas: 4replicas: 0

Questions?
Images by Connie Zhou
http://kubernetes.io

Cluster management with Kubernetes

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Cluster management with Kubernetes

Similar to Cluster management with Kubernetes (20)

Recently uploaded

Recently uploaded (20)

Cluster management with Kubernetes