Kubernetes is making the promise of changing the datacenter from being a group of computer to "a computer" itself. This presentation outlines the new features in K8S with 1.1 and 1.2 release.
Developer Data Modeling Mistakes: From Postgres to NoSQL
Kubernetes - State of the Union (Q1-2016)
1. Section Slide Template Option 2
Put your subtitle here. Feel free to pick from the handful of pretty Google colors available to you.
Make the subtitle something clever. People will think it’s neat.
Kubernetes - State of the Union (Q1-2016)
Vadim Solovey - CTO, DoIT International
Google Cloud Developer Expert | Authorized Trainer
vadim@doit-intl.com
2. Google confidential │ Do not distribute
Agenda
Introduction to Containers & Kubernetes
What’s new and coming soon
Q&A
1
2
3
3. • Usage of micro-services
• Declarative management
• Highly flexible and scalable
• Automation-friendly
• Good for complex architectures
• Development for “Google scale”
KubernetesPackaging containersApps in Containers
Containers
5. Copyright 2016 Google Inc
How Can We Scale Out Container Workloads?
Node Node
Cluster
Node
???
• Placement?
• Scale?
• Node failure?
• Container failure?
• Application upgrades?
How to handle...
Containers
6. Managed Base OS
Node Container
Manager
Scheduled Containers
Cluster Scheduler
Schedule containers across
machines
Replication and resizing
Service naming and discovery
Cluster schedulingKubernetes
Containers
7. A datacenter is not a group
of computers,
a datacenter is a
computer.
The promise
8. Copyright 2015 Google Inc
Replication controllers create
new pod "replicas" from a
template and ensures that a
configurable number of
those pods are running.
A Service offers low overhead
way to route requests to a
logical set of pod backends
in the cluster based on a
label selector.
Replication
Controllers
ServicesLabels
Labels are metadata that
are attached to objects,
such as pods.
They enable organization
and selection of subsets
of objects with a cluster.
Pods
Pods are ephemeral units
that are used to manage
one or more tightly
coupled containers.
They enable data sharing
and communication
among their constituent
components.
Moving parts
Kubernetes
9. Copyright 2015 Google Inc
Namespaces AnnotationsSecretsVolumes
More moving parts
Kubernetes
Persistent
Volumes
Selectors
Load
Balancers
10. Copyright 2015 Google Inc
Autoscalers
Ingress
Jobs
Daemon
Sets
New kids in the town
Kubernetes
Deployments
11. Section Slide Template Option 2
Put your subtitle here. Feel free to pick from the handful of pretty Google colors available to you.
Make the subtitle something clever. People will think it’s neat.
Daemon Sets
12. Daemon Sets
A Daemon Set ensures that all (or some) nodes run a copy of a pod.
Node 1 Node 2 Node 3
pod pod pod
Popular use-cases:
● running a cluster storage daemon, such as glusterd or ceph
● running a logs collection daemon on every node, such as fluentd or logstash
● running a node monitoring daemon on every node collectd, new relic, ganglia
Alternatives:
● init script of your religion, - init, upstartd, systemd
● bare pods
13. Section Slide Template Option 2
Put your subtitle here. Feel free to pick from the handful of pretty Google colors available to you.
Make the subtitle something clever. People will think it’s neat.
Deployments
14. Deployments
A Deployment provides declarative update for Pods and ReplicationControllers.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
A typical use case is:
● Create a deployment to bring up a replication controller and pods.
● Later, update that deployment to recreate the pods (for ex: to use a
new image).
$ kubectl create -f app.yaml
deployment "app" created..
$ kubectl get deployments
NAME UPDATEDREPLICAS AGE
app 3/3 1m
15. Section Slide Template Option 2
Put your subtitle here. Feel free to pick from the handful of pretty Google colors available to you.
Make the subtitle something clever. People will think it’s neat.
Horizontal Pod Autoscaling
16. Pod Autoscaling
Horizontal pod autoscaling allows the number of pods in a replication controller or deployment
to scale automatically based on observed CPU utilization
Pod 1
Details:
● Control loop (targetNumOfPods = ceil(sum(currentPodsCPUUtilization) / target)
● --horizontal-pod-autoscaler-sync-period
● Autoscaling during rolling update
Pod 2 Pod .. Pod N
RC / Deployment Autoscaler
17. Section Slide Template Option 2
Put your subtitle here. Feel free to pick from the handful of pretty Google colors available to you.
Make the subtitle something clever. People will think it’s neat.
Ingress
18. Copyright 2016 Google Inc
The Ingress
Services
Internet
Services
Internet
Ingress
is collection of rules that allow inbound
connections to reach the cluster services
19. Copyright 2016 Google Inc
The Ingress Resource
Services
Internet
Ingress
Few potential use-cases include:
● Externally reachable urls for services
● Traffic Load Balancing
● Terminate SSL
● Name based virtual hosting
● More more as it evolves..
Available Controllers:
● GCE L7 LB
● nginx
● Write your own
20. Copyright 2016 Google Inc
The Ingress Resource
Services
Internet
Ingress
Minimal Ingress Resource may look like this:
01. apiVersion: extensions/v1beta1
02. kind: Ingress
03. metadata:
04. name: test-ingress
05. spec:
06. rules:
07. - http:
08. paths:
09. - path: /testpath
10. backend:
11. serviceName: test
12. servicePort: 80
21. Copyright 2016 Google Inc
Creating Ingress Resource
Services
Internet
Ingress
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: test-ingress
spec:
backend:
serviceName: testsvc
servicePort: 80
$ kubectl get ing
NAME RULE BACKEND ADDRESS
test-ingress - testsvc:80 107.178.254.228
23. Copyright 2016 Google Inc
Simple Fan Out
Simple edge accepting ingress
traffic and proxying it to the right
endpoints
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: test
spec:
rules:
- host: foo.bar.com
http:
paths:
- path: /foo
backend:
serviceName: s1
servicePort: 80
- path: /bar
backend:
serviceName: s2
servicePort: 80
$ kubectl get ing
NAME RULE BACKEND ADDRESS
test -
foo.bar.com
/foo s1:80
/bar s2:80
foo.bar.com
178.91.123.132
/foo
s1:80
/bar
s2:80
24. Copyright 2016 Google Inc
Name based virtual hosting
Name-based virtual hosts use
multiple host names for the same
IP address
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: test
spec:
rules:
- host: foo.bar.com
http:
paths:
- backend:
serviceName: s1
servicePort: 80
- host: bar.foo.com
http:
paths:
- backend:
serviceName: s2
servicePort: 80
foo.bar.com
178.91.123.132
foo.bar.com
s1:80
bar.foo.com
s2:80
bar.foo.com
25. Copyright 2016 Google Inc
Alternatives
You can expose a Service in multiple ways that don't directly involve the Ingress resource:
● Use Service.Type=LoadBalancer
● Use Service.Type=NodePort (30K-32K ports)
● Use a Port Proxy
● Deploy the Service Loadbalancer. This allows you to share a single IP among multiple
services and achieve more advanced load balancing through service annotations.
26. Copyright 2016 Google Inc
Gotchas
● The Ingress resource is not available in Kubernetes < 1.1
● You need an Ingress Controller to satisfy an Ingress.
○ Simply creating the resource will have no effect.
● On GCE/GKE there is a L7 LB controller, on other platforms you either need to write
your own or deploy an existing controller as a pod.
● The resource currently does not support HTTPS, but will do so before it leaves beta
(March/April 2016)
27. Copyright 2016 Google Inc
Future Work
● Various modes of HTTPS/TLS support (edge termination, sni etc)
● Requesting an IP or Hostname via claims
● Combining L4 and L7 Ingress
● More Ingress controllers (haproxy, vulcan, zuul, etc)
28. Section Slide Template Option 2
Put your subtitle here. Feel free to pick from the handful of pretty Google colors available to you.
Make the subtitle something clever. People will think it’s neat.
Jobs
29. Section Slide Template Option 2
Put your subtitle here. Feel free to pick from the handful of pretty Google colors available to you.
Make the subtitle something clever. People will think it’s neat.
Going forward
30. Jobs
A job creates one or more pods and ensures that a specified number of them successfully
terminate.
Details:
● .restartPolicy, .parallelism & .completions
● replication controller vs jobs
● cron
apiVersion: extensions/v1beta1
kind: Job
metadata:
name: pi
spec:
selector:
matchLabels:
app: pi
template:
metadata:
name: pi
labels:
app: pi
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
$ kubectl create -f ./job.yaml
jobs/pi
$ kubectl logs pi-aiw0a
3.141592653589793238462643383279502884197169399
37510582097494459230781640628620899862803482534
21170679821480865132823066470938446095505822317
25359408128481117450284102701938521105559644622
94895493038196442881097566593344612847564823371
31. Copyright 2016 Google Inc
Going forward in 2016
● version 1.2 would also enable multi-zone
● version 1.4 will allow multi-clustering (Ubernetes)
32. Section Slide Template Option 2
Put your subtitle here. Feel free to pick from the handful of pretty Google colors available to you.
Make the subtitle something clever. People will think it’s neat.
Q & A
Vadim Solovey - CTO, DoIT International
Google Cloud Developer Expert | Authorized Trainer
vadim@doit-intl.com
33. Section Slide Template Option 2
Put your subtitle here. Feel free to pick from the handful of pretty Google colors available to you.
Make the subtitle something clever. People will think it’s neat.
meetup.com/googlecloud
Editor's Notes
Questions to audience:
How many people are using containers in some environment (dev, ci, production)?
How many people are using some container orchestration engine (ecs, k8s, swarm, mesos)?
How many people know a little bit about Kubernetes?
Microservices take the Unix philosophy to your application design. Write programs that do one thing, and do it well. Write programs that work together.
Apps in containers provide ideal infrastructure for micro-services, it’s flexible, very automation friendly and built for complex architectures and scale.
So far, sounds familiar, right?
We can then create a node that hosts many containers.
This is much better.
My app & libraries get isolation through their containers,
and the container spins up on the order of a process (not booting a VM per app)
My app is kept portable, as containers that run on any modern linux stack.
We reduce the number of redundant OS kernels.
But
Google could not run, if we programmed and operated at individual Node level.
We have to write our apps with a higher level construct, we have to program at the cluster level
As we saw when clusters came into Google
the number of services proliferates, as ops & dev have better tools that cleave at the right abstraction layer
We have to be cluster first.
GCE does not natively support any way to manage deployment, scaling and reliability of container based workloads.
How to handle replication?
What about node failure?
What about container failure?
How do we manage application upgrades?
Managed Base OS
Node Container Manager
Common services: log rotation, watchdog restarting
Containers:
System container for shared daemons. Statically defined.
Dynamically scheduled containers
Cluster Scheduler
Schedules work (tasks) onto nodes
Work specified based on intents
Surfaces data about running tasks, restarts, etc.
Essentially, the promise of Kubernetes is to make a datacenter not a group of computers but for a datacenter to become a computer in itself.
Pods are ephemeral units that are used to manage one or more tightly coupled containers.
They enable data sharing and communication among their constituent components.
Labels are metadata that are attached to objects, such as pods.
They enable organization and selection of subsets of objects with a cluster.
Replication controllers create new pod "replicas" from a template and ensures that a configurable number of those pods are running.
A Service offers low overhead way to route requests to a logical set of pod backends in the cluster based on a label selector.
Services also provide a mechanism for surfacing legacy components such as databases with a cluster
Pods are ephemeral units that are used to manage one or more tightly coupled containers.
They enable data sharing and communication among their constituent components.
Labels are metadata that are attached to objects, such as pods.
They enable organization and selection of subsets of objects with a cluster.
Replication controllers create new pod "replicas" from a template and ensures that a configurable number of those pods are running.
A Service offers low overhead way to route requests to a logical set of pod backends in the cluster based on a label selector.
Services also provide a mechanism for surfacing legacy components such as databases with a cluster
But there are also new functionality coming up in 2016. Most of it is already available as beta feature in 1.1 release and all of them will be GA with 1.2 release scheduled for March/April 2016.
In a simple case, one Daemon Set, covering all nodes, would be used for each type of daemon. A more complex setup might use multiple DaemonSets would be used for a single type of daemon, but with different flags and/or different memory and cpu requests for different hardware types.
It is certainly possible to run daemon processes by directly starting them on a node (e.g using init, upstartd, or systemd). This is perfectly fine. However, there are several advantages to running such processes via a DaemonSet:
Ability to monitor and manage logs for daemons in the same way as applications.
Same config language and tools (e.g. pod templates, kubectl) for daemons and applications.
Future versions of Kubernetes will likely support integration between DaemonSet-created pods and node upgrade workflows.
Running daemons in containers with resource limits increases isolation between daemons from app containers. However, this can also be accomplished by running the daemons in a container but not in a pod (e.g. start directly via Docker).
Bare Pods
It is possible to create pods directly which specify a particular node to run on. However, a Daemon Set replaces pods that are deleted or terminated for any reason, such as in the case of node failure or disruptive node maintenance, such as a kernel upgrade. For this reason, you should use a Daemon Set rather than creating individual pods.
We already have cluster resize with 1.1 release on GCE and now we are adding pod autoscaling
Possible use-case for default backend: 404 page if none of the Hosts in your Ingress match the Host in the request header, and/or none of the Paths match the url of the request
A job creates one or more pods and ensures that a specified number of them successfully terminate. As pods successfully complete, the job tracks the successful completions. When a specified number of successful completions is reached, the job itself is complete. Deleting a Job will cleanup the pods it created.
A simple case is to create 1 Job object in order to reliably run one Pod to completion. A Job can also be used to run multiple pods in parallel.
Multiple Completions
By default, a Job is complete when one Pod runs to successful completion. You can also specify that this needs to happen multiple times by specifying .spec.completions with a value greater than 1. When multiple completions are requested, each Pod created by the Job controller has an identical spec. In particular, all pods will have the same command line and the same image, the same volumes, and mostly the same environment variables. It is up to the user to arrange for the pods to do work on different things. For example, the pods might all access a shared work queue service to acquire work units.
To create multiple pods which are similar, but have slightly different arguments, environment variables or images, use multiple Jobs.
Parallelism
You can suggest how many pods should run concurrently by setting .spec.parallelism to the number of pods you would like to have running concurrently. This number is a suggestion. The number running concurrently may be lower or higher for a variety of reasons. For example, it may be lower if the number of remaining completions is less, or as the controller is ramping up, or if it is throttling the job due to excessive failures. It may be higher for example if a pod is gracefully shutdown, and the replacement starts early.
If you do not specify .spec.parallelism, then it defaults to .spec.completions.
Everyone is invited for Google Cloud meetup to follow up on next events and workshops