Kubernetes & Cloud Native
Toronto
Bienvenue ! Welcome!
Thank you to our sponsors!
http://K8scanadaslack.herokuapp.com
Joignez-vous au Slack K8s Canada
Seattle! Dec 11-13
Join #kubecon-seattle2018
Aidez-nous !
● À Montréal, Toronto, Ottawa, Québec, Kitchener-Waterloo
● Soumettez une présentation
● Commanditez ! Rejoignez-nous sur meetup.com
● Aidez nous à organizer un meetup
Page
8
Intro
Archy
Solutions Architect & CNCF
Ambassador
Carol Trang
Community Manager
Kubernetes Certification
cloudops.com @cloudops_
Ateliers pratiques !
Montréal et en ligne
Deepen your knowledge of containers and microservices and their ecosystems.
● Docker and Kubernetes
● CI/CD
● IaC
● Advanced Docker and Kubernetes
● Machine Learning
cloudops.com/docker-and-kubernetes-workshops
info@cloudops.com
● OpenShift
● Kubernetes on Google Cloud
● Kubernetes on Azure
● Kubernetes on AWS
K8s 5000 ft. view
Page 13
Kubernetes - K-10
•
•
•
•
Page
Kubernetes K-100
Page
● Kubernetes Addons
● CNI (Container Network Interface) (stable)
● CRI (Container Runtime Interface plugins) (alpha)
● CSI (Container Storage Interface plugins) (alpha)
● Scheduler webhook & multiple (beta)
● Device plugins (e.g GPUs, NICs, FPGAs, InfiniBand)(alpha)
● External Cloud Provider Integrations (beta)
● API Server authn / authz webhooks (stable)
Extending Kubernetes Platform K-200
Page
● Initializers & Admission webhook (beta)
● Istio sidecar auto injection via mutating webhook admissions
● API Aggregation (beta)
● kubectl plugins (alpha)
● Example: kubectl ssh, kubectl switch, kubectl ip, kubectl uptime
● CustomResourceDefinitions (beta)
● Operators Framework (Rook, Vault, Prometheus, Kafka)
Extending Apps in Kubernetes K-300
K8s 1.12 & 1.13
Page
● The third release in 2018!!! September 28th
● Release link: https://github.com/kubernetes/kubernetes/releases
● The Kubernetes 1.13, 4th release December 4th!!!
Kubernetes 1.12
Cluster Bootstrap
cloudops.com @cloudops_Page
Kubernetes The Hard Way
20
Kelsey Hightower
Developer advocate
cloudops.com @cloudops_Page
Kubernetes The Hard Way
21
1. Provisioning Compute Resources
2. Provisioning the CA and Generating TLS Certificates
3. Generating Kubernetes Configuration Files for Authentication
4. Generating the Data Encryption Config and Key
5. Bootstrapping the etcd Cluster
6. Bootstrapping the Kubernetes Control Plane
7. Bootstrapping the Kubernetes Worker Nodes
8. Configuring kubectl for Remote Access
9. Provisioning Pod Network Routes
10. Deploying the DNS Cluster Add-on
KOPS Kubeadm
Page
Kubeadm, Kops and other Deployment tools can now benefit from:
● TLS Bootstrapping (Stable)
● kubelet generates a private key and a CSR for submission to a
cluster-level certificate signing process.
● TLS Server Certificate Rotation (Beta)
● In addition to self-signed certificates. Users can now generate a
key locally and use it to issue a CSR to the cluster API server for a
Certificate Authority certificate, which will be updated when it
expires.
What’s new in 1.12
Page
● Kubeadm (Stable) !!!
● Stable command-line UX (GA)
● Implementation (GA)
● Configuration file schema (beta)
What’s new in 1.13
Page
Example kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: stable
apiServer:
certSANs:
- "LOAD_BALANCER_DNS"
controlPlaneEndpoint: "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT"
etcd:
external:
endpoints:
- https://ETCD_0_IP:2379
- https://ETCD_1_IP:2379
- https://ETCD_2_IP:2379
caFile: /etc/kubernetes/pki/etcd/ca.crt
certFile:
/etc/kubernetes/pki/apiserver-etcd-client.crt
keyFile:
/etc/kubernetes/pki/apiserver-etcd-client.key
Page
● Kubeadm (Stable) !!!
● Stable command-line UX (GA)
● Implementation (GA)
● Configuration file schema (beta)
● Upgrades between minor versions (GA)
What’s new in 1.13
Page
● Kubeadm (Stable) !!!
● Stable command-line UX (GA)
● Implementation (GA)
● Configuration file schema (beta)
● Upgrades between minor versions (GA)
● Secure bootstrap Etcd
What’s new in 1.13
Page
● Kubeadm (Stable) !!!
● Stable command-line UX (GA)
● Implementation (GA)
● Configuration file schema (beta)
● Upgrades between minor versions (GA)
● Secure bootstrap Etcd
● HA (alpha)
kubeadm init
kubeadm join --experimental-control-plane
What’s new in 1.13
Page
HA Kubeadm Topologies
● 3 Master + 3 etcd (Collocated)
Page
HA Kubeadm Topologies
● 3 Master + 3 etcd External
Scheduling
Page
Current state of scheduling
● Basic scheduling
● DaemonSets
● Nodes Selectors (e.g. Scheduling on nodes with GPU)
● Advanced Scheduling
● Node Affinity Priority
● Custom schedulers
● Taints/tolerations (e.g scenario for Specialized Hardware,
Hardware failing (but not failed)
● Disruption budget (Cluster upgrades with stateful workloads)
● Pod Priority and Pre-emption (e.g. Run debuggers during overload)
(allows assign priority to specific pods)
Page
What’s new in 1.12
SIG Scheduling updates
● Quota by priority - beta
● Allows to set different namespaces to have different priorities, and
assign quotas to those namespaces accordingly. This enhances the
existing priority and preemption feature that was delivered in
Kubernetes 1.11.
Page
What’s new in 1.13
SIG Scheduling updates
● Scheduler can be configured to score a subset of the cluster nodes
● Kubernetes scheduler can be configured to only consider a
percentage of the nodes, as long as it can find enough feasible
nodes in that set. This improves the scheduler’s performance in
large clusters.
Container Runtime
Interface (CRI)
Page
Container Runtime Interface (CRI) 1.7 - GA
36
AVOID
LOCK-IN
Goal of CRI:
● Remove docker kubelet code of out Kubernetes
● Simplify integration of K8s with other runtimes
CRI runtimes
● cri-docker
● rktlet
● cri-o (based on OCI)
● cri-containerd (alpha)
● virtlet (alpha)
● frakti (alpha)
Page
What’s new in 1.12
SIG Scheduling updates:
● RuntimeClass - alpha (cluster-scoped runtime properties)
●The runtimeClass is a new field on the PodSpec that enables users
to designate the specific runtime they want to use
● E.g. it will allow users to run Docker and Gvisor containers in same
Kubernetes cluster and specify specific parameters related to that
runtime.
Kubernetes configuration
management
Page
Sig Nodes update:
● Dynamic audit configuration (alpha)
● Kubectl diff command (beta)
What’s new in 1.13
Networking
Page 41
Container Network Interface (CNI)
41
CNI is a specification proposed by CoreOS and adopted by
Kubernetes. CNI is currnetly part of CNCF
Goal of CNI:
● To make network layer easy pluggable
● CNM is not good option for K8s
● Avoid code duplication
Third-party CNI plugins:
● Flannel
● Weave
● Calico
● Contiv and many more
Page
Pod-to-Pod Communication (Continues)
42
CloudProvider Networking (kubenet):
● GCE
● AWS (50 host limit)
Overlay type:
● Flannel
● Weave
Layer 3 via BGP:
● Calico
● Kube-router (new)
Mixed
● Canal=Calico+Flannel
SDN
● Romana, OpenContrail
● Cisco, Openshift-SDN
● OVS
cloudops.com @cloudops_
Network Policy
Page
State of Network Policy in Kubernetes
Network Policy is (stable) Kubernetes 1.7 release and above
Features:
● Ingress (stable) policies can be defined
● Cross-namespace policies
● Egress (beta)
Page
Focus of SIG-Networking was improve to Network Policy features
● Egress - Stable
● Enables administrators to define how network traffic leaves a Pod,
this rules added in addition to Ingress Network Policy rules.
● ipBlock - Stable
● ipBlock functionality allows for defining CIDR ranges in
NetworkPolicy definitions.
What’s new in 1.12
Page
Example of egress and ipBlock
kind: NetworkPolicy
apiVersion:
networking.k8s.io/v1
metadata:
name: default-block
namespace: netpol-test
spec:
podSelector:
matchLabels:
role: db
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 192.168.111.0/24
policyTypes:
- Egress
Page
Focus of SIG-Networking
● CoreDNS - GA and default
What’s new in 1.13
Storage
Page
● K8s has Kubernetes Volume Plugins, however it is challenging
adding support for new “in-tree” volume plugins
● CSI makes Kubernetes volume layer truly extensible (Beta)
Current state of Storage
Page
Sig-Storage contributed some following enhancements:
● Topology-aware dynamic provisioning - Beta
● Topology aware provisioning makes it possible for Kubernetes to more intelligently provision
resources. Prevents from situation where a pod can’t start because the storage resources it
needs are in a different zone.
What’s new in 1.12
Page
Sig-Storage contributed some following enhancements:
● Container Storage Interface (CSI) - GA
● Raw block device using persistent volume source (Beta)
● Topology-aware dynamic provisioning (Stable)
What’s new in 1.13
Auto Scaling Feature
Page
Autoscaling in Kubernetes
55
● Horizontal Pod Autoscaling (HPA)
Based on CPU
Based on Memory
Based on Custom Metrics
● Vertical Pods Autoscaling (VPA) - alpha
● Cluster Autoscaling
HPA
Page
Horizontal Pod Autoscaling (HPA)
57
Kubernetes automatically scales the number of pods in
● Deployment
Metrics for autoscaling
● observed CPU utilization
● observed Memory utilization
● application-provided metrics aka Custom Metrics
Pod 1 Pod 2 Pod .. Pod N
RC / Deployment Autoscaler
Page
HPA based on CPU with Metrics Server
Page
Maintain a decent load
● If pods are heavily loaded then starting new
pods may bring average load down.
Page
Maintain a decent load
● If pods are heavily loaded then starting new
pods may bring average load down.
Page
Maintain a decent load
● If pods are heavily loaded then starting new
pods may bring average load down.
Page
Maintain a decent load
● If pods are heavily loaded then starting new
pods may bring average load down.
● If pods are barely loaded then stopping pods
will free some resources and the deployment
should still be ok..
Page
Maintain a decent load
● If pods are heavily loaded then starting new
pods may bring average load down.
● If pods are barely loaded then stopping pods
will free some resources and the deployment
should still be ok..
Page
Maintain a decent load
● If pods are heavily loaded then starting new
pods may bring average load down.
● If pods are barely loaded then stopping pods
will free some resources and the deployment
should still be ok..
● Specify the target for the load and try to be as
close as possible to it.
Example of HPA with
custom metrics
using Prometheus
Page
HPA v2 (Beta) based on Custom Metric
using Prometheus
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: my-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: <object name>
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
- type: Pods
pods:
metricName:<your pod custom metric>
targetAverageValue: 1k
- type: Object
object:
metricName: <any custom metric>
target:
apiVersion: extensions/v1beta1
kind: <resource type>
name: <resource name>
targetValue: 10k
VPA
Page
Vertical Pod Autoscaling (VPA)
How VPA works:
● Resource: CPU/Memory
● “Increasing CPU/Memory resources when
necessary”
● Less complicated to design for resource
increase
● Harder to autoscale
68
Page
VPA Architecture
69
Page
VPA Limitations
● alpha, so need testing and tease
out edge-cases
● in-place updates (requires support from
container runtime)
● usage spikes—how to deal with it best?
70
Page
(Proposal) VPA Architecture with in-place update
71
Page
Sig-Autoscaling made significant improvements in HPA and VPA
● HPA (Horizontal Pod Autoscaler)
● Scaling via custom metrics (metrics-server) - beta
● Improving scaling algorithm to reach size faster - beta
The algorithm used to determine how many pods should be active has been adjusted to
improve the time-to-completion
● VPA (Vertical Pod Autoscaler) - beta
What’s new in 1.12
Cloud Providers
Page
Cloud Providers
Page
● Runtime options:
● Container-Optimized OS with containerd (cos_containerd) - beta
● Gvisor
● Container-native Load Balancing
● Serverless Add-on (knative)
● Managed Istio
Kubernetes 1.12 (GCP)
Page
● Support for Azure Virtual Machine Scale Sets (VMSS)
● Cluster autoscaler support (Stable)
● Azure availability zone support (alpha)
● In future AKS will come with VMSS support
Kubernetes 1.12 (Azure)
Page
Kubernetes 1.13 (AWS)
● AWS ALB ingress controller (alpha)
● EBS CSI driver (alpha)
Page
Page
CNCF Update
cloudops.com @cloudops_
cloudops.com @cloudops_
Keynotes - CNCF Project Update
cloudops.com @cloudops_
Cloud Native Computing Foundation84
Cloud Native Computing Foundation85
Falco
A runtime security tool developed by Sysdig, designed to
detect anomalous activity and intrusions in Kubernetes
● Abnormal Behavior Detection for Linux based
Containers, Hosts, and Orchestration Platforms
● Commonly referred to “Runtime Security”
● Filter language can easily detect events such as:
○ Shells/processes spawned in a container
○ Unexpected outbound connections
○ Processes listening on unexpected ports
○ Files/binaries changed after container start
○ Container isolation compromised
● Automated action can be taken when abnormal events
are detected
Falco
Why do you need Falco?
● Image scanning is “point in time” security of choices made by
developers
● Need the have ability to detect breakdowns in isolation when containers are
running
● Falco can detect comprised:
○ Container isolation (vulnerabilities in container runtimes/Linux kernel)
○ Applications (exploited applications)
○ Orchestration Systems (Exposed dashboards, API ports)
● Enforces best practices & compliance requirements (PCI, SOC, GDPR)
How Falco Works?
Falco Ecosystem
Integrations with CNCF projects:
- Kubernetes, rkt, containerd, fluentd
Other integrations:
- Sysdig, Mesos, Marathon
Default Rule Set:
- Ships with 25 rules around container best practices.
Example:
- https://sysdig.com/blog/kubernetes-security-logging-fluentd-falco/
Example Falco Use Case
Runtime Security Tools Space
Proprietary
A number of vendors provide runtime security as
part of a broader container security product. These
products bundle capabilities from multiple security
areas - such as image scanning, access control,
and firewalling - to create a more extensive security
product.
- Sysdig Secure: The Falco rules
engine is used along with proprietary
software to create a SaaS based security
product.
- Aqua Security
- Twistlock
Open Source
Falco is one component of a complete security tool
set for Cloud Native platforms. Other
complementary open source projects include
Anchore, Clair, Inspec, Cilium, Notary, TUF, SPIFFE,
Vault, etc. Each project covers a different area of
infrastructure, software build, or runtime security.
- Falco
Incubation
Rook: Sandbox -> Incubation
CN Orchestrator for distributed storage systems
● Cloud-Native Storage Orchestrator
● Automates deployment, bootstrapping,
configuration, provisioning, scaling,
upgrading, migration, disaster recovery,
monitoring, and resource management
● Framework for many storage providers
and solutions
What is rook ?
Rook Design
etcdetcd
Kubernetes
API
kubectl
etcd
Rook
Operator
Rook Agent
Flexvolume
Driver
Kubelet
Rook vol.
plugin
Attach & Mount
Operations
Management &
Health APINew Object:
Volume
AttachmentNew Objects:
Storage Clusters
Storage Pools
Object Store
File Store
Ref: https://rook.io/docs/rook/master/
Rook Design with Ceph
Container Container Container Container
Volume Claim Volume Claim
Rook-agent Rook-agent Rook-agent Rook-agent Rook-agentOperator
● v0.7 released Feb 21, v0.8 released July 18
○ 545 commits total
● Instituted formalized project governance policies, added a new maintainer
● Rook Framework for Storage Providers
○ Makes Rook a general cloud-native storage orchestrator
○ Supports multiple new storage solutions with reusable specs, logic, policies
○ CockroachDB and Minio orchestration released in v0.8
○ NFS, Cassandra, Nexenta, Alluxio ongoing
● Ceph support graduated to Beta maturity
● Automatic horizontal scaling by the Ceph operator
● Improved security model and support for OpenShift
● Numerous other features and improvements
98
Progress Since Sandbox Entry
Adopters: Production Usage
99
There are additional adopters of Rook, especially those with on-premise deployments, that are
not ready to share the details of their usage publicly at this time.
Centre of Excellence in Next
Generation Networks
100
● 20 bare-metal nodes providing 100TB, with more being added
● Heterogeneous mix of nodes with high disk density as well as
compute-focused nodes
● Several databases, web applications, and a self-hosted file sharing
solution
“Rook is giving us a big head start in deploying cloud-native Ceph...having an
operator that can help deploy and manage Ceph in a cloud-native environment
is an ideal solution...gives us the ability to leverage both the storage and the extra
compute capabilities of the storage-dense nodes”
Raymond Maika, Cloud Infrastructure Engineer at CENGN
Harbor: Sandbox -> Incubation
A trusted container registry that stores, signs, and
scans docker images.
cloudops.com @cloudops_
An open source trusted cloud native registry project.
vmware.github.io/harbor
HARBOR
™
cloudops.com @cloudops_
What makes a trusted cloud native registry?
− Registry features include
■ Docker and Helm Registry
■ Multi-tenant content signing and validation
■ Security and vulnerability analysis
■ Role based access control and LDAP/AD support
■ Image deletion & garbage collection
■ Image replication between instances
■ Internationalization (currently English and Chinese)
− Operational experience
■ Deployed in containers
■ Extends, manages, and integrates proven open source components
cloudops.com @cloudops_
Architecture
API Routing
Core Service (API/Auth/GUI)
Image
Registry
Trusted
Content
Vulnerability
ScanningJob Service
Admin
Service
Harbor components
3rd party components
SQL DatabaseKey/Value Storage
Harbor integrates
multiple open
source
components to
provide a trusted
registry.
Persistence components
Local or Remote Storage (block, file, object)
Users (GUI/API)
Container
Schedulers/Runtimes
Consumers
LDAP/Active
Directory
Supporting services
HarborPackaging
cloudops.com @cloudops_
Kubernetes Deployment
cloudops.com @cloudops_
Web interface and vulnerability scanning
Graduation
Envoy: Incubation -> Graduation
A modern edge and service proxy
Envoy
● A C++ based L4/L7 proxy
● Low memory footprint
● Battle-tested @ Lyft
○ 100+ services
○ 10,000+ VMs
○ 2M req/s
Features:
● API driven config updates →
no reloads
● Zone-aware load balancing
w/ failover
● Traffic routing and splitting
● Health checks, circuit
breakers, timeouts, retry
budgets, fault injection, …
● HTTP/2 & gRPC
● Transparent proxying
● Designed for observability
Solutions build with Envoy
What’s Next ?
How to learn more about
CNCF projects?
Cloud Native Computing Foundation
11
3
2018-19 KubeCon + CloudNativeCon
• China
– Shanghai: November 14-15, 2018
– General session CFP closed!
– Intro and Deep Dive Sessions CFP
• North America
– Seattle: December 11 - 13, 2018
– CFP open until August 12, 2018
– Intro and Deep Dive Sessions CFP
• Europe
– Barcelona: May 21 - 23, 2019
Cloud Native Computing Foundation
11
4
2018-19 KubeCon + CloudNativeCon
CNCF Landscape (card mode)
CNCF Landscape
CNCF
Trail Map
Kubernetes and Cloud Native Update Q4 2018
Kubernetes and Cloud Native Update Q4 2018

Kubernetes and Cloud Native Update Q4 2018

  • 1.
    Kubernetes & CloudNative Toronto Bienvenue ! Welcome!
  • 2.
    Thank you toour sponsors!
  • 3.
  • 5.
    Seattle! Dec 11-13 Join#kubecon-seattle2018
  • 7.
    Aidez-nous ! ● ÀMontréal, Toronto, Ottawa, Québec, Kitchener-Waterloo ● Soumettez une présentation ● Commanditez ! Rejoignez-nous sur meetup.com ● Aidez nous à organizer un meetup
  • 8.
    Page 8 Intro Archy Solutions Architect &CNCF Ambassador Carol Trang Community Manager
  • 9.
  • 10.
  • 11.
    Ateliers pratiques ! Montréalet en ligne Deepen your knowledge of containers and microservices and their ecosystems. ● Docker and Kubernetes ● CI/CD ● IaC ● Advanced Docker and Kubernetes ● Machine Learning cloudops.com/docker-and-kubernetes-workshops info@cloudops.com ● OpenShift ● Kubernetes on Google Cloud ● Kubernetes on Azure ● Kubernetes on AWS
  • 12.
  • 13.
    Page 13 Kubernetes -K-10 • • • •
  • 14.
  • 15.
    Page ● Kubernetes Addons ●CNI (Container Network Interface) (stable) ● CRI (Container Runtime Interface plugins) (alpha) ● CSI (Container Storage Interface plugins) (alpha) ● Scheduler webhook & multiple (beta) ● Device plugins (e.g GPUs, NICs, FPGAs, InfiniBand)(alpha) ● External Cloud Provider Integrations (beta) ● API Server authn / authz webhooks (stable) Extending Kubernetes Platform K-200
  • 16.
    Page ● Initializers &Admission webhook (beta) ● Istio sidecar auto injection via mutating webhook admissions ● API Aggregation (beta) ● kubectl plugins (alpha) ● Example: kubectl ssh, kubectl switch, kubectl ip, kubectl uptime ● CustomResourceDefinitions (beta) ● Operators Framework (Rook, Vault, Prometheus, Kafka) Extending Apps in Kubernetes K-300
  • 17.
  • 18.
    Page ● The thirdrelease in 2018!!! September 28th ● Release link: https://github.com/kubernetes/kubernetes/releases ● The Kubernetes 1.13, 4th release December 4th!!! Kubernetes 1.12
  • 19.
  • 20.
    cloudops.com @cloudops_Page Kubernetes TheHard Way 20 Kelsey Hightower Developer advocate
  • 21.
    cloudops.com @cloudops_Page Kubernetes TheHard Way 21 1. Provisioning Compute Resources 2. Provisioning the CA and Generating TLS Certificates 3. Generating Kubernetes Configuration Files for Authentication 4. Generating the Data Encryption Config and Key 5. Bootstrapping the etcd Cluster 6. Bootstrapping the Kubernetes Control Plane 7. Bootstrapping the Kubernetes Worker Nodes 8. Configuring kubectl for Remote Access 9. Provisioning Pod Network Routes 10. Deploying the DNS Cluster Add-on
  • 22.
  • 23.
    Page Kubeadm, Kops andother Deployment tools can now benefit from: ● TLS Bootstrapping (Stable) ● kubelet generates a private key and a CSR for submission to a cluster-level certificate signing process. ● TLS Server Certificate Rotation (Beta) ● In addition to self-signed certificates. Users can now generate a key locally and use it to issue a CSR to the cluster API server for a Certificate Authority certificate, which will be updated when it expires. What’s new in 1.12
  • 24.
    Page ● Kubeadm (Stable)!!! ● Stable command-line UX (GA) ● Implementation (GA) ● Configuration file schema (beta) What’s new in 1.13
  • 25.
    Page Example kubeadm-config.yaml apiVersion: kubeadm.k8s.io/v1beta1 kind:ClusterConfiguration kubernetesVersion: stable apiServer: certSANs: - "LOAD_BALANCER_DNS" controlPlaneEndpoint: "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" etcd: external: endpoints: - https://ETCD_0_IP:2379 - https://ETCD_1_IP:2379 - https://ETCD_2_IP:2379 caFile: /etc/kubernetes/pki/etcd/ca.crt certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
  • 26.
    Page ● Kubeadm (Stable)!!! ● Stable command-line UX (GA) ● Implementation (GA) ● Configuration file schema (beta) ● Upgrades between minor versions (GA) What’s new in 1.13
  • 27.
    Page ● Kubeadm (Stable)!!! ● Stable command-line UX (GA) ● Implementation (GA) ● Configuration file schema (beta) ● Upgrades between minor versions (GA) ● Secure bootstrap Etcd What’s new in 1.13
  • 28.
    Page ● Kubeadm (Stable)!!! ● Stable command-line UX (GA) ● Implementation (GA) ● Configuration file schema (beta) ● Upgrades between minor versions (GA) ● Secure bootstrap Etcd ● HA (alpha) kubeadm init kubeadm join --experimental-control-plane What’s new in 1.13
  • 29.
    Page HA Kubeadm Topologies ●3 Master + 3 etcd (Collocated)
  • 30.
    Page HA Kubeadm Topologies ●3 Master + 3 etcd External
  • 31.
  • 32.
    Page Current state ofscheduling ● Basic scheduling ● DaemonSets ● Nodes Selectors (e.g. Scheduling on nodes with GPU) ● Advanced Scheduling ● Node Affinity Priority ● Custom schedulers ● Taints/tolerations (e.g scenario for Specialized Hardware, Hardware failing (but not failed) ● Disruption budget (Cluster upgrades with stateful workloads) ● Pod Priority and Pre-emption (e.g. Run debuggers during overload) (allows assign priority to specific pods)
  • 33.
    Page What’s new in1.12 SIG Scheduling updates ● Quota by priority - beta ● Allows to set different namespaces to have different priorities, and assign quotas to those namespaces accordingly. This enhances the existing priority and preemption feature that was delivered in Kubernetes 1.11.
  • 34.
    Page What’s new in1.13 SIG Scheduling updates ● Scheduler can be configured to score a subset of the cluster nodes ● Kubernetes scheduler can be configured to only consider a percentage of the nodes, as long as it can find enough feasible nodes in that set. This improves the scheduler’s performance in large clusters.
  • 35.
  • 36.
    Page Container Runtime Interface(CRI) 1.7 - GA 36 AVOID LOCK-IN Goal of CRI: ● Remove docker kubelet code of out Kubernetes ● Simplify integration of K8s with other runtimes CRI runtimes ● cri-docker ● rktlet ● cri-o (based on OCI) ● cri-containerd (alpha) ● virtlet (alpha) ● frakti (alpha)
  • 37.
    Page What’s new in1.12 SIG Scheduling updates: ● RuntimeClass - alpha (cluster-scoped runtime properties) ●The runtimeClass is a new field on the PodSpec that enables users to designate the specific runtime they want to use ● E.g. it will allow users to run Docker and Gvisor containers in same Kubernetes cluster and specify specific parameters related to that runtime.
  • 38.
  • 39.
    Page Sig Nodes update: ●Dynamic audit configuration (alpha) ● Kubectl diff command (beta) What’s new in 1.13
  • 40.
  • 41.
    Page 41 Container NetworkInterface (CNI) 41 CNI is a specification proposed by CoreOS and adopted by Kubernetes. CNI is currnetly part of CNCF Goal of CNI: ● To make network layer easy pluggable ● CNM is not good option for K8s ● Avoid code duplication Third-party CNI plugins: ● Flannel ● Weave ● Calico ● Contiv and many more
  • 42.
    Page Pod-to-Pod Communication (Continues) 42 CloudProviderNetworking (kubenet): ● GCE ● AWS (50 host limit) Overlay type: ● Flannel ● Weave Layer 3 via BGP: ● Calico ● Kube-router (new) Mixed ● Canal=Calico+Flannel SDN ● Romana, OpenContrail ● Cisco, Openshift-SDN ● OVS
  • 43.
  • 46.
    Page State of NetworkPolicy in Kubernetes Network Policy is (stable) Kubernetes 1.7 release and above Features: ● Ingress (stable) policies can be defined ● Cross-namespace policies ● Egress (beta)
  • 47.
    Page Focus of SIG-Networkingwas improve to Network Policy features ● Egress - Stable ● Enables administrators to define how network traffic leaves a Pod, this rules added in addition to Ingress Network Policy rules. ● ipBlock - Stable ● ipBlock functionality allows for defining CIDR ranges in NetworkPolicy definitions. What’s new in 1.12
  • 48.
    Page Example of egressand ipBlock kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: default-block namespace: netpol-test spec: podSelector: matchLabels: role: db egress: - to: - ipBlock: cidr: 0.0.0.0/0 except: - 192.168.111.0/24 policyTypes: - Egress
  • 49.
    Page Focus of SIG-Networking ●CoreDNS - GA and default What’s new in 1.13
  • 50.
  • 51.
    Page ● K8s hasKubernetes Volume Plugins, however it is challenging adding support for new “in-tree” volume plugins ● CSI makes Kubernetes volume layer truly extensible (Beta) Current state of Storage
  • 52.
    Page Sig-Storage contributed somefollowing enhancements: ● Topology-aware dynamic provisioning - Beta ● Topology aware provisioning makes it possible for Kubernetes to more intelligently provision resources. Prevents from situation where a pod can’t start because the storage resources it needs are in a different zone. What’s new in 1.12
  • 53.
    Page Sig-Storage contributed somefollowing enhancements: ● Container Storage Interface (CSI) - GA ● Raw block device using persistent volume source (Beta) ● Topology-aware dynamic provisioning (Stable) What’s new in 1.13
  • 54.
  • 55.
    Page Autoscaling in Kubernetes 55 ●Horizontal Pod Autoscaling (HPA) Based on CPU Based on Memory Based on Custom Metrics ● Vertical Pods Autoscaling (VPA) - alpha ● Cluster Autoscaling
  • 56.
  • 57.
    Page Horizontal Pod Autoscaling(HPA) 57 Kubernetes automatically scales the number of pods in ● Deployment Metrics for autoscaling ● observed CPU utilization ● observed Memory utilization ● application-provided metrics aka Custom Metrics Pod 1 Pod 2 Pod .. Pod N RC / Deployment Autoscaler
  • 58.
    Page HPA based onCPU with Metrics Server
  • 59.
    Page Maintain a decentload ● If pods are heavily loaded then starting new pods may bring average load down.
  • 60.
    Page Maintain a decentload ● If pods are heavily loaded then starting new pods may bring average load down.
  • 61.
    Page Maintain a decentload ● If pods are heavily loaded then starting new pods may bring average load down.
  • 62.
    Page Maintain a decentload ● If pods are heavily loaded then starting new pods may bring average load down. ● If pods are barely loaded then stopping pods will free some resources and the deployment should still be ok..
  • 63.
    Page Maintain a decentload ● If pods are heavily loaded then starting new pods may bring average load down. ● If pods are barely loaded then stopping pods will free some resources and the deployment should still be ok..
  • 64.
    Page Maintain a decentload ● If pods are heavily loaded then starting new pods may bring average load down. ● If pods are barely loaded then stopping pods will free some resources and the deployment should still be ok.. ● Specify the target for the load and try to be as close as possible to it.
  • 65.
    Example of HPAwith custom metrics using Prometheus
  • 66.
    Page HPA v2 (Beta)based on Custom Metric using Prometheus apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: my-hpa spec: scaleTargetRef: apiVersion: apps/v1beta1 kind: Deployment name: <object name> minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu targetAverageUtilization: 50 - type: Pods pods: metricName:<your pod custom metric> targetAverageValue: 1k - type: Object object: metricName: <any custom metric> target: apiVersion: extensions/v1beta1 kind: <resource type> name: <resource name> targetValue: 10k
  • 67.
  • 68.
    Page Vertical Pod Autoscaling(VPA) How VPA works: ● Resource: CPU/Memory ● “Increasing CPU/Memory resources when necessary” ● Less complicated to design for resource increase ● Harder to autoscale 68
  • 69.
  • 70.
    Page VPA Limitations ● alpha,so need testing and tease out edge-cases ● in-place updates (requires support from container runtime) ● usage spikes—how to deal with it best? 70
  • 71.
    Page (Proposal) VPA Architecturewith in-place update 71
  • 72.
    Page Sig-Autoscaling made significantimprovements in HPA and VPA ● HPA (Horizontal Pod Autoscaler) ● Scaling via custom metrics (metrics-server) - beta ● Improving scaling algorithm to reach size faster - beta The algorithm used to determine how many pods should be active has been adjusted to improve the time-to-completion ● VPA (Vertical Pod Autoscaler) - beta What’s new in 1.12
  • 73.
  • 74.
  • 75.
    Page ● Runtime options: ●Container-Optimized OS with containerd (cos_containerd) - beta ● Gvisor ● Container-native Load Balancing ● Serverless Add-on (knative) ● Managed Istio Kubernetes 1.12 (GCP)
  • 76.
    Page ● Support forAzure Virtual Machine Scale Sets (VMSS) ● Cluster autoscaler support (Stable) ● Azure availability zone support (alpha) ● In future AKS will come with VMSS support Kubernetes 1.12 (Azure)
  • 77.
    Page Kubernetes 1.13 (AWS) ●AWS ALB ingress controller (alpha) ● EBS CSI driver (alpha)
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
    Falco A runtime securitytool developed by Sysdig, designed to detect anomalous activity and intrusions in Kubernetes
  • 87.
    ● Abnormal BehaviorDetection for Linux based Containers, Hosts, and Orchestration Platforms ● Commonly referred to “Runtime Security” ● Filter language can easily detect events such as: ○ Shells/processes spawned in a container ○ Unexpected outbound connections ○ Processes listening on unexpected ports ○ Files/binaries changed after container start ○ Container isolation compromised ● Automated action can be taken when abnormal events are detected Falco
  • 88.
    Why do youneed Falco? ● Image scanning is “point in time” security of choices made by developers ● Need the have ability to detect breakdowns in isolation when containers are running ● Falco can detect comprised: ○ Container isolation (vulnerabilities in container runtimes/Linux kernel) ○ Applications (exploited applications) ○ Orchestration Systems (Exposed dashboards, API ports) ● Enforces best practices & compliance requirements (PCI, SOC, GDPR)
  • 89.
  • 90.
    Falco Ecosystem Integrations withCNCF projects: - Kubernetes, rkt, containerd, fluentd Other integrations: - Sysdig, Mesos, Marathon Default Rule Set: - Ships with 25 rules around container best practices. Example: - https://sysdig.com/blog/kubernetes-security-logging-fluentd-falco/
  • 91.
  • 92.
    Runtime Security ToolsSpace Proprietary A number of vendors provide runtime security as part of a broader container security product. These products bundle capabilities from multiple security areas - such as image scanning, access control, and firewalling - to create a more extensive security product. - Sysdig Secure: The Falco rules engine is used along with proprietary software to create a SaaS based security product. - Aqua Security - Twistlock Open Source Falco is one component of a complete security tool set for Cloud Native platforms. Other complementary open source projects include Anchore, Clair, Inspec, Cilium, Notary, TUF, SPIFFE, Vault, etc. Each project covers a different area of infrastructure, software build, or runtime security. - Falco
  • 93.
  • 94.
    Rook: Sandbox ->Incubation CN Orchestrator for distributed storage systems
  • 95.
    ● Cloud-Native StorageOrchestrator ● Automates deployment, bootstrapping, configuration, provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource management ● Framework for many storage providers and solutions What is rook ?
  • 96.
    Rook Design etcdetcd Kubernetes API kubectl etcd Rook Operator Rook Agent Flexvolume Driver Kubelet Rookvol. plugin Attach & Mount Operations Management & Health APINew Object: Volume AttachmentNew Objects: Storage Clusters Storage Pools Object Store File Store Ref: https://rook.io/docs/rook/master/
  • 97.
    Rook Design withCeph Container Container Container Container Volume Claim Volume Claim Rook-agent Rook-agent Rook-agent Rook-agent Rook-agentOperator
  • 98.
    ● v0.7 releasedFeb 21, v0.8 released July 18 ○ 545 commits total ● Instituted formalized project governance policies, added a new maintainer ● Rook Framework for Storage Providers ○ Makes Rook a general cloud-native storage orchestrator ○ Supports multiple new storage solutions with reusable specs, logic, policies ○ CockroachDB and Minio orchestration released in v0.8 ○ NFS, Cassandra, Nexenta, Alluxio ongoing ● Ceph support graduated to Beta maturity ● Automatic horizontal scaling by the Ceph operator ● Improved security model and support for OpenShift ● Numerous other features and improvements 98 Progress Since Sandbox Entry
  • 99.
    Adopters: Production Usage 99 Thereare additional adopters of Rook, especially those with on-premise deployments, that are not ready to share the details of their usage publicly at this time.
  • 100.
    Centre of Excellencein Next Generation Networks 100 ● 20 bare-metal nodes providing 100TB, with more being added ● Heterogeneous mix of nodes with high disk density as well as compute-focused nodes ● Several databases, web applications, and a self-hosted file sharing solution “Rook is giving us a big head start in deploying cloud-native Ceph...having an operator that can help deploy and manage Ceph in a cloud-native environment is an ideal solution...gives us the ability to leverage both the storage and the extra compute capabilities of the storage-dense nodes” Raymond Maika, Cloud Infrastructure Engineer at CENGN
  • 101.
    Harbor: Sandbox ->Incubation A trusted container registry that stores, signs, and scans docker images.
  • 102.
    cloudops.com @cloudops_ An opensource trusted cloud native registry project. vmware.github.io/harbor HARBOR ™
  • 103.
    cloudops.com @cloudops_ What makesa trusted cloud native registry? − Registry features include ■ Docker and Helm Registry ■ Multi-tenant content signing and validation ■ Security and vulnerability analysis ■ Role based access control and LDAP/AD support ■ Image deletion & garbage collection ■ Image replication between instances ■ Internationalization (currently English and Chinese) − Operational experience ■ Deployed in containers ■ Extends, manages, and integrates proven open source components
  • 104.
    cloudops.com @cloudops_ Architecture API Routing CoreService (API/Auth/GUI) Image Registry Trusted Content Vulnerability ScanningJob Service Admin Service Harbor components 3rd party components SQL DatabaseKey/Value Storage Harbor integrates multiple open source components to provide a trusted registry. Persistence components Local or Remote Storage (block, file, object) Users (GUI/API) Container Schedulers/Runtimes Consumers LDAP/Active Directory Supporting services HarborPackaging
  • 105.
  • 106.
    cloudops.com @cloudops_ Web interfaceand vulnerability scanning
  • 107.
  • 108.
    Envoy: Incubation ->Graduation A modern edge and service proxy
  • 109.
    Envoy ● A C++based L4/L7 proxy ● Low memory footprint ● Battle-tested @ Lyft ○ 100+ services ○ 10,000+ VMs ○ 2M req/s Features: ● API driven config updates → no reloads ● Zone-aware load balancing w/ failover ● Traffic routing and splitting ● Health checks, circuit breakers, timeouts, retry budgets, fault injection, … ● HTTP/2 & gRPC ● Transparent proxying ● Designed for observability
  • 110.
  • 111.
  • 112.
    How to learnmore about CNCF projects?
  • 113.
    Cloud Native ComputingFoundation 11 3 2018-19 KubeCon + CloudNativeCon • China – Shanghai: November 14-15, 2018 – General session CFP closed! – Intro and Deep Dive Sessions CFP • North America – Seattle: December 11 - 13, 2018 – CFP open until August 12, 2018 – Intro and Deep Dive Sessions CFP • Europe – Barcelona: May 21 - 23, 2019
  • 114.
    Cloud Native ComputingFoundation 11 4 2018-19 KubeCon + CloudNativeCon
  • 115.
  • 116.
  • 117.