GitOps
Git push all the things
Alexis Richardson
CEO, Weaveworks
TOC Chair, CNCF
@monadic
May 2018
Hello
2
Hello
● WTF is GitOps
● Why is Cloud Native relevant
● How does GitOps work and in what ways is it different from $MY_DEVOPS
● Tools
● Recap
3
Meet Qordoba
● SF based team use machine learning
to create ”local” marketing UX for big
brands
● Rapid iteration while obeying SOC2
compliance
● Google Cloud – Kubernetes & CI
● Weave Cloud – single cont. delivery
& observability pipeline
Over 30 releases per day per team, up from 1-2 per week across all teams
1) Estimated time needed to fix prod software bugs ~60% less time
2) Estimated time to respond to customer requests ~43% less time
3) Uptime 99% à 100% (so far…!)
Impact
Kubernetes: declarative infrastructure & orchestration
Image credit:
Helen Beal,
Ranger4
At least a decade of DevOps best practices
GitOps is
Automation for
Cloud Native
Describe the system
& build to that plan
New ways of working
cloud led us to devops
cloud native leads us to gitops
“push code, not containers”
“operations by pull request”
• Config is code
• Code must be version controlled
• Config must be version controlled too
GitOps follows the Logic of DevOps
GitOps follows the Logic of DevOps
• Config is code
• Code must be version controlled
• Config must be version controlled too
• What can be described can be automated
• Describe everything: code, config,
monitoring & policy; and then keep it in
version control
GitOps
• Git as a source of truth for desired state of whole system yes really
the whole system
• Control loop compares desired with actual state to pull changes,
enforce convergent atomic updates and writeback to log in Git
• Diff alerts, eg.:
Atomic updates for
declarative stack
Developer experience
is just Git push
Best practice for
Continuous Delivery
with Kubernetes
Kubernetes
Current
State via
Observability
Tools
Control &
Operations
Desired State
in Git Diff
Observe
Orient
Decide
Act
Release
What this gets us
• Any developer can use GitHub
• Anyone can join team and ship a new
app or make changes easily
• All changes can be triggered, stored,
audited and validated in Git
And we didn’t have to do anything very
new or clever
“The world is envisioned
as a repo and not as a
kubernetes installation"
- Kelsey Hightower
Kubernetes ❤ GitOps
Kubernetes is complex, ideally you’d like to…
Make a pull request & just go to a URL to see app change
Avoid kubectl
Have “Bonus points for Metrics… If you give people visibility, they will
stop asking for tools like kubectl to do their job, because now they can
actually observe what’s happening in the cluster”
Who is talking about or doing GitOps?
Weaveworks
Cloudbees
Bitnami
OpenFaaS
Hasura
Ocado
Financial Times
& more!
19
About Weaveworks
● Founded in 2014, backed by Google Ventures &
Accel Partners
● Mission: help software teams go faster by
providing technologies that support cloud native
development
20
● 40 people
● Berlin
● London
● San Francisco
Team
21
Team
Some of us are known for...
● Building cloud-native OSS since 2014
(Weave Net, Moby, Kubernetes, Prometheus)
● Founding member of CNCF
● Alexis Richardson (Weaveworks CEO) is chair of
the CNCF Technical Oversight Committee
● Weave Cloud runs on Kubernetes since 2015
22
About Weaveworks
• We use declarative infrastructure ie.
Kubernetes, Docker, Terraform, … and we
“diff all the things”
• Our entire system including code, config,
monitoring rules, dashboards, is described
in GitHub with full audit trail
• We roll out major or minor changes as pull
requests for any updates, outages and D/R
GitOps at Weaveworks
Cloud Native
Cloud Native
Cloud Native
Copenhagen: Home of Lego
Home of Lego
CNCF in 2016
CNCF in 2018
CNCF is building a cloud platform
● Goal of a Cloud Platform for era of ubiquitous services
à a bigger deal than the Web
à open like Linux
à everyone is on board this time
● Business Peeps TLDR Cloud Native is Cloud
● Outcome: Innovation and new Business Models for make profit
Velocity
Hadoop
Typical Hadoop Project 2013
2018: Kubeflow
Componentisation
Componentisation
No platform?
Who
wants to
build a
toaster?
Platforms enable Velocity
● Higher speed
● Lower barriers to entry
● Explosion of higher order systems
Velocity is a key metric in Continuous Delivery
High-performing teams deploy
more frequently and have
much faster lead times
They make changes with fewer
failures, and recover faster
from failures
200x more frequent
deployments
2,555x shorter lead
times
3x lower
change failure rate
24x faster
recovery from failures
200x
2,555x 3x
24x
Source: 2016 State of DevOps Report (Puppet Labs)
Make me a Velocity
Developers write code
that powers Applications
and integrates Services
deployed to a Cloud Platform that is easy, stable & operable
using best practices for Continuous Delivery at high velocity
New Cloud Platform
“Just run my code”
Kubernetes
Infra - Cloud & DCs & Edge
Other CNCF
Projects
Local Services &
Data
Code >>
Containers >>
1000s of ways to “Just Run My Code”
● Serverless: Openfaas, Kubeless, OpenEvents, AWS Lambda….
● PaaS (Openshift, Cloud Foundry..), MBaaS, KMaaS, ..
● Kubeflow, Istio, Pachyderm and other k8s native app f/works
● Declarative app def eg compose, ksonnet, ballerina
● Native general frameworks: metaparticle
● Ports: Laravel (PHP!) and other app frameworks to Kube
● Tools: Cert-manager, ChaosIQ, ..
● Explosion of higher order systems is caused by platform
Serverless & Kubernetes will converge
● Ubiquity of Kubernetes will pull serverless into the story - from “run my
containers” to “run my code”
● Consumption and packaging of services is where serverless and functions
add value today, and will be part of the Platform. AWS Lambda is a “clue”
not the “answer”.
● Commonly used programming tools will unify Kubernetes, containers,
“serverless”, managed services / APIs
● These models will be cloud agnostic
● The “pay per call” serverless business model will just be a feature of the
cloud platform management layer (eg: AWS Fargate)
Getting to a Cloud Platform
2017 2018-20 2020+
Core Platform
- Kubernetes & containers
Observability / Operability
- monitoring (prom.)
- logging (fluentd)
- tracing (jaeger, OT)
Routing
- mesh (envoy, linkerd)
- messaging (nats)
Security:
Spiffe, OPA, SAFE
Storage:
- orchestration
- CSI
- other
Interfaces:
- OpenMetrics
- OpenEvents
Developer On Ramp:
CICD, Helm packaging, &c
Marketplace of Services
and other Add-ons
“Just run my code” user
experiences for 1000s of
different use cases
>> Towards Ubiquity
Cloud native – just run my code
Practice
Tribes gotta tribe
New ways of working
cloud led us to devops
cloud native leads to gitops
“push code not containers”
“operations by pull request”
Summary
● Cloud Platform powered by CNCF tools, Kubernetes at the core
● Multi Cloud support: Amazon, Azure, OSS
● Explosion of higher order tools and services
● GitOps for high velocity delivery pipeline
So about GitOps
● Why Git
● Examples of what’s in Git (and image repo)
● CICD pipeline
● Security, Compliance & Audit
● Observability & Control
● Tools Overview
GitOps in depth
55
GitOps builds on DevOps with Git as a single source of truth for the
desired state of the system
● The entire system state is under version control and described in Git (trunk best)
● Operational changes on production clusters are made by pull request
● Rollback and audit logs are provided via Git
● When disaster strikes, the whole infrastructure can be quickly restored from Git
57
58
Canonical
source of truth
59
Canonical
source of truth
People
60
Canonical
source of truth
People
Software
Agents
61
Canonical
source of truth
People
Software
Agents
Software
Agents
62
Canonical
source of truth
People
Software
Agents
Software
Agents
63
Canonical
source of truth
Clear model with strong separations of concerns
(safety)
Easy rollbacks and reverts (velocity)
Tapping into existing code review tools and
processes
Great compliance tool
Collaboration point between software and
humans
64
?
Dashboards
Alerts
Playbook
Kubernetes Manifests
Application configuration
Provisioning scripts
65
Application checklists
Recording Rules
Sealed Secrets
66
67
Grafanalib dashboard library
https://github.com/weaveworks/grafanalib
68
YAML Service Checklist
Destination
config
apiVersion: config.istio.io/v1beta1
kind: DestinationPolicy
metadata:
name: ratings-lb-policy
namespace: default
spec:
destination:
name: reviews
labels:
version: v1
loadBalancing:
name: ROUND_ROBIN
circuitBreaker:
simpleCb:
maxConnections: 100
httpMaxRequests: 1000
httpMaxRequestsPerConnection: 10
httpConsecutiveErrors: 7
sleepWindow: 15m
httpDetectionInterval: 5m
RANDOM, LEAST_CONN
Limits outgoing connections to
“v1” of the reviews service
● 100 connections
● 1000 concurrent requests
● 10 rps
Load-balances in round-robin
fashion across all reviews “v1”
endpoints
Configures host ejection
● 7 consecutive 5xx errors
● Period of 15 minutes
● Scanned every 5 minutes
Egress config
apiVersion: config.istio.io/v1beta1
kind: EgressRule
metadata:
name: foo-egress-rule
spec:
destination:
service: *.foo.com
ports:
- port: 80
protocol: http
- port: 443
protocol: https
Provides access to a set of
services under the foo.com
domain.
Sidecar will handle automatically
upgrading connection to TLS, if
desired.
● Must access as HTTP
● Example:
http://mail.foo.com:443
Routing config
apiVersion: config.istio.io/v1beta1
kind: RouteRule
metadata:
name: reviews-rating-jason-rule
namespace: default
spec:
destination:
name: ratings
route:
- labels:
version: v1
weight: 100
match:
source:
name: reviews
labels:
version: v2
request:
headers:
cookie:
regex: "^(.*?;)?(user=jason)(;.*)?"
uri:
For traffic going to the ratings
service send all of it to “v1” if:
● It is coming from “v2” the
reviews services
● And the URL path starts
with /ratings/v2
● And the request contains a
cookie with the value
“user=jason”
Redirect Config
Fault Injection
# HTTP Redirect snippet
spec:
destination:
name: ratings
match:
request:
headers:
uri: /v1/getProductRatings
redirect:
uri: /v1/bookRatings
authority: bookratings.default.svc.cluster.local
---
# Fault injection snippet
spec:
destination:
name: reviews
route:
- labels:
version: v1
httpFault:
abort:
percent: 10
httpStatus: 400
HTTP Redirection
● For all requests to
/v1/getProductRatings,
return a 302 with a location
of /v1/bookRatings and
overwrite the
host/authority header.
HTTP Fault injection
● For 10% of requests to v1 of
the reviews service, fail with
a status code of 400
Timeouts, retries, request
rewrites, delays configured
similarly
Pipelines & Security
73
Pipelines & Control Loops
Deployment
App Dev Build (CI) Containers
Execution
(CD + Release
Automation)
Observe & Control
CI Image RepoCode Repo
Typical CICD pipeline
ClusterDev RW
RW RWRW
RO RW RO
There should be a firewall between CI and CD
CI CD
GitOps separation of concerns
CI tooling
Scope: test, build, publish artifacts
● Runs outside the production cluster
● Read access to code repo
● Read/Write access to image repo
● Read/Write access to integration env
● “Push” based
CD tooling
Scope: reconciliation between git and the cluster
● Runs inside the production cluster
● Read/Write access to config repo
● Read access to image repo
● Read/Write access to production cluster
● “Pull” based
CICode Repo
Kubernetes API
GitOps CICD pipeline
Dev RO
RO
CD OperatorRO
RW
RW
RW
RW Image Repo
Config Repo
GitOps enables security
● The CI tooling can be push based but has no production system
access
● The CD tooling is pull based and retains the production
credentials inside the cluster
● Developers can’t push directly to image registry
● Cluster API & credentials are never exposed/cross boundary
● Encrypted API keys and data storage credentials can be stored in
Git and decrypted at deploy time inside the cluster
CI ops
80
Kubernetes: operator pattern
Git
Config
Kubernetes Cluster
Deployment
Service
Deploy
Operator
Write back from Kubernetes to maintain TX audit log
○ Config is code & everything is config (‘declarative infra’)
○ Code (& config!) must be version controlled
○ Anything that does not record changes in version
control is harmful – Git as Audit Log
Atomic Updates
○ Groups of changes are hard
○ Partial success / failure à redeploy cluster?
○ Want atomic update-in-place
○ Operators can do this. It’s really hard with CI scripts.
○ Git as Transaction Log
Example pipeline
Git
Code
Git
Config
Container
Registry
Build
Container
(CI)
Update image in staging config
1/ Code change
2/ Merge
Staging to
Prod
Config Updater
Kubernetes Cluster
Deployment
Service
Deploy
Operator
Typical (not mandatory) Structure of a GitOps repository
● At least 1 repository per application/service
● Config & code in separate repos. Images named via labels.
● Use a separate branch per environment (maps to a Kubernetes
namespace, or cluster)
● Push changes such as the image name, health checks, etc to
staging (or feature) branches first.
● Rolling out to production involves a merge. (use `git merge -s
ours branchname` to skip a set of staging-only changes).
● Use protected branches to enforce code review requirements.
Staging
Use declarative configuration to define your application and services.
All changes need to go through your git review process – noone should be using
kubectl directly. (also: don’t push from CI to prod)
Use an operator in the cluster to drive the observed cluster state to the desired
state, as declared by your configuration in git
Summary: Three core principles of GitOps
Cluster updates are a sequence of atomic transactions which succeed or fail
cleanly, and are so easy to do that your team velocity will rocket up
Git provides a transaction log for rollback, audit, and team work
Config and image repos act as a “firewall” between dev and prod, e.g. so that CI
cannot “own production” if hacked.
Summary: Three technical benefits of GitOps
❯ GitOps operational mindset, all
k8s applications stored in Git.
❯ Securely automate & share
secrets publicly
❯ Asymmetric (public key)
cryptography
❯ Encrypt data up to (and inside)
K8s cluster
Bitnami: Encrypt Kubernetes SecretsSealed
Secrets
Observability &
Control
91
Validating what happened is PART OF THE DEPLOYMENT
94
Declare
95
Declare
Implement
96
Declare
Implement
Automated by
software
agents
97
Declare
Implement
Monitor /
Observe
Automated by
software
agents
98
Declare
Implement
Monitor /
Observe
Default
dashboards
Automated by
software
agents
99
Declare
Implement
Monitor /
Observe
Plan
Automated by
software
agents
Default
dashboards
10
0
Declare
Implement
Monitor /
Observe
Plan
Automated by
software
agents
Default
dashboards
10
1
Declare
Implement
Monitor /
Observe
Plan
Automated by
software
agents
Default
dashboards
10
2
Declare
Implement
Monitor
Plan
Continuous
Deployment
Default
dashboards
Automated by
software
agents
Improving UX is PART OF DEPLOYMENT
• End user happiness is all
• Integrate GitOps CD pipeline with
tools to observe results of PRs
• Developers have to correlate UX
to operational concepts like
monitoring, tracing, logs
• Like doctors, we must be able to
validate health as well as
diagnose problems
Every service should have a unified interactive dash
(eg. metrics + events + actions; image is from Lyft)
Fundamental
Theorem
ONLY what can be
described and
observed can be
automated and
controlled
Three GitOps Takeaways
• Git push is a great DX – “push code not containers" - best
practice for Kubernetes, Cloud Native & Serverless…
• GitOps is about more than triggering cluster deployment via a
PR, it is a full transactional operating model for the whole
stack. It is “scale invariant” and it uses a control loop to
implement a “joined up” pipeline for delivery and observability
• GitOps is different from CI ops. It is based on ‘firewall’ between
Dev and Ops, it guarantees deployments are correct or fail
cleanly, it integrates with Observability & Control tools
FASTER, BETTER
& SAFER
10
7
Tools?
10
8
● DIY
● CI ops
● PaaS (Heroku, Cloud Foundry …)
● Dedicated modern CD tools
Choices
10
9
Not EITHER / OR
● Spinnaker
● Helm
● Weave Flux / Weave Cloud
● JenkinsX
● Skaffold
● Gitkube
● Harness
Dedicated tools for app dev and/or cicd
11
0
● Created by Netflix for Netflix
● Jenkins++ CICD tool, with Pipeline Management and Release Management
● Pipelines GUI, nested pipelines, canary as pipeline…
● Designed for VMs – doesn’t “speak Kubernetes” (also: Terraform?)
● Good if your Release model is “Deploy my VMs and start my cluster”
● “CI Ops”, so Not Good if your Release model is atomic updates pulled by operator
● Does not use Git, uses external DB.
● Audit log & desired state not complete
● Generally complicated with lots of moving parts. Operationally burdensome even if
run in Kubernetes
Spinnaker
11
1
● V2 of Kubernetes templating system
● Writes a group of changes as a “chart” – so can be a packaging tool for Kubernetes
● De facto “app API” for Kubernetes – great for getting started
● *** IS NOT A CD TOOL ***
● CI + Helm is a dangerous pattern
● Non-atomic
● Non-deterministic
● Non-compositional
● Tiller
Helm
11
2
● Created for Kubernetes by Weaveworks, will go to CNCF
● Only does Release Management: pull based CD, policy, staging, audit trail
● Works with any CI but *** does not connect to CI ***
● Watches repos. Updates on label & config change, no need for a “full rebuild”
● Kubernetes native – all Kube objects, also Helm, CRDs – make Helm do GitOps
● Secure (if cluster is)
● Orchestrator forces convergent atomic updates on cluster even for group of
changes – succeeds or fails cleanly, no need for full cluster reboot
● COMPLETE record in Git kept in sync. Rollback & roll forward
● Diffs – continually monitors cluster & repo to spot drift
Weave Flux
11
3
● Simple Gitops model for DEV with Kubernetes
● Push to gitkube remote server that lives in your cluster (ie. runs custom git server
inside Kubernetes cluster)
● Runs build for you, instead of CI. Couples continuous build of Docker images &
continuous deployment to the cluster. These should be decoupled.
● Pushes container into Kubernetes, but not Kube objects, not Helm, not CRDs
● Not atomic or idempotent
● No built in monitoring, so deployments may not converge
● Does not track changes in Git
Gitkube
11
4
● Skaffold
● Weave Flux
● Jenkins X
● Minikube
● Docker
Gitops developer toolkit?
11
5
Weave Cloud
11
6
Commercial
11
7
11
8
11
9
12
0
12
1
Anything missing?
Anything missing?
Developers
(That means YOU)
Thank you!
12
4
Alexis Richardson
alexis@weave.works
@monadic
facebook.com/WeaveworksInc/
twitter.com/weaveworks
slack.weave.works/
youtube.com/c/WeaveWorksInc
linkedin.com/company/weaveworks
@weaveworks
https://weave.works

Continuous Lifecycle London 2018 Event Keynote

  • 1.
    GitOps Git push allthe things Alexis Richardson CEO, Weaveworks TOC Chair, CNCF @monadic May 2018
  • 2.
  • 3.
    Hello ● WTF isGitOps ● Why is Cloud Native relevant ● How does GitOps work and in what ways is it different from $MY_DEVOPS ● Tools ● Recap 3
  • 4.
    Meet Qordoba ● SFbased team use machine learning to create ”local” marketing UX for big brands ● Rapid iteration while obeying SOC2 compliance ● Google Cloud – Kubernetes & CI ● Weave Cloud – single cont. delivery & observability pipeline
  • 6.
    Over 30 releasesper day per team, up from 1-2 per week across all teams 1) Estimated time needed to fix prod software bugs ~60% less time 2) Estimated time to respond to customer requests ~43% less time 3) Uptime 99% à 100% (so far…!) Impact
  • 7.
  • 8.
    Image credit: Helen Beal, Ranger4 Atleast a decade of DevOps best practices
  • 9.
    GitOps is Automation for CloudNative Describe the system & build to that plan
  • 10.
    New ways ofworking cloud led us to devops cloud native leads us to gitops “push code, not containers” “operations by pull request”
  • 11.
    • Config iscode • Code must be version controlled • Config must be version controlled too GitOps follows the Logic of DevOps
  • 12.
    GitOps follows theLogic of DevOps • Config is code • Code must be version controlled • Config must be version controlled too • What can be described can be automated • Describe everything: code, config, monitoring & policy; and then keep it in version control
  • 13.
    GitOps • Git asa source of truth for desired state of whole system yes really the whole system • Control loop compares desired with actual state to pull changes, enforce convergent atomic updates and writeback to log in Git • Diff alerts, eg.:
  • 14.
    Atomic updates for declarativestack Developer experience is just Git push Best practice for Continuous Delivery with Kubernetes Kubernetes Current State via Observability Tools Control & Operations Desired State in Git Diff Observe Orient Decide Act Release
  • 15.
    What this getsus • Any developer can use GitHub • Anyone can join team and ship a new app or make changes easily • All changes can be triggered, stored, audited and validated in Git And we didn’t have to do anything very new or clever
  • 16.
    “The world isenvisioned as a repo and not as a kubernetes installation" - Kelsey Hightower Kubernetes ❤ GitOps
  • 17.
    Kubernetes is complex,ideally you’d like to… Make a pull request & just go to a URL to see app change Avoid kubectl Have “Bonus points for Metrics… If you give people visibility, they will stop asking for tools like kubectl to do their job, because now they can actually observe what’s happening in the cluster”
  • 18.
    Who is talkingabout or doing GitOps? Weaveworks Cloudbees Bitnami OpenFaaS Hasura Ocado Financial Times & more!
  • 19.
    19 About Weaveworks ● Foundedin 2014, backed by Google Ventures & Accel Partners ● Mission: help software teams go faster by providing technologies that support cloud native development
  • 20.
    20 ● 40 people ●Berlin ● London ● San Francisco Team
  • 21.
    21 Team Some of usare known for...
  • 22.
    ● Building cloud-nativeOSS since 2014 (Weave Net, Moby, Kubernetes, Prometheus) ● Founding member of CNCF ● Alexis Richardson (Weaveworks CEO) is chair of the CNCF Technical Oversight Committee ● Weave Cloud runs on Kubernetes since 2015 22 About Weaveworks
  • 23.
    • We usedeclarative infrastructure ie. Kubernetes, Docker, Terraform, … and we “diff all the things” • Our entire system including code, config, monitoring rules, dashboards, is described in GitHub with full audit trail • We roll out major or minor changes as pull requests for any updates, outages and D/R GitOps at Weaveworks
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 32.
  • 33.
  • 34.
    CNCF is buildinga cloud platform ● Goal of a Cloud Platform for era of ubiquitous services à a bigger deal than the Web à open like Linux à everyone is on board this time ● Business Peeps TLDR Cloud Native is Cloud ● Outcome: Innovation and new Business Models for make profit
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
    Platforms enable Velocity ●Higher speed ● Lower barriers to entry ● Explosion of higher order systems
  • 42.
    Velocity is akey metric in Continuous Delivery High-performing teams deploy more frequently and have much faster lead times They make changes with fewer failures, and recover faster from failures 200x more frequent deployments 2,555x shorter lead times 3x lower change failure rate 24x faster recovery from failures 200x 2,555x 3x 24x Source: 2016 State of DevOps Report (Puppet Labs)
  • 44.
    Make me aVelocity Developers write code that powers Applications and integrates Services deployed to a Cloud Platform that is easy, stable & operable using best practices for Continuous Delivery at high velocity
  • 45.
    New Cloud Platform “Justrun my code” Kubernetes Infra - Cloud & DCs & Edge Other CNCF Projects Local Services & Data Code >> Containers >>
  • 46.
    1000s of waysto “Just Run My Code” ● Serverless: Openfaas, Kubeless, OpenEvents, AWS Lambda…. ● PaaS (Openshift, Cloud Foundry..), MBaaS, KMaaS, .. ● Kubeflow, Istio, Pachyderm and other k8s native app f/works ● Declarative app def eg compose, ksonnet, ballerina ● Native general frameworks: metaparticle ● Ports: Laravel (PHP!) and other app frameworks to Kube ● Tools: Cert-manager, ChaosIQ, .. ● Explosion of higher order systems is caused by platform
  • 47.
    Serverless & Kuberneteswill converge ● Ubiquity of Kubernetes will pull serverless into the story - from “run my containers” to “run my code” ● Consumption and packaging of services is where serverless and functions add value today, and will be part of the Platform. AWS Lambda is a “clue” not the “answer”. ● Commonly used programming tools will unify Kubernetes, containers, “serverless”, managed services / APIs ● These models will be cloud agnostic ● The “pay per call” serverless business model will just be a feature of the cloud platform management layer (eg: AWS Fargate)
  • 48.
    Getting to aCloud Platform 2017 2018-20 2020+ Core Platform - Kubernetes & containers Observability / Operability - monitoring (prom.) - logging (fluentd) - tracing (jaeger, OT) Routing - mesh (envoy, linkerd) - messaging (nats) Security: Spiffe, OPA, SAFE Storage: - orchestration - CSI - other Interfaces: - OpenMetrics - OpenEvents Developer On Ramp: CICD, Helm packaging, &c Marketplace of Services and other Add-ons “Just run my code” user experiences for 1000s of different use cases >> Towards Ubiquity
  • 49.
    Cloud native –just run my code
  • 50.
  • 51.
    New ways ofworking cloud led us to devops cloud native leads to gitops “push code not containers” “operations by pull request”
  • 52.
    Summary ● Cloud Platformpowered by CNCF tools, Kubernetes at the core ● Multi Cloud support: Amazon, Azure, OSS ● Explosion of higher order tools and services ● GitOps for high velocity delivery pipeline
  • 53.
  • 54.
    ● Why Git ●Examples of what’s in Git (and image repo) ● CICD pipeline ● Security, Compliance & Audit ● Observability & Control ● Tools Overview GitOps in depth 55
  • 55.
    GitOps builds onDevOps with Git as a single source of truth for the desired state of the system ● The entire system state is under version control and described in Git (trunk best) ● Operational changes on production clusters are made by pull request ● Rollback and audit logs are provided via Git ● When disaster strikes, the whole infrastructure can be quickly restored from Git
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
    63 Canonical source of truth Clearmodel with strong separations of concerns (safety) Easy rollbacks and reverts (velocity) Tapping into existing code review tools and processes Great compliance tool Collaboration point between software and humans
  • 63.
  • 64.
    Dashboards Alerts Playbook Kubernetes Manifests Application configuration Provisioningscripts 65 Application checklists Recording Rules Sealed Secrets
  • 65.
  • 66.
  • 67.
  • 68.
    Destination config apiVersion: config.istio.io/v1beta1 kind: DestinationPolicy metadata: name:ratings-lb-policy namespace: default spec: destination: name: reviews labels: version: v1 loadBalancing: name: ROUND_ROBIN circuitBreaker: simpleCb: maxConnections: 100 httpMaxRequests: 1000 httpMaxRequestsPerConnection: 10 httpConsecutiveErrors: 7 sleepWindow: 15m httpDetectionInterval: 5m RANDOM, LEAST_CONN Limits outgoing connections to “v1” of the reviews service ● 100 connections ● 1000 concurrent requests ● 10 rps Load-balances in round-robin fashion across all reviews “v1” endpoints Configures host ejection ● 7 consecutive 5xx errors ● Period of 15 minutes ● Scanned every 5 minutes
  • 69.
    Egress config apiVersion: config.istio.io/v1beta1 kind:EgressRule metadata: name: foo-egress-rule spec: destination: service: *.foo.com ports: - port: 80 protocol: http - port: 443 protocol: https Provides access to a set of services under the foo.com domain. Sidecar will handle automatically upgrading connection to TLS, if desired. ● Must access as HTTP ● Example: http://mail.foo.com:443
  • 70.
    Routing config apiVersion: config.istio.io/v1beta1 kind:RouteRule metadata: name: reviews-rating-jason-rule namespace: default spec: destination: name: ratings route: - labels: version: v1 weight: 100 match: source: name: reviews labels: version: v2 request: headers: cookie: regex: "^(.*?;)?(user=jason)(;.*)?" uri: For traffic going to the ratings service send all of it to “v1” if: ● It is coming from “v2” the reviews services ● And the URL path starts with /ratings/v2 ● And the request contains a cookie with the value “user=jason”
  • 71.
    Redirect Config Fault Injection #HTTP Redirect snippet spec: destination: name: ratings match: request: headers: uri: /v1/getProductRatings redirect: uri: /v1/bookRatings authority: bookratings.default.svc.cluster.local --- # Fault injection snippet spec: destination: name: reviews route: - labels: version: v1 httpFault: abort: percent: 10 httpStatus: 400 HTTP Redirection ● For all requests to /v1/getProductRatings, return a 302 with a location of /v1/bookRatings and overwrite the host/authority header. HTTP Fault injection ● For 10% of requests to v1 of the reviews service, fail with a status code of 400 Timeouts, retries, request rewrites, delays configured similarly
  • 72.
  • 73.
    Pipelines & ControlLoops Deployment App Dev Build (CI) Containers Execution (CD + Release Automation) Observe & Control
  • 74.
    CI Image RepoCodeRepo Typical CICD pipeline ClusterDev RW RW RWRW RO RW RO
  • 75.
    There should bea firewall between CI and CD CI CD
  • 76.
    GitOps separation ofconcerns CI tooling Scope: test, build, publish artifacts ● Runs outside the production cluster ● Read access to code repo ● Read/Write access to image repo ● Read/Write access to integration env ● “Push” based CD tooling Scope: reconciliation between git and the cluster ● Runs inside the production cluster ● Read/Write access to config repo ● Read access to image repo ● Read/Write access to production cluster ● “Pull” based
  • 77.
    CICode Repo Kubernetes API GitOpsCICD pipeline Dev RO RO CD OperatorRO RW RW RW RW Image Repo Config Repo
  • 78.
    GitOps enables security ●The CI tooling can be push based but has no production system access ● The CD tooling is pull based and retains the production credentials inside the cluster ● Developers can’t push directly to image registry ● Cluster API & credentials are never exposed/cross boundary ● Encrypted API keys and data storage credentials can be stored in Git and decrypted at deploy time inside the cluster
  • 79.
  • 80.
    Kubernetes: operator pattern Git Config KubernetesCluster Deployment Service Deploy Operator
  • 81.
    Write back fromKubernetes to maintain TX audit log ○ Config is code & everything is config (‘declarative infra’) ○ Code (& config!) must be version controlled ○ Anything that does not record changes in version control is harmful – Git as Audit Log
  • 82.
    Atomic Updates ○ Groupsof changes are hard ○ Partial success / failure à redeploy cluster? ○ Want atomic update-in-place ○ Operators can do this. It’s really hard with CI scripts. ○ Git as Transaction Log
  • 83.
    Example pipeline Git Code Git Config Container Registry Build Container (CI) Update imagein staging config 1/ Code change 2/ Merge Staging to Prod Config Updater Kubernetes Cluster Deployment Service Deploy Operator
  • 84.
    Typical (not mandatory)Structure of a GitOps repository ● At least 1 repository per application/service ● Config & code in separate repos. Images named via labels. ● Use a separate branch per environment (maps to a Kubernetes namespace, or cluster) ● Push changes such as the image name, health checks, etc to staging (or feature) branches first. ● Rolling out to production involves a merge. (use `git merge -s ours branchname` to skip a set of staging-only changes). ● Use protected branches to enforce code review requirements.
  • 85.
  • 86.
    Use declarative configurationto define your application and services. All changes need to go through your git review process – noone should be using kubectl directly. (also: don’t push from CI to prod) Use an operator in the cluster to drive the observed cluster state to the desired state, as declared by your configuration in git Summary: Three core principles of GitOps
  • 87.
    Cluster updates area sequence of atomic transactions which succeed or fail cleanly, and are so easy to do that your team velocity will rocket up Git provides a transaction log for rollback, audit, and team work Config and image repos act as a “firewall” between dev and prod, e.g. so that CI cannot “own production” if hacked. Summary: Three technical benefits of GitOps
  • 88.
    ❯ GitOps operationalmindset, all k8s applications stored in Git. ❯ Securely automate & share secrets publicly ❯ Asymmetric (public key) cryptography ❯ Encrypt data up to (and inside) K8s cluster Bitnami: Encrypt Kubernetes SecretsSealed Secrets
  • 89.
  • 90.
    Validating what happenedis PART OF THE DEPLOYMENT
  • 92.
  • 93.
  • 94.
  • 95.
  • 96.
  • 97.
  • 98.
  • 99.
  • 100.
  • 101.
    Improving UX isPART OF DEPLOYMENT • End user happiness is all • Integrate GitOps CD pipeline with tools to observe results of PRs • Developers have to correlate UX to operational concepts like monitoring, tracing, logs • Like doctors, we must be able to validate health as well as diagnose problems
  • 102.
    Every service shouldhave a unified interactive dash (eg. metrics + events + actions; image is from Lyft)
  • 103.
    Fundamental Theorem ONLY what canbe described and observed can be automated and controlled
  • 104.
    Three GitOps Takeaways •Git push is a great DX – “push code not containers" - best practice for Kubernetes, Cloud Native & Serverless… • GitOps is about more than triggering cluster deployment via a PR, it is a full transactional operating model for the whole stack. It is “scale invariant” and it uses a control loop to implement a “joined up” pipeline for delivery and observability • GitOps is different from CI ops. It is based on ‘firewall’ between Dev and Ops, it guarantees deployments are correct or fail cleanly, it integrates with Observability & Control tools
  • 105.
  • 106.
  • 107.
    ● DIY ● CIops ● PaaS (Heroku, Cloud Foundry …) ● Dedicated modern CD tools Choices 10 9
  • 108.
    Not EITHER /OR ● Spinnaker ● Helm ● Weave Flux / Weave Cloud ● JenkinsX ● Skaffold ● Gitkube ● Harness Dedicated tools for app dev and/or cicd 11 0
  • 109.
    ● Created byNetflix for Netflix ● Jenkins++ CICD tool, with Pipeline Management and Release Management ● Pipelines GUI, nested pipelines, canary as pipeline… ● Designed for VMs – doesn’t “speak Kubernetes” (also: Terraform?) ● Good if your Release model is “Deploy my VMs and start my cluster” ● “CI Ops”, so Not Good if your Release model is atomic updates pulled by operator ● Does not use Git, uses external DB. ● Audit log & desired state not complete ● Generally complicated with lots of moving parts. Operationally burdensome even if run in Kubernetes Spinnaker 11 1
  • 110.
    ● V2 ofKubernetes templating system ● Writes a group of changes as a “chart” – so can be a packaging tool for Kubernetes ● De facto “app API” for Kubernetes – great for getting started ● *** IS NOT A CD TOOL *** ● CI + Helm is a dangerous pattern ● Non-atomic ● Non-deterministic ● Non-compositional ● Tiller Helm 11 2
  • 111.
    ● Created forKubernetes by Weaveworks, will go to CNCF ● Only does Release Management: pull based CD, policy, staging, audit trail ● Works with any CI but *** does not connect to CI *** ● Watches repos. Updates on label & config change, no need for a “full rebuild” ● Kubernetes native – all Kube objects, also Helm, CRDs – make Helm do GitOps ● Secure (if cluster is) ● Orchestrator forces convergent atomic updates on cluster even for group of changes – succeeds or fails cleanly, no need for full cluster reboot ● COMPLETE record in Git kept in sync. Rollback & roll forward ● Diffs – continually monitors cluster & repo to spot drift Weave Flux 11 3
  • 112.
    ● Simple Gitopsmodel for DEV with Kubernetes ● Push to gitkube remote server that lives in your cluster (ie. runs custom git server inside Kubernetes cluster) ● Runs build for you, instead of CI. Couples continuous build of Docker images & continuous deployment to the cluster. These should be decoupled. ● Pushes container into Kubernetes, but not Kube objects, not Helm, not CRDs ● Not atomic or idempotent ● No built in monitoring, so deployments may not converge ● Does not track changes in Git Gitkube 11 4
  • 113.
    ● Skaffold ● WeaveFlux ● Jenkins X ● Minikube ● Docker Gitops developer toolkit? 11 5
  • 114.
  • 115.
  • 116.
  • 117.
  • 118.
  • 119.
  • 120.
  • 121.
  • 122.