Speeding up your
team with GitOps
London SEAM – June 2019
Brice Fernandes – brice@weave.works
1
2
I’m Brice
I work for Weaveworks.
You can find Weaveworks at https://www.weave.works
or @weaveworks
Team at Weaveworks is behind the GitOps model.
We build GitOps tools for enterprise Kubernetes and
Cloud Native.
3
This talk
Tactical
Technical
Process / tools
4
This talk
Tactical
Technical
Process / tools
Principles of Operations
What is GitOps?
5
6
GitOps is...
7
GitOps is...
An operation model for
8
GitOps is...
An operation model
Derived from CS and operation knowledge
9
GitOps is...
An operation model
Derived from CS and operation knowledge
Technology agnostic (name notwithstanding)
10
GitOps is...
An operation model
Derived from CS and operation knowledge
Technology agnostic (name notwithstanding)
A set of principles (Why instead of How)
11
GitOps is...
An operation model
Derived from CS and operation knowledge
Technology agnostic (name notwithstanding)
A set of principles (Why instead of How)
Although
Weaveworks
can help
with how
12
GitOps is...
An operation model
Derived from CS and operation knowledge
Technology agnostic (name notwithstanding)
A set of principles (Why instead of How)
A way to speed up your team
The GitOps Model
13
14
Kubernetes
Cluster
GitOps ON Kubernetes
15
GitOps ON Kubernetes
Kubectl /
Direct access
16
GitOps ON Kubernetes
17
Configuration
Repository
GitOps ON Kubernetes
18
GitOps ON Kubernetes
19
Deployment
Agent *
GitOps ON Kubernetes
20
Security
Boundary
*
21
GitOps ON Kubernetes
22
GitOps ON Kubernetes
Image
Repository
23
GitOps ON Kubernetes
State
continuously
monitored
24
GitOps ON Kubernetes
Control Loop
The Principles of GitOps
25
26
1 The entire system is described declaratively.
2 The desired system state is versioned
3 Approved changes to the desired state are
automatically applied to the system
4 Software agents ensure correctness
and alert on divergence
27
GitOps ON Kubernetes
Image
Repository
1
The entire system is described declaratively.
28
GitOps ON Kubernetes
Image
Repository
2
The desired system state is versioned
29
GitOps ON Kubernetes
Image
Repository
3
Approved changes to the desired state are
automatically applied to the system
30
GitOps ON Kubernetes
Image
Repository
4
Software agents ensure correctness
and alert on divergence
What should be GitOps’ed?
31
What should be GitOps’ed?
32
I’m so very
sorry
33
?
Dashboards
Alerts
Playbook
Kubernetes Manifests
Application configuration
Provisioning scripts
34
Application checklists
Recording Rules
Sealed Secrets
Dashboards
Alerts
Playbook
Kubernetes Manifests
Application configuration
Provisioning scripts
35
Application checklists
Recording Rules
Sealed Secrets
Technical / Application Domain
36
Why should we care?
37
Typical CICD pipeline
Continuous Integration
Cluster API
Continuous Delivery/Deployment
Container
Registry
CI
Code
Repo
Dev RW
CI credsGit creds
RW
CR creds3
RO
RW
API creds
CR creds1
Shares credentials cross several logical security boundaries.
Boundary
RO RW
Container
Registry (CR)
creds2
Cluster API
GitOps pipeline
Container
Registry
CI
Code
Repo
Dev RO
CR creds2
CI credsGit creds
RO
Deploy
CR creds3
RO
RW
Config repo
creds
CR creds1
Credentials are never shared across a logical security boundary.
RW RW
RW
Cluster API
creds
Canonical desired
state store
Config Repo
Cluster API
GitOps pipeline
Container
Registry
CI
Code
Repo
Dev RO
CR creds2
CI credsGit creds
RO
Deploy
CR creds3
RO
RW
Config repo
creds
CR creds1
Credentials are never shared across a logical security boundary.
RW RW
RW
Cluster API
creds
Operator RW Config Repo
Operator
Cluster API
GitOps pipeline
Container
Registry
CI
Code
Repo
Dev RO
CR creds2
CI credsGit creds
RO
Deploy
CR creds3
RO
RW
Config repo
creds
CR creds1
Credentials are never shared across a logical security boundary.
RW RW
RW
Cluster API
creds
RW Config Repo
Process & constraints
enforcement
Operator
Cluster API
GitOps pipeline
Container
Registry
CI
Code
Repo
Dev RO
CR creds2
CI credsGit creds
RO
Deploy
CR creds3
RO
RW
Config repo
creds
CR creds1
Credentials are never shared across a logical security boundary.
RW RW
RW
Cluster API
creds
RW Config Repo
Exceptional auditing
and attribution*
43
● Trivialises rollbacks
● Exceptional auditing and attribution*
● Separation of concerns
● No crossing security boundary
● Process & constraints enforcement
● Great Software ↔ Human collaboration point
● Easy to validate for correctness (Policies)
● System can self heal
Why should we care?
44
● Trivialises rollbacks
● Exceptional auditing and attribution*
● Separation of concerns
● No crossing security boundary
● Process & constraints enforcement
● Great Software ↔ Human collaboration point
● Easy to validate for correctness (Policies)
● System can self heal
Why should we care?
* If you’ve secured your Git repositories properly
Gitops at Qordoba
45
46
1) Estimated time needed to fix
prod software bugs ~60% less
time after switching
2) Estimated time to respond to
customer requests ~43% less
time after switching
3) Service uptime went to 100%
from 99%
Customer tools: Jenkins, GKE and
Weave Cloud. Apps: web, ML.
The GitOps effect
47
The GitOps effect
~2 → 150+
deployments per week
48
The GitOps effect
~2 → 150+
deployments per week
Why?
49
Decide
Act
Observe
Orient
50
Declare
Implement
Monitor /
Observe
Modify
51
Declare
Implement
Monitor /
Observe
Modify
52
Declare
ImplementModify
Continuous
Deployment
Default
dashboards
Automated by
software
agents
Monitor /
Observe
53
Declare
ImplementModify
Continuous
Deployment
Default
dashboards
Automated by
software
agents
Monitor /
Observe
Software
making
commits
54
Declare
ImplementModify
Continuous
Deployment
Default
dashboards
Automated by
software
agents
Monitor /
Observe
Safe and
reversible
changes
55
Declare
ImplementModify
Continuous
Deployment
Default
dashboards
Automated by
software
agents
Monitor /
Observe
Automated,
templated
dashboards
56
Feedback loop latency.
This is what matters.
Beyond technology
57
58
BI dashboards
BI Reports
Playbook
Team Membership
Org Chart
Operations Manual
59
Security Policies
Processes
Roles & Authorisation
BI dashboards
BI Reports
Playbook
Team Membership
Org Chart
Operations Manual
60
Security Policies
Processes
Roles & Authorisation
Business Domain
61
Hard
Technical
Problems
62
Hard
Technical
Problems
63
Hard
Human
Problems
64
What if you could roll back your sales processes?
What if you could create a pull request on your
organisation policy?
What if you could get it reviewed and approved in an
afternoon?
What if your entire business could be reproduced the
same way a git repository is cloned?
GitOps at Weaveworks
65
66
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
*“stress-reduced”
67
Deploying a
service with
Flux
68
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
*“stress-reduced”
69
70
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
*“stress-reduced”
71
72
73
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
*“stress-reduced”
74
75
76
77
78
79
80
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
*“stress-reduced”
81
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
82
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
83
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
84
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
Permissive approach
to production access
85
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
Permissive approach
to production access
Excellent developer experience
86
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
Permissive approach
to production access
Excellent developer experience
Stress-free on-call*
87
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
Permissive approach
to production access
Excellent developer experience
Stress-free on-call*
*“stress-reduced”
Where to find out more
88
Search for “Weaveworks GitOps” in your favourite search engine
Take a look at our opensource work on https://github.com/weaveworks
Questions?
Weaveworks
@weaveworks
https://weave.works
Brice Fernandes
@fractallambda
brice@weave.works

Speeding up your team with GitOps