SlideShare a Scribd company logo
www.scling.com
Secure software supply chain
on a shoestring budget
Lars Albertsson, Founder, Scling
Jfokus, 2022-05-04
1
www.scling.com
Losing battles
2
https://www.carbonbrief.org/unep-1-5c-climate-target-slipping-out-of-reach
https://www.idea.int/gsod-indices/faqs
"I am here to bring you the bad news,
which is that we are not winning. We are
really losing this battle [on security]."
- Vinton Cerf
www.scling.com
What do we contribute?
● Internet, digitalisation + many good little things
● Ability to measure and manipulate populations at scale
● Monetising bad security
○ Stolen CPU cycles → money
○ Ransomware
3
https://spinbackup.com/blog/24-biggest-ransomware-attacks-in-2019/
https://blog.chainalysis.com/reports/2022-crypto-crime-report-preview-ransomware/
https://www.theguardian.com/news/2018/mar/17/ca
mbridge-analytica-facebook-influence-us-election
www.scling.com
vs
Risk-management rarely wins
Employees have conflicting definitions of success
Security vs productivity
4
Revenue-generation
Features
Delivery speed
Security reviews
Pentests
Password reauthentication
Phishing campaigns
Firewalls
…
www.scling.com
A simple recipe for application security:
- While we value items on the right, we value items on the left more.
- Invent alternatives that are aligned with speed
- Give employees aligned definitions of success
Security AND productivity
5
SSO
Password managers
Infrastructure as code
Hardware MFA
Ephemeral containers
…
Security reviews
Pentests
Password reauthentication
Phishing campaigns
Firewalls
…
www.scling.com
Quality expectations 1995-2002 Quality expectations 2022
We have been here before
6
https://www.cnet.com/culture/windows-may-crash-after-49-7-days/
www.scling.com
Quality and ops
7
Aligning quality with speed
TDD
Continuous
delivery
Agile
Dev-friendly
ops tooling
Test
automation
XP
Cross-functional
teams
DevOps
Trunk-based
Continuous
integration
Containers
www.scling.com
● Scaled processes
● Machine tools
● Challenges: scale,
logistics, legal,
organisation, faults, ...
Manual, mechanised, industrialised
8
● Muscle-powered
● Few tools
● Human touch for every
step
● Direct human control
● Machine tools
● Low investment, direct
return
www.scling.com
IT craft to factory
9
Security Waterfall
Application
delivery
Traditional
operations
Traditional
QA
Infrastructure
DevSecOps Agile
Containers
DevOps CI/CD
Infrastructure
as code
www.scling.com
● Toyota: Low defect rates AND high margins per vehicle
● State of DevOps report: High reliability AND high deployment rate
○ We have industrialised software engineering
Quality, speed - choose two
10
Quality
vs
Speed
Quality
AND
Speed
1000x span in
availability metrics
www.scling.com
Themes of good presentations, IMHO
● We have seen lots of X / X from a different angle. Here are some patterns.
● We have context Y. Here is how we work.
● We did a thing Z. Here is what we learnt.
11
We need to share how we work
in order to make faster progress.
www.scling.com
Security Waterfall
Data factories
12
Application
delivery
Traditional
operations
DevSecOps
Traditional
QA
Infrastructure
DB-oriented
architecture
Agile
Containers
DevOps CI/CD
Infrastructure
as code
Data factories,
data pipelines,
DataOps
www.scling.com
Data industrialisation
13
DW
~10 year capability gap
"data factory engineering"
Enterprise big data failures
"Modern data stack" -
traditional workflows, new technology
4GL / UML phase of data engineering
Data engineering education
www.scling.com
How data leaders work
14
Data processed offline
Online
Data factory
Data platform & lake
data
Data
innovation &
functionality
100+K daily
datasets
30% staff
BigQuery daily
users
Value from data!
www.scling.com
Scling - data-factory-as-a-service
15
Data value through collaboration
Customer
Data factory
Data platform & lake
data
domain
expertise
Value from data!
Rapid data
innovation
Learning by doing,
in collaboration
www.scling.com
Efficiency is sacred
● Productivity is our unique selling point
○ Client value from data is unpredictable
○ Clients don't know what they want
○ Quick experiments & pivot
● Minimal operational overhead
○ Pipelines / person
○ Datasets / day / person
● Nothing must undermine our USP
16
www.scling.com
Our security strategy
● Invest where it improves productivity
○ Cloud single sign on
○ Cloud identity management
○ Workload identities over secret tokens
○ Hardware multifactor authentication
○ Infrastructure as code
○ Patch management *
● Homogeneity over autonomy
○ Few technologies
○ Few processes
○ Processes encoded in code *
17
● Minimal attack surface *
● Strict asset management
○ Digital assets as code
○ Process to align assets with code
○ Explicit manual asset management
● Lean on Google
www.scling.com
Minimising attack surfaces
● Few ecosystems
○ Ubuntu
○ Scala + Spark
○ Python
● Few components
○ Reuse over perfect match
● Few versions
○ Single version per third party component
○ Opens gates to dependency hell *
■ Control or autonomous cells
18
www.scling.com
Our supply chain
● Google cloud
○ Kubernetes, GCS, Cloud SQL, …
● Virtual machine images
○ Ubuntu, Google
● Container base images
○ Ubuntu, phusion, MySQL, …
● Apt packages
● SaaS
○ Google, Atlassian, Gitlab
19
● Scala (+ other JVM)
○ Maven central
● Python
○ Pypi
● Direct downloads
○ URL + checksum
● Bazel plugins
○ URL + checksum
● Developer devices
○ Ubuntu, MacOS, Android, iOS
www.scling.com
Which version?
● Version specifications
○ Exact version
■ Good for application stability
○ Range
○ Latest
■ Good for patch latency
● Specification choice tradeoffs
○ Provider trust
○ Patch latency
20
● Upgrade tradeoffs
○ Vulnerability patching
○ Rogue code
○ Bugs fixed
○ Bugs introduced
○ Necessary work
● Our goal:
○ Exact version
○ Transitive dependencies locked
○ Automatically updated
● Let's pursue!
www.scling.com
Levels of up to date
● No new version of A exists
● New A version exists. Application verified ok with upgrade.
● New A version exists. Unclear whether upgrade breaks application.
● New A version exists. Upgrade breaks application.
○ We use a deprecated API.
○ New version has bug.
● New A version exists. Upgrade breaks dependency B.
○ New version of B exists.
○ No new version of B exists.
○ A and B must atomically upgrade
21
www.scling.com
A bot friendly task
● There is some order that moves us forward through hell
● Slow trial and error cycle
○ Compile or test takes minutes
● There are bots
○ Dependabot, Scala steward
■ Way too complex (100/20 KLOC, 1000s lines of doc / examples)
○ Do not cover our needs
■ Application correctness
■ Our ecosystems
22
www.scling.com
With a strong process
● we can reason and automate
○ Trial and error forward
● Process strength
○ Faulty change is detected before prod
○ Non-code changes unlikely to affect correctness
○ Self-bootstrapping
23
www.scling.com
Strong process challenges
● Everything not covered by tests
● Test infrastructure / setup defined by code
○ How to test?
○ How to bootstrap?
● Indeterministic processes / components
○ Mostly deterministic is ok
24
Extended test suite:
● Testsuite bootstrap
● Continuous deployment testsuite
● Non-production functionality
○ Dev tooling
○ Web
○ …
www.scling.com
Our build process
● Monorepo + trunk-based
○ Platforms + all client code and pipelines
○ Single version of platform
● All tests verified* for every change
○ Tests do not require cloud resources
● Build + test speed challenging
○ Spark → seconds upstart time → slow tests
● Simple recipe for speed:
○ Avoid doing things → caching
○ Do things in parallel
25
www.scling.com
Bazel
● Designed for monorepos & strong process
○ Lazy tree evaluation
○ Isolated sandboxes
● Unmatched performance features
○ Isolation → reliable caching
○ Test result caching
○ Remote caching
○ Parallelism
○ Remote execution
26
● Great for stuff used by Google
● Catching up on
○ Docker
○ Scala
○ Third-party dependencies
www.scling.com
Dependency version control
● Transitive, locked
○ Python
○ JVM
○ Lock files in version control
● Not transitive, locked
○ Direct downloads
○ Bazel plugins
○ Container base images
○ version.bzl file
■ → bazel, python, bash
27
● Apt packages
○ Latest*
● Some Google components
○ VM base images, misc
○ Latest
● Employee devices
○ Manual
● Unmanaged leftovers
○ SaaS
○ Otherwise minimal exposure
www.scling.com
bazel-deps
28
dependencies.yaml
workspace.bzl
www.scling.com
pip-tools
29
requirements.in
requirements.txt
BUILD.bazel
bootstrap tooling
www.scling.com
pip-compile, build time: bazel-deps, run time:
Python vs JVM dependency failure
30
www.scling.com
Bazel & containers
31
{scala,py}_binary
base image
files / tars
{scala,py}_image
container_run_and_commit_layer
Weak determinism
Apt, files only
Distroless tools
install_pkgs
www.scling.com
Can we make apt install deterministic?
● apt-get typically provides latest
○ Determined by Packages.gz
○ Download during build breaks determinism & caching?
● Distroless bazel package_manager:
○ Exact Packages.gz specification
○ Debian: Versioned Packages.gz
○ Ubuntu: Only latest Packages.gz
● Compromise on determinism
○ Download Packages.gz before build
○ Caching still ok
● Not running apt scripts seemed to work. For a while.
○ Subtle low-level container failures
○ Abandoned
32
www.scling.com
● Single unified platform
○ Monorepo + trunk-based process
○ Separate instance per client
○ All test suites run on every change
● Factories are adapted to constraints and important properties
○ Ok: Security, risk, quality, availability, compliance
○ No: Preferred technology, work processes
Scling collaboration models
33
Refinement factory
● Raw data in
● Valuable data out
● Non-technical clients
● "Easy" domain
Joint factory
● Hybrid teams
● Domain experts
● Data apprentices
● Scling runs data platform
Client factory
● Start as joint factory
● Goal: Client independent
www.scling.com
Divided, multi-tenant platform
34
Orion
base data platform
GCP (but portable to other clouds)
Isolated
client
instance
Isolated
client
instance
Isolated
client
instance Saturn
non-essential
operational tooling
ion CLI tool
scli CLI tool
www.scling.com
Client exit scenario
35
Orion
base data platform
Client cloud choice
Isolated
client
instance
Client monitoring,
logging, identity, etc
ion CLI tool
www.scling.com
Multiphase build bootstrap
36
Ubuntu
some python
docker
benderbot
python 3.x.y
JVM
bazel
py deps
ion
gcloud
kubectl
scli
hugo
orion/bin/tool.py
versions.bzl
requirements.txt
● Images cached based on
content
● Caches shared
www.scling.com
Benderbot
● Lazy bot that takes the easy way out
○ Dumb solutions over smart
● Find Guess next versions
○ 404 not found? Quick failure.
● Mimic developer actions
○ Upgrade source
○ Rerun bazel-deps / pip-compile
○ Run build bootstrap, test suite, dev tooling check
○ Look at logs to classify problem
○ Update checksum if necessary
○ Create merge request on success
37
● Isolated environment
○ Separate region
○ No internal network access
○ Gitlab + logging bucket credentials
○ Cheap spot instance + NVMe
www.scling.com
● Months of evening hacking
○ = weeks full time
Benderbot components / efforts
38
benderbot.py
< 1000 LOC
Statistics
data pipelines
Reporting dashboard
tool.py
few LOCs, brittle
Classification
data pipeline
Reevaluation journey:
● dash + plotly
● bokeh + bokeh
● streamlit + bokeh
www.scling.com
Benderbot reports
39
www.scling.com
Resolution classifications
● No new version of A exists
● New A version exists. Application verified ok with upgrade.
● New A version exists. Unclear whether upgrade breaks application.
● New A version exists. Upgrade breaks application.
○ We use a deprecated API.
○ New version has bug.
● New A version exists. Upgrade breaks dependency B.
○ New version of B exists.
○ No new version of B exists.
○ A and B must atomically upgrade
40
not found
test failure
success
test failure
test failure
test failure
transient
transient
transient
transient
www.scling.com
Our most productive developer
~500 MRs
41
www.scling.com
Benderbot stats - resolutions
42
www.scling.com
Benderbot stats - resolutions
43
More
hardware
Process
flakiness
Speculative
execution
www.scling.com
Resolutions by kind
44
Total
Other
JVM
Python
www.scling.com
Last resolution by dependency
45
Total
Other
JVM
Python
www.scling.com
Time between scans
46
www.scling.com
Google SLSA evaluation
● Supply-chain Levels for Software Artifacts
○ Maturity model
● SLSA 1: yes
● SLSA 2: yes
● SLSA 3: some
○ Prioritising speed over Ephemeral Environment,
Isolated, Non-Falsifiable
● SLSA 4: some
○ Parameterless
○ Dependencies complete (except apt)
47
www.scling.com
Concluding remarks
● Challenges?
○ Operational tuning to balance rate vs €
○ Google cloud_sql_proxy patch update took us down
○ Diva dependencies need custom solutions
○ Which test failure to address?
● Future?
○ Upgrade conditional on container scanning?
○ Dead dependency detection?
● Open source? No.
○ Specific to our environment
○ Bot is easy. Just do it.
○ Strong process challenging. But rewarding.
○ Offer: A copy of the code for a C-level lunch date. :-)
48
www.scling.com
Resources
https://trunkbaseddevelopment.com/
https://reproducible-builds.org/
https://www.scling.com/presentations/
49

More Related Content

Similar to Secure software supply chain on a shoestring budget

Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-AriThinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Demi Ben-Ari
 
Crossing the data divide
Crossing the data divideCrossing the data divide
Crossing the data divide
Lars Albertsson
 
Google Cloud - Stand Out Features
Google Cloud - Stand Out FeaturesGoogle Cloud - Stand Out Features
Google Cloud - Stand Out Features
GDG Cloud Bengaluru
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
Haggai Philip Zagury
 
Last Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons LearntLast Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons Learnt
Mark Grebler
 
Head in the clouds @ bol.com
Head in the clouds @ bol.comHead in the clouds @ bol.com
Head in the clouds @ bol.com
Maarten Dirkse
 
OpenFlow @ Google
OpenFlow @ GoogleOpenFlow @ Google
OpenFlow @ Google
Open Networking Summits
 
Workflow Engines + Luigi
Workflow Engines + LuigiWorkflow Engines + Luigi
Workflow Engines + Luigi
Vladislav Supalov
 
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
Tom Paseka
 
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
HostedbyConfluent
 
Fluent 2018: Tracking Performance of the Web with HTTP Archive
Fluent 2018: Tracking Performance of the Web with HTTP ArchiveFluent 2018: Tracking Performance of the Web with HTTP Archive
Fluent 2018: Tracking Performance of the Web with HTTP Archive
Paul Calvano
 
Multiplier Effect: Case Studies in Distributions for Publishers
Multiplier Effect: Case Studies in Distributions for PublishersMultiplier Effect: Case Studies in Distributions for Publishers
Multiplier Effect: Case Studies in Distributions for Publishers
Jon Peck
 
The 7 habits of data effective companies.pdf
The 7 habits of data effective companies.pdfThe 7 habits of data effective companies.pdf
The 7 habits of data effective companies.pdf
Lars Albertsson
 
Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKIntroduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OK
Kriangkrai Chaonithi
 
From monolith to microservices
From monolith to microservicesFrom monolith to microservices
From monolith to microservices
TransferWiseSG
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
Amihay Zer-Kavod
 
Data ops in practice
Data ops in practiceData ops in practice
Data ops in practice
Lars Albertsson
 
Cloud Native Practice
Cloud Native PracticeCloud Native Practice
Cloud Native Practice
Philip Zheng
 
Container world 2019 Canary Release
Container world 2019 Canary ReleaseContainer world 2019 Canary Release
Container world 2019 Canary Release
Billy Yuen
 
“How Do We Enable Edge ML Everywhere? Data, Reliability and Silicon Flexibili...
“How Do We Enable Edge ML Everywhere? Data, Reliability and Silicon Flexibili...“How Do We Enable Edge ML Everywhere? Data, Reliability and Silicon Flexibili...
“How Do We Enable Edge ML Everywhere? Data, Reliability and Silicon Flexibili...
Edge AI and Vision Alliance
 

Similar to Secure software supply chain on a shoestring budget (20)

Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-AriThinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
 
Crossing the data divide
Crossing the data divideCrossing the data divide
Crossing the data divide
 
Google Cloud - Stand Out Features
Google Cloud - Stand Out FeaturesGoogle Cloud - Stand Out Features
Google Cloud - Stand Out Features
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
 
Last Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons LearntLast Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons Learnt
 
Head in the clouds @ bol.com
Head in the clouds @ bol.comHead in the clouds @ bol.com
Head in the clouds @ bol.com
 
OpenFlow @ Google
OpenFlow @ GoogleOpenFlow @ Google
OpenFlow @ Google
 
Workflow Engines + Luigi
Workflow Engines + LuigiWorkflow Engines + Luigi
Workflow Engines + Luigi
 
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
 
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
 
Fluent 2018: Tracking Performance of the Web with HTTP Archive
Fluent 2018: Tracking Performance of the Web with HTTP ArchiveFluent 2018: Tracking Performance of the Web with HTTP Archive
Fluent 2018: Tracking Performance of the Web with HTTP Archive
 
Multiplier Effect: Case Studies in Distributions for Publishers
Multiplier Effect: Case Studies in Distributions for PublishersMultiplier Effect: Case Studies in Distributions for Publishers
Multiplier Effect: Case Studies in Distributions for Publishers
 
The 7 habits of data effective companies.pdf
The 7 habits of data effective companies.pdfThe 7 habits of data effective companies.pdf
The 7 habits of data effective companies.pdf
 
Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKIntroduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OK
 
From monolith to microservices
From monolith to microservicesFrom monolith to microservices
From monolith to microservices
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
 
Data ops in practice
Data ops in practiceData ops in practice
Data ops in practice
 
Cloud Native Practice
Cloud Native PracticeCloud Native Practice
Cloud Native Practice
 
Container world 2019 Canary Release
Container world 2019 Canary ReleaseContainer world 2019 Canary Release
Container world 2019 Canary Release
 
“How Do We Enable Edge ML Everywhere? Data, Reliability and Silicon Flexibili...
“How Do We Enable Edge ML Everywhere? Data, Reliability and Silicon Flexibili...“How Do We Enable Edge ML Everywhere? Data, Reliability and Silicon Flexibili...
“How Do We Enable Edge ML Everywhere? Data, Reliability and Silicon Flexibili...
 

More from Lars Albertsson

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
Lars Albertsson
 
Schema management with Scalameta
Schema management with ScalametaSchema management with Scalameta
Schema management with Scalameta
Lars Albertsson
 
How to not kill people - Berlin Buzzwords 2023.pdf
How to not kill people - Berlin Buzzwords 2023.pdfHow to not kill people - Berlin Buzzwords 2023.pdf
How to not kill people - Berlin Buzzwords 2023.pdf
Lars Albertsson
 
Ai legal and ethics
Ai   legal and ethicsAi   legal and ethics
Ai legal and ethics
Lars Albertsson
 
The right side of speed - learning to shift left
The right side of speed - learning to shift leftThe right side of speed - learning to shift left
The right side of speed - learning to shift left
Lars Albertsson
 
Mortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data qualityMortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data quality
Lars Albertsson
 
Data democratised
Data democratisedData democratised
Data democratised
Lars Albertsson
 
Eventually, time will kill your data processing
Eventually, time will kill your data processingEventually, time will kill your data processing
Eventually, time will kill your data processing
Lars Albertsson
 
Eventually, time will kill your data pipeline
Eventually, time will kill your data pipelineEventually, time will kill your data pipeline
Eventually, time will kill your data pipeline
Lars Albertsson
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platform
Lars Albertsson
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science team
Lars Albertsson
 
Big data == lean data
Big data == lean dataBig data == lean data
Big data == lean data
Lars Albertsson
 
Privacy by design
Privacy by designPrivacy by design
Privacy by design
Lars Albertsson
 
Test strategies for data processing pipelines, v2.0
Test strategies for data processing pipelines, v2.0Test strategies for data processing pipelines, v2.0
Test strategies for data processing pipelines, v2.0
Lars Albertsson
 
10 ways to stumble with big data
10 ways to stumble with big data10 ways to stumble with big data
10 ways to stumble with big data
Lars Albertsson
 
Protecting privacy in practice
Protecting privacy in practiceProtecting privacy in practice
Protecting privacy in practice
Lars Albertsson
 
Testing data streaming applications
Testing data streaming applicationsTesting data streaming applications
Testing data streaming applications
Lars Albertsson
 
A primer on building real time data-driven products
A primer on building real time data-driven productsA primer on building real time data-driven products
A primer on building real time data-driven products
Lars Albertsson
 

More from Lars Albertsson (20)

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Schema management with Scalameta
Schema management with ScalametaSchema management with Scalameta
Schema management with Scalameta
 
How to not kill people - Berlin Buzzwords 2023.pdf
How to not kill people - Berlin Buzzwords 2023.pdfHow to not kill people - Berlin Buzzwords 2023.pdf
How to not kill people - Berlin Buzzwords 2023.pdf
 
Ai legal and ethics
Ai   legal and ethicsAi   legal and ethics
Ai legal and ethics
 
The right side of speed - learning to shift left
The right side of speed - learning to shift leftThe right side of speed - learning to shift left
The right side of speed - learning to shift left
 
Mortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data qualityMortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data quality
 
Data democratised
Data democratisedData democratised
Data democratised
 
Eventually, time will kill your data processing
Eventually, time will kill your data processingEventually, time will kill your data processing
Eventually, time will kill your data processing
 
Eventually, time will kill your data pipeline
Eventually, time will kill your data pipelineEventually, time will kill your data pipeline
Eventually, time will kill your data pipeline
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platform
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science team
 
Big data == lean data
Big data == lean dataBig data == lean data
Big data == lean data
 
Privacy by design
Privacy by designPrivacy by design
Privacy by design
 
Test strategies for data processing pipelines, v2.0
Test strategies for data processing pipelines, v2.0Test strategies for data processing pipelines, v2.0
Test strategies for data processing pipelines, v2.0
 
10 ways to stumble with big data
10 ways to stumble with big data10 ways to stumble with big data
10 ways to stumble with big data
 
Protecting privacy in practice
Protecting privacy in practiceProtecting privacy in practice
Protecting privacy in practice
 
Testing data streaming applications
Testing data streaming applicationsTesting data streaming applications
Testing data streaming applications
 
A primer on building real time data-driven products
A primer on building real time data-driven productsA primer on building real time data-driven products
A primer on building real time data-driven products
 

Recently uploaded

AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 

Recently uploaded (20)

AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 

Secure software supply chain on a shoestring budget

  • 1. www.scling.com Secure software supply chain on a shoestring budget Lars Albertsson, Founder, Scling Jfokus, 2022-05-04 1
  • 2. www.scling.com Losing battles 2 https://www.carbonbrief.org/unep-1-5c-climate-target-slipping-out-of-reach https://www.idea.int/gsod-indices/faqs "I am here to bring you the bad news, which is that we are not winning. We are really losing this battle [on security]." - Vinton Cerf
  • 3. www.scling.com What do we contribute? ● Internet, digitalisation + many good little things ● Ability to measure and manipulate populations at scale ● Monetising bad security ○ Stolen CPU cycles → money ○ Ransomware 3 https://spinbackup.com/blog/24-biggest-ransomware-attacks-in-2019/ https://blog.chainalysis.com/reports/2022-crypto-crime-report-preview-ransomware/ https://www.theguardian.com/news/2018/mar/17/ca mbridge-analytica-facebook-influence-us-election
  • 4. www.scling.com vs Risk-management rarely wins Employees have conflicting definitions of success Security vs productivity 4 Revenue-generation Features Delivery speed Security reviews Pentests Password reauthentication Phishing campaigns Firewalls …
  • 5. www.scling.com A simple recipe for application security: - While we value items on the right, we value items on the left more. - Invent alternatives that are aligned with speed - Give employees aligned definitions of success Security AND productivity 5 SSO Password managers Infrastructure as code Hardware MFA Ephemeral containers … Security reviews Pentests Password reauthentication Phishing campaigns Firewalls …
  • 6. www.scling.com Quality expectations 1995-2002 Quality expectations 2022 We have been here before 6 https://www.cnet.com/culture/windows-may-crash-after-49-7-days/
  • 7. www.scling.com Quality and ops 7 Aligning quality with speed TDD Continuous delivery Agile Dev-friendly ops tooling Test automation XP Cross-functional teams DevOps Trunk-based Continuous integration Containers
  • 8. www.scling.com ● Scaled processes ● Machine tools ● Challenges: scale, logistics, legal, organisation, faults, ... Manual, mechanised, industrialised 8 ● Muscle-powered ● Few tools ● Human touch for every step ● Direct human control ● Machine tools ● Low investment, direct return
  • 9. www.scling.com IT craft to factory 9 Security Waterfall Application delivery Traditional operations Traditional QA Infrastructure DevSecOps Agile Containers DevOps CI/CD Infrastructure as code
  • 10. www.scling.com ● Toyota: Low defect rates AND high margins per vehicle ● State of DevOps report: High reliability AND high deployment rate ○ We have industrialised software engineering Quality, speed - choose two 10 Quality vs Speed Quality AND Speed 1000x span in availability metrics
  • 11. www.scling.com Themes of good presentations, IMHO ● We have seen lots of X / X from a different angle. Here are some patterns. ● We have context Y. Here is how we work. ● We did a thing Z. Here is what we learnt. 11 We need to share how we work in order to make faster progress.
  • 13. www.scling.com Data industrialisation 13 DW ~10 year capability gap "data factory engineering" Enterprise big data failures "Modern data stack" - traditional workflows, new technology 4GL / UML phase of data engineering Data engineering education
  • 14. www.scling.com How data leaders work 14 Data processed offline Online Data factory Data platform & lake data Data innovation & functionality 100+K daily datasets 30% staff BigQuery daily users Value from data!
  • 15. www.scling.com Scling - data-factory-as-a-service 15 Data value through collaboration Customer Data factory Data platform & lake data domain expertise Value from data! Rapid data innovation Learning by doing, in collaboration
  • 16. www.scling.com Efficiency is sacred ● Productivity is our unique selling point ○ Client value from data is unpredictable ○ Clients don't know what they want ○ Quick experiments & pivot ● Minimal operational overhead ○ Pipelines / person ○ Datasets / day / person ● Nothing must undermine our USP 16
  • 17. www.scling.com Our security strategy ● Invest where it improves productivity ○ Cloud single sign on ○ Cloud identity management ○ Workload identities over secret tokens ○ Hardware multifactor authentication ○ Infrastructure as code ○ Patch management * ● Homogeneity over autonomy ○ Few technologies ○ Few processes ○ Processes encoded in code * 17 ● Minimal attack surface * ● Strict asset management ○ Digital assets as code ○ Process to align assets with code ○ Explicit manual asset management ● Lean on Google
  • 18. www.scling.com Minimising attack surfaces ● Few ecosystems ○ Ubuntu ○ Scala + Spark ○ Python ● Few components ○ Reuse over perfect match ● Few versions ○ Single version per third party component ○ Opens gates to dependency hell * ■ Control or autonomous cells 18
  • 19. www.scling.com Our supply chain ● Google cloud ○ Kubernetes, GCS, Cloud SQL, … ● Virtual machine images ○ Ubuntu, Google ● Container base images ○ Ubuntu, phusion, MySQL, … ● Apt packages ● SaaS ○ Google, Atlassian, Gitlab 19 ● Scala (+ other JVM) ○ Maven central ● Python ○ Pypi ● Direct downloads ○ URL + checksum ● Bazel plugins ○ URL + checksum ● Developer devices ○ Ubuntu, MacOS, Android, iOS
  • 20. www.scling.com Which version? ● Version specifications ○ Exact version ■ Good for application stability ○ Range ○ Latest ■ Good for patch latency ● Specification choice tradeoffs ○ Provider trust ○ Patch latency 20 ● Upgrade tradeoffs ○ Vulnerability patching ○ Rogue code ○ Bugs fixed ○ Bugs introduced ○ Necessary work ● Our goal: ○ Exact version ○ Transitive dependencies locked ○ Automatically updated ● Let's pursue!
  • 21. www.scling.com Levels of up to date ● No new version of A exists ● New A version exists. Application verified ok with upgrade. ● New A version exists. Unclear whether upgrade breaks application. ● New A version exists. Upgrade breaks application. ○ We use a deprecated API. ○ New version has bug. ● New A version exists. Upgrade breaks dependency B. ○ New version of B exists. ○ No new version of B exists. ○ A and B must atomically upgrade 21
  • 22. www.scling.com A bot friendly task ● There is some order that moves us forward through hell ● Slow trial and error cycle ○ Compile or test takes minutes ● There are bots ○ Dependabot, Scala steward ■ Way too complex (100/20 KLOC, 1000s lines of doc / examples) ○ Do not cover our needs ■ Application correctness ■ Our ecosystems 22
  • 23. www.scling.com With a strong process ● we can reason and automate ○ Trial and error forward ● Process strength ○ Faulty change is detected before prod ○ Non-code changes unlikely to affect correctness ○ Self-bootstrapping 23
  • 24. www.scling.com Strong process challenges ● Everything not covered by tests ● Test infrastructure / setup defined by code ○ How to test? ○ How to bootstrap? ● Indeterministic processes / components ○ Mostly deterministic is ok 24 Extended test suite: ● Testsuite bootstrap ● Continuous deployment testsuite ● Non-production functionality ○ Dev tooling ○ Web ○ …
  • 25. www.scling.com Our build process ● Monorepo + trunk-based ○ Platforms + all client code and pipelines ○ Single version of platform ● All tests verified* for every change ○ Tests do not require cloud resources ● Build + test speed challenging ○ Spark → seconds upstart time → slow tests ● Simple recipe for speed: ○ Avoid doing things → caching ○ Do things in parallel 25
  • 26. www.scling.com Bazel ● Designed for monorepos & strong process ○ Lazy tree evaluation ○ Isolated sandboxes ● Unmatched performance features ○ Isolation → reliable caching ○ Test result caching ○ Remote caching ○ Parallelism ○ Remote execution 26 ● Great for stuff used by Google ● Catching up on ○ Docker ○ Scala ○ Third-party dependencies
  • 27. www.scling.com Dependency version control ● Transitive, locked ○ Python ○ JVM ○ Lock files in version control ● Not transitive, locked ○ Direct downloads ○ Bazel plugins ○ Container base images ○ version.bzl file ■ → bazel, python, bash 27 ● Apt packages ○ Latest* ● Some Google components ○ VM base images, misc ○ Latest ● Employee devices ○ Manual ● Unmanaged leftovers ○ SaaS ○ Otherwise minimal exposure
  • 30. www.scling.com pip-compile, build time: bazel-deps, run time: Python vs JVM dependency failure 30
  • 31. www.scling.com Bazel & containers 31 {scala,py}_binary base image files / tars {scala,py}_image container_run_and_commit_layer Weak determinism Apt, files only Distroless tools install_pkgs
  • 32. www.scling.com Can we make apt install deterministic? ● apt-get typically provides latest ○ Determined by Packages.gz ○ Download during build breaks determinism & caching? ● Distroless bazel package_manager: ○ Exact Packages.gz specification ○ Debian: Versioned Packages.gz ○ Ubuntu: Only latest Packages.gz ● Compromise on determinism ○ Download Packages.gz before build ○ Caching still ok ● Not running apt scripts seemed to work. For a while. ○ Subtle low-level container failures ○ Abandoned 32
  • 33. www.scling.com ● Single unified platform ○ Monorepo + trunk-based process ○ Separate instance per client ○ All test suites run on every change ● Factories are adapted to constraints and important properties ○ Ok: Security, risk, quality, availability, compliance ○ No: Preferred technology, work processes Scling collaboration models 33 Refinement factory ● Raw data in ● Valuable data out ● Non-technical clients ● "Easy" domain Joint factory ● Hybrid teams ● Domain experts ● Data apprentices ● Scling runs data platform Client factory ● Start as joint factory ● Goal: Client independent
  • 34. www.scling.com Divided, multi-tenant platform 34 Orion base data platform GCP (but portable to other clouds) Isolated client instance Isolated client instance Isolated client instance Saturn non-essential operational tooling ion CLI tool scli CLI tool
  • 35. www.scling.com Client exit scenario 35 Orion base data platform Client cloud choice Isolated client instance Client monitoring, logging, identity, etc ion CLI tool
  • 36. www.scling.com Multiphase build bootstrap 36 Ubuntu some python docker benderbot python 3.x.y JVM bazel py deps ion gcloud kubectl scli hugo orion/bin/tool.py versions.bzl requirements.txt ● Images cached based on content ● Caches shared
  • 37. www.scling.com Benderbot ● Lazy bot that takes the easy way out ○ Dumb solutions over smart ● Find Guess next versions ○ 404 not found? Quick failure. ● Mimic developer actions ○ Upgrade source ○ Rerun bazel-deps / pip-compile ○ Run build bootstrap, test suite, dev tooling check ○ Look at logs to classify problem ○ Update checksum if necessary ○ Create merge request on success 37 ● Isolated environment ○ Separate region ○ No internal network access ○ Gitlab + logging bucket credentials ○ Cheap spot instance + NVMe
  • 38. www.scling.com ● Months of evening hacking ○ = weeks full time Benderbot components / efforts 38 benderbot.py < 1000 LOC Statistics data pipelines Reporting dashboard tool.py few LOCs, brittle Classification data pipeline Reevaluation journey: ● dash + plotly ● bokeh + bokeh ● streamlit + bokeh
  • 40. www.scling.com Resolution classifications ● No new version of A exists ● New A version exists. Application verified ok with upgrade. ● New A version exists. Unclear whether upgrade breaks application. ● New A version exists. Upgrade breaks application. ○ We use a deprecated API. ○ New version has bug. ● New A version exists. Upgrade breaks dependency B. ○ New version of B exists. ○ No new version of B exists. ○ A and B must atomically upgrade 40 not found test failure success test failure test failure test failure transient transient transient transient
  • 41. www.scling.com Our most productive developer ~500 MRs 41
  • 43. www.scling.com Benderbot stats - resolutions 43 More hardware Process flakiness Speculative execution
  • 45. www.scling.com Last resolution by dependency 45 Total Other JVM Python
  • 47. www.scling.com Google SLSA evaluation ● Supply-chain Levels for Software Artifacts ○ Maturity model ● SLSA 1: yes ● SLSA 2: yes ● SLSA 3: some ○ Prioritising speed over Ephemeral Environment, Isolated, Non-Falsifiable ● SLSA 4: some ○ Parameterless ○ Dependencies complete (except apt) 47
  • 48. www.scling.com Concluding remarks ● Challenges? ○ Operational tuning to balance rate vs € ○ Google cloud_sql_proxy patch update took us down ○ Diva dependencies need custom solutions ○ Which test failure to address? ● Future? ○ Upgrade conditional on container scanning? ○ Dead dependency detection? ● Open source? No. ○ Specific to our environment ○ Bot is easy. Just do it. ○ Strong process challenging. But rewarding. ○ Offer: A copy of the code for a C-level lunch date. :-) 48