SlideShare a Scribd company logo
www.scling.com
The lean principles of
DataOps
Berlin Buzzwords, 2020-06-08
Lars Albertsson, Founder, Scling
Christopher Bergh, CEO & Head Chef, DataKitchen
1
www.scling.com
Scling - data-value-as-a-service
2
Data lake
Stream storage
● Extract value from your data
● Data platform + custom data pipelines
● Imitate data leaders:
○ Quick idea-to-production
○ Operational efficiency
Our marketing strategy:
● Promiscuously share knowledge
○ On slides devoid of glossy polish
www.scling.com
1994: OS/2 Warp CID installation
3
Grmbl, who
reinstalled my
machine?
www.scling.com
IT craft to factory
4
Security Waterfall
Application
delivery
Traditional
operations
Traditional
QA
Infrastructure
DevSecOps Agile
Containers
DevOps CI/CD
Infrastructure
as code
www.scling.com
Security Waterfall
Data factories
5
Application
delivery
Traditional
operations
Traditional
QA
Infrastructure
DB-oriented
architecture
DevSecOps Agile
Containers
DevOps CI/CD
Infrastructure
as code
Data factories,
data pipelines,
DataOps
www.scling.com
The Toyota Way
Selected lean principles:
● Long-term over short-term
● The right process will produce the right results
● Eliminate waste (muda)
● Continuous improvement (kaizen)
● Use pull systems to avoid unnecessary production
● Quality takes precedence (jidoka)
○ Stop to fix problems
● Standardised tasks and processes
● Reliable technology that serves people and process
● Develop your people
● Decisions slowly by consensus
● Relentless reflection (hansei), organisational learning
6
www.scling.com
Common waste species
● Cognitive waste
● Delivery waste
● Operational waste
● Product waste
7
www.scling.com
Cognitive waste
● Why do we have 25 time formats?
○ ISO 8601, UTC assumed
○ ISO 8601 + timezone
○ Millis since epoch, UTC
○ Nanos since epoch, UTC
○ Millis since epoch, user local time
○ …
○ Float of seconds since epoch, as string.
WTF?!?
● my-kafka-topic-name, your_topic_name
8
● Definition of an order:
○ Abandoned cart?
○ Payment refused?
○ Returned goods?
○ Free promotion?
● Data entity source of truth
○ MySQL, Kafka, data lake?
www.scling.com
What causes cognitive waste?
● We are autonomous!
○ Teams can choose technology, format, process, ...
● Cognitive debt
○ Short-term over long-term
○ Decisions without consensus
● Recognition and rewards
○ "You have made a similar independent pipeline, great work!"
9
www.scling.com
Avoiding cognitive waste
● Reusing semantic definitions
● Reusing code & technical definitions
○ Code transparency & sharing
○ Standardised technology
○ Document decisions & consensus process
● Read-only sharing not enough
○ Must be empowered to change for reuse and to improve quality
○ Standardised processes
10
www.scling.com
Eliminating cognitive waste
● Refactoring code, semantics, docs
● Low risk - what will I break downstream?
○ Standardised, automated, trusted QA process
○ End-to-end pipeline testing
● "Creating a pipeline - one day! Replace old pipeline - 18 months."
11
www.scling.com
Delivery waste
● Friction from code to production
○ Ideal: Idea, research, write code+tests, done. Everything else is friction.
● Code inventory
○ Code not yet fully utilised
● Data inventory
○ Data not yet fully processed
12
www.scling.com
Data product quality assurance
● Product quality = f(code, data)
○ Cannot do full QA on code only
○ Only real data is production data
● Test in production
○ Quick QA cycle = quick production deployment
○ Measure, monitor, validate
13
www.scling.com
Eliminating delivery friction
14
● In theory simple - scrutinise everything
○ Positive engineering: writing code, tests, docs, refactor, improve
○ All else is negative
● You are limited by your assumptions
○ State of practice far from state of art
But the test suite
takes 3 hours.
We have this
checklist.
Security must
approve.
X must be
released before Y.
That is another
team's job.
We don't have
access.
We must test in
staging first.
We haven't
performance
tested yet.
www.scling.com
So get rid of the waste. Resources:
No tradeoff between speed and quality!
15
www.scling.com
● Code not yet fully utilised
● Code on its way to production
○ In a notebook
○ Waiting for approval
○ Waiting for release
○ Internally released, waiting
for dependants to upgrade
● Tests not fully used
○ Cover code (shared component),
but not yet executed
Code inventory
16
www.scling.com
Data inventory
● Data collected, but not yet fully processed
○ Traditional lazy joins & SQL processing at runtime
● Eliminate with eager processing = pipeline
○ Process, join, denormalise
● Fatal problems → offline crash
○ "Andon" cord - stop and fix before significant harm is done
17
www.scling.com
Operational waste
● Friction in operational manoeuvres
○ Fear of mistakes
● Cost of incidents
○ Time to recovery
○ Impact of incident
○ Frequency of incidents
18
www.scling.com
Separating offline and online
19
Raw
19
Fraud
serviceFraud
model
Orders Orders
Replication /
Backup
Standard procedures Standard proceduresLightweight procedures
● QA driven by internal efficiency
● Continuous deployment
● New pipeline < 1 day
● Upgrade < 1 hour
● Bug recovery < 1 hour
Careful handover Careful handover
www.scling.com
20
Cost of a software error
Online
● User impact
● Data corruption
● Cascading corruption
● Unbounded recovery
www.scling.com
21
Cost of a software error
Nearline
● Data corruption
● Downstream impact
● Bounded recovery
Online
● User impact
● Data corruption
● Cascading corruption
● Unbounded recovery
Job
Stream
Stream
Job
Stream
www.scling.com
22
Cost of a software error
Nearline
● Data corruption
● Downstream impact
● Bounded recovery
Offline
● Temporary data
corruption
● Downstream impact
● Easy recovery
Online
● User impact
● Data corruption
● Cascading corruption
● Unbounded recovery
Job
Stream
Stream
Job
Stream
www.scling.com
Data speed Innovation speed
23
Nearline
Data processing tradeoff
23
Job
Stream
OfflineOnline
Stream
Job
Stream
www.scling.com
Product waste
● Work not driven by use case
● Unrealised data potential due to friction
○ Unawareness of data
○ Difficulty to use data
● Hidden quality problems
● Collaboration and communication overhead
24
Data democratisation -
making data accessible
and usable
Copyright 2020 by DataKitchen, Inc. All Rights Reserved.
Waste: Your Team’s Time Not Well Spent
25
Percentage
Time Team
Spends Per
Week
Current
Errors &
Operational Tasks
New Features &
Data For Customers
Improvements & Debt
Challenges:
• Complex roles
• Complex organizations
• Complex toolchains
• Complex data
• Complex collaboration
Copyright 2020 DataKitchen, Inc.
Waste: Data Analytics is like the US Auto
Industry in the 1970s
Current
High Errors
Production
Errors
Data Analytics
Team
Deployment
Latency
Weeks, Months
Dev Prod
Challenges:
• Slow to add new features,
rapidly address consumer
requests, changing data sets
• Lack of trust by data
consumers
• Slow model deployment, slow
to move to cloud
• Team morale
26
Copyright 2020 by DataKitchen, Inc.  All Rights Reserved.
Waste: Conway’s Law and Data Pipelines
Data Analytics Follows Conway's Law
The structure of how teams are organized to do Data Science, Data
Engineering, Analytics, and Production is reflected in their data
pipelines.
Copyright 2020 by DataKitchen, Inc.  All Rights Reserved.
Waste: A cornucopia of collaboration complexity
D D
P
D
D
D D
D
D
D
P
D
P
P
D Development - Data Analytic Team P Production - Data Analytic Team
Centralized Dev Centralized Dev & Prod Decentralized Dev Decentralized Dev & Prod
How do we create
together without conflicts?
(Data Engineer & Data
Scientist)
How do we deploy safely
and rapidly? (Data Team and
Production Team)
How to balance centralized
control vs self service freedom?
(Home Office Data Team and
Line of Business Analysts)
How to reuse/incorporate what
another team deployed?
(Multiple Data & Production
Teams in Many Orgs)
DE
DS
BI
Copyright 2020 by DataKitchen, Inc. All Rights Reserved.
Why? Data Teams Are Suffering
Data teams are caught between three competing forces:
• Unaware Data Providers – unaware that they send
crappy, late, and error prone data sets
• Demanding Data Consumers – demand trusted, original
insight at the speed of Amazon delivery
• Critical Supporting Teams – need flawless ongoing
production and collaboration with other teams/people
Make for:
• A beaten down, distraught, disempowered work
environment
• Teams that cannot create and innovate
• Lack of trust all around
29
Unaware Data
Providers
Demanding Data
Consumers
Critical Supporting
Teams
Copyright 2020 by DataKitchen, Inc. All Rights Reserved.
DataOps – Solution To That Suffering
DataOps – The technical practices,
cultural norms, and architecture
that enable:
• Rapid cycles of experimentation
and innovation to delivery of new
insights to our customers
• Low error rates
• Collaboration across complex sets
of people, technology, and
environments
• Clear measurement and monitoring
of results
30Source: Gartner
“Organizations that adopt a DevOps- and DataOps-based
approach are more successful in implementing end-to-end,
reliable, robust, scalable and repeatable solutions.”
Sumit Pal, Gartner, November 2018
People,
Process,
Organization
Technical
Environment
Copyright 2020 by DataKitchen, Inc.  All Rights Reserved.
DataOps Benefit: Lower Cost, More Insight
31
After DataOps
Percentage
Time Team
Spends Per
Week
Before DataOps
New Features &
Data For Customers
Errors &
Operational Tasks
New Features &
Data For Customers
Improvements & Debt
Errors & Operational
Tasks
Process Improvements
& Tech Debt Reduction
Copyright 2020 by DataKitchen, Inc.  All Rights Reserved.
DataOps Benefit: Faster, Better & Happier
32
After DataOpsBefore DataOps
High Errors
Production
Errors Low Errors
Data Analytics
Team
Deployment
Latency
Weeks, Months
Dev Prod
Hours & Mins
Dev Prod
Copyright 2020 by DataKitchen, Inc.  All Rights Reserved.
DevOps vs DataOps (and all those *Opses)
Lean, Learning Origination, and W Edwards Deming Principles: Focus on Low Errors, Cycle Time,
Collaboration, and Measurement
Industrial Manufacturing
Teams
Business
Management
Concept
Data Science, Engineering
and Analytics Teams
IT and Software TeamsOrganization
Team Management Agile, Kanban, Scrum, DA, etc.
Team Management Six Sigma,
Total Quality Management
Organizational
Management
Method
Technical
Environment and
Process DevOps
AIOps
DevSecOps
DataOps
ModelOps
MLOps
…
GitOps
Copyright 2020 by DataKitchen, Inc.  All Rights Reserved.
DevOps vs DataOps (and all those *Opses)
Lean, Learning Origination, and W Edwards Deming Principles: Focus on Low Errors, Cycle Time,
Collaboration, and Measurement
Industrial Manufacturing
Teams
Business
Management
Concept
Data Science, Engineering
and Analytics Teams
IT and Software TeamsOrganization
Team Management Agile, Kanban, Scrum, DA, etc.
Team Management Six Sigma,
Total Quality Management
Organizational
Management
Method
Technical
Environment and
Process DevOps
AIOps
DevSecOps
DataOps
ModelOps
MLOps
…
GitOps
Copyright 2020 by DataKitchen, Inc.  All Rights Reserved.
DevOps vs DataOps (and all those *Opses)
Lean, Learning Origination, and W Edwards Deming Principles: Focus on Low Errors, Cycle Time,
Collaboration, and Measurement
Industrial Manufacturing
Teams
Business
Management
Concept
Data Science, Engineering
and Analytics Teams
IT and Software TeamsOrganization
Team Management Agile, Kanban, Scrum, DA, etc.
Team Management Six Sigma,
Total Quality Management
Organizational
Management
Method
Technical
Environment and
Process DevOps
AIOps
DevSecOps
DataOps
ModelOps
MLOps
…
GitOps
Copyright 2020 by DataKitchen, Inc.  All Rights Reserved.
DevOps vs DataOps (and all those *Opses)
Lean, Learning Origination, and W Edwards Deming Principles: Focus on Low Errors, Cycle Time,
Collaboration, and Measurement
Industrial Manufacturing
Teams
Business
Management
Concept
Data Science, Engineering
and Analytics Teams
IT and Software TeamsOrganization
Team Management Agile, Kanban, Scrum, DA, etc.
Team Management Six Sigma,
Total Quality Management
Organizational
Management
Method
Technical
Environment and
Process DevOps
AIOps
DevSecOps
DataOps
ModelOps
MLOps
…
GitOps
Copyright 2020 by DataKitchen, Inc. All Rights Reserved.
What You Do Is Much Less Important Than
How You Do It
37
“We realized that the true problem, the true difficulty, and where
the greatest potential is – is building the machine that makes
the machine. It’s building the factory.” – Elon Musk
94% of causes were common cause. We often attribute problems
to a specific case, and look for a person to blame, rather than
focusing on the underlying process – Dr Deming
www.scling.com
Questions?
38

More Related Content

What's hot

Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
Denodo
 
Dsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovicDsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovic
Radovan Baćović
 
Testing the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big ProblemsTesting the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big Problems
TechWell
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Mihai Criveti
 
Measuring Data Quality with DataOps
Measuring Data Quality with DataOpsMeasuring Data Quality with DataOps
Measuring Data Quality with DataOps
Steven Ensslen
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
Inside Analysis
 
Architecting for analytics
Architecting for analyticsArchitecting for analytics
Architecting for analytics
Rob Winters
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
Caserta
 
Michael Stonebraker: Big Data, Disruption, and the 800 Pound Gorilla in the ...
Michael Stonebraker:  Big Data, Disruption, and the 800 Pound Gorilla in the ...Michael Stonebraker:  Big Data, Disruption, and the 800 Pound Gorilla in the ...
Michael Stonebraker: Big Data, Disruption, and the 800 Pound Gorilla in the ...
TamrMarketing
 
Open Data Science Conference Agile Data
Open Data Science Conference Agile DataOpen Data Science Conference Agile Data
Open Data Science Conference Agile Data
DataKitchen
 
Webinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data IntegrationWebinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data Integration
SnapLogic
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
Dataiku
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku
 
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Caserta
 
Data Ops at TripActions
Data Ops at TripActionsData Ops at TripActions
Data Ops at TripActions
Rob Winters
 
MLUC 2011 XQuery Enigma
MLUC 2011 XQuery EnigmaMLUC 2011 XQuery Enigma
MLUC 2011 XQuery Enigma
Peter O'Kelly
 
An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
Inside Analysis
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
Databricks
 
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
Seeling Cheung
 
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data Lake
Caserta
 

What's hot (20)

Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Dsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovicDsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovic
 
Testing the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big ProblemsTesting the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big Problems
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
 
Measuring Data Quality with DataOps
Measuring Data Quality with DataOpsMeasuring Data Quality with DataOps
Measuring Data Quality with DataOps
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
Architecting for analytics
Architecting for analyticsArchitecting for analytics
Architecting for analytics
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
 
Michael Stonebraker: Big Data, Disruption, and the 800 Pound Gorilla in the ...
Michael Stonebraker:  Big Data, Disruption, and the 800 Pound Gorilla in the ...Michael Stonebraker:  Big Data, Disruption, and the 800 Pound Gorilla in the ...
Michael Stonebraker: Big Data, Disruption, and the 800 Pound Gorilla in the ...
 
Open Data Science Conference Agile Data
Open Data Science Conference Agile DataOpen Data Science Conference Agile Data
Open Data Science Conference Agile Data
 
Webinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data IntegrationWebinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data Integration
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
 
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
 
Data Ops at TripActions
Data Ops at TripActionsData Ops at TripActions
Data Ops at TripActions
 
MLUC 2011 XQuery Enigma
MLUC 2011 XQuery EnigmaMLUC 2011 XQuery Enigma
MLUC 2011 XQuery Enigma
 
An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
 
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
 
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data Lake
 

Similar to The lean principles of data ops

DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practices
Lars Albertsson
 
Data ops in practice - Swedish style
Data ops in practice - Swedish styleData ops in practice - Swedish style
Data ops in practice - Swedish style
Lars Albertsson
 
Crossing the data divide
Crossing the data divideCrossing the data divide
Crossing the data divide
Lars Albertsson
 
Holistic data application quality
Holistic data application qualityHolistic data application quality
Holistic data application quality
Lars Albertsson
 
Overcoming Digital Transformation Pain Points
Overcoming Digital Transformation Pain PointsOvercoming Digital Transformation Pain Points
Overcoming Digital Transformation Pain Points
Inductive Automation
 
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000DDeveloping and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
dclsocialmedia
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
Denodo
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
Lars Albertsson
 
Data Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
Data Con LA 2022 - Practical Solutions to Complex Supply Chain ProblemsData Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
Data Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
Data Con LA
 
Introduction for Embedding Infobright for OEMs
Introduction for Embedding Infobright for OEMsIntroduction for Embedding Infobright for OEMs
Introduction for Embedding Infobright for OEMs
Infobright
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
NuoDB
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Denodo
 
Secure software supply chain on a shoestring budget
Secure software supply chain on a shoestring budgetSecure software supply chain on a shoestring budget
Secure software supply chain on a shoestring budget
Lars Albertsson
 
David García, Rubén Aguilera Díaz-Heredero | A microservices experience in th...
David García, Rubén Aguilera Díaz-Heredero | A microservices experience in th...David García, Rubén Aguilera Díaz-Heredero | A microservices experience in th...
David García, Rubén Aguilera Díaz-Heredero | A microservices experience in th...
Codemotion
 
Essential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big DataEssential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big Data
Society of Petroleum Engineers
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
Cubodrom profile
Cubodrom profileCubodrom profile
Cubodrom profile
cubodrom
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Precisely
 
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsPower to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Looker
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
Jesus Rodriguez
 

Similar to The lean principles of data ops (20)

DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practices
 
Data ops in practice - Swedish style
Data ops in practice - Swedish styleData ops in practice - Swedish style
Data ops in practice - Swedish style
 
Crossing the data divide
Crossing the data divideCrossing the data divide
Crossing the data divide
 
Holistic data application quality
Holistic data application qualityHolistic data application quality
Holistic data application quality
 
Overcoming Digital Transformation Pain Points
Overcoming Digital Transformation Pain PointsOvercoming Digital Transformation Pain Points
Overcoming Digital Transformation Pain Points
 
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000DDeveloping and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
 
Data Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
Data Con LA 2022 - Practical Solutions to Complex Supply Chain ProblemsData Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
Data Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
 
Introduction for Embedding Infobright for OEMs
Introduction for Embedding Infobright for OEMsIntroduction for Embedding Infobright for OEMs
Introduction for Embedding Infobright for OEMs
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Secure software supply chain on a shoestring budget
Secure software supply chain on a shoestring budgetSecure software supply chain on a shoestring budget
Secure software supply chain on a shoestring budget
 
David García, Rubén Aguilera Díaz-Heredero | A microservices experience in th...
David García, Rubén Aguilera Díaz-Heredero | A microservices experience in th...David García, Rubén Aguilera Díaz-Heredero | A microservices experience in th...
David García, Rubén Aguilera Díaz-Heredero | A microservices experience in th...
 
Essential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big DataEssential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big Data
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Cubodrom profile
Cubodrom profileCubodrom profile
Cubodrom profile
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
 
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsPower to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
 

More from Lars Albertsson

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
Lars Albertsson
 
Schema management with Scalameta
Schema management with ScalametaSchema management with Scalameta
Schema management with Scalameta
Lars Albertsson
 
How to not kill people - Berlin Buzzwords 2023.pdf
How to not kill people - Berlin Buzzwords 2023.pdfHow to not kill people - Berlin Buzzwords 2023.pdf
How to not kill people - Berlin Buzzwords 2023.pdf
Lars Albertsson
 
The 7 habits of data effective companies.pdf
The 7 habits of data effective companies.pdfThe 7 habits of data effective companies.pdf
The 7 habits of data effective companies.pdf
Lars Albertsson
 
Ai legal and ethics
Ai   legal and ethicsAi   legal and ethics
Ai legal and ethics
Lars Albertsson
 
The right side of speed - learning to shift left
The right side of speed - learning to shift leftThe right side of speed - learning to shift left
The right side of speed - learning to shift left
Lars Albertsson
 
Mortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data qualityMortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data quality
Lars Albertsson
 
Data democratised
Data democratisedData democratised
Data democratised
Lars Albertsson
 
Engineering data quality
Engineering data qualityEngineering data quality
Engineering data quality
Lars Albertsson
 
Eventually, time will kill your data processing
Eventually, time will kill your data processingEventually, time will kill your data processing
Eventually, time will kill your data processing
Lars Albertsson
 
Taming the reproducibility crisis
Taming the reproducibility crisisTaming the reproducibility crisis
Taming the reproducibility crisis
Lars Albertsson
 
Eventually, time will kill your data pipeline
Eventually, time will kill your data pipelineEventually, time will kill your data pipeline
Eventually, time will kill your data pipeline
Lars Albertsson
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platform
Lars Albertsson
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science team
Lars Albertsson
 
Big data == lean data
Big data == lean dataBig data == lean data
Big data == lean data
Lars Albertsson
 
Privacy by design
Privacy by designPrivacy by design
Privacy by design
Lars Albertsson
 
Test strategies for data processing pipelines, v2.0
Test strategies for data processing pipelines, v2.0Test strategies for data processing pipelines, v2.0
Test strategies for data processing pipelines, v2.0
Lars Albertsson
 
10 ways to stumble with big data
10 ways to stumble with big data10 ways to stumble with big data
10 ways to stumble with big data
Lars Albertsson
 

More from Lars Albertsson (20)

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Schema management with Scalameta
Schema management with ScalametaSchema management with Scalameta
Schema management with Scalameta
 
How to not kill people - Berlin Buzzwords 2023.pdf
How to not kill people - Berlin Buzzwords 2023.pdfHow to not kill people - Berlin Buzzwords 2023.pdf
How to not kill people - Berlin Buzzwords 2023.pdf
 
The 7 habits of data effective companies.pdf
The 7 habits of data effective companies.pdfThe 7 habits of data effective companies.pdf
The 7 habits of data effective companies.pdf
 
Ai legal and ethics
Ai   legal and ethicsAi   legal and ethics
Ai legal and ethics
 
The right side of speed - learning to shift left
The right side of speed - learning to shift leftThe right side of speed - learning to shift left
The right side of speed - learning to shift left
 
Mortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data qualityMortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data quality
 
Data democratised
Data democratisedData democratised
Data democratised
 
Engineering data quality
Engineering data qualityEngineering data quality
Engineering data quality
 
Eventually, time will kill your data processing
Eventually, time will kill your data processingEventually, time will kill your data processing
Eventually, time will kill your data processing
 
Taming the reproducibility crisis
Taming the reproducibility crisisTaming the reproducibility crisis
Taming the reproducibility crisis
 
Eventually, time will kill your data pipeline
Eventually, time will kill your data pipelineEventually, time will kill your data pipeline
Eventually, time will kill your data pipeline
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platform
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science team
 
Big data == lean data
Big data == lean dataBig data == lean data
Big data == lean data
 
Privacy by design
Privacy by designPrivacy by design
Privacy by design
 
Test strategies for data processing pipelines, v2.0
Test strategies for data processing pipelines, v2.0Test strategies for data processing pipelines, v2.0
Test strategies for data processing pipelines, v2.0
 
10 ways to stumble with big data
10 ways to stumble with big data10 ways to stumble with big data
10 ways to stumble with big data
 

Recently uploaded

Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
dizzycaye
 
Girls Call Vadodara 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Vadodara 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Vadodara 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Vadodara 000XX00000 Provide Best And Top Girl Service And No1 in City
gargnatasha985
 
🚂🚘 Premium Girls Call Nashik 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Nashik  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...🚂🚘 Premium Girls Call Nashik  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Nashik 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
kuldeepsharmaks8120
 
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
ginni singh$A17
 
the unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithmthe unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithm
huseindihon
 
Biometric Question Bank 2021 - 1 Soln-1.pdf
Biometric Question Bank 2021 - 1 Soln-1.pdfBiometric Question Bank 2021 - 1 Soln-1.pdf
Biometric Question Bank 2021 - 1 Soln-1.pdf
Joel Ngushwai
 
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured DataFine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
kevig
 
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
norina2645
 
Potential Uses of the Floyd-Warshall Algorithm as appropriate
Potential Uses of the Floyd-Warshall Algorithm as appropriatePotential Uses of the Floyd-Warshall Algorithm as appropriate
Potential Uses of the Floyd-Warshall Algorithm as appropriate
huseindihon
 
VIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
VIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...VIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
VIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
44annissa
 
DU degree offer diploma Transcript
DU degree offer diploma TranscriptDU degree offer diploma Transcript
DU degree offer diploma Transcript
uapta
 
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy DsouzaOpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata
 
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
tanupasswan6
 
DataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptxDataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptx
Kanchana Weerasinghe
 
Willis Tower //Sears Tower- Supertall Building .pdf
Willis Tower //Sears Tower- Supertall Building .pdfWillis Tower //Sears Tower- Supertall Building .pdf
Willis Tower //Sears Tower- Supertall Building .pdf
LINAT
 
CHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptxCHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptx
girewiy968
 
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
revolutionary575
 
New Girls Call Noida 9873940964 Unlimited Short Providing Girls Service Avail...
New Girls Call Noida 9873940964 Unlimited Short Providing Girls Service Avail...New Girls Call Noida 9873940964 Unlimited Short Providing Girls Service Avail...
New Girls Call Noida 9873940964 Unlimited Short Providing Girls Service Avail...
kinni singh$A17
 
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdfWhy_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Alexander Teggin
 
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
bhupeshkumar0889
 

Recently uploaded (20)

Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
 
Girls Call Vadodara 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Vadodara 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Vadodara 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Vadodara 000XX00000 Provide Best And Top Girl Service And No1 in City
 
🚂🚘 Premium Girls Call Nashik 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Nashik  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...🚂🚘 Premium Girls Call Nashik  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Nashik 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
 
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
 
the unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithmthe unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithm
 
Biometric Question Bank 2021 - 1 Soln-1.pdf
Biometric Question Bank 2021 - 1 Soln-1.pdfBiometric Question Bank 2021 - 1 Soln-1.pdf
Biometric Question Bank 2021 - 1 Soln-1.pdf
 
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured DataFine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
 
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
 
Potential Uses of the Floyd-Warshall Algorithm as appropriate
Potential Uses of the Floyd-Warshall Algorithm as appropriatePotential Uses of the Floyd-Warshall Algorithm as appropriate
Potential Uses of the Floyd-Warshall Algorithm as appropriate
 
VIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
VIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...VIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
VIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
 
DU degree offer diploma Transcript
DU degree offer diploma TranscriptDU degree offer diploma Transcript
DU degree offer diploma Transcript
 
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy DsouzaOpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
 
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
 
DataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptxDataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptx
 
Willis Tower //Sears Tower- Supertall Building .pdf
Willis Tower //Sears Tower- Supertall Building .pdfWillis Tower //Sears Tower- Supertall Building .pdf
Willis Tower //Sears Tower- Supertall Building .pdf
 
CHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptxCHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptx
 
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
 
New Girls Call Noida 9873940964 Unlimited Short Providing Girls Service Avail...
New Girls Call Noida 9873940964 Unlimited Short Providing Girls Service Avail...New Girls Call Noida 9873940964 Unlimited Short Providing Girls Service Avail...
New Girls Call Noida 9873940964 Unlimited Short Providing Girls Service Avail...
 
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdfWhy_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
 
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
 

The lean principles of data ops

  • 1. www.scling.com The lean principles of DataOps Berlin Buzzwords, 2020-06-08 Lars Albertsson, Founder, Scling Christopher Bergh, CEO & Head Chef, DataKitchen 1
  • 2. www.scling.com Scling - data-value-as-a-service 2 Data lake Stream storage ● Extract value from your data ● Data platform + custom data pipelines ● Imitate data leaders: ○ Quick idea-to-production ○ Operational efficiency Our marketing strategy: ● Promiscuously share knowledge ○ On slides devoid of glossy polish
  • 3. www.scling.com 1994: OS/2 Warp CID installation 3 Grmbl, who reinstalled my machine?
  • 4. www.scling.com IT craft to factory 4 Security Waterfall Application delivery Traditional operations Traditional QA Infrastructure DevSecOps Agile Containers DevOps CI/CD Infrastructure as code
  • 6. www.scling.com The Toyota Way Selected lean principles: ● Long-term over short-term ● The right process will produce the right results ● Eliminate waste (muda) ● Continuous improvement (kaizen) ● Use pull systems to avoid unnecessary production ● Quality takes precedence (jidoka) ○ Stop to fix problems ● Standardised tasks and processes ● Reliable technology that serves people and process ● Develop your people ● Decisions slowly by consensus ● Relentless reflection (hansei), organisational learning 6
  • 7. www.scling.com Common waste species ● Cognitive waste ● Delivery waste ● Operational waste ● Product waste 7
  • 8. www.scling.com Cognitive waste ● Why do we have 25 time formats? ○ ISO 8601, UTC assumed ○ ISO 8601 + timezone ○ Millis since epoch, UTC ○ Nanos since epoch, UTC ○ Millis since epoch, user local time ○ … ○ Float of seconds since epoch, as string. WTF?!? ● my-kafka-topic-name, your_topic_name 8 ● Definition of an order: ○ Abandoned cart? ○ Payment refused? ○ Returned goods? ○ Free promotion? ● Data entity source of truth ○ MySQL, Kafka, data lake?
  • 9. www.scling.com What causes cognitive waste? ● We are autonomous! ○ Teams can choose technology, format, process, ... ● Cognitive debt ○ Short-term over long-term ○ Decisions without consensus ● Recognition and rewards ○ "You have made a similar independent pipeline, great work!" 9
  • 10. www.scling.com Avoiding cognitive waste ● Reusing semantic definitions ● Reusing code & technical definitions ○ Code transparency & sharing ○ Standardised technology ○ Document decisions & consensus process ● Read-only sharing not enough ○ Must be empowered to change for reuse and to improve quality ○ Standardised processes 10
  • 11. www.scling.com Eliminating cognitive waste ● Refactoring code, semantics, docs ● Low risk - what will I break downstream? ○ Standardised, automated, trusted QA process ○ End-to-end pipeline testing ● "Creating a pipeline - one day! Replace old pipeline - 18 months." 11
  • 12. www.scling.com Delivery waste ● Friction from code to production ○ Ideal: Idea, research, write code+tests, done. Everything else is friction. ● Code inventory ○ Code not yet fully utilised ● Data inventory ○ Data not yet fully processed 12
  • 13. www.scling.com Data product quality assurance ● Product quality = f(code, data) ○ Cannot do full QA on code only ○ Only real data is production data ● Test in production ○ Quick QA cycle = quick production deployment ○ Measure, monitor, validate 13
  • 14. www.scling.com Eliminating delivery friction 14 ● In theory simple - scrutinise everything ○ Positive engineering: writing code, tests, docs, refactor, improve ○ All else is negative ● You are limited by your assumptions ○ State of practice far from state of art But the test suite takes 3 hours. We have this checklist. Security must approve. X must be released before Y. That is another team's job. We don't have access. We must test in staging first. We haven't performance tested yet.
  • 15. www.scling.com So get rid of the waste. Resources: No tradeoff between speed and quality! 15
  • 16. www.scling.com ● Code not yet fully utilised ● Code on its way to production ○ In a notebook ○ Waiting for approval ○ Waiting for release ○ Internally released, waiting for dependants to upgrade ● Tests not fully used ○ Cover code (shared component), but not yet executed Code inventory 16
  • 17. www.scling.com Data inventory ● Data collected, but not yet fully processed ○ Traditional lazy joins & SQL processing at runtime ● Eliminate with eager processing = pipeline ○ Process, join, denormalise ● Fatal problems → offline crash ○ "Andon" cord - stop and fix before significant harm is done 17
  • 18. www.scling.com Operational waste ● Friction in operational manoeuvres ○ Fear of mistakes ● Cost of incidents ○ Time to recovery ○ Impact of incident ○ Frequency of incidents 18
  • 19. www.scling.com Separating offline and online 19 Raw 19 Fraud serviceFraud model Orders Orders Replication / Backup Standard procedures Standard proceduresLightweight procedures ● QA driven by internal efficiency ● Continuous deployment ● New pipeline < 1 day ● Upgrade < 1 hour ● Bug recovery < 1 hour Careful handover Careful handover
  • 20. www.scling.com 20 Cost of a software error Online ● User impact ● Data corruption ● Cascading corruption ● Unbounded recovery
  • 21. www.scling.com 21 Cost of a software error Nearline ● Data corruption ● Downstream impact ● Bounded recovery Online ● User impact ● Data corruption ● Cascading corruption ● Unbounded recovery Job Stream Stream Job Stream
  • 22. www.scling.com 22 Cost of a software error Nearline ● Data corruption ● Downstream impact ● Bounded recovery Offline ● Temporary data corruption ● Downstream impact ● Easy recovery Online ● User impact ● Data corruption ● Cascading corruption ● Unbounded recovery Job Stream Stream Job Stream
  • 23. www.scling.com Data speed Innovation speed 23 Nearline Data processing tradeoff 23 Job Stream OfflineOnline Stream Job Stream
  • 24. www.scling.com Product waste ● Work not driven by use case ● Unrealised data potential due to friction ○ Unawareness of data ○ Difficulty to use data ● Hidden quality problems ● Collaboration and communication overhead 24 Data democratisation - making data accessible and usable
  • 25. Copyright 2020 by DataKitchen, Inc. All Rights Reserved. Waste: Your Team’s Time Not Well Spent 25 Percentage Time Team Spends Per Week Current Errors & Operational Tasks New Features & Data For Customers Improvements & Debt Challenges: • Complex roles • Complex organizations • Complex toolchains • Complex data • Complex collaboration
  • 26. Copyright 2020 DataKitchen, Inc. Waste: Data Analytics is like the US Auto Industry in the 1970s Current High Errors Production Errors Data Analytics Team Deployment Latency Weeks, Months Dev Prod Challenges: • Slow to add new features, rapidly address consumer requests, changing data sets • Lack of trust by data consumers • Slow model deployment, slow to move to cloud • Team morale 26
  • 27. Copyright 2020 by DataKitchen, Inc.  All Rights Reserved. Waste: Conway’s Law and Data Pipelines Data Analytics Follows Conway's Law The structure of how teams are organized to do Data Science, Data Engineering, Analytics, and Production is reflected in their data pipelines.
  • 28. Copyright 2020 by DataKitchen, Inc.  All Rights Reserved. Waste: A cornucopia of collaboration complexity D D P D D D D D D D P D P P D Development - Data Analytic Team P Production - Data Analytic Team Centralized Dev Centralized Dev & Prod Decentralized Dev Decentralized Dev & Prod How do we create together without conflicts? (Data Engineer & Data Scientist) How do we deploy safely and rapidly? (Data Team and Production Team) How to balance centralized control vs self service freedom? (Home Office Data Team and Line of Business Analysts) How to reuse/incorporate what another team deployed? (Multiple Data & Production Teams in Many Orgs) DE DS BI
  • 29. Copyright 2020 by DataKitchen, Inc. All Rights Reserved. Why? Data Teams Are Suffering Data teams are caught between three competing forces: • Unaware Data Providers – unaware that they send crappy, late, and error prone data sets • Demanding Data Consumers – demand trusted, original insight at the speed of Amazon delivery • Critical Supporting Teams – need flawless ongoing production and collaboration with other teams/people Make for: • A beaten down, distraught, disempowered work environment • Teams that cannot create and innovate • Lack of trust all around 29 Unaware Data Providers Demanding Data Consumers Critical Supporting Teams
  • 30. Copyright 2020 by DataKitchen, Inc. All Rights Reserved. DataOps – Solution To That Suffering DataOps – The technical practices, cultural norms, and architecture that enable: • Rapid cycles of experimentation and innovation to delivery of new insights to our customers • Low error rates • Collaboration across complex sets of people, technology, and environments • Clear measurement and monitoring of results 30Source: Gartner “Organizations that adopt a DevOps- and DataOps-based approach are more successful in implementing end-to-end, reliable, robust, scalable and repeatable solutions.” Sumit Pal, Gartner, November 2018 People, Process, Organization Technical Environment
  • 31. Copyright 2020 by DataKitchen, Inc.  All Rights Reserved. DataOps Benefit: Lower Cost, More Insight 31 After DataOps Percentage Time Team Spends Per Week Before DataOps New Features & Data For Customers Errors & Operational Tasks New Features & Data For Customers Improvements & Debt Errors & Operational Tasks Process Improvements & Tech Debt Reduction
  • 32. Copyright 2020 by DataKitchen, Inc.  All Rights Reserved. DataOps Benefit: Faster, Better & Happier 32 After DataOpsBefore DataOps High Errors Production Errors Low Errors Data Analytics Team Deployment Latency Weeks, Months Dev Prod Hours & Mins Dev Prod
  • 33. Copyright 2020 by DataKitchen, Inc.  All Rights Reserved. DevOps vs DataOps (and all those *Opses) Lean, Learning Origination, and W Edwards Deming Principles: Focus on Low Errors, Cycle Time, Collaboration, and Measurement Industrial Manufacturing Teams Business Management Concept Data Science, Engineering and Analytics Teams IT and Software TeamsOrganization Team Management Agile, Kanban, Scrum, DA, etc. Team Management Six Sigma, Total Quality Management Organizational Management Method Technical Environment and Process DevOps AIOps DevSecOps DataOps ModelOps MLOps … GitOps
  • 34. Copyright 2020 by DataKitchen, Inc.  All Rights Reserved. DevOps vs DataOps (and all those *Opses) Lean, Learning Origination, and W Edwards Deming Principles: Focus on Low Errors, Cycle Time, Collaboration, and Measurement Industrial Manufacturing Teams Business Management Concept Data Science, Engineering and Analytics Teams IT and Software TeamsOrganization Team Management Agile, Kanban, Scrum, DA, etc. Team Management Six Sigma, Total Quality Management Organizational Management Method Technical Environment and Process DevOps AIOps DevSecOps DataOps ModelOps MLOps … GitOps
  • 35. Copyright 2020 by DataKitchen, Inc.  All Rights Reserved. DevOps vs DataOps (and all those *Opses) Lean, Learning Origination, and W Edwards Deming Principles: Focus on Low Errors, Cycle Time, Collaboration, and Measurement Industrial Manufacturing Teams Business Management Concept Data Science, Engineering and Analytics Teams IT and Software TeamsOrganization Team Management Agile, Kanban, Scrum, DA, etc. Team Management Six Sigma, Total Quality Management Organizational Management Method Technical Environment and Process DevOps AIOps DevSecOps DataOps ModelOps MLOps … GitOps
  • 36. Copyright 2020 by DataKitchen, Inc.  All Rights Reserved. DevOps vs DataOps (and all those *Opses) Lean, Learning Origination, and W Edwards Deming Principles: Focus on Low Errors, Cycle Time, Collaboration, and Measurement Industrial Manufacturing Teams Business Management Concept Data Science, Engineering and Analytics Teams IT and Software TeamsOrganization Team Management Agile, Kanban, Scrum, DA, etc. Team Management Six Sigma, Total Quality Management Organizational Management Method Technical Environment and Process DevOps AIOps DevSecOps DataOps ModelOps MLOps … GitOps
  • 37. Copyright 2020 by DataKitchen, Inc. All Rights Reserved. What You Do Is Much Less Important Than How You Do It 37 “We realized that the true problem, the true difficulty, and where the greatest potential is – is building the machine that makes the machine. It’s building the factory.” – Elon Musk 94% of causes were common cause. We often attribute problems to a specific case, and look for a person to blame, rather than focusing on the underlying process – Dr Deming