The Five Dysfunctions of a Data Engineering Team

Jesse Anderson
Jesse AndersonManaging Director at Big Data Institute
The	Five	Dysfunctions	of	a
Data	Engineering	Team
1	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Chapter	1
The	Five	Dysfunctions	of	a
Data	Engineering	Team
2	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Why	Worry
The	Dysfunctions
What	to	Do?
The	Five	Dysfunctions	of	a	Data	Engineering	Team
3	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
85%	of	Big	Data	projects
fail	to	get	into	production
Source:	http://tiny.bdi.io/gartnerfail
Failure
4	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
I'd	train	at	companies	and	see	failures	at
their	mid-point
It	took	a	while	to	see	the	patterns
It	took	more	time	to	figure	out	the	most
common	patterns
Big	data	only	amplifies	existing	problems
If	you	barely	get	by	with	small	data,	you'll	have	big	problems	with
Big	Data
You	can	be	successful	by	avoiding	these
problems
Why?
5	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Why	Worry
The	Dysfunctions
What	to	Do?
The	Five	Dysfunctions	of	a	Data	Engineering	Team
6	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
DBA	Definition	-	Someone	whose	only
programming	language	is	SQL
This	includes	data	warehouse,	SQL	Developers,	etc
Big	Data	is	not	an	extension	or	the	logical
extension	of	data	warehousing
It's	much	much	more	complex
http://tiny.bdi.io/complex
It's	not	just	a	skills	gap;	it's	an	ability	gap
http://tiny.bdi.io/abilitygap
All	DBAs
7	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Big	Data	is	complex
http://tiny.bdi.io/complex
Beginners	need	to	be	give	the	time	and
resources	to	learn
It	takes	at	least	6	months	for	a	beginner	to
become	proficient
As	you	look	at	successful	case	study	talks,
they	leave	out
Expert	resources	provided
Starting	proficiency	of	the	team
Total	time	used
Set	Up	For	failure
8	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Schema	problems	don't	manifest
immediately
Takes	6-12	months	to	see
You	can't	lay	down	PBs	of	data	and
change	it
Data	pipelines	need	to	change	data
formats
Which	role	typically	has	this	skill?
DBAs	(I	didn't	say	no	DBAs-I	said	not	just	DBAs)
No	One	Understands	Schema
9	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
A	project	veteran	is	someone	who	has	put
a	Big	Data	or	distributed	system	in
production
Beginners	to	distributed	systems	and	Big
Data	are	the	sources	of	the	worst
abominations
Average	time	lost	is	1-2	man	months
Very	different	to	whiteboard	and	erase
than	code	and	rewrite
Only	a	project	veteran	can	critique	a	whiteboarded	architecture
Remember	it's	programming	and	operations
No	Veterans
10	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
You	can't	go	from	0	to	Big	Data	all	at	once
You	really	can't	go	from	0	to	the	holy	grail
Your	team	needs	the	time	to	go	from
beginners	to	intermediate	to	advanced
You	need	to	build	momentum	first
Projects	without	momentum	get	canceled
Too	Ambitious
11	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Why	Worry
The	Dysfunctions
What	to	Do?
The	Five	Dysfunctions	of	a	Data	Engineering	Team
12	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Take	an	honest	evaluation	of	the	team
Skills
Abilities
Use	case
Resources
Does	the	team	have	a	skills	gap?
Does	the	team	have	an	ability	gap?
http://tiny.bdi.io/abilitygap
Does	this	Sound	Like	Your	Team?
13	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Data	engineering	teams
need	to	be	multidisciplinary
http://tiny.bdi.io/detbook
Data	Engineering	Teams
14	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Some	teams	say	they	don't	need	help
Technical	people	think	it's	not	needed	(small	data	mentality)
Admission	of	failure
Very	important	to	take	an	honest	look	at
the	team
Training
Consulting
Very	important	to	get	a	company	with	a	good	track	record
Mentoring
On	going	help	for	the	technical	and	business	teams
Getting	Help
15	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
100
80
60
40
20
0
First ReleaseTeam Creation Project Start Second Release Nth Release
Phase In Project
PercentofBlame
Management Development Operations
When	Things	Fail
16	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Early
Never	too	late	to	fix,	but
fixing	will	be	much	more
costly
When	Should	You	Fix?
17	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
Current:	Instructor,	Thought	Leader,	Monkey	Tamer
Previously:
Curriculum	Developer	and	Instructor	@	Cloudera
Senior	Software	Engineer	@	Intuit
Covered,	Conferences	and	Published	In:
GigaOM,	ArsTecnica,	Pragmatic	Programmers,	Strata,	OSCON,
Wall	Street	Journal,	CNN,	BBC,	NPR
See	Me	On:
http://www.jesse-anderson.com
@jessetanderson
http://tiny.bdi.io/linkedin
http://tiny.bdi.io/youtube
About	Me
18	/	18Copyright	©	2016	Smoking	Hand	LLC.	All	rights	Reserved.	Version:	ef81f3f
1 of 18

Recommended

Big Data and Analytics in the COVID-19 Era by
Big Data and Analytics in the COVID-19 EraBig Data and Analytics in the COVID-19 Era
Big Data and Analytics in the COVID-19 EraJesse Anderson
107 views21 slides
Fishbone and 5 Why webinar 02 11-21 by
Fishbone and 5 Why webinar 02 11-21Fishbone and 5 Why webinar 02 11-21
Fishbone and 5 Why webinar 02 11-21Darren Dolcemascolo
260 views26 slides
A/B testing AI - Global Artificial Intelligence Conference 2019 by
A/B testing AI - Global Artificial Intelligence Conference 2019A/B testing AI - Global Artificial Intelligence Conference 2019
A/B testing AI - Global Artificial Intelligence Conference 2019Pavel Dmitriev
302 views26 slides
Graham Mossman " The Big Data Internet of ... What?!" by
Graham Mossman " The Big Data Internet of ... What?!"Graham Mossman " The Big Data Internet of ... What?!"
Graham Mossman " The Big Data Internet of ... What?!"Dataconomy Media
2.1K views21 slides
How to Create 80% of a Big Data Pilot Project by
How to Create 80% of a Big Data Pilot ProjectHow to Create 80% of a Big Data Pilot Project
How to Create 80% of a Big Data Pilot ProjectGreg Makowski
828 views23 slides
Mika Karaila - Ai chatbots industry cases by
Mika Karaila - Ai chatbots industry casesMika Karaila - Ai chatbots industry cases
Mika Karaila - Ai chatbots industry casesEficode
181 views28 slides

More Related Content

Similar to The Five Dysfunctions of a Data Engineering Team

Preparing the next generation for the cognitive era by
Preparing the next generation for the cognitive era Preparing the next generation for the cognitive era
Preparing the next generation for the cognitive era Steven Miller
677 views40 slides
Dit yvol1iss5 by
Dit yvol1iss5Dit yvol1iss5
Dit yvol1iss5Rick Lemieux
129 views4 slides
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo... by
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...Dataconomy Media
558 views28 slides
Quarterly ideenwerkstatt 11_2013_eng by
Quarterly ideenwerkstatt 11_2013_engQuarterly ideenwerkstatt 11_2013_eng
Quarterly ideenwerkstatt 11_2013_engICV_eV
2.2K views4 slides
Business Process Automation: Are You Ready for It? by
Business Process Automation: Are You Ready for It?Business Process Automation: Are You Ready for It?
Business Process Automation: Are You Ready for It?CompTIA
2.9K views16 slides
[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski by
[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski
[DSC Europe 22] Avoid mistakes building AI products - Karol PrzystalskiDataScienceConferenc1
7 views69 slides

Similar to The Five Dysfunctions of a Data Engineering Team(20)

Preparing the next generation for the cognitive era by Steven Miller
Preparing the next generation for the cognitive era Preparing the next generation for the cognitive era
Preparing the next generation for the cognitive era
Steven Miller677 views
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo... by Dataconomy Media
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Dataconomy Media558 views
Quarterly ideenwerkstatt 11_2013_eng by ICV_eV
Quarterly ideenwerkstatt 11_2013_engQuarterly ideenwerkstatt 11_2013_eng
Quarterly ideenwerkstatt 11_2013_eng
ICV_eV2.2K views
Business Process Automation: Are You Ready for It? by CompTIA
Business Process Automation: Are You Ready for It?Business Process Automation: Are You Ready for It?
Business Process Automation: Are You Ready for It?
CompTIA2.9K views
[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski by DataScienceConferenc1
[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski
[DSC Europe 22] Avoid mistakes building AI products - Karol Przystalski
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum... by Amazon Web Services Korea
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Three Monitoring Mistakes and How to Avoid Them by New Relic
Three Monitoring Mistakes and How to Avoid ThemThree Monitoring Mistakes and How to Avoid Them
Three Monitoring Mistakes and How to Avoid Them
New Relic454 views
How to (almost certainly) fail: Building vs. buying your API infrastructure by Apigee | Google Cloud
How to (almost certainly) fail: Building vs. buying your API infrastructureHow to (almost certainly) fail: Building vs. buying your API infrastructure
How to (almost certainly) fail: Building vs. buying your API infrastructure
Performance is a feature! - Developer South Coast - part 1 by Matt Warren
Performance is a feature! - Developer South Coast - part 1Performance is a feature! - Developer South Coast - part 1
Performance is a feature! - Developer South Coast - part 1
Matt Warren825 views
3 Keys for Digital Transformation in Manufacturing by Plex Systems
3 Keys for Digital Transformation in Manufacturing3 Keys for Digital Transformation in Manufacturing
3 Keys for Digital Transformation in Manufacturing
Plex Systems2.2K views
Petrophysics and Big Data by Elephant Scale training and consultin by elephantscale
Petrophysics and Big Data by Elephant Scale training and consultinPetrophysics and Big Data by Elephant Scale training and consultin
Petrophysics and Big Data by Elephant Scale training and consultin
elephantscale760 views
The standish group chaos report by Mizno Kruge
The standish group chaos report The standish group chaos report
The standish group chaos report
Mizno Kruge1.3K views
Is your IT aligned to your Business needs? Getting it to Think Like the Busin... by HCL Technologies
Is your IT aligned to your Business needs? Getting it to Think Like the Busin...Is your IT aligned to your Business needs? Getting it to Think Like the Busin...
Is your IT aligned to your Business needs? Getting it to Think Like the Busin...
HCL Technologies500 views
Bailing Out Your Business with Open Source by Matt Asay
Bailing Out Your Business with Open SourceBailing Out Your Business with Open Source
Bailing Out Your Business with Open Source
Matt Asay635 views
Tackling the ticking time bomb – Data Migration and the hidden risks by Harley Capewell
Tackling the ticking time bomb – Data Migration and the hidden risksTackling the ticking time bomb – Data Migration and the hidden risks
Tackling the ticking time bomb – Data Migration and the hidden risks
Harley Capewell906 views
Top Three Mistakes People Make with Monitoring by New Relic
Top Three Mistakes People Make with MonitoringTop Three Mistakes People Make with Monitoring
Top Three Mistakes People Make with Monitoring
New Relic164 views
Working Together As Data Teams V1 by Jesse Anderson
Working Together As Data Teams V1Working Together As Data Teams V1
Working Together As Data Teams V1
Jesse Anderson214 views
Introduction To Predictive Modelling by Spotle.ai
Introduction To Predictive ModellingIntroduction To Predictive Modelling
Introduction To Predictive Modelling
Spotle.ai299 views

More from Jesse Anderson

Managing Real-Time Data Teams by
Managing Real-Time Data TeamsManaging Real-Time Data Teams
Managing Real-Time Data TeamsJesse Anderson
213 views20 slides
Pulsar for Kafka People by
Pulsar for Kafka PeoplePulsar for Kafka People
Pulsar for Kafka PeopleJesse Anderson
403 views25 slides
What Does an Exec Need to About Architecture and Why by
What Does an Exec Need to About Architecture and WhyWhat Does an Exec Need to About Architecture and Why
What Does an Exec Need to About Architecture and WhyJesse Anderson
138 views16 slides
HBaseCon 2014-Just the Basics by
HBaseCon 2014-Just the BasicsHBaseCon 2014-Just the Basics
HBaseCon 2014-Just the BasicsJesse Anderson
562 views23 slides
Million Monkeys User Group by
Million Monkeys User GroupMillion Monkeys User Group
Million Monkeys User GroupJesse Anderson
408 views18 slides
Strata 2012 Million Monkeys by
Strata 2012 Million MonkeysStrata 2012 Million Monkeys
Strata 2012 Million MonkeysJesse Anderson
338 views9 slides

More from Jesse Anderson(11)

Managing Real-Time Data Teams by Jesse Anderson
Managing Real-Time Data TeamsManaging Real-Time Data Teams
Managing Real-Time Data Teams
Jesse Anderson213 views
What Does an Exec Need to About Architecture and Why by Jesse Anderson
What Does an Exec Need to About Architecture and WhyWhat Does an Exec Need to About Architecture and Why
What Does an Exec Need to About Architecture and Why
Jesse Anderson138 views
HBaseCon 2014-Just the Basics by Jesse Anderson
HBaseCon 2014-Just the BasicsHBaseCon 2014-Just the Basics
HBaseCon 2014-Just the Basics
Jesse Anderson562 views
EC2 Performance, Spot Instance ROI and EMR Scalability by Jesse Anderson
EC2 Performance, Spot Instance ROI and EMR ScalabilityEC2 Performance, Spot Instance ROI and EMR Scalability
EC2 Performance, Spot Instance ROI and EMR Scalability
Jesse Anderson1.6K views
Introduction to Regular Expressions by Jesse Anderson
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular Expressions
Jesse Anderson3K views

Recently uploaded

Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
61 views21 slides
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue by
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueShapeBlue
222 views23 slides
NTGapps NTG LowCode Platform by
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform Mustafa Kuğu
365 views30 slides
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITShapeBlue
166 views8 slides
Cencora Executive Symposium by
Cencora Executive SymposiumCencora Executive Symposium
Cencora Executive Symposiummarketingcommunicati21
139 views14 slides
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...ShapeBlue
138 views18 slides

Recently uploaded(20)

What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue by ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
ShapeBlue222 views
NTGapps NTG LowCode Platform by Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu365 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue166 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue138 views
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue by ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue176 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue253 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE69 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue163 views
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... by ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue123 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker50 views
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue158 views
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava... by ShapeBlue
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
ShapeBlue101 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue93 views
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates by ShapeBlue
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates
ShapeBlue210 views
The Role of Patterns in the Era of Large Language Models by Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li80 views

The Five Dysfunctions of a Data Engineering Team