SlideShare a Scribd company logo
1 of 27
Download to read offline
ALEX DEAN
What makes an
effective data
team?
@alexcrdean
What makes an
effective data
team?
Introducing me, Alex Dean
• CEO and co-founder at Snowplow Analytics[1], the company
behind Snowplow, the open source event data pipeline[2]
• Our mission at Snowplow is to help companies make better
decisions
• I have been at different stages of my working life a data
engineer and a business analyst, but never a data scientist!
• Weekend writer of Event Streams in Action (Manning)[3]
[1] https://snowplowanalytics.com
[2] https://github.com/snowplow/snowplow
[3] https://www.manning.com/books/event-streams-in-action
@datasciencefest
@SnowplowData
Snowplow is a real-time event data pipeline designed for
the data team
• My co-founder Yali and I created Snowplow so that
companies could own their customer event data without a
huge data engineering effort
• When we started Snowplow, we thought we would spend
6 months building a data pipeline and then get back to
data analytics...
• ... 7 years later, we are still building event pipelines J
• Customer base of 150 worldwide and large open-source
community of enterprises and high-growth startups
• Snowplow is designed from the ground up for data teams
(data scientists, data engineers, business analysts)
DATA TEAM
CDO
Data lifecycle
@datasciencefest
@SnowplowData
Framing: how we think about the software landscape*
Systems of
intelligence
Systems of
record
* “Four types of system” taxonomy comes from Satya Nadella, Microsoft CEO (source)
which is in turn an unbundling of Jerry Chen’s “three types of system” taxonomy (source)
• Control end-user interactions – e.g. ad tech,
support desks, marketing automation
• AI and machine learning platforms
• Web analytics, mobile analytics, product analytics
• Event data pipelines
• Cloud data integration providers
• IoT platforms
• CRM, HR/HCM, ERP/Financials
• Powers a critical business function
Systems of
engagement
Systems of
observation
Framing: how we think about data maturity@datasciencefest
@SnowplowData
A data team’s
hierarchy of
needs
Random forest
Let’s apply Maslow’s Hierarchy of Needs to the data team
Data scientists are doing industry-leading work
Leadership believes it is running a “data company”
Company is structured for data success
Data is high quality
Data is available
@datasciencefest
@SnowplowData
Data is
available
“Data is available” sounds obvious – but if it’s not true, then
you will not be doing much data science
Data collection is the foundation
of the data value chain
There are two “types” of data
companies need to collect
Data collection is a solved
problem in 2019
• A wide variety of commercial and open-
source systems of observation exist to
capture real-time event data and/or
slowly-evolving data
• And remember that you don’t have to
capture “all the things”: there is plenty of
duplicated signal across a company. You
just need the original ”signal”
• Before you can drive value from data,
you need an accurate, comprehensive
data set to work with, otherwise:
• Collecting data is hard:
• Multiple sources of data e.g. web,
mobile, email, social
• Sources are often evolving /
breaking their data “contracts”
1. “Real-time event data”
• Describes what is happening as it is
happening
• Includes web site, mobile data, call
center, email, in-store etc.
2. Slowly- evolving data
• Found in operational databases or
behind APIs
• Includes product catalogues, CRM,
content databases
Solved problem 1: collecting your real-time event data
from websites, mobile apps, ESPs etc
Tag managers / Customer Data Platforms (CDPs)
• Integrate one SDK in
web / mobile app ->
send data to many
destinations e.g.
marketing providers
• Data warehouse one of
many categories of
destination
Real-time event data pipelines
• Available open-source
or running in your
cloud as a managed
service
• Focus on data quality,
data richness and
flexibility
• Events are available in
real-time
@datasciencefest
@SnowplowData
Solved problem 2: collecting your slowly-evolving data
from operational databases, SaaS platforms etc
Next gen ETL/ELT-as-a-service
• SaaS ETL providers aka cloud data
integrators aka iPaas
• Specialize in warehousing data
from third party APIs and
databases
• Amortize the cost of maintaining
hundreds of connectors to
unreliable source systems across
large customer bases
@datasciencefest
@SnowplowData
A warning about your data engineers building this themselves
• Sometimes data engineers like to break out the tools and build
data pipelines from scratch:
• “The pre-built solution is too expensive, I can build this in a week”
• “The pre-built solution doesn’t understand all the specifics of our
business, which is not like any other business on the planet”
• “If we use a pre-built solution, we will be locked into that vendor”
• Try and dissuade your data engineers from wasting their time on
this – you need to keep your data engineers available for the
much harder problems coming in the next section J
Data is high
quality
Data quality is a really tough problem throughout the data lifecycle –
you will be leaning on your data engineers here
Store
Socialize
Activate
Create / collectExpire / delete
Data lifecycle
By data quality we are not just talking about data completeness and
cleanliness – it’s much broader
Making sure the data is
complete and correct
Making sure the data is
semantically understood
Making sure the data is
regulatory compliant
• Identify, report and recover
data which doesn’t comply to
schema
• Report on anomalies in the
data
• Deliver and maintain a single
source of truth for the data
• Create a unified semantic
layer explaining the data,
making clear what data and
derivations are available
• Clarify what assumptions
have been made in
processing the data –
tracking data lineage
• Ensure usage of data is
consistent with basis of
collection and changing data
subject preferences
• Make it easy to demonstrate
compliance
Some of the problems your data team will be grappling with:
Build a common language between the data team
and the rest of the organisation
Company is
structured for
data success
When I speak to Heads of Data and CDOs, this is the biggest
problem they are grappling with
DATA TEAM
Operationalizing the work
• Moving out of the “lab environment”
and getting results in actual
operational systems
• Dealing with differences in data
sources, data processing
• Handing conflicts with existing
operational rulesets
“Selling” the work to other teams
• Convincing the control-freak CEO that
the algorithm is working
• Convincing a team that the outcomes of
data science will make their lives easier –
it’s not just an integration chore
• Helping other teams understand that
their work will change as automated
decisioning comes in
Handling dependencies on other
teams
Learning to let go
• Built, Operate, Transfer
• Understanding that business users
have insights and enhancements
that they can make to your work
• Moving out of the MacGyver mode,
having to rely on other teams to get
things done (e.g. event tracking
instrumentation)
• Fitting into those teams’
agile/scrum/etc processes
Data insights and science
Working to discover insights, build and
test models and then make sure that
that work has impact in the business
CDO
You don’t need one culture for successful data science – you need
an evolving culture with different practices
• I like the analogy of starting with a data “MacGyver” and
then migrating to a data “A-Team”
• This is similar to Simon Wardley’s analogy of needing
Pioneers, Settlers and Town Planners[1] at various stages
of your product development process
• You will need different structures at different times - stay
flexible and keep adapting to the needs of your company
[1] https://blog.gardeviance.org/2015/03/on-pioneers-settlers-town-planners-and.html
If the interface between your data team(s) and the rest
of the company starts breaking down, change it!
Leadership
believes it is
running a
“data
company”
• Digital platforms make it
possible to collect much more
data than ever before about
how companies engage with
individual users, and provide
users with a personalised
experience
• AI and other advances in
analytic technology means
more insight can be driven
from that data
• Real-time data processing
means the data can be
“activated” in real-time
The world is changing…
• Data enables companies to
compete: those companies
that use data as a strategic
asset, to best understand their
users are able to drive
sustained competitive
advantage
...creating opportunities
• Executing on these
opportunities is hard: requires
strategic and operational
(process, culture, people,
technology) aspects
• Data-enabled competitors are
a threat – every industry has
these challengers now
• Data poses a significant
liability as well as opportunity
(e.g. GDPR)
...and challenges
• Rise of the Chief Data Officer
(CDO): identify and execute on
opportunities to use data to
drive strategic value across
the business
• Rise of the data team under
the CDO. Responsible for:
• Systematically growing a
company’s data set and
capability to use that data
to drive value
• Empowering other teams
with data
• Managing data liability /
compliance
Companies have to adapt
Advances in data technology are transforming the ways companies
do business
How do you tell if leadership is truly betting on data, or is
just playing Big Data Bingo?
Committing to
change
Publicising
concrete wins
Investing
Engaging with
the ethics /
CSR
• The hardest to fake! Is the company investing in growing team and technology?
• Is the company investing in training and upskilling the existing team members?
• Is data science quarantined in an innovation lab, or is it in a disruptively central position?
• How many hops/gatekeepers from the Head of Data to the CEO and Board?
• Is the company talking to the market in generalities about “becoming a data company”, or is it
calling out specific victories which were driven by the data team?
• Is the leadership team seriously grappling with the ethical dimension of your data work?
• The implosion of Google’s AI Ethics Board is a cautionary tale – perils of PR–driven approach
@datasciencefest
@SnowplowData
Data scientists
are doing
industry-
leading work
If all the lower “floors” are in place, hopefully you are now
empowered to do industry-leading work
What does an industry-leading environment look like? Some suggestions:
• Publishing in data science
journals
• Technical blog posts,
tutorials and technical
reports
• Educating the wider
company on what the
data team is doing and
how they can help
• Gaining significant
competitive advantage in
your market
• Winning industry awards
for innovative or
breakthrough use of data
• Tension with publishing –
suddenly your employer
sees your team’s work as
“secret sauce”!
• Investing in growing the
team
• Investing in the best tools
and processes to support
the team
• Investing in you – making
sure your personal
development keeps you at
the company and
maintains your edge
Publishing, writing and
training
Beating the competition Investing and scaling
Conclusion
Thank you! Questions
• I always love talking to data scientists and the
rest of the data team – you can reach me on:
alex@snowplowanalytics.com
@alexcrdean
• And huge thanks to Data Science Festival:
#DataScienceFest
@datasciencefest
@SnowplowData
snowplowanalytics.com
© 2018 Snowplow Analytics Ltd.

More Related Content

What's hot

The Evolution of Big Data Frameworks
The Evolution of Big Data FrameworksThe Evolution of Big Data Frameworks
The Evolution of Big Data FrameworkseXascale Infolab
 
Dataiku - google cloud platform roadshow - october 2013
Dataiku  - google cloud platform roadshow - october 2013Dataiku  - google cloud platform roadshow - october 2013
Dataiku - google cloud platform roadshow - october 2013Dataiku
 
Fasten you seatbelt and listen to the Data Steward
Fasten you seatbelt and listen to the Data StewardFasten you seatbelt and listen to the Data Steward
Fasten you seatbelt and listen to the Data StewardJean-Pierre Riehl
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamGreg Goltsov
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015 Dataiku
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 Dataiku
 
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...Rehgan Avon
 
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16thDataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16thDataiku
 
Redis rise of Dataops
Redis rise of DataopsRedis rise of Dataops
Redis rise of Dataopslandoop
 
Dataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku
 
Rolling Out Tableau to the Enterprise
Rolling Out Tableau to the EnterpriseRolling Out Tableau to the Enterprise
Rolling Out Tableau to the EnterpriseSenturus
 
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos DeligiannisCourse 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos DeligiannisBetacowork
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Betacowork
 
Lean approach to IT development
Lean approach to IT developmentLean approach to IT development
Lean approach to IT developmentMark Krebs
 
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014Josh Patterson
 
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Benjamin Nussbaum
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessInfiniteGraph
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersKaren Lopez
 
Self Evolving Model to Attain to State of Dynamic System Accuracy
Self Evolving Model to Attain to State of Dynamic System AccuracySelf Evolving Model to Attain to State of Dynamic System Accuracy
Self Evolving Model to Attain to State of Dynamic System AccuracyDataWorks Summit
 
Data Science with Hadoop: A Primer
Data Science with Hadoop: A PrimerData Science with Hadoop: A Primer
Data Science with Hadoop: A PrimerDataWorks Summit
 

What's hot (20)

The Evolution of Big Data Frameworks
The Evolution of Big Data FrameworksThe Evolution of Big Data Frameworks
The Evolution of Big Data Frameworks
 
Dataiku - google cloud platform roadshow - october 2013
Dataiku  - google cloud platform roadshow - october 2013Dataiku  - google cloud platform roadshow - october 2013
Dataiku - google cloud platform roadshow - october 2013
 
Fasten you seatbelt and listen to the Data Steward
Fasten you seatbelt and listen to the Data StewardFasten you seatbelt and listen to the Data Steward
Fasten you seatbelt and listen to the Data Steward
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016
 
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
 
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16thDataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16th
 
Redis rise of Dataops
Redis rise of DataopsRedis rise of Dataops
Redis rise of Dataops
 
Dataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine Learning
 
Rolling Out Tableau to the Enterprise
Rolling Out Tableau to the EnterpriseRolling Out Tableau to the Enterprise
Rolling Out Tableau to the Enterprise
 
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos DeligiannisCourse 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez
 
Lean approach to IT development
Lean approach to IT developmentLean approach to IT development
Lean approach to IT development
 
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
 
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-less
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data Modelers
 
Self Evolving Model to Attain to State of Dynamic System Accuracy
Self Evolving Model to Attain to State of Dynamic System AccuracySelf Evolving Model to Attain to State of Dynamic System Accuracy
Self Evolving Model to Attain to State of Dynamic System Accuracy
 
Data Science with Hadoop: A Primer
Data Science with Hadoop: A PrimerData Science with Hadoop: A Primer
Data Science with Hadoop: A Primer
 

Similar to What makes an effective data team?

The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation Caserta
 
Tips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsTips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsAbhishek Sood
 
Get Your Data Under Control in 5 Steps
Get Your Data Under Control in 5 StepsGet Your Data Under Control in 5 Steps
Get Your Data Under Control in 5 Stepsgloverastera
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It? Caserta
 
Ashley Ohmann--Data Governance Final 011315
Ashley Ohmann--Data Governance Final 011315Ashley Ohmann--Data Governance Final 011315
Ashley Ohmann--Data Governance Final 011315Ashley Ohmann
 
How To Make The Most Out of Enterprise Data
How To Make The Most Out of Enterprise DataHow To Make The Most Out of Enterprise Data
How To Make The Most Out of Enterprise DataSnapShot
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseJesus Rodriguez
 
Simplify your analytics strategy
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategyAnkita Kumari
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceAnnie Flippo
 
The Journey to Success with Big Data
The Journey to Success with Big DataThe Journey to Success with Big Data
The Journey to Success with Big DataCloudera, Inc.
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and InnovationCaserta
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeCaserta
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategyHimanshu Bari
 
EPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfEPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfcedrinemadera
 
Five Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyFive Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyPerficient, Inc.
 
Predictive analytics from a to z
Predictive analytics from a to zPredictive analytics from a to z
Predictive analytics from a to zalpinedatalabs
 
Getting Data Quality Right
Getting Data Quality RightGetting Data Quality Right
Getting Data Quality RightDATAVERSITY
 

Similar to What makes an effective data team? (20)

The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation
 
Tips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsTips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data Analytics
 
Get Your Data Under Control in 5 Steps
Get Your Data Under Control in 5 StepsGet Your Data Under Control in 5 Steps
Get Your Data Under Control in 5 Steps
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
Ashley Ohmann--Data Governance Final 011315
Ashley Ohmann--Data Governance Final 011315Ashley Ohmann--Data Governance Final 011315
Ashley Ohmann--Data Governance Final 011315
 
How To Make The Most Out of Enterprise Data
How To Make The Most Out of Enterprise DataHow To Make The Most Out of Enterprise Data
How To Make The Most Out of Enterprise Data
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
 
Simplify your analytics strategy
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategy
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data Science
 
The Journey to Success with Big Data
The Journey to Success with Big DataThe Journey to Success with Big Data
The Journey to Success with Big Data
 
Data and data scientists are not equal to money david hoyle
Data and data scientists are not equal to money   david hoyleData and data scientists are not equal to money   david hoyle
Data and data scientists are not equal to money david hoyle
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and Innovation
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategy
 
EPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfEPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdf
 
Sgcp14dunlea
Sgcp14dunleaSgcp14dunlea
Sgcp14dunlea
 
Big data
Big dataBig data
Big data
 
Five Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyFive Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data Strategy
 
Predictive analytics from a to z
Predictive analytics from a to zPredictive analytics from a to z
Predictive analytics from a to z
 
Getting Data Quality Right
Getting Data Quality RightGetting Data Quality Right
Getting Data Quality Right
 

Recently uploaded

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 

Recently uploaded (20)

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 

What makes an effective data team?

  • 1. ALEX DEAN What makes an effective data team? @alexcrdean
  • 3. Introducing me, Alex Dean • CEO and co-founder at Snowplow Analytics[1], the company behind Snowplow, the open source event data pipeline[2] • Our mission at Snowplow is to help companies make better decisions • I have been at different stages of my working life a data engineer and a business analyst, but never a data scientist! • Weekend writer of Event Streams in Action (Manning)[3] [1] https://snowplowanalytics.com [2] https://github.com/snowplow/snowplow [3] https://www.manning.com/books/event-streams-in-action @datasciencefest @SnowplowData
  • 4. Snowplow is a real-time event data pipeline designed for the data team • My co-founder Yali and I created Snowplow so that companies could own their customer event data without a huge data engineering effort • When we started Snowplow, we thought we would spend 6 months building a data pipeline and then get back to data analytics... • ... 7 years later, we are still building event pipelines J • Customer base of 150 worldwide and large open-source community of enterprises and high-growth startups • Snowplow is designed from the ground up for data teams (data scientists, data engineers, business analysts) DATA TEAM CDO Data lifecycle @datasciencefest @SnowplowData
  • 5. Framing: how we think about the software landscape* Systems of intelligence Systems of record * “Four types of system” taxonomy comes from Satya Nadella, Microsoft CEO (source) which is in turn an unbundling of Jerry Chen’s “three types of system” taxonomy (source) • Control end-user interactions – e.g. ad tech, support desks, marketing automation • AI and machine learning platforms • Web analytics, mobile analytics, product analytics • Event data pipelines • Cloud data integration providers • IoT platforms • CRM, HR/HCM, ERP/Financials • Powers a critical business function Systems of engagement Systems of observation
  • 6. Framing: how we think about data maturity@datasciencefest @SnowplowData
  • 7. A data team’s hierarchy of needs Random forest
  • 8. Let’s apply Maslow’s Hierarchy of Needs to the data team Data scientists are doing industry-leading work Leadership believes it is running a “data company” Company is structured for data success Data is high quality Data is available @datasciencefest @SnowplowData
  • 10. “Data is available” sounds obvious – but if it’s not true, then you will not be doing much data science Data collection is the foundation of the data value chain There are two “types” of data companies need to collect Data collection is a solved problem in 2019 • A wide variety of commercial and open- source systems of observation exist to capture real-time event data and/or slowly-evolving data • And remember that you don’t have to capture “all the things”: there is plenty of duplicated signal across a company. You just need the original ”signal” • Before you can drive value from data, you need an accurate, comprehensive data set to work with, otherwise: • Collecting data is hard: • Multiple sources of data e.g. web, mobile, email, social • Sources are often evolving / breaking their data “contracts” 1. “Real-time event data” • Describes what is happening as it is happening • Includes web site, mobile data, call center, email, in-store etc. 2. Slowly- evolving data • Found in operational databases or behind APIs • Includes product catalogues, CRM, content databases
  • 11. Solved problem 1: collecting your real-time event data from websites, mobile apps, ESPs etc Tag managers / Customer Data Platforms (CDPs) • Integrate one SDK in web / mobile app -> send data to many destinations e.g. marketing providers • Data warehouse one of many categories of destination Real-time event data pipelines • Available open-source or running in your cloud as a managed service • Focus on data quality, data richness and flexibility • Events are available in real-time @datasciencefest @SnowplowData
  • 12. Solved problem 2: collecting your slowly-evolving data from operational databases, SaaS platforms etc Next gen ETL/ELT-as-a-service • SaaS ETL providers aka cloud data integrators aka iPaas • Specialize in warehousing data from third party APIs and databases • Amortize the cost of maintaining hundreds of connectors to unreliable source systems across large customer bases @datasciencefest @SnowplowData
  • 13. A warning about your data engineers building this themselves • Sometimes data engineers like to break out the tools and build data pipelines from scratch: • “The pre-built solution is too expensive, I can build this in a week” • “The pre-built solution doesn’t understand all the specifics of our business, which is not like any other business on the planet” • “If we use a pre-built solution, we will be locked into that vendor” • Try and dissuade your data engineers from wasting their time on this – you need to keep your data engineers available for the much harder problems coming in the next section J
  • 15. Data quality is a really tough problem throughout the data lifecycle – you will be leaning on your data engineers here Store Socialize Activate Create / collectExpire / delete Data lifecycle
  • 16. By data quality we are not just talking about data completeness and cleanliness – it’s much broader Making sure the data is complete and correct Making sure the data is semantically understood Making sure the data is regulatory compliant • Identify, report and recover data which doesn’t comply to schema • Report on anomalies in the data • Deliver and maintain a single source of truth for the data • Create a unified semantic layer explaining the data, making clear what data and derivations are available • Clarify what assumptions have been made in processing the data – tracking data lineage • Ensure usage of data is consistent with basis of collection and changing data subject preferences • Make it easy to demonstrate compliance Some of the problems your data team will be grappling with: Build a common language between the data team and the rest of the organisation
  • 18. When I speak to Heads of Data and CDOs, this is the biggest problem they are grappling with DATA TEAM Operationalizing the work • Moving out of the “lab environment” and getting results in actual operational systems • Dealing with differences in data sources, data processing • Handing conflicts with existing operational rulesets “Selling” the work to other teams • Convincing the control-freak CEO that the algorithm is working • Convincing a team that the outcomes of data science will make their lives easier – it’s not just an integration chore • Helping other teams understand that their work will change as automated decisioning comes in Handling dependencies on other teams Learning to let go • Built, Operate, Transfer • Understanding that business users have insights and enhancements that they can make to your work • Moving out of the MacGyver mode, having to rely on other teams to get things done (e.g. event tracking instrumentation) • Fitting into those teams’ agile/scrum/etc processes Data insights and science Working to discover insights, build and test models and then make sure that that work has impact in the business CDO
  • 19. You don’t need one culture for successful data science – you need an evolving culture with different practices • I like the analogy of starting with a data “MacGyver” and then migrating to a data “A-Team” • This is similar to Simon Wardley’s analogy of needing Pioneers, Settlers and Town Planners[1] at various stages of your product development process • You will need different structures at different times - stay flexible and keep adapting to the needs of your company [1] https://blog.gardeviance.org/2015/03/on-pioneers-settlers-town-planners-and.html If the interface between your data team(s) and the rest of the company starts breaking down, change it!
  • 20. Leadership believes it is running a “data company”
  • 21. • Digital platforms make it possible to collect much more data than ever before about how companies engage with individual users, and provide users with a personalised experience • AI and other advances in analytic technology means more insight can be driven from that data • Real-time data processing means the data can be “activated” in real-time The world is changing… • Data enables companies to compete: those companies that use data as a strategic asset, to best understand their users are able to drive sustained competitive advantage ...creating opportunities • Executing on these opportunities is hard: requires strategic and operational (process, culture, people, technology) aspects • Data-enabled competitors are a threat – every industry has these challengers now • Data poses a significant liability as well as opportunity (e.g. GDPR) ...and challenges • Rise of the Chief Data Officer (CDO): identify and execute on opportunities to use data to drive strategic value across the business • Rise of the data team under the CDO. Responsible for: • Systematically growing a company’s data set and capability to use that data to drive value • Empowering other teams with data • Managing data liability / compliance Companies have to adapt Advances in data technology are transforming the ways companies do business
  • 22. How do you tell if leadership is truly betting on data, or is just playing Big Data Bingo? Committing to change Publicising concrete wins Investing Engaging with the ethics / CSR • The hardest to fake! Is the company investing in growing team and technology? • Is the company investing in training and upskilling the existing team members? • Is data science quarantined in an innovation lab, or is it in a disruptively central position? • How many hops/gatekeepers from the Head of Data to the CEO and Board? • Is the company talking to the market in generalities about “becoming a data company”, or is it calling out specific victories which were driven by the data team? • Is the leadership team seriously grappling with the ethical dimension of your data work? • The implosion of Google’s AI Ethics Board is a cautionary tale – perils of PR–driven approach @datasciencefest @SnowplowData
  • 24. If all the lower “floors” are in place, hopefully you are now empowered to do industry-leading work What does an industry-leading environment look like? Some suggestions: • Publishing in data science journals • Technical blog posts, tutorials and technical reports • Educating the wider company on what the data team is doing and how they can help • Gaining significant competitive advantage in your market • Winning industry awards for innovative or breakthrough use of data • Tension with publishing – suddenly your employer sees your team’s work as “secret sauce”! • Investing in growing the team • Investing in the best tools and processes to support the team • Investing in you – making sure your personal development keeps you at the company and maintains your edge Publishing, writing and training Beating the competition Investing and scaling
  • 26. Thank you! Questions • I always love talking to data scientists and the rest of the data team – you can reach me on: alex@snowplowanalytics.com @alexcrdean • And huge thanks to Data Science Festival: #DataScienceFest @datasciencefest @SnowplowData