Data + Data Scientists ≠ Money
Dr. David Hoyle
My background
PRODUCING DATA SCIENCE
• 20 yrs in academia
• dunnhumby
• dunnhumby
APPLYING DATA SCIENCE
• Lloyds Banking Group
• AutoTrader UK
• InfinityWorks
The challenges in applying Data Science are very different
Data
Science
Doesn’t
Work
How did we get here?
Your Company Inc.
Cultural and
organizational
challenges
are always
harder than
technical
challenges
• Which parts of the business?
• How should we organize?
• How should we work?
• How should we communicate?
• What support do we need?
• What data should we use?
• SVM vs Logistic Regression?
Where do
companies need
Data Science?
Data touchpoints
Bad guys Account
Manager
Marketing
Developers OEMs
Private seller
£, €, $
Finance
Car Dealer
Consumer
External Data
Internal Data
£ Trade£ Retail
‘Why?’ is a
powerful Data
Science tool How will you consume the outputs?
‘We need a neural network’
‘We want to predict if users will
click this link’
‘Not clicking indicates low user
engagement’
Why?
Why?
‘We can alter the content in
session if engagement is low’
Can you respond to the neural
network output fast enough?
‘Hmmm… No.’
Build cross-functional teams
Data Scientist ≠ Data Engineer
Data
Science Data Engineering
Product
10
Get close to the
business Analytics
Team
Product
Area C
Analyst
Product
Area A
Analyst
Product
Area B
Analyst
Business
Area 2
Analyst
Business
Area 1
Analyst
Data Scientists
+
Data Analysts
Does Agile
always
work for
Data
Science?
All parts of Data Science have
outputs
INFORMATION
PRESENTATION
ALGORITHM
DEVELOPMENT
Easier to communicate outputs
Easier to communicate progress
Harder to communicate outputs
Harder to communicate progress
Always
communicate what
the outputs will be
Understand
business problem
Map to
appropriate
abstraction
Mathematical
statement of
abstraction
Identify type of
mathematical
model required
Identify & explore
potential data
sources
Build, validate, &
test model,
e.g. CRISP-DM
Productionize
model
Deploy
production model
artefacts
Consume model
outputs
Monitor
production model
Re-build
production model Improve model
Data
Science
Data
Engineering
Data
Science
Data
Engineering
Data
Engineering
Data
Engineering
Data
Science
Data
Science
Understand &
conceptualize
the problem
Understand
resources
available & build
model
Incorporate
model into
business
process
Monitor &
improve
The Data Science innovation lifecycle is longer than you think
Data & Compute should
be close together
Operational
Operational
+
Data Warehouse
SQL, BI
Not all data is valuable
Either
Your data is valuable to
you – e.g. helps improve
business processes
Or
Your data is valuable to
someone else – e.g.
gives a market wide view
To make Data Science pay you need to
1. Work on the projects with direct P&L impact
2. …..by asking the right business questions up-front
3. …..using teams that have the technical right skills
4. …..and understand the business challenges
5. …..using Agile methodologies where appropriate
6. …..always communicating what you are doing and why
7. …..with the right tools and on the right data
Data and data scientists are not equal to money   david hoyle

Data and data scientists are not equal to money david hoyle

  • 1.
    Data + DataScientists ≠ Money Dr. David Hoyle
  • 2.
    My background PRODUCING DATASCIENCE • 20 yrs in academia • dunnhumby • dunnhumby APPLYING DATA SCIENCE • Lloyds Banking Group • AutoTrader UK • InfinityWorks The challenges in applying Data Science are very different
  • 3.
  • 4.
    How did weget here? Your Company Inc.
  • 5.
    Cultural and organizational challenges are always harderthan technical challenges • Which parts of the business? • How should we organize? • How should we work? • How should we communicate? • What support do we need? • What data should we use? • SVM vs Logistic Regression?
  • 6.
  • 7.
    Data touchpoints Bad guysAccount Manager Marketing Developers OEMs Private seller £, €, $ Finance Car Dealer Consumer External Data Internal Data £ Trade£ Retail
  • 8.
    ‘Why?’ is a powerfulData Science tool How will you consume the outputs? ‘We need a neural network’ ‘We want to predict if users will click this link’ ‘Not clicking indicates low user engagement’ Why? Why? ‘We can alter the content in session if engagement is low’ Can you respond to the neural network output fast enough? ‘Hmmm… No.’
  • 9.
    Build cross-functional teams DataScientist ≠ Data Engineer Data Science Data Engineering Product
  • 10.
    10 Get close tothe business Analytics Team Product Area C Analyst Product Area A Analyst Product Area B Analyst Business Area 2 Analyst Business Area 1 Analyst Data Scientists + Data Analysts
  • 11.
  • 12.
    All parts ofData Science have outputs INFORMATION PRESENTATION ALGORITHM DEVELOPMENT Easier to communicate outputs Easier to communicate progress Harder to communicate outputs Harder to communicate progress Always communicate what the outputs will be
  • 13.
    Understand business problem Map to appropriate abstraction Mathematical statementof abstraction Identify type of mathematical model required Identify & explore potential data sources Build, validate, & test model, e.g. CRISP-DM Productionize model Deploy production model artefacts Consume model outputs Monitor production model Re-build production model Improve model Data Science Data Engineering Data Science Data Engineering Data Engineering Data Engineering Data Science Data Science Understand & conceptualize the problem Understand resources available & build model Incorporate model into business process Monitor & improve The Data Science innovation lifecycle is longer than you think
  • 14.
    Data & Computeshould be close together Operational Operational + Data Warehouse SQL, BI
  • 15.
    Not all datais valuable Either Your data is valuable to you – e.g. helps improve business processes Or Your data is valuable to someone else – e.g. gives a market wide view
  • 16.
    To make DataScience pay you need to 1. Work on the projects with direct P&L impact 2. …..by asking the right business questions up-front 3. …..using teams that have the technical right skills 4. …..and understand the business challenges 5. …..using Agile methodologies where appropriate 6. …..always communicating what you are doing and why 7. …..with the right tools and on the right data