Reality Check: What’s the Value of
Data Science for Organizations?
Ganes KesariColloquium, Apr 2020
2
DATA SCIENCE:
WHAT’S THE VALUE
REALITY CHECK:
HOW TO THRIVE?
IT’S A RECESSION.
WHY DATA NOW?
3
INTRODUCTION
Ganes Kesari
Co-founder & Head of Analytics
“Simplify Data Science for all”100+ Clients
Insights as Stories
@kesaritweets Help start, apply and adopt Data Science
4
DATA SCIENCE:
WHAT’S THE VALUE?
IT’S A RECESSION.
WHY DATA NOW?
REALITY CHECK: HOW
TO THRIVE?
5
Senior Data ScientistPrincipal AI StorytellerChief Data Wizard
FEELING LUCKY? HERE’S A DATA SCIENCE TITLE GENERATOR!
Data
Statistical
ML
AI
Chief
Principal
Senior
Junior
Associate
Deputy
Assistant
Scientist
Engineer
Analyst
Designer
Developer
Designer
Storyteller
Ninja
Chef
Wrangler
Evangelist
Rock Star
Wizard
Alchemist
Vanity keywords Areas Activities
6
BUZZWORDS AND BUSTED BUDGETS
7
THE JOURNEY FROM DATA TO DECISIONS
Data Engineering
MaturityPhases
Data Science
Data as
‘Culture’
Data
Collection
Data Storage
Data
Transformation
Reporting Insights Consumption Decisions
Source: Article – When and how to build out your data science team
8
THE JOURNEY FROM DATA TO DECISIONS
Data Engineering Data Science
Data
Collection
Data Storage
Data
Transformation
Reporting Insights Consumption
MaturityPhases
Source: Article – When and how to build out your data science team
Data as
‘Culture’
Decisions
9
REPORTING: DESCRIPTIVE SUMMARIES
2019 Boston Chicago Detroit New York
Month Price Sales Price Sales Price Sales Price Sales
Jan 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
Feb 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
Mar 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
Apr 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
May 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
Jun 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
Jul 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
Aug 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
Sep 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
Oct 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
Nov 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
Average 9.0 7.50 9.0 7.50 9.0 7.50 9.0 7.50
Variance 10.0 3.75 10.0 3.75 10.0 3.75 10.0 3.75
Revenue numbers from four Cities
10
INSIGHT: PREDICTING TELCO CUSTOMER CHURN
Tenure (months)
0 - 12 36+12-36
Data Usage >
1.5 GB
01
YN
Bill > $65
0
N Y
• Simple Decision-tree model offered ~30% reduction in churn
• Advanced black-box models offered ~50%, but with low explainability
0Low Risk
1
High Risk
Source: Gramener
11
INSIGHT: IDENTIFYING QUALITY OF LIFE FROM SATELLITE IMAGES
Source: https://qol.gramener.com/
12
CONSUMPTION: WHEN ARE PEOPLE BORN IN THE US?
Source: https://gramener.com/posters/Birthdays.pdf
..so, conceptions
might happen here
Very high births..
Love the Valentine’s?
Too busy holidaying?
Avoid April
Fool’s Day?
Unlucky 13th?
More births
Fewer births
13
More births
CONSUMPTION: WHAT’S THE BIRTH PATTERN IN INDIA?
Source: https://gramener.com/posters/Birthdays.pdf
Fewer births
Most births in
the first half
A striking birth pattern seen on the 5th, 10th,
15th, 20th and 25th of each month…
Very low births
Aug onwards
Why? Birthdates are ‘changed’ to
aid early school admissions
.. this is a typical
indication of fraud!
14
CONSUMPTION: DECODING SHAKESPEARE’S SONNETS
Source: https://gramener.com/shakespeare/
15
INSIGHT + CONSUMPTION: DATA STORIES FROM THE WORLD BANK
Source: World bank storytelling, by Gramener
16
DATA CULTURE: WHEN DATA DRIVES THE ENTIRE ORGANIZATION
Source: Netflix.com; Slides from InfoQ– ML Infra at Netflix
17
DATA SCIENCE:
WHAT’S THE VALUE?
IT’S A RECESSION.
WHY DATA NOW?
REALITY CHECK: HOW
TO THRIVE?
18
1. Most Data Science projects solve the wrong Problem..
Tip #1: Master the application of knowledge
19
PROFESSIONALS STRUGGLE WITH APPLYING THEIR DATA SKILLS
Scenario Approaches to Apply
Just 1 week’s data (single data point!)
Data for the past 3 weeks
6 month’s data, with moderate patterns
Use heuristics and business
judgement
Simpler techniques - Moving
averages, extrapolations..
Statistics and simple time series
forecasting techniques
2 year’s data with useful signals Advanced techniques with time
series and causal approaches
20
GUIDELINES TO APPLY DATA SCIENCE TO BUSINESS PROBLEMS
• What’s the business problem you’re solving?
• Who is your audience and what do they need?
• What data do you have and what approaches
are relevant?
• What insights are important and are they
actionable?
• Take feedback and iterate
21
AI IS COMING FOR THE DATA SCIENCE JOBS
AI and automation will
do away with most of
the grunt work in the
data science workflow
today.
Applied knowledge will
keep you relevant for
much longer.
22
2. Data Analytics needs a lot more than Data & Analytics..
Tip #2: Learn non-core skills
23
DATA SCIENCE SOLUTION: LET’S TAKE THIS EXAMPLE..
Source: World bank storytelling, by Gramener
24
..AND BREAK IT DOWN INTO THE BUILDING BLOCKS
Domain
Design
Analytics
Development
• Impact analytics
• Clustering techniques
• Business workflow
• Influencing factors
• Frontend/backend coding
• Data transformation
• User journey
• Visuals & aesthetics
Project
Management
• Piecing it all together
• Change management
25
HERE ARE THE 5 ROLES & SKILLS CRITICAL FOR DATA SCIENCE
Data
Translator
ML
Engineer
Information
Designer
Data
Scientist
Data Science
Manager
Comic characters from Gramener Comicgen library
Domain
Design
Analytics
Development
Project
Management
• Domain expertise
• Business analysis
• Solutioning
• Software engineering
• Front/back-end coding
• Data pipelining
• Information design
• User centered design
• Interface/visual design (parts)
• Stats & ML
• Interpret insights
• Scripting skills
• Project management
• Business analysis/solutioning
• Team handling
26
3. Data cleaning takes up a majority of time on projects..
Tip #3: Sharpen ability to handle data
27
In data science, 80% of the time is spent preparing data,
and the other 20% on complaining about preparing the data!
- Kirk Borne
“
28
BE PREPARED FOR DATA TO BE UNSUITABLE FOR ANALYSIS
Source: Kaggle Survey
Gathering and cleaning data is a critical pre-requisite for ‘meaningful’
insights
29
LEARN DATA HANDLING AND BUDGET TIME FOR IT IN YOUR WORK
Data
deduplication
Data
standardization
Data
normalization
Quality check
Exploratory
analysis
Data Cleaning
& Preparation
30
4. Technology goes obsolete faster in Data Science..
Tip #4: Learn new tools quickly
31
WHAT DOES THE DATA TOOLS LANDSCAPE LOOK LIKE?
The tool does not matter. A person’s skill with the tool does.
Pick an ability to learn new tools rapidly
Source: https://mattturck.com/data2019/
32
EXAMPLE: WHAT ARE YOUR TOOL OPTIONS TO VISUALIZE DATA?
Code-based
Plug-n-play
Flexibility
Complexity
Google Data Studio
Excel
Google Sheets
Tableau
Raw
Vismio
Datawrapper
Timeline JS
Polestar
Vega
Vega-lite
d3,
matplotlib
C3
High charts
Nvd3
Gramex
ggplot, bokeh
Plotly
Choose tools based on flexibility, your background and tool availability
33
Tip #4: Learn new tools quickly
Tip #2: Learn non-core skills
Tip #3: Sharpen ability to handle data
Tip #1: Master the application of knowledge
34
DATA SCIENCE:
WHAT’S THE VALUE?
IT’S A RECESSION.
WHY DATA NOW?
REALITY CHECK: HOW
TO THRIVE?
35
COVID-19 HAS DISRUPTED THE GLOBAL ECONOMY..
Source: McKinsey – COVID-19 Briefing materials
36
..THE US LOST ALL JOBS GAINED SINCE THE GREAT RECESSION
Source: Tax Policy Center
Over 26M jobs lost… …in just 5 weeks
Source: CNBC, Dept of Labor, Bureau of Labor Statistics
37
WHAT DOES THE RECESSION MEAN FOR JOBS IN DATA SCIENCE?
Source: McKinsey report – Lives and Livelihoods
Data jobs and specialized professions
are relatively less impacted
Industries with the lowest wages and
lowest educational attainment are hit
the hardest
38
HERE’S WHY DATA IS KEY FOR COVID-19 AND THE RECESSION
Enterprises
B
Community
C
Remote workforce & collaboration
Market demand & Cash flows1
2
Supply chain & Logistics3
Identifying vulnerability and contact-tracing
Tracking the COVID-19 patient lifecycle1
2
Predicting infection rates and spread2
Public Health
A
Understand behavioral shifts
Mapping the effectiveness of shutdown1
2
Address people concerns during Covid-193
Source: Gramener – NYC 311 analysisSource: Kinsa Health weather map Source: Gramener – Supply Chain flow
39
HOW DO YOU STAY RELEVANT AND GROW IN YOUR CAREER PATH?
Do your own
data projects
Read/Write on
data science
Maintain a
public portfolio
Compete, learn
& re-apply
Source: Article – How to demonstrate your passion for Data
40
Thank You!
@kesaritweets
/gkesari
gramener.com
Please help me improve the session by
answering the feedback survey that will
be sent to your email J

What's the Value of Data Science for Organizations: Tips for Invincibility in your Data Science Career

  • 1.
    Reality Check: What’sthe Value of Data Science for Organizations? Ganes KesariColloquium, Apr 2020
  • 2.
    2 DATA SCIENCE: WHAT’S THEVALUE REALITY CHECK: HOW TO THRIVE? IT’S A RECESSION. WHY DATA NOW?
  • 3.
    3 INTRODUCTION Ganes Kesari Co-founder &Head of Analytics “Simplify Data Science for all”100+ Clients Insights as Stories @kesaritweets Help start, apply and adopt Data Science
  • 4.
    4 DATA SCIENCE: WHAT’S THEVALUE? IT’S A RECESSION. WHY DATA NOW? REALITY CHECK: HOW TO THRIVE?
  • 5.
    5 Senior Data ScientistPrincipalAI StorytellerChief Data Wizard FEELING LUCKY? HERE’S A DATA SCIENCE TITLE GENERATOR! Data Statistical ML AI Chief Principal Senior Junior Associate Deputy Assistant Scientist Engineer Analyst Designer Developer Designer Storyteller Ninja Chef Wrangler Evangelist Rock Star Wizard Alchemist Vanity keywords Areas Activities
  • 6.
  • 7.
    7 THE JOURNEY FROMDATA TO DECISIONS Data Engineering MaturityPhases Data Science Data as ‘Culture’ Data Collection Data Storage Data Transformation Reporting Insights Consumption Decisions Source: Article – When and how to build out your data science team
  • 8.
    8 THE JOURNEY FROMDATA TO DECISIONS Data Engineering Data Science Data Collection Data Storage Data Transformation Reporting Insights Consumption MaturityPhases Source: Article – When and how to build out your data science team Data as ‘Culture’ Decisions
  • 9.
    9 REPORTING: DESCRIPTIVE SUMMARIES 2019Boston Chicago Detroit New York Month Price Sales Price Sales Price Sales Price Sales Jan 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 Feb 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 Mar 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 Apr 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 May 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 Jun 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 Jul 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 Aug 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 Sep 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 Oct 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 Nov 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 Average 9.0 7.50 9.0 7.50 9.0 7.50 9.0 7.50 Variance 10.0 3.75 10.0 3.75 10.0 3.75 10.0 3.75 Revenue numbers from four Cities
  • 10.
    10 INSIGHT: PREDICTING TELCOCUSTOMER CHURN Tenure (months) 0 - 12 36+12-36 Data Usage > 1.5 GB 01 YN Bill > $65 0 N Y • Simple Decision-tree model offered ~30% reduction in churn • Advanced black-box models offered ~50%, but with low explainability 0Low Risk 1 High Risk Source: Gramener
  • 11.
    11 INSIGHT: IDENTIFYING QUALITYOF LIFE FROM SATELLITE IMAGES Source: https://qol.gramener.com/
  • 12.
    12 CONSUMPTION: WHEN AREPEOPLE BORN IN THE US? Source: https://gramener.com/posters/Birthdays.pdf ..so, conceptions might happen here Very high births.. Love the Valentine’s? Too busy holidaying? Avoid April Fool’s Day? Unlucky 13th? More births Fewer births
  • 13.
    13 More births CONSUMPTION: WHAT’STHE BIRTH PATTERN IN INDIA? Source: https://gramener.com/posters/Birthdays.pdf Fewer births Most births in the first half A striking birth pattern seen on the 5th, 10th, 15th, 20th and 25th of each month… Very low births Aug onwards Why? Birthdates are ‘changed’ to aid early school admissions .. this is a typical indication of fraud!
  • 14.
    14 CONSUMPTION: DECODING SHAKESPEARE’SSONNETS Source: https://gramener.com/shakespeare/
  • 15.
    15 INSIGHT + CONSUMPTION:DATA STORIES FROM THE WORLD BANK Source: World bank storytelling, by Gramener
  • 16.
    16 DATA CULTURE: WHENDATA DRIVES THE ENTIRE ORGANIZATION Source: Netflix.com; Slides from InfoQ– ML Infra at Netflix
  • 17.
    17 DATA SCIENCE: WHAT’S THEVALUE? IT’S A RECESSION. WHY DATA NOW? REALITY CHECK: HOW TO THRIVE?
  • 18.
    18 1. Most DataScience projects solve the wrong Problem.. Tip #1: Master the application of knowledge
  • 19.
    19 PROFESSIONALS STRUGGLE WITHAPPLYING THEIR DATA SKILLS Scenario Approaches to Apply Just 1 week’s data (single data point!) Data for the past 3 weeks 6 month’s data, with moderate patterns Use heuristics and business judgement Simpler techniques - Moving averages, extrapolations.. Statistics and simple time series forecasting techniques 2 year’s data with useful signals Advanced techniques with time series and causal approaches
  • 20.
    20 GUIDELINES TO APPLYDATA SCIENCE TO BUSINESS PROBLEMS • What’s the business problem you’re solving? • Who is your audience and what do they need? • What data do you have and what approaches are relevant? • What insights are important and are they actionable? • Take feedback and iterate
  • 21.
    21 AI IS COMINGFOR THE DATA SCIENCE JOBS AI and automation will do away with most of the grunt work in the data science workflow today. Applied knowledge will keep you relevant for much longer.
  • 22.
    22 2. Data Analyticsneeds a lot more than Data & Analytics.. Tip #2: Learn non-core skills
  • 23.
    23 DATA SCIENCE SOLUTION:LET’S TAKE THIS EXAMPLE.. Source: World bank storytelling, by Gramener
  • 24.
    24 ..AND BREAK ITDOWN INTO THE BUILDING BLOCKS Domain Design Analytics Development • Impact analytics • Clustering techniques • Business workflow • Influencing factors • Frontend/backend coding • Data transformation • User journey • Visuals & aesthetics Project Management • Piecing it all together • Change management
  • 25.
    25 HERE ARE THE5 ROLES & SKILLS CRITICAL FOR DATA SCIENCE Data Translator ML Engineer Information Designer Data Scientist Data Science Manager Comic characters from Gramener Comicgen library Domain Design Analytics Development Project Management • Domain expertise • Business analysis • Solutioning • Software engineering • Front/back-end coding • Data pipelining • Information design • User centered design • Interface/visual design (parts) • Stats & ML • Interpret insights • Scripting skills • Project management • Business analysis/solutioning • Team handling
  • 26.
    26 3. Data cleaningtakes up a majority of time on projects.. Tip #3: Sharpen ability to handle data
  • 27.
    27 In data science,80% of the time is spent preparing data, and the other 20% on complaining about preparing the data! - Kirk Borne “
  • 28.
    28 BE PREPARED FORDATA TO BE UNSUITABLE FOR ANALYSIS Source: Kaggle Survey Gathering and cleaning data is a critical pre-requisite for ‘meaningful’ insights
  • 29.
    29 LEARN DATA HANDLINGAND BUDGET TIME FOR IT IN YOUR WORK Data deduplication Data standardization Data normalization Quality check Exploratory analysis Data Cleaning & Preparation
  • 30.
    30 4. Technology goesobsolete faster in Data Science.. Tip #4: Learn new tools quickly
  • 31.
    31 WHAT DOES THEDATA TOOLS LANDSCAPE LOOK LIKE? The tool does not matter. A person’s skill with the tool does. Pick an ability to learn new tools rapidly Source: https://mattturck.com/data2019/
  • 32.
    32 EXAMPLE: WHAT AREYOUR TOOL OPTIONS TO VISUALIZE DATA? Code-based Plug-n-play Flexibility Complexity Google Data Studio Excel Google Sheets Tableau Raw Vismio Datawrapper Timeline JS Polestar Vega Vega-lite d3, matplotlib C3 High charts Nvd3 Gramex ggplot, bokeh Plotly Choose tools based on flexibility, your background and tool availability
  • 33.
    33 Tip #4: Learnnew tools quickly Tip #2: Learn non-core skills Tip #3: Sharpen ability to handle data Tip #1: Master the application of knowledge
  • 34.
    34 DATA SCIENCE: WHAT’S THEVALUE? IT’S A RECESSION. WHY DATA NOW? REALITY CHECK: HOW TO THRIVE?
  • 35.
    35 COVID-19 HAS DISRUPTEDTHE GLOBAL ECONOMY.. Source: McKinsey – COVID-19 Briefing materials
  • 36.
    36 ..THE US LOSTALL JOBS GAINED SINCE THE GREAT RECESSION Source: Tax Policy Center Over 26M jobs lost… …in just 5 weeks Source: CNBC, Dept of Labor, Bureau of Labor Statistics
  • 37.
    37 WHAT DOES THERECESSION MEAN FOR JOBS IN DATA SCIENCE? Source: McKinsey report – Lives and Livelihoods Data jobs and specialized professions are relatively less impacted Industries with the lowest wages and lowest educational attainment are hit the hardest
  • 38.
    38 HERE’S WHY DATAIS KEY FOR COVID-19 AND THE RECESSION Enterprises B Community C Remote workforce & collaboration Market demand & Cash flows1 2 Supply chain & Logistics3 Identifying vulnerability and contact-tracing Tracking the COVID-19 patient lifecycle1 2 Predicting infection rates and spread2 Public Health A Understand behavioral shifts Mapping the effectiveness of shutdown1 2 Address people concerns during Covid-193 Source: Gramener – NYC 311 analysisSource: Kinsa Health weather map Source: Gramener – Supply Chain flow
  • 39.
    39 HOW DO YOUSTAY RELEVANT AND GROW IN YOUR CAREER PATH? Do your own data projects Read/Write on data science Maintain a public portfolio Compete, learn & re-apply Source: Article – How to demonstrate your passion for Data
  • 40.
    40 Thank You! @kesaritweets /gkesari gramener.com Please helpme improve the session by answering the feedback survey that will be sent to your email J