SlideShare a Scribd company logo
Setting Expectations on
Data Science Projects:
What am I doing and why am I doing it?
March 21, 2019
2
“Hey Kathleen,
Can you get me X data?
It should be really simple.
I just need the data.”
The game changer…
“I need the data that shows that client experience
increased because of the new Preferred Call Center.”
Historical Approach Better Approach
3
The game changer…
“I need the data that shows that client experience
increased because of the new Preferred Call Center.”
“Great! Let’s spend
another $7MM to
double capacity.”
“Here’s your data.”
Historical Approach Better Approach
4
The game changer…
“I need the data that shows that client experience
increased because of the new Preferred Call Center.”
“Great! Let’s spend
another $7MM to
double capacity.”
“Benefit was driven by
your talent selection,
not the $7MM.”
“Let’s expand with
the right people.”
“Here’s your data.”
Historical Approach Better Approach
5
1. Define the
question
2. Do the work
(and draw a
conclusion)
3. Answer the
question
“Hey Kathleen,
Can you get me X data?
It should be really simple. I
just need the data.”
6
Hear the question differently, and trust yourself
“Hey Kathleen,
I need your help with
something.”
7
Part 1: Define the question
Don’t say “yes,” but definitely don’t say “no”
“Help me understand
what you are trying to
achieve so I can better
meet your needs.”
8
Part 1: Define the question
Ask open-ended questions, clarify as you go, and trust your
business partner
“Ok here’s what’s really going on… But don’t say
anything because it’s confidential…”
Who else is working on this?
What other theories are there?
What’s different about this?
What did you learn last time?
What level of precision?
If we do X, couldn’t it impact Y?
Is it possible that…?
What is your hypothesis?
What is your concern?
What changed?
What did you expect?
Who asked the question?
What prompted it?
What do you think?
9
Part 1: Define the question
Work backward and focus on the story
I ask two questions…
1. What’s my conclusion?
2. How do I know?
10
Part 3: Answer the question
Talk it out, and use conversational language for your first draft
11
Part 3: Answer the question
Give yourself plenty of time, and plan for multiple iterations
“I have made this letter longer than
usual, only because I have not had the
time to make it shorter.”
-- Blaise Pascal
12
Part 3: Answer the question
13
Iteration #1
Objective – Understand difference in revenue based on RDC use
Methodology/Period – Test vs. pseudo-control (Q2 to Q3)
Note – This is a historical “statement of fact”. It does not imply
causality, other confounding factors could also contribute
Test/Pseudo-Control Groups – Both test (N=27k) and pseudo-
control groups (N=40k) are households that were open, mobile
active, and depositing checks during both periods. Clients in the
test group started using RDC in the post period; clients in the
pseudo-control group never utilized RDC. Test and pseudo-controls
are balanced by demographics based on pre-period data
Result –
• Revenue – Households that started utilizing RDC have an
increase in revenue. The increase in revenue could be driven by
the increasing check deposits for the households who started
utilizing RDC
• RDC Usage – Households that deposit checks through RDC as
their main channel deposit fewer checks overall. No correlation
between change in revenue and RDC usage is observed.
Application – Of n retail households open, mobile active, as well
as depositing checks during Q3, m of them were using RDC at the
time. The “value” of households who utilized remote deposit during
Q3 would roughly translate to $x +/- $y
Define the question
14
Iteration #1
Objective – Understand difference in revenue based on RDC use
Methodology/Period – Test vs. pseudo-control (Q2 to Q3)
Note – This is a historical “statement of fact”. It does not imply
causality, other confounding factors could also contribute
Test/Pseudo-Control Groups – Both test (N=27k) and pseudo-
control groups (N=40k) are households that were open, mobile
active, and depositing checks during both periods. Clients in the
test group started using RDC in the post period; clients in the
pseudo-control group never utilized RDC. Test and pseudo-controls
are balanced by demographics based on pre-period data
Result –
• Revenue – Households that started utilizing RDC have an
increase in revenue. The increase in revenue could be driven by
the increasing check deposits for the households who started
utilizing RDC
• RDC Usage – Households that deposit checks through RDC as
their main channel deposit fewer checks overall. No correlation
between change in revenue and RDC usage is observed.
Application – Of n retail households open, mobile active, as well
as depositing checks during Q3, m of them were using RDC at the
time. The “value” of households who utilized remote deposit during
Q3 would roughly translate to $x +/- $y
Define the question
Do the work
15
Iteration #1
Objective – Understand difference in revenue based on RDC use
Methodology/Period – Test vs. pseudo-control (Q2 to Q3)
Note – This is a historical “statement of fact”. It does not imply
causality, other confounding factors could also contribute
Test/Pseudo-Control Groups – Both test (N=27k) and pseudo-
control groups (N=40k) are households that were open, mobile
active, and depositing checks during both periods. Clients in the
test group started using RDC in the post period; clients in the
pseudo-control group never utilized RDC. Test and pseudo-controls
are balanced by demographics based on pre-period data
Result –
• Revenue – Households that started utilizing RDC have an
increase in revenue. The increase in revenue could be driven by
the increasing check deposits for the households who started
utilizing RDC
• RDC Usage – Households that deposit checks through RDC as
their main channel deposit fewer checks overall. No correlation
between change in revenue and RDC usage is observed.
Application – Of n retail households open, mobile active, as well
as depositing checks during Q3, m of them were using RDC at the
time. The “value” of households who utilized remote deposit during
Q3 would roughly translate to $x +/- $y
Define the question
Do the work
Draw a conclusion
16
Iteration #1
Objective – Understand difference in revenue based on RDC use
Methodology/Period – Test vs. pseudo-control (Q2 to Q3)
Note – This is a historical “statement of fact”. It does not imply
causality, other confounding factors could also contribute
Test/Pseudo-Control Groups – Both test (N=27k) and pseudo-
control groups (N=40k) are households that were open, mobile
active, and depositing checks during both periods. Clients in the
test group started using RDC in the post period; clients in the
pseudo-control group never utilized RDC. Test and pseudo-controls
are balanced by demographics based on pre-period data
Result –
• Revenue – Households that started utilizing RDC have an
increase in revenue. The increase in revenue could be driven by
the increasing check deposits for the households who started
utilizing RDC
• RDC Usage – Households that deposit checks through RDC as
their main channel deposit fewer checks overall. No correlation
between change in revenue and RDC usage is observed.
Application – Of n retail households open, mobile active, as well
as depositing checks during Q3, m of them were using RDC at the
time. The “value” of households who utilized remote deposit during
Q3 would roughly translate to $x +/- $y
Define the question
Do the work
Draw a conclusion
Answer the question
17
Final Iteration
Households who started using RDC generated
$x more revenue during Q3 compared to
households that didn’t.
This is not necessarily a causal relationship –
households that started using RDC also…
• Deposited more check overall
• Had a higher level of engagement in general
RDC appears to have a slight negative impact on
retention
• Lower CSAT observed in households that used
RDC suggests a poor RDC experience may have
contributed to client attrition
Households that utilized RDC as their main
channel for check deposit had the highest
satisfaction score in RDC
• These households, however, also deposit fewer
checks on average
Answer the question
18
Final Iteration
Households who started using RDC generated
$x more revenue during Q3 compared to
households that didn’t.
This is not necessarily a causal relationship –
households that started using RDC also…
• Deposited more check overall
• Had a higher level of engagement in general
RDC appears to have a slight negative impact on
retention
• Lower CSAT observed in households that used
RDC suggests a poor RDC experience may have
contributed to client attrition
Households that utilized RDC as their main
channel for check deposit had the highest
satisfaction score in RDC
• These households, however, also deposit fewer
checks on average
1. What’s my conclusion?
Answer the question
2. How do I know?
19
Final Iteration
Households who started using RDC generated
$x more revenue during Q3 compared to
households that didn’t.
This is not necessarily a causal relationship –
households that started using RDC also…
• Deposited more check overall
• Had a higher level of engagement in general
RDC appears to have a slight negative impact on
retention
• Lower CSAT observed in households that used
RDC suggests a poor RDC experience may have
contributed to client attrition
Households that utilized RDC as their main
channel for check deposit had the highest
satisfaction score in RDC
• These households, however, also deposit fewer
checks on average
1. What’s my conclusion?
2. How do I know?
1. What’s my conclusion?
2. How do I know?
Answer the question
20
Final Iteration
Households who started using RDC generated
$x more revenue during Q3 compared to
households that didn’t.
This is not necessarily a causal relationship –
households that started using RDC also…
• Deposited more check overall
• Had a higher level of engagement in general
RDC appears to have a slight negative impact on
retention
• Lower CSAT observed in households that used
RDC suggests a poor RDC experience may have
contributed to client attrition
Households that utilized RDC as their main
channel for check deposit had the highest
satisfaction score in RDC
• These households, however, also deposit fewer
checks on average
1. What’s my conclusion?
Answer the question
2. How do I know?
1. What’s my conclusion?
2. How do I know?
1. What’s my conclusion?
2. How do I know?
1. Hear the question differently, and trust
yourself
2. Don’t say “yes,” but definitely don’t say
“no”
3. Ask open-ended questions, clarify as you
go, and trust your business partner
1. Define the question
1. Work backward and focus on the story
2. Talk it out, and use conversational
language for your first draft
3. Give yourself plenty of time, and plan for
multiple iterations
3. Answer the question
21
In summary…

More Related Content

Similar to 2019 WIA - Setting Expectations on Data Science Projects

Benefits Management – a fool’s errand?
Benefits Management – a fool’s errand?Benefits Management – a fool’s errand?
Benefits Management – a fool’s errand?
grantpn
 
Week 7 - Maximising Value and Removing Waste in Media
Week 7 - Maximising Value and Removing Waste in MediaWeek 7 - Maximising Value and Removing Waste in Media
Week 7 - Maximising Value and Removing Waste in Media
Ben Shepherd
 
Final presentation
Final presentationFinal presentation
Final presentation
ssuser8e5ee2
 
Measuring Mission Value of Digital Communications in the Public Sector and 9 ...
Measuring Mission Value of Digital Communications in the Public Sector and 9 ...Measuring Mission Value of Digital Communications in the Public Sector and 9 ...
Measuring Mission Value of Digital Communications in the Public Sector and 9 ...
Scott Burns
 
Measuring and Capturing Value of Government Communication
Measuring and Capturing Value of Government CommunicationMeasuring and Capturing Value of Government Communication
Measuring and Capturing Value of Government Communication
GovLoop
 
Introducing data driven practices into sales environments
Introducing data driven practices into sales environmentsIntroducing data driven practices into sales environments
Introducing data driven practices into sales environments
Barry Magee
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage Industry
Pranov Mishra
 
Universal Credit - where next for social policy?
Universal Credit - where next for social policy?Universal Credit - where next for social policy?
Universal Credit - where next for social policy?
Policy in Practice
 
Cost-Benefit-Analysis-Presentation.pptx
Cost-Benefit-Analysis-Presentation.pptxCost-Benefit-Analysis-Presentation.pptx
Cost-Benefit-Analysis-Presentation.pptx
kaushalsekhsaria
 
Diseconomies of Scale webinar
Diseconomies of Scale webinarDiseconomies of Scale webinar
Diseconomies of Scale webinar
Locality
 
Desgining data led intervention campaigns
Desgining data led intervention campaignsDesgining data led intervention campaigns
Desgining data led intervention campaigns
Policy in Practice
 
1.8 Data and Performance Simplified (De Jong)
1.8 Data and Performance Simplified (De Jong)1.8 Data and Performance Simplified (De Jong)
1.8 Data and Performance Simplified (De Jong)
National Alliance to End Homelessness
 
SUCCESS STORY: How King County Treasury Reduced Taxpayer Late Fees by 62%, Fe...
SUCCESS STORY: How King County Treasury Reduced Taxpayer Late Fees by 62%, Fe...SUCCESS STORY: How King County Treasury Reduced Taxpayer Late Fees by 62%, Fe...
SUCCESS STORY: How King County Treasury Reduced Taxpayer Late Fees by 62%, Fe...
GoLeanSixSigma.com
 
Assignment DetailsASSIGNMENT SWEET DRINKS OR MARIJUANA.docx
Assignment DetailsASSIGNMENT SWEET DRINKS OR MARIJUANA.docxAssignment DetailsASSIGNMENT SWEET DRINKS OR MARIJUANA.docx
Assignment DetailsASSIGNMENT SWEET DRINKS OR MARIJUANA.docx
rock73
 
1.11 Data and Performance Simplified
1.11 Data and Performance Simplified1.11 Data and Performance Simplified
1.11 Data and Performance Simplified
National Alliance to End Homelessness
 
2.6 Expert Forum: Data and Performance Simplified
2.6 Expert Forum: Data and Performance Simplified2.6 Expert Forum: Data and Performance Simplified
2.6 Expert Forum: Data and Performance Simplified
National Alliance to End Homelessness
 
Outpost24 webinar - The economics of penetration testing in the new threat la...
Outpost24 webinar - The economics of penetration testing in the new threat la...Outpost24 webinar - The economics of penetration testing in the new threat la...
Outpost24 webinar - The economics of penetration testing in the new threat la...
Outpost24
 
Charity Navigator 2.0 Case Study Presentation
Charity Navigator 2.0 Case Study PresentationCharity Navigator 2.0 Case Study Presentation
Charity Navigator 2.0 Case Study Presentation
CharityNav
 
TrustCloud: Satisfaction in Sharing Survey 8 June 2015
TrustCloud: Satisfaction in Sharing Survey 8 June 2015TrustCloud: Satisfaction in Sharing Survey 8 June 2015
TrustCloud: Satisfaction in Sharing Survey 8 June 2015
Miles Spencer
 
Student Name Type your name hereDateEnter the date on w.docx
Student Name Type your name hereDateEnter the date on w.docxStudent Name Type your name hereDateEnter the date on w.docx
Student Name Type your name hereDateEnter the date on w.docx
emelyvalg9
 

Similar to 2019 WIA - Setting Expectations on Data Science Projects (20)

Benefits Management – a fool’s errand?
Benefits Management – a fool’s errand?Benefits Management – a fool’s errand?
Benefits Management – a fool’s errand?
 
Week 7 - Maximising Value and Removing Waste in Media
Week 7 - Maximising Value and Removing Waste in MediaWeek 7 - Maximising Value and Removing Waste in Media
Week 7 - Maximising Value and Removing Waste in Media
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Measuring Mission Value of Digital Communications in the Public Sector and 9 ...
Measuring Mission Value of Digital Communications in the Public Sector and 9 ...Measuring Mission Value of Digital Communications in the Public Sector and 9 ...
Measuring Mission Value of Digital Communications in the Public Sector and 9 ...
 
Measuring and Capturing Value of Government Communication
Measuring and Capturing Value of Government CommunicationMeasuring and Capturing Value of Government Communication
Measuring and Capturing Value of Government Communication
 
Introducing data driven practices into sales environments
Introducing data driven practices into sales environmentsIntroducing data driven practices into sales environments
Introducing data driven practices into sales environments
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage Industry
 
Universal Credit - where next for social policy?
Universal Credit - where next for social policy?Universal Credit - where next for social policy?
Universal Credit - where next for social policy?
 
Cost-Benefit-Analysis-Presentation.pptx
Cost-Benefit-Analysis-Presentation.pptxCost-Benefit-Analysis-Presentation.pptx
Cost-Benefit-Analysis-Presentation.pptx
 
Diseconomies of Scale webinar
Diseconomies of Scale webinarDiseconomies of Scale webinar
Diseconomies of Scale webinar
 
Desgining data led intervention campaigns
Desgining data led intervention campaignsDesgining data led intervention campaigns
Desgining data led intervention campaigns
 
1.8 Data and Performance Simplified (De Jong)
1.8 Data and Performance Simplified (De Jong)1.8 Data and Performance Simplified (De Jong)
1.8 Data and Performance Simplified (De Jong)
 
SUCCESS STORY: How King County Treasury Reduced Taxpayer Late Fees by 62%, Fe...
SUCCESS STORY: How King County Treasury Reduced Taxpayer Late Fees by 62%, Fe...SUCCESS STORY: How King County Treasury Reduced Taxpayer Late Fees by 62%, Fe...
SUCCESS STORY: How King County Treasury Reduced Taxpayer Late Fees by 62%, Fe...
 
Assignment DetailsASSIGNMENT SWEET DRINKS OR MARIJUANA.docx
Assignment DetailsASSIGNMENT SWEET DRINKS OR MARIJUANA.docxAssignment DetailsASSIGNMENT SWEET DRINKS OR MARIJUANA.docx
Assignment DetailsASSIGNMENT SWEET DRINKS OR MARIJUANA.docx
 
1.11 Data and Performance Simplified
1.11 Data and Performance Simplified1.11 Data and Performance Simplified
1.11 Data and Performance Simplified
 
2.6 Expert Forum: Data and Performance Simplified
2.6 Expert Forum: Data and Performance Simplified2.6 Expert Forum: Data and Performance Simplified
2.6 Expert Forum: Data and Performance Simplified
 
Outpost24 webinar - The economics of penetration testing in the new threat la...
Outpost24 webinar - The economics of penetration testing in the new threat la...Outpost24 webinar - The economics of penetration testing in the new threat la...
Outpost24 webinar - The economics of penetration testing in the new threat la...
 
Charity Navigator 2.0 Case Study Presentation
Charity Navigator 2.0 Case Study PresentationCharity Navigator 2.0 Case Study Presentation
Charity Navigator 2.0 Case Study Presentation
 
TrustCloud: Satisfaction in Sharing Survey 8 June 2015
TrustCloud: Satisfaction in Sharing Survey 8 June 2015TrustCloud: Satisfaction in Sharing Survey 8 June 2015
TrustCloud: Satisfaction in Sharing Survey 8 June 2015
 
Student Name Type your name hereDateEnter the date on w.docx
Student Name Type your name hereDateEnter the date on w.docxStudent Name Type your name hereDateEnter the date on w.docx
Student Name Type your name hereDateEnter the date on w.docx
 

More from Women in Analytics Conference

WIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
WIA 2019 - Enhancing AI in Wearable Devices using Topological Data AnalysisWIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
WIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
Women in Analytics Conference
 
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
Women in Analytics Conference
 
WIA 2019 - Unearth the Journey of Implementing Vision Based Deep Learning Sol...
WIA 2019 - Unearth the Journey of Implementing Vision Based Deep Learning Sol...WIA 2019 - Unearth the Journey of Implementing Vision Based Deep Learning Sol...
WIA 2019 - Unearth the Journey of Implementing Vision Based Deep Learning Sol...
Women in Analytics Conference
 
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
Women in Analytics Conference
 
2019 WIA - The Importance of Ethics in Data Science
2019 WIA - The Importance of Ethics in Data Science2019 WIA - The Importance of Ethics in Data Science
2019 WIA - The Importance of Ethics in Data Science
Women in Analytics Conference
 
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
Women in Analytics Conference
 
2019 WIA - Analytics Maturity: The Power to be Incredible
2019 WIA - Analytics Maturity: The Power to be Incredible2019 WIA - Analytics Maturity: The Power to be Incredible
2019 WIA - Analytics Maturity: The Power to be Incredible
Women in Analytics Conference
 
2019 WIA - Diversity in Analytics
2019 WIA - Diversity in Analytics2019 WIA - Diversity in Analytics
2019 WIA - Diversity in Analytics
Women in Analytics Conference
 
2019 WIA - User-centric Design for Data Scientists
2019 WIA - User-centric Design for Data Scientists2019 WIA - User-centric Design for Data Scientists
2019 WIA - User-centric Design for Data Scientists
Women in Analytics Conference
 
2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements
Women in Analytics Conference
 
WIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual DiagnosticsWIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual Diagnostics
Women in Analytics Conference
 
WIA 2019 - From Academia to Industry
WIA 2019 - From Academia to IndustryWIA 2019 - From Academia to Industry
WIA 2019 - From Academia to Industry
Women in Analytics Conference
 

More from Women in Analytics Conference (12)

WIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
WIA 2019 - Enhancing AI in Wearable Devices using Topological Data AnalysisWIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
WIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
 
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
 
WIA 2019 - Unearth the Journey of Implementing Vision Based Deep Learning Sol...
WIA 2019 - Unearth the Journey of Implementing Vision Based Deep Learning Sol...WIA 2019 - Unearth the Journey of Implementing Vision Based Deep Learning Sol...
WIA 2019 - Unearth the Journey of Implementing Vision Based Deep Learning Sol...
 
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
 
2019 WIA - The Importance of Ethics in Data Science
2019 WIA - The Importance of Ethics in Data Science2019 WIA - The Importance of Ethics in Data Science
2019 WIA - The Importance of Ethics in Data Science
 
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
 
2019 WIA - Analytics Maturity: The Power to be Incredible
2019 WIA - Analytics Maturity: The Power to be Incredible2019 WIA - Analytics Maturity: The Power to be Incredible
2019 WIA - Analytics Maturity: The Power to be Incredible
 
2019 WIA - Diversity in Analytics
2019 WIA - Diversity in Analytics2019 WIA - Diversity in Analytics
2019 WIA - Diversity in Analytics
 
2019 WIA - User-centric Design for Data Scientists
2019 WIA - User-centric Design for Data Scientists2019 WIA - User-centric Design for Data Scientists
2019 WIA - User-centric Design for Data Scientists
 
2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements
 
WIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual DiagnosticsWIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual Diagnostics
 
WIA 2019 - From Academia to Industry
WIA 2019 - From Academia to IndustryWIA 2019 - From Academia to Industry
WIA 2019 - From Academia to Industry
 

Recently uploaded

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 

Recently uploaded (20)

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 

2019 WIA - Setting Expectations on Data Science Projects

  • 1. Setting Expectations on Data Science Projects: What am I doing and why am I doing it? March 21, 2019
  • 2. 2 “Hey Kathleen, Can you get me X data? It should be really simple. I just need the data.”
  • 3. The game changer… “I need the data that shows that client experience increased because of the new Preferred Call Center.” Historical Approach Better Approach 3
  • 4. The game changer… “I need the data that shows that client experience increased because of the new Preferred Call Center.” “Great! Let’s spend another $7MM to double capacity.” “Here’s your data.” Historical Approach Better Approach 4
  • 5. The game changer… “I need the data that shows that client experience increased because of the new Preferred Call Center.” “Great! Let’s spend another $7MM to double capacity.” “Benefit was driven by your talent selection, not the $7MM.” “Let’s expand with the right people.” “Here’s your data.” Historical Approach Better Approach 5
  • 6. 1. Define the question 2. Do the work (and draw a conclusion) 3. Answer the question “Hey Kathleen, Can you get me X data? It should be really simple. I just need the data.” 6
  • 7. Hear the question differently, and trust yourself “Hey Kathleen, I need your help with something.” 7 Part 1: Define the question
  • 8. Don’t say “yes,” but definitely don’t say “no” “Help me understand what you are trying to achieve so I can better meet your needs.” 8 Part 1: Define the question
  • 9. Ask open-ended questions, clarify as you go, and trust your business partner “Ok here’s what’s really going on… But don’t say anything because it’s confidential…” Who else is working on this? What other theories are there? What’s different about this? What did you learn last time? What level of precision? If we do X, couldn’t it impact Y? Is it possible that…? What is your hypothesis? What is your concern? What changed? What did you expect? Who asked the question? What prompted it? What do you think? 9 Part 1: Define the question
  • 10. Work backward and focus on the story I ask two questions… 1. What’s my conclusion? 2. How do I know? 10 Part 3: Answer the question
  • 11. Talk it out, and use conversational language for your first draft 11 Part 3: Answer the question
  • 12. Give yourself plenty of time, and plan for multiple iterations “I have made this letter longer than usual, only because I have not had the time to make it shorter.” -- Blaise Pascal 12 Part 3: Answer the question
  • 13. 13 Iteration #1 Objective – Understand difference in revenue based on RDC use Methodology/Period – Test vs. pseudo-control (Q2 to Q3) Note – This is a historical “statement of fact”. It does not imply causality, other confounding factors could also contribute Test/Pseudo-Control Groups – Both test (N=27k) and pseudo- control groups (N=40k) are households that were open, mobile active, and depositing checks during both periods. Clients in the test group started using RDC in the post period; clients in the pseudo-control group never utilized RDC. Test and pseudo-controls are balanced by demographics based on pre-period data Result – • Revenue – Households that started utilizing RDC have an increase in revenue. The increase in revenue could be driven by the increasing check deposits for the households who started utilizing RDC • RDC Usage – Households that deposit checks through RDC as their main channel deposit fewer checks overall. No correlation between change in revenue and RDC usage is observed. Application – Of n retail households open, mobile active, as well as depositing checks during Q3, m of them were using RDC at the time. The “value” of households who utilized remote deposit during Q3 would roughly translate to $x +/- $y Define the question
  • 14. 14 Iteration #1 Objective – Understand difference in revenue based on RDC use Methodology/Period – Test vs. pseudo-control (Q2 to Q3) Note – This is a historical “statement of fact”. It does not imply causality, other confounding factors could also contribute Test/Pseudo-Control Groups – Both test (N=27k) and pseudo- control groups (N=40k) are households that were open, mobile active, and depositing checks during both periods. Clients in the test group started using RDC in the post period; clients in the pseudo-control group never utilized RDC. Test and pseudo-controls are balanced by demographics based on pre-period data Result – • Revenue – Households that started utilizing RDC have an increase in revenue. The increase in revenue could be driven by the increasing check deposits for the households who started utilizing RDC • RDC Usage – Households that deposit checks through RDC as their main channel deposit fewer checks overall. No correlation between change in revenue and RDC usage is observed. Application – Of n retail households open, mobile active, as well as depositing checks during Q3, m of them were using RDC at the time. The “value” of households who utilized remote deposit during Q3 would roughly translate to $x +/- $y Define the question Do the work
  • 15. 15 Iteration #1 Objective – Understand difference in revenue based on RDC use Methodology/Period – Test vs. pseudo-control (Q2 to Q3) Note – This is a historical “statement of fact”. It does not imply causality, other confounding factors could also contribute Test/Pseudo-Control Groups – Both test (N=27k) and pseudo- control groups (N=40k) are households that were open, mobile active, and depositing checks during both periods. Clients in the test group started using RDC in the post period; clients in the pseudo-control group never utilized RDC. Test and pseudo-controls are balanced by demographics based on pre-period data Result – • Revenue – Households that started utilizing RDC have an increase in revenue. The increase in revenue could be driven by the increasing check deposits for the households who started utilizing RDC • RDC Usage – Households that deposit checks through RDC as their main channel deposit fewer checks overall. No correlation between change in revenue and RDC usage is observed. Application – Of n retail households open, mobile active, as well as depositing checks during Q3, m of them were using RDC at the time. The “value” of households who utilized remote deposit during Q3 would roughly translate to $x +/- $y Define the question Do the work Draw a conclusion
  • 16. 16 Iteration #1 Objective – Understand difference in revenue based on RDC use Methodology/Period – Test vs. pseudo-control (Q2 to Q3) Note – This is a historical “statement of fact”. It does not imply causality, other confounding factors could also contribute Test/Pseudo-Control Groups – Both test (N=27k) and pseudo- control groups (N=40k) are households that were open, mobile active, and depositing checks during both periods. Clients in the test group started using RDC in the post period; clients in the pseudo-control group never utilized RDC. Test and pseudo-controls are balanced by demographics based on pre-period data Result – • Revenue – Households that started utilizing RDC have an increase in revenue. The increase in revenue could be driven by the increasing check deposits for the households who started utilizing RDC • RDC Usage – Households that deposit checks through RDC as their main channel deposit fewer checks overall. No correlation between change in revenue and RDC usage is observed. Application – Of n retail households open, mobile active, as well as depositing checks during Q3, m of them were using RDC at the time. The “value” of households who utilized remote deposit during Q3 would roughly translate to $x +/- $y Define the question Do the work Draw a conclusion Answer the question
  • 17. 17 Final Iteration Households who started using RDC generated $x more revenue during Q3 compared to households that didn’t. This is not necessarily a causal relationship – households that started using RDC also… • Deposited more check overall • Had a higher level of engagement in general RDC appears to have a slight negative impact on retention • Lower CSAT observed in households that used RDC suggests a poor RDC experience may have contributed to client attrition Households that utilized RDC as their main channel for check deposit had the highest satisfaction score in RDC • These households, however, also deposit fewer checks on average Answer the question
  • 18. 18 Final Iteration Households who started using RDC generated $x more revenue during Q3 compared to households that didn’t. This is not necessarily a causal relationship – households that started using RDC also… • Deposited more check overall • Had a higher level of engagement in general RDC appears to have a slight negative impact on retention • Lower CSAT observed in households that used RDC suggests a poor RDC experience may have contributed to client attrition Households that utilized RDC as their main channel for check deposit had the highest satisfaction score in RDC • These households, however, also deposit fewer checks on average 1. What’s my conclusion? Answer the question 2. How do I know?
  • 19. 19 Final Iteration Households who started using RDC generated $x more revenue during Q3 compared to households that didn’t. This is not necessarily a causal relationship – households that started using RDC also… • Deposited more check overall • Had a higher level of engagement in general RDC appears to have a slight negative impact on retention • Lower CSAT observed in households that used RDC suggests a poor RDC experience may have contributed to client attrition Households that utilized RDC as their main channel for check deposit had the highest satisfaction score in RDC • These households, however, also deposit fewer checks on average 1. What’s my conclusion? 2. How do I know? 1. What’s my conclusion? 2. How do I know? Answer the question
  • 20. 20 Final Iteration Households who started using RDC generated $x more revenue during Q3 compared to households that didn’t. This is not necessarily a causal relationship – households that started using RDC also… • Deposited more check overall • Had a higher level of engagement in general RDC appears to have a slight negative impact on retention • Lower CSAT observed in households that used RDC suggests a poor RDC experience may have contributed to client attrition Households that utilized RDC as their main channel for check deposit had the highest satisfaction score in RDC • These households, however, also deposit fewer checks on average 1. What’s my conclusion? Answer the question 2. How do I know? 1. What’s my conclusion? 2. How do I know? 1. What’s my conclusion? 2. How do I know?
  • 21. 1. Hear the question differently, and trust yourself 2. Don’t say “yes,” but definitely don’t say “no” 3. Ask open-ended questions, clarify as you go, and trust your business partner 1. Define the question 1. Work backward and focus on the story 2. Talk it out, and use conversational language for your first draft 3. Give yourself plenty of time, and plan for multiple iterations 3. Answer the question 21 In summary…