SlideShare a Scribd company logo
1 of 12
Common Data Driven Mistakes
Promotable Presentation
Jeanette Shutay, PhD
Senior Director, Advanced Analytics
February 5, 2020
HAVI | Confidential & Proprietary | 2/13/2020 | 2
Current Professional Contributions
• Advanced Analytics Center of Excellence Lead at HAVI
• Adjunct Professor for NCU School of Technology
Academic Preparation
• BA in Psychology
• MA in Developmental Psychology
• PhD in Research Methodology
• Student in AIP program
• Student to start in GIS program
Personal Interests
• Family & pets
• Volunteering
• I love animals!!
• Jogging & yoga
• Sports
My son Brett (16)
My son Brendon (12)
HAVI | Confidential & Proprietary | 2/13/2020 | 3
Data Solutions Lifecycle
Key Concepts
• Stakeholders
• Value proposition
• Data quality & characteristics
• Interaction effects / complex
relationships
• Getting to causality
• Constraints
• Scaling
HAVI | Confidential & Proprietary | 2/13/2020 | 4
Defining the problem
- Have you correctly and thoroughly defined the problem?
• Engage domain experts early and maintain continual engagement
- Estimate and consider the value proposition associated with solving the problem
- Identify the key performance indicators and any drivers of interest
- Operationally define all variables and indicators to be measured or observed
• Start with a priori hypotheses based on subject matter expertise and
industry/academic literature
- Brainstorm with key stakeholders & involve people with diverse backgrounds & views
- Generate hypotheses to test prior to specifying data requirements
- Review and align on all assumptions
• Document business requirements
HAVI | Confidential & Proprietary | 2/13/2020 | 5
Specifying the data requirements
- Do your data meet all requirements?
• Granularity & Cadence
- Do you need daily level data, location level data, etc.?
- When are decisions made? Every day, every week?
• Representativeness & Fidelity
- How generalizable are the cases you are studying to the problem as a whole?
• If doing a POC or small pilot, do you have a representative set of cases?
• Are you working with cases that have a high probability of treatment fidelity?
- Example: If testing a new customer experience program, are the stores that you are using
as part of the POC going to implement the program as intended? A low fidelity situation
can do more harm than not testing at all.
HAVI | Confidential & Proprietary | 2/13/2020 | 6
Data preparation
- Normalizing Variables
• In many cases, you will need to standardize your variables before analysis
- Using z scores are a good way to avoid data mistakes in modeling
- Identifying and Managing Anomalies or Outliers
• Sometimes anomalies are what you are interested in
• When anomalies or outliers are problematic, consider dropping those cases
or imputing, but watch out for errors in this approach
- Example: Some values may appear as outliers in time series data with high seasonality
- Model assumptions
• Ensure that the characteristics of your data, and the problem you are trying
to solve, align with the model you are implementing
HAVI | Confidential & Proprietary | 2/13/2020 | 7
Data exploration
- Go beyond univariate exploratory data analysis (EDA)
• Explore interaction effects
Conclusion: There is no difference
between green and yellow feeders.
Conclusion: There is a difference
between green and yellow feeders.
HAVI | Confidential & Proprietary | 2/13/2020 | 8
Causality & spurious relationships
- Causality - three conditions must exist:
• X and Y must be correlated
• X must proceed Y in time
• All other rival causes must be ruled out (e.g., internal validity)
- Beware of Spurious Relationships & Rival Causes
• Example 1: You launch a promotion in March. You believe the success of your promotion
(increased sales) is due to your marketing campaign, but it is a result of a third-party cause
(increased consumer buying power due to tax refunds)
• Example 2: You launch a new crime watch campaign that launches in December and you
see a significant decrease in crime month-over-month. The true cause is seasonality.
• Solution: Design your campaign to minimize potential rival causes. This is where including
the SME is critical.
HAVI | Confidential & Proprietary | 2/13/2020 | 9
- Look for Suppressor Effects
• Example: You launch an employee training
program. You compare their performance at
the end of the program to the general
employee population. You find that those in
the training program had lower performance
ratings than the general population.
• Problem: You didn’t consider pre-existing
differences. You find out that those who were
selected for the program where low
performers.
• Solution: Use deltas (change from baseline to
post) and/or include control variables in your
model (prior performance, demographics, etc.).
1.7
3.2
3.5
3.6
Baseline performance Final performance
Employee Performance Rating 5-Point Scale
Training participants General population
Suppressor Effects
HAVI | Confidential & Proprietary | 2/13/2020 | 10
Time to value & diminishing returns
- Progress vs. Perfection
• Time to value is an important factor to consider. It is better to provide
something for the business to work with and continually improve than to wait
until you reach perfection before sharing with the business
- Data & Analytics ROI
• Know when improving the model and/or adding more external data no longer
yields the return on investment
- Cost-to-benefit analysis
- Assess forecastability
HAVI | Confidential & Proprietary | 2/13/2020 | 11
- Are there specific constraints that might impact your approach?
• Example 1: Can’t recommend an alcoholic beverage, even if customer is likely to buy
• Example 2: Must use interpretable models
• Example 3: Must include non-significant promotions for simulation purposes
- Do you need to scale your solution?
• If you need to scale your solution, try to prototype within the same ecosystem (e.g., Azure,
Python, Spark, etc.) in which you plan to scale.
- Many times results do not replicate when using different software or platforms
- Avoid using data for modeling that is not available at decision time
• Weather data or other data that you have historical information for, but no future data
- Avoid data leakage when building models
• Don’t commingle modeling training data with model validation data
Other important considerations
HAVI | Confidential & Proprietary | 2/13/2020 |
QUESTIONS?

More Related Content

What's hot

Executive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationExecutive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationMetre22
 
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_State
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_StateTraditional_Consulting_Approach_to_Assess_A_Clients_Current_State
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_Staterichibh
 
MonetizingStatistics
MonetizingStatisticsMonetizingStatistics
MonetizingStatisticsAaron Sankey
 
Getting Started with Reliability Engineering
Getting Started with Reliability EngineeringGetting Started with Reliability Engineering
Getting Started with Reliability EngineeringAccendo Reliability
 
Case analysis approach
Case analysis approachCase analysis approach
Case analysis approachbalbirsingh
 
Financial analysis for product managers
Financial analysis for product managersFinancial analysis for product managers
Financial analysis for product managersMike Claiborne
 
1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptop1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptopRising Media, Inc.
 
TrustImpact - Great Place IT Services
TrustImpact - Great Place IT ServicesTrustImpact - Great Place IT Services
TrustImpact - Great Place IT ServicesShivanshu Singh
 

What's hot (20)

1120 track2 bennett
1120 track2 bennett1120 track2 bennett
1120 track2 bennett
 
Executive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationExecutive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic Experimentation
 
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_State
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_StateTraditional_Consulting_Approach_to_Assess_A_Clients_Current_State
Traditional_Consulting_Approach_to_Assess_A_Clients_Current_State
 
MonetizingStatistics
MonetizingStatisticsMonetizingStatistics
MonetizingStatistics
 
Getting Started with Reliability Engineering
Getting Started with Reliability EngineeringGetting Started with Reliability Engineering
Getting Started with Reliability Engineering
 
1615 track1 schleicher
1615 track1 schleicher1615 track1 schleicher
1615 track1 schleicher
 
Making sense of numbers - a half-day workshop
Making sense of numbers - a half-day workshopMaking sense of numbers - a half-day workshop
Making sense of numbers - a half-day workshop
 
1030 track1 heiler
1030 track1 heiler1030 track1 heiler
1030 track1 heiler
 
Case analysis approach
Case analysis approachCase analysis approach
Case analysis approach
 
1000 track1 gland_sims
1000 track1 gland_sims1000 track1 gland_sims
1000 track1 gland_sims
 
The Public Sector cannot deliver Benefits
The Public Sector cannot deliver BenefitsThe Public Sector cannot deliver Benefits
The Public Sector cannot deliver Benefits
 
1055 track3 soules
1055 track3 soules1055 track3 soules
1055 track3 soules
 
Financial analysis for product managers
Financial analysis for product managersFinancial analysis for product managers
Financial analysis for product managers
 
1440 track2 roberts
1440 track2 roberts1440 track2 roberts
1440 track2 roberts
 
Case study template
Case study templateCase study template
Case study template
 
1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptop1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptop
 
TrustImpact - Great Place IT Services
TrustImpact - Great Place IT ServicesTrustImpact - Great Place IT Services
TrustImpact - Great Place IT Services
 
1120 track1 taylor
1120 track1 taylor1120 track1 taylor
1120 track1 taylor
 
Unit b
Unit bUnit b
Unit b
 
Process outcomes vs outputs
Process outcomes vs outputsProcess outcomes vs outputs
Process outcomes vs outputs
 

Similar to Common Data Driven Mistakes with HAVI's Sr. Director of Advanced Analytics

Value Summary Online Improvement Portal: Product Overview
Value Summary Online Improvement Portal: Product OverviewValue Summary Online Improvement Portal: Product Overview
Value Summary Online Improvement Portal: Product OverviewUniversity of Utah
 
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdfmtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdfJens-Fabian Goetzmann
 
Strategy and roadmap slides
Strategy and roadmap slidesStrategy and roadmap slides
Strategy and roadmap slidesData Blueprint
 
Data-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & RoadmapData-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & RoadmapDATAVERSITY
 
Data Science in Business: Value Creation of Business
Data Science in Business: Value Creation of BusinessData Science in Business: Value Creation of Business
Data Science in Business: Value Creation of BusinessTa-Wei (David) Huang
 
Hypothesis driven storyboarding
Hypothesis driven storyboardingHypothesis driven storyboarding
Hypothesis driven storyboardingRahul Sahai
 
Introduction to Policy Evaluation
Introduction to Policy EvaluationIntroduction to Policy Evaluation
Introduction to Policy EvaluationpasicUganda
 
Data-Ed: Data-centric Strategy & Roadmap
Data-Ed: Data-centric Strategy & RoadmapData-Ed: Data-centric Strategy & Roadmap
Data-Ed: Data-centric Strategy & RoadmapData Blueprint
 
Data-Ed Online: Data-Centric Strategy & Roadmap
Data-Ed Online: Data-Centric Strategy & RoadmapData-Ed Online: Data-Centric Strategy & Roadmap
Data-Ed Online: Data-Centric Strategy & RoadmapDATAVERSITY
 
Supply Chain Strategy Assessment
Supply Chain Strategy AssessmentSupply Chain Strategy Assessment
Supply Chain Strategy AssessmentChief Innovation
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxAsadAli104515
 
Ba process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDBa process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDDebarata Basu
 
Speed Wins: Launching new products and services. pptx
Speed Wins: Launching new products and services. pptxSpeed Wins: Launching new products and services. pptx
Speed Wins: Launching new products and services. pptxPeter Eales
 
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...Gray Associates, Inc
 

Similar to Common Data Driven Mistakes with HAVI's Sr. Director of Advanced Analytics (20)

ROI-Institute-Brochure1
ROI-Institute-Brochure1ROI-Institute-Brochure1
ROI-Institute-Brochure1
 
Ranking portfolio initiatives, Bernard Marshall, june 2012
Ranking portfolio initiatives, Bernard Marshall, june 2012Ranking portfolio initiatives, Bernard Marshall, june 2012
Ranking portfolio initiatives, Bernard Marshall, june 2012
 
Value Summary Online Improvement Portal: Product Overview
Value Summary Online Improvement Portal: Product OverviewValue Summary Online Improvement Portal: Product Overview
Value Summary Online Improvement Portal: Product Overview
 
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdfmtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
mtpcon London+EMEA 2022 – Why Product Managers should not be data-driven.pdf
 
Strategy and roadmap slides
Strategy and roadmap slidesStrategy and roadmap slides
Strategy and roadmap slides
 
Data-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & RoadmapData-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & Roadmap
 
SMC
SMCSMC
SMC
 
Data Science in Business: Value Creation of Business
Data Science in Business: Value Creation of BusinessData Science in Business: Value Creation of Business
Data Science in Business: Value Creation of Business
 
Hypothesis driven storyboarding
Hypothesis driven storyboardingHypothesis driven storyboarding
Hypothesis driven storyboarding
 
Product Management
Product ManagementProduct Management
Product Management
 
Measuring_HR_ROI-1.pdf
Measuring_HR_ROI-1.pdfMeasuring_HR_ROI-1.pdf
Measuring_HR_ROI-1.pdf
 
Introduction to Policy Evaluation
Introduction to Policy EvaluationIntroduction to Policy Evaluation
Introduction to Policy Evaluation
 
Data-Ed: Data-centric Strategy & Roadmap
Data-Ed: Data-centric Strategy & RoadmapData-Ed: Data-centric Strategy & Roadmap
Data-Ed: Data-centric Strategy & Roadmap
 
Data-Ed Online: Data-Centric Strategy & Roadmap
Data-Ed Online: Data-Centric Strategy & RoadmapData-Ed Online: Data-Centric Strategy & Roadmap
Data-Ed Online: Data-Centric Strategy & Roadmap
 
Supply Chain Strategy Assessment
Supply Chain Strategy AssessmentSupply Chain Strategy Assessment
Supply Chain Strategy Assessment
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptx
 
Ba process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDBa process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTD
 
Speed Wins: Launching new products and services. pptx
Speed Wins: Launching new products and services. pptxSpeed Wins: Launching new products and services. pptx
Speed Wins: Launching new products and services. pptx
 
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
The Best Practices in Program Portfolio Evaluation - Running an Effective Pro...
 
Hkwaw event 20100622
Hkwaw event   20100622Hkwaw event   20100622
Hkwaw event 20100622
 

More from Promotable

Data Driven Culture with Slalom's Director of Analytics
Data Driven Culture with Slalom's Director of AnalyticsData Driven Culture with Slalom's Director of Analytics
Data Driven Culture with Slalom's Director of AnalyticsPromotable
 
Healthcare analytics 101 - Proverbs to Prediction
Healthcare analytics 101 - Proverbs to PredictionHealthcare analytics 101 - Proverbs to Prediction
Healthcare analytics 101 - Proverbs to PredictionPromotable
 
Data to Insights with Gogo's Data Science Lead
Data to Insights with Gogo's Data Science LeadData to Insights with Gogo's Data Science Lead
Data to Insights with Gogo's Data Science LeadPromotable
 
How to Pick the Right Metrics with Josh Vincent of Transparent Partners
How to Pick the Right Metrics with Josh Vincent of Transparent PartnersHow to Pick the Right Metrics with Josh Vincent of Transparent Partners
How to Pick the Right Metrics with Josh Vincent of Transparent PartnersPromotable
 
Turning Problems into Insights with Grubhub's Analytics Manager
Turning Problems into Insights with Grubhub's Analytics ManagerTurning Problems into Insights with Grubhub's Analytics Manager
Turning Problems into Insights with Grubhub's Analytics ManagerPromotable
 
Metrics with BMW Director of Product
Metrics with BMW Director of Product Metrics with BMW Director of Product
Metrics with BMW Director of Product Promotable
 
Becoming agile with Peapod Labs Sr. Product Owner
Becoming agile with Peapod Labs Sr. Product OwnerBecoming agile with Peapod Labs Sr. Product Owner
Becoming agile with Peapod Labs Sr. Product OwnerPromotable
 
Marketing Analytics with Mcdonald's Data Scientist
Marketing Analytics with Mcdonald's Data ScientistMarketing Analytics with Mcdonald's Data Scientist
Marketing Analytics with Mcdonald's Data ScientistPromotable
 

More from Promotable (8)

Data Driven Culture with Slalom's Director of Analytics
Data Driven Culture with Slalom's Director of AnalyticsData Driven Culture with Slalom's Director of Analytics
Data Driven Culture with Slalom's Director of Analytics
 
Healthcare analytics 101 - Proverbs to Prediction
Healthcare analytics 101 - Proverbs to PredictionHealthcare analytics 101 - Proverbs to Prediction
Healthcare analytics 101 - Proverbs to Prediction
 
Data to Insights with Gogo's Data Science Lead
Data to Insights with Gogo's Data Science LeadData to Insights with Gogo's Data Science Lead
Data to Insights with Gogo's Data Science Lead
 
How to Pick the Right Metrics with Josh Vincent of Transparent Partners
How to Pick the Right Metrics with Josh Vincent of Transparent PartnersHow to Pick the Right Metrics with Josh Vincent of Transparent Partners
How to Pick the Right Metrics with Josh Vincent of Transparent Partners
 
Turning Problems into Insights with Grubhub's Analytics Manager
Turning Problems into Insights with Grubhub's Analytics ManagerTurning Problems into Insights with Grubhub's Analytics Manager
Turning Problems into Insights with Grubhub's Analytics Manager
 
Metrics with BMW Director of Product
Metrics with BMW Director of Product Metrics with BMW Director of Product
Metrics with BMW Director of Product
 
Becoming agile with Peapod Labs Sr. Product Owner
Becoming agile with Peapod Labs Sr. Product OwnerBecoming agile with Peapod Labs Sr. Product Owner
Becoming agile with Peapod Labs Sr. Product Owner
 
Marketing Analytics with Mcdonald's Data Scientist
Marketing Analytics with Mcdonald's Data ScientistMarketing Analytics with Mcdonald's Data Scientist
Marketing Analytics with Mcdonald's Data Scientist
 

Recently uploaded

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

Common Data Driven Mistakes with HAVI's Sr. Director of Advanced Analytics

  • 1. Common Data Driven Mistakes Promotable Presentation Jeanette Shutay, PhD Senior Director, Advanced Analytics February 5, 2020
  • 2. HAVI | Confidential & Proprietary | 2/13/2020 | 2 Current Professional Contributions • Advanced Analytics Center of Excellence Lead at HAVI • Adjunct Professor for NCU School of Technology Academic Preparation • BA in Psychology • MA in Developmental Psychology • PhD in Research Methodology • Student in AIP program • Student to start in GIS program Personal Interests • Family & pets • Volunteering • I love animals!! • Jogging & yoga • Sports My son Brett (16) My son Brendon (12)
  • 3. HAVI | Confidential & Proprietary | 2/13/2020 | 3 Data Solutions Lifecycle Key Concepts • Stakeholders • Value proposition • Data quality & characteristics • Interaction effects / complex relationships • Getting to causality • Constraints • Scaling
  • 4. HAVI | Confidential & Proprietary | 2/13/2020 | 4 Defining the problem - Have you correctly and thoroughly defined the problem? • Engage domain experts early and maintain continual engagement - Estimate and consider the value proposition associated with solving the problem - Identify the key performance indicators and any drivers of interest - Operationally define all variables and indicators to be measured or observed • Start with a priori hypotheses based on subject matter expertise and industry/academic literature - Brainstorm with key stakeholders & involve people with diverse backgrounds & views - Generate hypotheses to test prior to specifying data requirements - Review and align on all assumptions • Document business requirements
  • 5. HAVI | Confidential & Proprietary | 2/13/2020 | 5 Specifying the data requirements - Do your data meet all requirements? • Granularity & Cadence - Do you need daily level data, location level data, etc.? - When are decisions made? Every day, every week? • Representativeness & Fidelity - How generalizable are the cases you are studying to the problem as a whole? • If doing a POC or small pilot, do you have a representative set of cases? • Are you working with cases that have a high probability of treatment fidelity? - Example: If testing a new customer experience program, are the stores that you are using as part of the POC going to implement the program as intended? A low fidelity situation can do more harm than not testing at all.
  • 6. HAVI | Confidential & Proprietary | 2/13/2020 | 6 Data preparation - Normalizing Variables • In many cases, you will need to standardize your variables before analysis - Using z scores are a good way to avoid data mistakes in modeling - Identifying and Managing Anomalies or Outliers • Sometimes anomalies are what you are interested in • When anomalies or outliers are problematic, consider dropping those cases or imputing, but watch out for errors in this approach - Example: Some values may appear as outliers in time series data with high seasonality - Model assumptions • Ensure that the characteristics of your data, and the problem you are trying to solve, align with the model you are implementing
  • 7. HAVI | Confidential & Proprietary | 2/13/2020 | 7 Data exploration - Go beyond univariate exploratory data analysis (EDA) • Explore interaction effects Conclusion: There is no difference between green and yellow feeders. Conclusion: There is a difference between green and yellow feeders.
  • 8. HAVI | Confidential & Proprietary | 2/13/2020 | 8 Causality & spurious relationships - Causality - three conditions must exist: • X and Y must be correlated • X must proceed Y in time • All other rival causes must be ruled out (e.g., internal validity) - Beware of Spurious Relationships & Rival Causes • Example 1: You launch a promotion in March. You believe the success of your promotion (increased sales) is due to your marketing campaign, but it is a result of a third-party cause (increased consumer buying power due to tax refunds) • Example 2: You launch a new crime watch campaign that launches in December and you see a significant decrease in crime month-over-month. The true cause is seasonality. • Solution: Design your campaign to minimize potential rival causes. This is where including the SME is critical.
  • 9. HAVI | Confidential & Proprietary | 2/13/2020 | 9 - Look for Suppressor Effects • Example: You launch an employee training program. You compare their performance at the end of the program to the general employee population. You find that those in the training program had lower performance ratings than the general population. • Problem: You didn’t consider pre-existing differences. You find out that those who were selected for the program where low performers. • Solution: Use deltas (change from baseline to post) and/or include control variables in your model (prior performance, demographics, etc.). 1.7 3.2 3.5 3.6 Baseline performance Final performance Employee Performance Rating 5-Point Scale Training participants General population Suppressor Effects
  • 10. HAVI | Confidential & Proprietary | 2/13/2020 | 10 Time to value & diminishing returns - Progress vs. Perfection • Time to value is an important factor to consider. It is better to provide something for the business to work with and continually improve than to wait until you reach perfection before sharing with the business - Data & Analytics ROI • Know when improving the model and/or adding more external data no longer yields the return on investment - Cost-to-benefit analysis - Assess forecastability
  • 11. HAVI | Confidential & Proprietary | 2/13/2020 | 11 - Are there specific constraints that might impact your approach? • Example 1: Can’t recommend an alcoholic beverage, even if customer is likely to buy • Example 2: Must use interpretable models • Example 3: Must include non-significant promotions for simulation purposes - Do you need to scale your solution? • If you need to scale your solution, try to prototype within the same ecosystem (e.g., Azure, Python, Spark, etc.) in which you plan to scale. - Many times results do not replicate when using different software or platforms - Avoid using data for modeling that is not available at decision time • Weather data or other data that you have historical information for, but no future data - Avoid data leakage when building models • Don’t commingle modeling training data with model validation data Other important considerations
  • 12. HAVI | Confidential & Proprietary | 2/13/2020 | QUESTIONS?

Editor's Notes

  1. As I go through the examples, I will speak to how these aforementioned mistakes can have economic implications.
  2. We will unpack these key concepts throughout the presentation.