SlideShare a Scribd company logo
1 of 26
Download to read offline
You don’t have to be a Data
Scientist to do Data Science
@carmenmardiros (not a data scientist)
“Sexiest job of the 21st
century”
Why do I, a mere analyst, care?
The appeal of Data Science (for me as an analyst)
Increase
confidence
My own and others’ in my analyses as the complexity
of data and business ecosystem increases.
Become more
productive
Speed up the analysis cycle from exploration to
hypothesis to experimentation.
Add value in
new ways
As the business and technology landscape changes.
Operationalise analysis outcomes as data products.
“It’s just not for me...”
“I don’t have a degree in statistics or programming.”
No confidence to attend the
sessions.
Worried I would not understand
the content.
Worried I’d be spotted as a fraud.
(3m into my data science foray)
Understood much of the content
and terminology.
Mentally thought questions
others asked.
I knew more than I thought I did.
Predictive Analytics Summit 2013 Predictive Analytics Summit 2016
Doing data science requires a
PhD/going back to school.
Can’t do data science until you
can write an algorithm.
Bottom-up is the only way.
Doing data science requires
enthusiasm and confidence in
ourselves.
Can and should do data science
once we’ve conceptually
understood how and why the
algorithm works.
Top-down works.
Provide value, learn as you go.
Myth Truth
Adapt. Grow. Stay relevant.
Digital Analytics is changing fast
Increasingly
scientific
approaches
Essential as we move towards prescriptive analytics
at speed.
Become familiar
with data
science toolkit
We will be key to bridging the gap between PhDs,
machines and management.
May even use it ourselves for our day-to-day work.
Future-proof
ourselves
MS Office for Machine Learning coming soon at a
cloud near you.
3 Transformative
Data Science techniques
#1 Resampling
The Bootstrap
Number of observations: 100
Sample is representative (to the best of
our knowledge).
Observed mean: 17.54 months
The Bootstrap
Draw 100 random samples with
replacement.
Calculate for each one the mean:
[17.61, 16.21, 17.13, 14.08, 19.58 … ] # 100
Plot all means, the 2.5 and 97.5
percentiles and original observed mean.
Bootstrap is extremely versatile:
● Fewer assumptions than parametric
methods.
● Can be used on any statistic.
Simulations & Sensitivity Analysis
Simple simulation:
Given existing distribution of order values and a
given range of possible conversion rates , how
much £££ would we make if we doubled the
traffic to our website?
Sensitivity analysis
(or how to open up black boxes):
Given a predictive model, randomly generate
new data points for each input based on
observed distributions, create predictions using
the model and interpret distribution of
outcome scenarios.
Cross Validation
Iterations
1 Train fold Train fold Train fold Train fold Test fold
2 Train fold Train fold Train fold Test fold Train fold
3 Train fold Train fold Test fold Train fold Train fold
4 Train fold Test fold Train fold Train fold Train fold
5 Test fold Train fold Train fold Train fold Train fold
Assesses how well a predictive model generalises to unseen data.
Resampling
Protects you
from unsound
inference
Acknowledges and mitigates effects of variance and
noise in the data.
You already do this when you use confidence
intervals. Quantify uncertainty more often.
Paints possible
future scenarios
Leverages randomness and probability to give you
glimpses into possible future outcomes.
Embrace randomness. It's your ally into prescriptive
analytics.
#2 Faceted visualisation
Segmented view, side-by-side
Outstanding tools for exploratory data analysis: Seaborn in Python and ggplot in R
Segmented view, side-by-side
Outstanding tools for exploratory data analysis: Seaborn in Python and ggplot in R
Segmented view, side-by-side
Outstanding tools for exploratory data analysis: Seaborn in Python and ggplot in R
#3 Feature Engineering
What?!
#3 Feature Engineering
#3 Calculated Metrics or
Content Groupings?
Back on familiar territory.
Feature Engineering Examples
Unique content
views per user
by content type
# politics content views, # business content views
# short/long-form content views
Distribution of
content seen
per user
% politics content views in total content viewed
adjusted for uncertainty of small samples
Result: fat user-level table of attributes and
behaviour for analysis and modelling.
Feature Engineering Examples
Infer trading
calendar
activities
from data
(for time series
analysis)
# new marketing campaigns (first date with sessions)
# new brands launched (first date with pageviews)
# voucher codes at peak redeem-rate (date with
highest redeems)
# AB tests started (date with first events tracked)
# VIPs active on each date, etc
Result: fat date-level table of leading KPIs and
activities (model the ecosystem).
Feature Engineering
New ways of
capturing
underlying
phenomena
Seasoned data scientists: Feature engineering often
yields higher rewards than pushing the latest
algorithms.
You likely already do this, likely in Excel.
It’s painful and limiting.
Your analytical creativity needs better tools.
SQL: The single most valuable tool in our toolkit.
We become self-sufficient analysts.
Resources
Inspired?
Learn Python https://try.jupyter.org/ -- start learning python for
data science right now (no setup!).
https://learncodethehardway.org/python/
Learn Machine
Learning
http://machinelearningmastery.com/
Understand how algorithms using spreadsheets.
Top-down approach. No programming required.
Learn SQL https://learncodethehardway.org/sql/

More Related Content

What's hot

Heroconf London 2018_Automating Search Query Processing
Heroconf London 2018_Automating Search Query ProcessingHeroconf London 2018_Automating Search Query Processing
Heroconf London 2018_Automating Search Query Processingnorisk
 
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
Data Driven Attribution in BigQuery with Shapley Values and Markov ChainsData Driven Attribution in BigQuery with Shapley Values and Markov Chains
Data Driven Attribution in BigQuery with Shapley Values and Markov ChainsChristopher Gutknecht
 
Machine Learning in PPC: How to get started today | Chris Gutknecht | Friends...
Machine Learning in PPC: How to get started today | Chris Gutknecht | Friends...Machine Learning in PPC: How to get started today | Chris Gutknecht | Friends...
Machine Learning in PPC: How to get started today | Chris Gutknecht | Friends...norisk
 
Google Tag Manager for beginners
Google Tag Manager for beginnersGoogle Tag Manager for beginners
Google Tag Manager for beginnersL3analytics
 
Analytics Tools to improve Customer Insight
Analytics Tools to improve Customer InsightAnalytics Tools to improve Customer Insight
Analytics Tools to improve Customer InsightPhil Pearce
 
Google tag manager fundamentals question and answer (june 23 and july 24, 2015)
Google tag manager fundamentals question and answer (june 23 and july 24, 2015)Google tag manager fundamentals question and answer (june 23 and july 24, 2015)
Google tag manager fundamentals question and answer (june 23 and july 24, 2015)Mahendra Patel
 
Clicktale Vendor Privacy Audit (August 2013)
Clicktale Vendor Privacy Audit (August 2013)Clicktale Vendor Privacy Audit (August 2013)
Clicktale Vendor Privacy Audit (August 2013)Phil Pearce
 
Martijn Scheijbeler @ All Things DATA 2016
Martijn Scheijbeler @ All Things DATA 2016Martijn Scheijbeler @ All Things DATA 2016
Martijn Scheijbeler @ All Things DATA 2016Shuki Mann
 
Google Tag Manager - Introduction & Implementation
Google Tag Manager - Introduction & ImplementationGoogle Tag Manager - Introduction & Implementation
Google Tag Manager - Introduction & ImplementationSearch Commander, Inc.
 
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...Christopher Gutknecht
 
What to Expect from the Google Analytics Exam 2014
What to Expect from the Google Analytics Exam 2014What to Expect from the Google Analytics Exam 2014
What to Expect from the Google Analytics Exam 2014IndigoVerge
 
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)Christopher Gutknecht
 
An Introduction To Google Analytics
An Introduction To Google AnalyticsAn Introduction To Google Analytics
An Introduction To Google AnalyticsGlobal Media Insight
 
Top 10 Google Analytics tips to save you money!
Top 10 Google Analytics tips to save you money!Top 10 Google Analytics tips to save you money!
Top 10 Google Analytics tips to save you money!Phil Pearce
 
Google Tag Manager | Google Tag Manager Tutorial 2019 | Google Tag Manager Se...
Google Tag Manager | Google Tag Manager Tutorial 2019 | Google Tag Manager Se...Google Tag Manager | Google Tag Manager Tutorial 2019 | Google Tag Manager Se...
Google Tag Manager | Google Tag Manager Tutorial 2019 | Google Tag Manager Se...Simplilearn
 
Google Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
Google Analytics with an Intro to Google Tag Manager for Austin WordPress MeetupGoogle Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
Google Analytics with an Intro to Google Tag Manager for Austin WordPress MeetupRich Plakas
 
BrightonSEO_How to create harmony between SEOs & Developers
BrightonSEO_How to create harmony between SEOs & DevelopersBrightonSEO_How to create harmony between SEOs & Developers
BrightonSEO_How to create harmony between SEOs & DevelopersSara Moccand-Sayegh
 

What's hot (20)

Heroconf London 2018_Automating Search Query Processing
Heroconf London 2018_Automating Search Query ProcessingHeroconf London 2018_Automating Search Query Processing
Heroconf London 2018_Automating Search Query Processing
 
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
Data Driven Attribution in BigQuery with Shapley Values and Markov ChainsData Driven Attribution in BigQuery with Shapley Values and Markov Chains
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
 
Machine Learning in PPC: How to get started today | Chris Gutknecht | Friends...
Machine Learning in PPC: How to get started today | Chris Gutknecht | Friends...Machine Learning in PPC: How to get started today | Chris Gutknecht | Friends...
Machine Learning in PPC: How to get started today | Chris Gutknecht | Friends...
 
Google Tag Manager for beginners
Google Tag Manager for beginnersGoogle Tag Manager for beginners
Google Tag Manager for beginners
 
Analytics Tools to improve Customer Insight
Analytics Tools to improve Customer InsightAnalytics Tools to improve Customer Insight
Analytics Tools to improve Customer Insight
 
Google tag manager fundamentals question and answer (june 23 and july 24, 2015)
Google tag manager fundamentals question and answer (june 23 and july 24, 2015)Google tag manager fundamentals question and answer (june 23 and july 24, 2015)
Google tag manager fundamentals question and answer (june 23 and july 24, 2015)
 
Clicktale Vendor Privacy Audit (August 2013)
Clicktale Vendor Privacy Audit (August 2013)Clicktale Vendor Privacy Audit (August 2013)
Clicktale Vendor Privacy Audit (August 2013)
 
Martijn Scheijbeler @ All Things DATA 2016
Martijn Scheijbeler @ All Things DATA 2016Martijn Scheijbeler @ All Things DATA 2016
Martijn Scheijbeler @ All Things DATA 2016
 
Google Tag Manager - Introduction & Implementation
Google Tag Manager - Introduction & ImplementationGoogle Tag Manager - Introduction & Implementation
Google Tag Manager - Introduction & Implementation
 
Google Tag Manager - Measure Twice, Cut Once
Google Tag Manager - Measure Twice, Cut OnceGoogle Tag Manager - Measure Twice, Cut Once
Google Tag Manager - Measure Twice, Cut Once
 
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
 
What to Expect from the Google Analytics Exam 2014
What to Expect from the Google Analytics Exam 2014What to Expect from the Google Analytics Exam 2014
What to Expect from the Google Analytics Exam 2014
 
Seo Tips: Google News
Seo Tips: Google NewsSeo Tips: Google News
Seo Tips: Google News
 
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
 
An Introduction To Google Analytics
An Introduction To Google AnalyticsAn Introduction To Google Analytics
An Introduction To Google Analytics
 
PPT - Google Data Studio
PPT - Google Data StudioPPT - Google Data Studio
PPT - Google Data Studio
 
Top 10 Google Analytics tips to save you money!
Top 10 Google Analytics tips to save you money!Top 10 Google Analytics tips to save you money!
Top 10 Google Analytics tips to save you money!
 
Google Tag Manager | Google Tag Manager Tutorial 2019 | Google Tag Manager Se...
Google Tag Manager | Google Tag Manager Tutorial 2019 | Google Tag Manager Se...Google Tag Manager | Google Tag Manager Tutorial 2019 | Google Tag Manager Se...
Google Tag Manager | Google Tag Manager Tutorial 2019 | Google Tag Manager Se...
 
Google Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
Google Analytics with an Intro to Google Tag Manager for Austin WordPress MeetupGoogle Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
Google Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
 
BrightonSEO_How to create harmony between SEOs & Developers
BrightonSEO_How to create harmony between SEOs & DevelopersBrightonSEO_How to create harmony between SEOs & Developers
BrightonSEO_How to create harmony between SEOs & Developers
 

Viewers also liked

Chi squared test for digital analytics
Chi squared test for digital analyticsChi squared test for digital analytics
Chi squared test for digital analyticsPawel Kapuscinski
 
How to Analyse and Monitor the Health of Your Customer Base
How to Analyse and Monitor the Health of Your Customer BaseHow to Analyse and Monitor the Health of Your Customer Base
How to Analyse and Monitor the Health of Your Customer BaseCarmen Mardiros
 
Morphing GA into an Affiliate Analytics Monster
Morphing GA into an Affiliate Analytics MonsterMorphing GA into an Affiliate Analytics Monster
Morphing GA into an Affiliate Analytics MonsterPhil Pearce
 
Using Lifecycle Scores for Marketing Optimisation
Using Lifecycle Scores for Marketing OptimisationUsing Lifecycle Scores for Marketing Optimisation
Using Lifecycle Scores for Marketing OptimisationCarmen Mardiros
 
How to Sharpen Your Investigative Analysis with PowerPivot
How to Sharpen Your Investigative Analysis with PowerPivotHow to Sharpen Your Investigative Analysis with PowerPivot
How to Sharpen Your Investigative Analysis with PowerPivotCarmen Mardiros
 
Visitor Intent: Smart clues for understanding customer journeys
Visitor Intent: Smart clues for understanding customer journeysVisitor Intent: Smart clues for understanding customer journeys
Visitor Intent: Smart clues for understanding customer journeysCarmen Mardiros
 
Contribution Modelling using Conversion Path Coverage
Contribution Modelling using Conversion Path CoverageContribution Modelling using Conversion Path Coverage
Contribution Modelling using Conversion Path CoverageCarmen Mardiros
 
4 clicks 2 Measurement - Analytics Automation @ SuperWeek
4 clicks 2 Measurement - Analytics Automation @ SuperWeek4 clicks 2 Measurement - Analytics Automation @ SuperWeek
4 clicks 2 Measurement - Analytics Automation @ SuperWeekPhil Pearce
 
Google Data Studio - First impressions @ Measurecamp
Google Data Studio - First impressions @ MeasurecampGoogle Data Studio - First impressions @ Measurecamp
Google Data Studio - First impressions @ MeasurecampPhil Pearce
 
動的最適化の今までとこれから
動的最適化の今までとこれから動的最適化の今までとこれから
動的最適化の今までとこれからKazuki Baba
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringHJ van Veen
 
MeasureCamp 7 Bigger Faster Data by Andrew Hood and Cameron Gray from Lynchpin
MeasureCamp 7   Bigger Faster Data by Andrew Hood and Cameron Gray from LynchpinMeasureCamp 7   Bigger Faster Data by Andrew Hood and Cameron Gray from Lynchpin
MeasureCamp 7 Bigger Faster Data by Andrew Hood and Cameron Gray from LynchpinLynchpin Analytics Consultancy
 
Machine Learning in action
Machine Learning in actionMachine Learning in action
Machine Learning in actionMichal Brys
 
Find signal in noise.
Find signal in noise.Find signal in noise.
Find signal in noise.Michal Brys
 
Proactive Measures for Good Site Health - Brighton SEO 2014
Proactive Measures for Good Site Health - Brighton SEO 2014Proactive Measures for Good Site Health - Brighton SEO 2014
Proactive Measures for Good Site Health - Brighton SEO 2014Thomas Whittam
 
MeasureCamp London - Using enhanced ecommerce for non-ecommerce websites
MeasureCamp London - Using enhanced ecommerce for non-ecommerce websitesMeasureCamp London - Using enhanced ecommerce for non-ecommerce websites
MeasureCamp London - Using enhanced ecommerce for non-ecommerce websitesYard
 
Achtung panzer
Achtung panzerAchtung panzer
Achtung panzerOdal Rune
 
A study of digital data about yourself - By Phil Pearce
A study of digital data about yourself - By Phil PearceA study of digital data about yourself - By Phil Pearce
A study of digital data about yourself - By Phil PearcePhil Pearce
 
TOP UNIVERSITIES IN US FOR MS IN DATA SCIENCE
TOP UNIVERSITIES IN US FOR MS IN DATA SCIENCETOP UNIVERSITIES IN US FOR MS IN DATA SCIENCE
TOP UNIVERSITIES IN US FOR MS IN DATA SCIENCESKILL-LYNC SUPPORT
 

Viewers also liked (20)

Chi squared test for digital analytics
Chi squared test for digital analyticsChi squared test for digital analytics
Chi squared test for digital analytics
 
How to Analyse and Monitor the Health of Your Customer Base
How to Analyse and Monitor the Health of Your Customer BaseHow to Analyse and Monitor the Health of Your Customer Base
How to Analyse and Monitor the Health of Your Customer Base
 
The Lego Data Layer
The Lego Data LayerThe Lego Data Layer
The Lego Data Layer
 
Morphing GA into an Affiliate Analytics Monster
Morphing GA into an Affiliate Analytics MonsterMorphing GA into an Affiliate Analytics Monster
Morphing GA into an Affiliate Analytics Monster
 
Using Lifecycle Scores for Marketing Optimisation
Using Lifecycle Scores for Marketing OptimisationUsing Lifecycle Scores for Marketing Optimisation
Using Lifecycle Scores for Marketing Optimisation
 
How to Sharpen Your Investigative Analysis with PowerPivot
How to Sharpen Your Investigative Analysis with PowerPivotHow to Sharpen Your Investigative Analysis with PowerPivot
How to Sharpen Your Investigative Analysis with PowerPivot
 
Visitor Intent: Smart clues for understanding customer journeys
Visitor Intent: Smart clues for understanding customer journeysVisitor Intent: Smart clues for understanding customer journeys
Visitor Intent: Smart clues for understanding customer journeys
 
Contribution Modelling using Conversion Path Coverage
Contribution Modelling using Conversion Path CoverageContribution Modelling using Conversion Path Coverage
Contribution Modelling using Conversion Path Coverage
 
4 clicks 2 Measurement - Analytics Automation @ SuperWeek
4 clicks 2 Measurement - Analytics Automation @ SuperWeek4 clicks 2 Measurement - Analytics Automation @ SuperWeek
4 clicks 2 Measurement - Analytics Automation @ SuperWeek
 
Google Data Studio - First impressions @ Measurecamp
Google Data Studio - First impressions @ MeasurecampGoogle Data Studio - First impressions @ Measurecamp
Google Data Studio - First impressions @ Measurecamp
 
動的最適化の今までとこれから
動的最適化の今までとこれから動的最適化の今までとこれから
動的最適化の今までとこれから
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
MeasureCamp 7 Bigger Faster Data by Andrew Hood and Cameron Gray from Lynchpin
MeasureCamp 7   Bigger Faster Data by Andrew Hood and Cameron Gray from LynchpinMeasureCamp 7   Bigger Faster Data by Andrew Hood and Cameron Gray from Lynchpin
MeasureCamp 7 Bigger Faster Data by Andrew Hood and Cameron Gray from Lynchpin
 
Machine Learning in action
Machine Learning in actionMachine Learning in action
Machine Learning in action
 
Find signal in noise.
Find signal in noise.Find signal in noise.
Find signal in noise.
 
Proactive Measures for Good Site Health - Brighton SEO 2014
Proactive Measures for Good Site Health - Brighton SEO 2014Proactive Measures for Good Site Health - Brighton SEO 2014
Proactive Measures for Good Site Health - Brighton SEO 2014
 
MeasureCamp London - Using enhanced ecommerce for non-ecommerce websites
MeasureCamp London - Using enhanced ecommerce for non-ecommerce websitesMeasureCamp London - Using enhanced ecommerce for non-ecommerce websites
MeasureCamp London - Using enhanced ecommerce for non-ecommerce websites
 
Achtung panzer
Achtung panzerAchtung panzer
Achtung panzer
 
A study of digital data about yourself - By Phil Pearce
A study of digital data about yourself - By Phil PearceA study of digital data about yourself - By Phil Pearce
A study of digital data about yourself - By Phil Pearce
 
TOP UNIVERSITIES IN US FOR MS IN DATA SCIENCE
TOP UNIVERSITIES IN US FOR MS IN DATA SCIENCETOP UNIVERSITIES IN US FOR MS IN DATA SCIENCE
TOP UNIVERSITIES IN US FOR MS IN DATA SCIENCE
 

Similar to You Don't Have to Be a Data Scientist to Do Data Science

Data Analysis - Making Big Data Work
Data Analysis - Making Big Data WorkData Analysis - Making Big Data Work
Data Analysis - Making Big Data WorkDavid Chiu
 
Machine learning at b.e.s.t. summer university
Machine learning  at b.e.s.t. summer universityMachine learning  at b.e.s.t. summer university
Machine learning at b.e.s.t. summer universityLászló Kovács
 
Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxCloudBusiness2
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine LearningLynn Langit
 
Mixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX ResearchersMixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX ResearchersUXPA International
 
UXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big DataUXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big DataZachary Sam Zaiss
 
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Rohit Dubey
 
Data Analytics Introduction.pptx
Data Analytics Introduction.pptxData Analytics Introduction.pptx
Data Analytics Introduction.pptxamitparashar42
 
Data Analytics Introduction.pptx
Data Analytics Introduction.pptxData Analytics Introduction.pptx
Data Analytics Introduction.pptxamitparashar42
 
How to Run Discrete Choice Conjoint Analysis
How to Run Discrete Choice Conjoint AnalysisHow to Run Discrete Choice Conjoint Analysis
How to Run Discrete Choice Conjoint AnalysisQuestionPro
 
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Intel® Software
 
Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)Dolapo Amusat
 
Survey analytics conjointanalysis_1
Survey analytics conjointanalysis_1Survey analytics conjointanalysis_1
Survey analytics conjointanalysis_1QuestionPro
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data ScienceAjay Ohri
 
Lobsters, Wine and Market Research
Lobsters, Wine and Market ResearchLobsters, Wine and Market Research
Lobsters, Wine and Market ResearchTed Clark
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data ScientistRohit Dubey
 
SHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxSHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxShahbazKhan77289
 

Similar to You Don't Have to Be a Data Scientist to Do Data Science (20)

Data Analysis - Making Big Data Work
Data Analysis - Making Big Data WorkData Analysis - Making Big Data Work
Data Analysis - Making Big Data Work
 
Machine learning at b.e.s.t. summer university
Machine learning  at b.e.s.t. summer universityMachine learning  at b.e.s.t. summer university
Machine learning at b.e.s.t. summer university
 
Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptx
 
Data science guide
Data science guideData science guide
Data science guide
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine Learning
 
Mixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX ResearchersMixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX Researchers
 
UXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big DataUXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big Data
 
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
 
Data Analytics Introduction.pptx
Data Analytics Introduction.pptxData Analytics Introduction.pptx
Data Analytics Introduction.pptx
 
Data Analytics Introduction.pptx
Data Analytics Introduction.pptxData Analytics Introduction.pptx
Data Analytics Introduction.pptx
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
 
How to Run Discrete Choice Conjoint Analysis
How to Run Discrete Choice Conjoint AnalysisHow to Run Discrete Choice Conjoint Analysis
How to Run Discrete Choice Conjoint Analysis
 
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
 
Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)
 
Experimenting with Data!
Experimenting with Data!Experimenting with Data!
Experimenting with Data!
 
Survey analytics conjointanalysis_1
Survey analytics conjointanalysis_1Survey analytics conjointanalysis_1
Survey analytics conjointanalysis_1
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Lobsters, Wine and Market Research
Lobsters, Wine and Market ResearchLobsters, Wine and Market Research
Lobsters, Wine and Market Research
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data Scientist
 
SHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxSHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docx
 

Recently uploaded

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 

Recently uploaded (20)

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 

You Don't Have to Be a Data Scientist to Do Data Science

  • 1. You don’t have to be a Data Scientist to do Data Science @carmenmardiros (not a data scientist)
  • 2. “Sexiest job of the 21st century” Why do I, a mere analyst, care?
  • 3. The appeal of Data Science (for me as an analyst) Increase confidence My own and others’ in my analyses as the complexity of data and business ecosystem increases. Become more productive Speed up the analysis cycle from exploration to hypothesis to experimentation. Add value in new ways As the business and technology landscape changes. Operationalise analysis outcomes as data products.
  • 4. “It’s just not for me...” “I don’t have a degree in statistics or programming.”
  • 5. No confidence to attend the sessions. Worried I would not understand the content. Worried I’d be spotted as a fraud. (3m into my data science foray) Understood much of the content and terminology. Mentally thought questions others asked. I knew more than I thought I did. Predictive Analytics Summit 2013 Predictive Analytics Summit 2016
  • 6. Doing data science requires a PhD/going back to school. Can’t do data science until you can write an algorithm. Bottom-up is the only way. Doing data science requires enthusiasm and confidence in ourselves. Can and should do data science once we’ve conceptually understood how and why the algorithm works. Top-down works. Provide value, learn as you go. Myth Truth
  • 7. Adapt. Grow. Stay relevant.
  • 8. Digital Analytics is changing fast Increasingly scientific approaches Essential as we move towards prescriptive analytics at speed. Become familiar with data science toolkit We will be key to bridging the gap between PhDs, machines and management. May even use it ourselves for our day-to-day work. Future-proof ourselves MS Office for Machine Learning coming soon at a cloud near you.
  • 11. The Bootstrap Number of observations: 100 Sample is representative (to the best of our knowledge). Observed mean: 17.54 months
  • 12. The Bootstrap Draw 100 random samples with replacement. Calculate for each one the mean: [17.61, 16.21, 17.13, 14.08, 19.58 … ] # 100 Plot all means, the 2.5 and 97.5 percentiles and original observed mean. Bootstrap is extremely versatile: ● Fewer assumptions than parametric methods. ● Can be used on any statistic.
  • 13. Simulations & Sensitivity Analysis Simple simulation: Given existing distribution of order values and a given range of possible conversion rates , how much £££ would we make if we doubled the traffic to our website? Sensitivity analysis (or how to open up black boxes): Given a predictive model, randomly generate new data points for each input based on observed distributions, create predictions using the model and interpret distribution of outcome scenarios.
  • 14. Cross Validation Iterations 1 Train fold Train fold Train fold Train fold Test fold 2 Train fold Train fold Train fold Test fold Train fold 3 Train fold Train fold Test fold Train fold Train fold 4 Train fold Test fold Train fold Train fold Train fold 5 Test fold Train fold Train fold Train fold Train fold Assesses how well a predictive model generalises to unseen data.
  • 15. Resampling Protects you from unsound inference Acknowledges and mitigates effects of variance and noise in the data. You already do this when you use confidence intervals. Quantify uncertainty more often. Paints possible future scenarios Leverages randomness and probability to give you glimpses into possible future outcomes. Embrace randomness. It's your ally into prescriptive analytics.
  • 17. Segmented view, side-by-side Outstanding tools for exploratory data analysis: Seaborn in Python and ggplot in R
  • 18. Segmented view, side-by-side Outstanding tools for exploratory data analysis: Seaborn in Python and ggplot in R
  • 19. Segmented view, side-by-side Outstanding tools for exploratory data analysis: Seaborn in Python and ggplot in R
  • 21. #3 Feature Engineering #3 Calculated Metrics or Content Groupings? Back on familiar territory.
  • 22. Feature Engineering Examples Unique content views per user by content type # politics content views, # business content views # short/long-form content views Distribution of content seen per user % politics content views in total content viewed adjusted for uncertainty of small samples Result: fat user-level table of attributes and behaviour for analysis and modelling.
  • 23. Feature Engineering Examples Infer trading calendar activities from data (for time series analysis) # new marketing campaigns (first date with sessions) # new brands launched (first date with pageviews) # voucher codes at peak redeem-rate (date with highest redeems) # AB tests started (date with first events tracked) # VIPs active on each date, etc Result: fat date-level table of leading KPIs and activities (model the ecosystem).
  • 24. Feature Engineering New ways of capturing underlying phenomena Seasoned data scientists: Feature engineering often yields higher rewards than pushing the latest algorithms. You likely already do this, likely in Excel. It’s painful and limiting. Your analytical creativity needs better tools. SQL: The single most valuable tool in our toolkit. We become self-sufficient analysts.
  • 26. Inspired? Learn Python https://try.jupyter.org/ -- start learning python for data science right now (no setup!). https://learncodethehardway.org/python/ Learn Machine Learning http://machinelearningmastery.com/ Understand how algorithms using spreadsheets. Top-down approach. No programming required. Learn SQL https://learncodethehardway.org/sql/