SlideShare a Scribd company logo
1 of 29
Introduction 
Sean Byrnes 
http://seanbyrnes.com 
@sbyrnes 
to 
Data Science
Who Am I? 
f 
ATTENDED 
FOUNDED 
CURRENTLY 
from Yahoo!
Introduction to Data Science 
• What is Data Science? 
• Example 1: Basic Math 
• Example 2: Regression Modeling 
• Example 3: Recommender Systems 
• Getting started in data science
What is Data Science? 
Software Engineering 
+ 
Statistical Analysis
What is Data Science? 
1. Question 
2. Data Gathering 
3. Exploration 
4. Modeling 
5. Answer 
6. Production
Example 1: Basic Math 
What is my customer churn rate? 
def. Churn rate: The percentage of subscribers to a 
service that discontinue their subscription to that service 
in a given time period. (aka attrition rate)
Example 1: Basic Math 
# customers at start 
Churn(month) = 
# customers lost
Example 1: Basic Math 
Month Churn 
Dec '13 3.75% 
Nov '13 1.87% 
Oct '13 3.82% 
Sep '13 2.76% 
Aug '13 2.43% 
Jul '13 2.04% 
Jun '13 1.60%
Example 1: Basic Math 
For all customers acquired in a given month 
Retention(Cmonth) = 
Active(Cmonth) 
Total(Cmonth)
Example 1: Basic Math 
0 1 2 3 4 5 6 
Dec '13 100% 12.82% 8.04% 6.34% 4.91% 3.95% 3.14% 
Nov '13 100% 15.66% 9.97% 6.96% 5.46% 3.88% 2.77% 
Oct '13 100% 16% 10.86% 8.62% 6.22% 5.06% 3.98% 
Sep '13 100% 13.28% 9.52% 7.28% 5.28% 4.48% 4% 
Aug '13 100% 12.96% 9.18% 6.55% 4.73% 3.86% 3.13% 
Jul '13 100% 15.84% 10.85% 8.27% 6.67% 5.60% 4.63% 
Jun '13 100% 16.08% 11.36% 8.36% 7.07% 6% 5.25%
Example 1: Basic Math 
0 1 2 3 4 5 6 
Dec '13 100% 12.82% 8.04% 6.34% 4.91% 3.95% 3.14% 
Nov '13 100% 15.66% 9.97% 6.96% 5.46% 3.88% 2.77% 
Oct '13 100% 16% 10.86% 8.62% 6.22% 5.06% 3.98% 
Sep '13 100% 13.28% 9.52% 7.28% 5.28% 4.48% 4% 
Aug '13 100% 12.96% 9.18% 6.55% 4.73% 3.86% 3.13% 
Jul '13 100% 15.84% 10.85% 8.27% 6.67% 5.60% 4.63% 
Jun '13 100% 16.08% 11.36% 8.36% 7.07% 6% 5.25%
Example 2: Regression Modeling 
How many users will we have next month?
Example 2: Regression Modeling 
160,000 
140,000 
120,000 
100,000 
80,000 
60,000 
40,000 
20,000 
- 
1/1/13 2/1/13 3/1/13 4/1/13 5/1/13 6/1/13 7/1/13 8/1/13 9/1/13 10/1/13 11/1/13 12/1/13
Example 2: Regression Modeling 
For data set X(n), find f(n) such that 
f(ni) ~ X(ni)
Example 2: Regression Modeling 
Assume X(ni) = [x1, x2, … xk] 
f(n) = c1x1 + c2x2 + c3x3 + … + cnxn
Example 2: Regression Modeling 
160,000 
140,000 
120,000 
100,000 
80,000 
60,000 
40,000 
20,000 
- 
1/1/13 2/1/13 3/1/13 4/1/13 5/1/13 6/1/13 7/1/13 8/1/13 9/1/13 10/1/13 11/1/13 12/1/13 
Linear Model
Example 2: Regression Modeling 
Assume X(ni) = [x1, x2, … xk] 
f(n) = c1x1 + c2x2 + c3x3 + … + cnxn 
Or, maybe 
f(n) = c1x1 + c2x1 
2 + c3x2 + c4x2 
2 + …+ cmxn 
2
Example 2: Regression Modeling 
160,000 
140,000 
120,000 
100,000 
80,000 
60,000 
40,000 
20,000 
- 
1/1/13 2/1/13 3/1/13 4/1/13 5/1/13 6/1/13 7/1/13 8/1/13 9/1/13 10/1/13 11/1/13 12/1/13 
2nd Degree Polynomial Model
Example 2: Regression Modeling 
160,000 
140,000 
120,000 
100,000 
80,000 
60,000 
40,000 
20,000 
- 
1/1/13 2/1/13 3/1/13 4/1/13 5/1/13 6/1/13 7/1/13 8/1/13 9/1/13 10/1/13 11/1/13 12/1/13 
4th Degree Polynomial Model
Example 2: Regression Modeling 
https://github.com/sbyrnes/Lyric
Example 3: Recommender Systems 
What other products might this 
customer buy?
Example 3: Recommender Systems 
Product 1 Product 2 Product 3 … Product N 
Customer 1 3.5 4.0 3.0 
Customer 2 2.0 3.5 
Customer 3 3.0 2.5 
… 
Customer 
N 
4.5 4.5
Example 3: Recommender Systems 
Given customer preference matrix M, find 
P x Q ~ M
Example 3: Recommender Systems 
Product 1 Product 2 Product 3 … Product N 
Customer 1 3.5 4.0 2.5 3.0 
Customer 2 2.0 1.5 3.5 3.0 
Customer 3 1.5 3.0 2.5 4.0 
… 
Customer 
N 
4.5 3.5 4.0 4.5
Example 3: Recommender Systems 
Given customer preferences c[p1,p2,…pn] 
and overall rating average roverall 
cbias = mean(c[p1], c[p2],… c[pn]) – roverall
Example 3: Recommender Systems 
https://github.com/sbyrnes/likely.js
Getting Started in Data Science 
• Programming 
• Statistics 
• Machine learning 
• Toolkit 
– R 
– Hadoop 
– D3
seanbyrnes.com 
@sbyrnes 
github.com/sbyrnes
Sean Byrnes 
seanbyrnes.com 
@sbyrnes 
github.com/sbyrnes

More Related Content

Viewers also liked

An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceWesley Eldridge
 
Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Data Science Thailand
 
Introduction to Data Science - ESCP Europe
Introduction to Data Science - ESCP Europe Introduction to Data Science - ESCP Europe
Introduction to Data Science - ESCP Europe Martin Daniel
 
Data Science Introduction
Data Science IntroductionData Science Introduction
Data Science IntroductionGang Tao
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
How to write a Developer CV/Résumé that will get you hired
How to write a Developer CV/Résumé that will get you hiredHow to write a Developer CV/Résumé that will get you hired
How to write a Developer CV/Résumé that will get you hiredPerl Careers
 
Machine learning workshop @DYP Pune
Machine learning workshop @DYP PuneMachine learning workshop @DYP Pune
Machine learning workshop @DYP PuneGanesh Raskar
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceNiko Vuokko
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsSrinath Perera
 
Introduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningIntroduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningNik Spirin
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data ScienceEdureka!
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientistryanorban
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 

Viewers also liked (15)

An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data Science
 
Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)
 
Introduction to Data Science - ESCP Europe
Introduction to Data Science - ESCP Europe Introduction to Data Science - ESCP Europe
Introduction to Data Science - ESCP Europe
 
Data Science Introduction
Data Science IntroductionData Science Introduction
Data Science Introduction
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
How to write a Developer CV/Résumé that will get you hired
How to write a Developer CV/Résumé that will get you hiredHow to write a Developer CV/Résumé that will get you hired
How to write a Developer CV/Résumé that will get you hired
 
Machine learning workshop @DYP Pune
Machine learning workshop @DYP PuneMachine learning workshop @DYP Pune
Machine learning workshop @DYP Pune
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Introduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningIntroduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine Learning
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientist
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 

Similar to Introduction to Data Science

Smoking and Pregnancy SurveyPlease take this brief survey of w.docx
Smoking and Pregnancy SurveyPlease take this brief survey of w.docxSmoking and Pregnancy SurveyPlease take this brief survey of w.docx
Smoking and Pregnancy SurveyPlease take this brief survey of w.docxpbilly1
 
07 ch ken black solution
07 ch ken black solution07 ch ken black solution
07 ch ken black solutionKrunal Shah
 
Common Errors in ML
Common Errors in MLCommon Errors in ML
Common Errors in MLKyle Polich
 
Common Errors in ML
Common Errors in MLCommon Errors in ML
Common Errors in MLKyle Polich
 
BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...
BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...
BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...Big Data Week
 
A/B Testing - Design, Analysis and Pitfals
A/B Testing - Design, Analysis and PitfalsA/B Testing - Design, Analysis and Pitfals
A/B Testing - Design, Analysis and PitfalsSlava Borodovsky
 
08 ch ken black solution
08 ch ken black solution08 ch ken black solution
08 ch ken black solutionKrunal Shah
 
153929081 80951377-regression-analysis-of-count-data
153929081 80951377-regression-analysis-of-count-data153929081 80951377-regression-analysis-of-count-data
153929081 80951377-regression-analysis-of-count-dataNataniel Barros
 
Metrics and Measurement Work Sampling Project
Metrics and Measurement Work Sampling ProjectMetrics and Measurement Work Sampling Project
Metrics and Measurement Work Sampling ProjectDivyang Choudhary
 
THESLING-PETER-6019098-EFR-THESIS
THESLING-PETER-6019098-EFR-THESISTHESLING-PETER-6019098-EFR-THESIS
THESLING-PETER-6019098-EFR-THESISPeter Thesling
 
ALLL Webinar | CECL Methodologies Series Kick Off
ALLL Webinar | CECL Methodologies Series Kick OffALLL Webinar | CECL Methodologies Series Kick Off
ALLL Webinar | CECL Methodologies Series Kick OffLibby Bierman
 
Agile 2014 Software Moneyball (Troy Magennis)
Agile 2014   Software Moneyball (Troy Magennis)Agile 2014   Software Moneyball (Troy Magennis)
Agile 2014 Software Moneyball (Troy Magennis)Troy Magennis
 
Intellectual Property Lebret December 2018
Intellectual Property Lebret December 2018Intellectual Property Lebret December 2018
Intellectual Property Lebret December 2018Hervé Lebret
 
Week 11 data collation & analysis
Week 11 data collation & analysisWeek 11 data collation & analysis
Week 11 data collation & analysisKevin Goh At Np
 
Unofficial Industrial Research Methodology
Unofficial Industrial Research Methodology Unofficial Industrial Research Methodology
Unofficial Industrial Research Methodology Tanya Sattaya-aphitan
 
Business statistics -_assignment_dec_2019_zf_sgc5ylme
Business statistics -_assignment_dec_2019_zf_sgc5ylmeBusiness statistics -_assignment_dec_2019_zf_sgc5ylme
Business statistics -_assignment_dec_2019_zf_sgc5ylmeAssignmentchimp
 
Demystifying Data Science Webinar - February 14, 2018
Demystifying Data Science Webinar - February 14, 2018Demystifying Data Science Webinar - February 14, 2018
Demystifying Data Science Webinar - February 14, 2018Analytics8
 

Similar to Introduction to Data Science (20)

Training Module
Training ModuleTraining Module
Training Module
 
Smoking and Pregnancy SurveyPlease take this brief survey of w.docx
Smoking and Pregnancy SurveyPlease take this brief survey of w.docxSmoking and Pregnancy SurveyPlease take this brief survey of w.docx
Smoking and Pregnancy SurveyPlease take this brief survey of w.docx
 
07 ch ken black solution
07 ch ken black solution07 ch ken black solution
07 ch ken black solution
 
Common Errors in ML
Common Errors in MLCommon Errors in ML
Common Errors in ML
 
Common Errors in ML
Common Errors in MLCommon Errors in ML
Common Errors in ML
 
BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...
BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...
BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...
 
A/B Testing - Design, Analysis and Pitfals
A/B Testing - Design, Analysis and PitfalsA/B Testing - Design, Analysis and Pitfals
A/B Testing - Design, Analysis and Pitfals
 
Models
ModelsModels
Models
 
08 ch ken black solution
08 ch ken black solution08 ch ken black solution
08 ch ken black solution
 
153929081 80951377-regression-analysis-of-count-data
153929081 80951377-regression-analysis-of-count-data153929081 80951377-regression-analysis-of-count-data
153929081 80951377-regression-analysis-of-count-data
 
Metrics and Measurement Work Sampling Project
Metrics and Measurement Work Sampling ProjectMetrics and Measurement Work Sampling Project
Metrics and Measurement Work Sampling Project
 
THESLING-PETER-6019098-EFR-THESIS
THESLING-PETER-6019098-EFR-THESISTHESLING-PETER-6019098-EFR-THESIS
THESLING-PETER-6019098-EFR-THESIS
 
ALLL Webinar | CECL Methodologies Series Kick Off
ALLL Webinar | CECL Methodologies Series Kick OffALLL Webinar | CECL Methodologies Series Kick Off
ALLL Webinar | CECL Methodologies Series Kick Off
 
Agile 2014 Software Moneyball (Troy Magennis)
Agile 2014   Software Moneyball (Troy Magennis)Agile 2014   Software Moneyball (Troy Magennis)
Agile 2014 Software Moneyball (Troy Magennis)
 
Intellectual Property Lebret December 2018
Intellectual Property Lebret December 2018Intellectual Property Lebret December 2018
Intellectual Property Lebret December 2018
 
Week 11 data collation & analysis
Week 11 data collation & analysisWeek 11 data collation & analysis
Week 11 data collation & analysis
 
Unofficial Industrial Research Methodology
Unofficial Industrial Research Methodology Unofficial Industrial Research Methodology
Unofficial Industrial Research Methodology
 
Business statistics -_assignment_dec_2019_zf_sgc5ylme
Business statistics -_assignment_dec_2019_zf_sgc5ylmeBusiness statistics -_assignment_dec_2019_zf_sgc5ylme
Business statistics -_assignment_dec_2019_zf_sgc5ylme
 
Demystifying Data Science Webinar - February 14, 2018
Demystifying Data Science Webinar - February 14, 2018Demystifying Data Science Webinar - February 14, 2018
Demystifying Data Science Webinar - February 14, 2018
 
Sampling techniques
Sampling techniquesSampling techniques
Sampling techniques
 

Recently uploaded

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 

Recently uploaded (20)

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 

Introduction to Data Science

  • 1. Introduction Sean Byrnes http://seanbyrnes.com @sbyrnes to Data Science
  • 2. Who Am I? f ATTENDED FOUNDED CURRENTLY from Yahoo!
  • 3. Introduction to Data Science • What is Data Science? • Example 1: Basic Math • Example 2: Regression Modeling • Example 3: Recommender Systems • Getting started in data science
  • 4. What is Data Science? Software Engineering + Statistical Analysis
  • 5. What is Data Science? 1. Question 2. Data Gathering 3. Exploration 4. Modeling 5. Answer 6. Production
  • 6. Example 1: Basic Math What is my customer churn rate? def. Churn rate: The percentage of subscribers to a service that discontinue their subscription to that service in a given time period. (aka attrition rate)
  • 7. Example 1: Basic Math # customers at start Churn(month) = # customers lost
  • 8. Example 1: Basic Math Month Churn Dec '13 3.75% Nov '13 1.87% Oct '13 3.82% Sep '13 2.76% Aug '13 2.43% Jul '13 2.04% Jun '13 1.60%
  • 9. Example 1: Basic Math For all customers acquired in a given month Retention(Cmonth) = Active(Cmonth) Total(Cmonth)
  • 10. Example 1: Basic Math 0 1 2 3 4 5 6 Dec '13 100% 12.82% 8.04% 6.34% 4.91% 3.95% 3.14% Nov '13 100% 15.66% 9.97% 6.96% 5.46% 3.88% 2.77% Oct '13 100% 16% 10.86% 8.62% 6.22% 5.06% 3.98% Sep '13 100% 13.28% 9.52% 7.28% 5.28% 4.48% 4% Aug '13 100% 12.96% 9.18% 6.55% 4.73% 3.86% 3.13% Jul '13 100% 15.84% 10.85% 8.27% 6.67% 5.60% 4.63% Jun '13 100% 16.08% 11.36% 8.36% 7.07% 6% 5.25%
  • 11. Example 1: Basic Math 0 1 2 3 4 5 6 Dec '13 100% 12.82% 8.04% 6.34% 4.91% 3.95% 3.14% Nov '13 100% 15.66% 9.97% 6.96% 5.46% 3.88% 2.77% Oct '13 100% 16% 10.86% 8.62% 6.22% 5.06% 3.98% Sep '13 100% 13.28% 9.52% 7.28% 5.28% 4.48% 4% Aug '13 100% 12.96% 9.18% 6.55% 4.73% 3.86% 3.13% Jul '13 100% 15.84% 10.85% 8.27% 6.67% 5.60% 4.63% Jun '13 100% 16.08% 11.36% 8.36% 7.07% 6% 5.25%
  • 12. Example 2: Regression Modeling How many users will we have next month?
  • 13. Example 2: Regression Modeling 160,000 140,000 120,000 100,000 80,000 60,000 40,000 20,000 - 1/1/13 2/1/13 3/1/13 4/1/13 5/1/13 6/1/13 7/1/13 8/1/13 9/1/13 10/1/13 11/1/13 12/1/13
  • 14. Example 2: Regression Modeling For data set X(n), find f(n) such that f(ni) ~ X(ni)
  • 15. Example 2: Regression Modeling Assume X(ni) = [x1, x2, … xk] f(n) = c1x1 + c2x2 + c3x3 + … + cnxn
  • 16. Example 2: Regression Modeling 160,000 140,000 120,000 100,000 80,000 60,000 40,000 20,000 - 1/1/13 2/1/13 3/1/13 4/1/13 5/1/13 6/1/13 7/1/13 8/1/13 9/1/13 10/1/13 11/1/13 12/1/13 Linear Model
  • 17. Example 2: Regression Modeling Assume X(ni) = [x1, x2, … xk] f(n) = c1x1 + c2x2 + c3x3 + … + cnxn Or, maybe f(n) = c1x1 + c2x1 2 + c3x2 + c4x2 2 + …+ cmxn 2
  • 18. Example 2: Regression Modeling 160,000 140,000 120,000 100,000 80,000 60,000 40,000 20,000 - 1/1/13 2/1/13 3/1/13 4/1/13 5/1/13 6/1/13 7/1/13 8/1/13 9/1/13 10/1/13 11/1/13 12/1/13 2nd Degree Polynomial Model
  • 19. Example 2: Regression Modeling 160,000 140,000 120,000 100,000 80,000 60,000 40,000 20,000 - 1/1/13 2/1/13 3/1/13 4/1/13 5/1/13 6/1/13 7/1/13 8/1/13 9/1/13 10/1/13 11/1/13 12/1/13 4th Degree Polynomial Model
  • 20. Example 2: Regression Modeling https://github.com/sbyrnes/Lyric
  • 21. Example 3: Recommender Systems What other products might this customer buy?
  • 22. Example 3: Recommender Systems Product 1 Product 2 Product 3 … Product N Customer 1 3.5 4.0 3.0 Customer 2 2.0 3.5 Customer 3 3.0 2.5 … Customer N 4.5 4.5
  • 23. Example 3: Recommender Systems Given customer preference matrix M, find P x Q ~ M
  • 24. Example 3: Recommender Systems Product 1 Product 2 Product 3 … Product N Customer 1 3.5 4.0 2.5 3.0 Customer 2 2.0 1.5 3.5 3.0 Customer 3 1.5 3.0 2.5 4.0 … Customer N 4.5 3.5 4.0 4.5
  • 25. Example 3: Recommender Systems Given customer preferences c[p1,p2,…pn] and overall rating average roverall cbias = mean(c[p1], c[p2],… c[pn]) – roverall
  • 26. Example 3: Recommender Systems https://github.com/sbyrnes/likely.js
  • 27. Getting Started in Data Science • Programming • Statistics • Machine learning • Toolkit – R – Hadoop – D3
  • 29. Sean Byrnes seanbyrnes.com @sbyrnes github.com/sbyrnes