SlideShare a Scribd company logo
Predicting the NBA MVP with Data Science
bit.ly/nba-la
CrossCamp.us Events
About us
We train developers and data
scientists through 1-on-1
mentorship and career prep
About me
• Alex Nussbacher
• Lead Data Science Instructor at Thinkful
• Data scientist at Uber, focus on consumption
economics and economics of choice 🤔
What’s your background?
• I have a software background
• I have a math or stats background
• None of the above
Data Science Process
• Frame the question.
• Collect the raw data.
• Process the data.
• Explore the data.
• Communicate results.
Frame the question
• Who will win the MVP in the NBA this
season?
Collect the Data
• What kind of data do we need?
• Individual stats
• Team stats and success
• Past winners and voting records
• All data from basketball-reference.com
Process the data
• How’s the data “dirty” and how can we fix it?
• User input, redundancies, missing data…
• Formatting: adapt the data to meet certain
specifications.
• Cleaning: detecting and correcting corrupt
or inaccurate records.
Explore the data
• What are the meaningful patterns in the
data?
• How meaningful is each data point for our
predictions?
Goals
• Introduction to a data scientist's tools and
methods:
• Jupyter notebooks, numpy, pandas,
sklearn…
• Overview of basic machine learning concepts:
• Data formatting and cleaning, Decision
trees, Overfitting, Random Forests…
Jupyter Notebooks
• One of data scientist’s everyday tools.
• Find the links in our classroom tool.
• Contains cells with code.
NumPy
• The fundamental package for scientific
computing with Python.
• Provides powerful multi-dimensional array
objects.
• Many methods for fast operations on arrays.
Pandas
• Fundamental high-level building block for
doing practical, real world data analysis in
Python.
• Built on top of NumPy.
• Offers data structures and operations for
manipulating numerical tables and time
series.
Scikit-learn
• Python module for machine learning.
• Provides a large menu of libraries for
scientific computation, such as integration,
interpolation, signal processing, linear
algebra, statistics, etc.
Initial imports and loading data with Pandas
Understanding your data
• .head(n) method: Returns first n rows.
• .value_counts() method: Returns the counts
of unique values in the DataFrame.
Training Set
• We loaded in our data as a training set.
• This is because we’re going to use this data
to build, or train, our model
• It consists of every year for which we have
data on NBA MVP voting, from the 1955-56
season onward
Formatting your Data
Formatting your Data
• We need to put our data in the easiest to use
format
• No blanks allowed
• Numeric strings (like win loss record) need to
have the numbers extracted and typed as
integers
• Factors, or categories, need to be changed to
dummies, which report a 0 or 1 to show if that
value is present
Decision Trees
• It breaks down a dataset into smaller and
smaller subsets.
• The final result is a model with a tree
structure that has:
• Decision nodes: ask a question and have
two or more branches.
• Leaf nodes: represent a classification or
decision.
Classification vs Regression
• Classification — Predict categories.
• Identifying group membership.
• Regression — Predict values.
• Involves estimating or predicting a
response.
Classification
Classification
?
Regression
• Regression — Predict values.
• Involves estimating or predicting a
response.
• This is what we’ll be doing. Predicting
vote share…
Creating your first Decision Tree
You will use the scikit-learn and numpy libraries
to build your first decision tree. We will need the
following to build a decision tree
• Response (y): A one-dimensional array or
series containing the target from the train
data.
• Inputs (X): A multidimensional pandas data
frame containing the features/predictors from
the train data.
Creating your first Decision Tree
Importances and Score
• .feature_importances_ attribute: tells us
how important the features are for the final
result.
• .score() method: returns the mean accuracy
of our fitting.
Importances and Score
That looks good…
But that’s actually not clear.
CLASS IMBALANCE
• We have what is called a class imbalance
problem.
• The outcome of not being MVP is much much
more common than being the MVP,
• So our model is ‘accurate’ if it just tells
everyone they’re not going to be MVP
Looking closer
Looking at our results
• We seem to be doing a decent job of
identifying players who are great players
• But the ordering isn’t perfect
• And we have a lot of people who are scored
as equivalent
• Also note this seems to be a year with a lot of
great performers this year
Let’s improve it!
• We have options for improving the model
• Firstly, we can look at our feature list and
select a smaller but more effective list of
features
• We could also choose a better type of
model…
Let’s improve it!
Modify the feature list
• We put a lot of features into our model
• Trimming it down to a smaller list could
improve the efficiency of our trees and
possibly improve accuracy as well
Overfitting
• Resulting model too tied to the training set.
• It doesn’t generalize to new data, which is the
point of prediction.
Random Forest Classifier
• Random Forest Classifiers use many
Decision Trees to build a classifier.
• We introduce a bit of randomness.
• Each Tree can uses a subset of the data to
give a different answer (a vote). The final
classification is the most common amongst
the Trees.
Random Forest Classifier
Results
And the MVP goes to…
Russell Westbrook!
What’s going on?
• Our model is giving good weight to major
statistical categories and position, but not
enough to team record…
• How could you fix continue to improve???
Trim our variable list…
2016
STEPH!
2008
Kobe!
1996
MJ!
The End
More about Thinkful
• Anyone who’s committed can learn to code
• 1-on-1 mentorship is the best way to learn
• Flexibility! Learn anywhere, anytime, & at your
own pace
Our Program
You’ll learn concepts, practice with drills, and build
capstone projects — all guided by a personal mentor
Our Mentors
Mentors have, on average, 10+ years of experience
Data Science Syllabus
• Managing data with SQL and Python
• Modeling with both supervised and unsupervised
models
• Data visualization and communicating with data
• Technical interviews + career services
Special Introductory Offer
• Prep course for 50% off —
$250 instead of $500
• Covers math, stats,
Python, and data science
toolkit
• Option to continue into full
program
• Talk to me (or email
noel@thinkful.com) if
you’re interested

More Related Content

What's hot

Machine learning basics
Machine learning basics Machine learning basics
Machine learning basics
Akanksha Bali
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
Pranav Ainavolu
 
Ensemble methods for modeling financial data
Ensemble methods for modeling financial dataEnsemble methods for modeling financial data
Ensemble methods for modeling financial data
Gaurav Chakravorty
 
Module 9: Natural Language Processing Part 2
Module 9:  Natural Language Processing Part 2Module 9:  Natural Language Processing Part 2
Module 9: Natural Language Processing Part 2
Sara Hooker
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
Sara Hooker
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientist
Poo Kuan Hoong
 

What's hot (6)

Machine learning basics
Machine learning basics Machine learning basics
Machine learning basics
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
 
Ensemble methods for modeling financial data
Ensemble methods for modeling financial dataEnsemble methods for modeling financial data
Ensemble methods for modeling financial data
 
Module 9: Natural Language Processing Part 2
Module 9:  Natural Language Processing Part 2Module 9:  Natural Language Processing Part 2
Module 9: Natural Language Processing Part 2
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientist
 

Similar to Predicting the NBA MVP

Hpd 1
Hpd 1Hpd 1
Data Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsData Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsVivastream
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
Benjamin Schulte
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
GibDevs
 
machine learning workflow with data input.pptx
machine learning workflow with data input.pptxmachine learning workflow with data input.pptx
machine learning workflow with data input.pptx
jasontseng19
 
datamining and warehousing ppt
datamining  and warehousing pptdatamining  and warehousing ppt
datamining and warehousing ppt
Satyamverma2011
 
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Rohit Dubey
 
How to Use Artificial Intelligence by Microsoft Product Manager
 How to Use Artificial Intelligence by Microsoft Product Manager How to Use Artificial Intelligence by Microsoft Product Manager
How to Use Artificial Intelligence by Microsoft Product Manager
Product School
 
Data Analytics course.pptx
Data Analytics course.pptxData Analytics course.pptx
Data Analytics course.pptx
UttarakhandAccountin
 
Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - Slides
Aditya Joshi
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
Roger Barga
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
Dev Raj Gautam
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Simon Hughes
 
Using Net Promoter Score (NPS) to Increase Course Engagement
Using Net Promoter Score (NPS) to Increase Course EngagementUsing Net Promoter Score (NPS) to Increase Course Engagement
Using Net Promoter Score (NPS) to Increase Course Engagement
Lambda Solutions
 
Data Analyst Job Description | Edureka
Data Analyst Job Description | EdurekaData Analyst Job Description | Edureka
Data Analyst Job Description | Edureka
Edureka!
 
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
Louis Dorard
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
ananth
 
MLSEV Virtual. Automating Model Selection
MLSEV Virtual. Automating Model SelectionMLSEV Virtual. Automating Model Selection
MLSEV Virtual. Automating Model Selection
BigML, Inc
 
data science with python_UNIT 2_full notes.pdf
data science with python_UNIT 2_full notes.pdfdata science with python_UNIT 2_full notes.pdf
data science with python_UNIT 2_full notes.pdf
mukeshgarg02
 

Similar to Predicting the NBA MVP (20)

Hpd 1
Hpd 1Hpd 1
Hpd 1
 
Data Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsData Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisions
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
machine learning workflow with data input.pptx
machine learning workflow with data input.pptxmachine learning workflow with data input.pptx
machine learning workflow with data input.pptx
 
datamining and warehousing ppt
datamining  and warehousing pptdatamining  and warehousing ppt
datamining and warehousing ppt
 
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
 
How to Use Artificial Intelligence by Microsoft Product Manager
 How to Use Artificial Intelligence by Microsoft Product Manager How to Use Artificial Intelligence by Microsoft Product Manager
How to Use Artificial Intelligence by Microsoft Product Manager
 
Data Analytics course.pptx
Data Analytics course.pptxData Analytics course.pptx
Data Analytics course.pptx
 
Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - Slides
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Using Net Promoter Score (NPS) to Increase Course Engagement
Using Net Promoter Score (NPS) to Increase Course EngagementUsing Net Promoter Score (NPS) to Increase Course Engagement
Using Net Promoter Score (NPS) to Increase Course Engagement
 
Data Analyst Job Description | Edureka
Data Analyst Job Description | EdurekaData Analyst Job Description | Edureka
Data Analyst Job Description | Edureka
 
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
 
MLSEV Virtual. Automating Model Selection
MLSEV Virtual. Automating Model SelectionMLSEV Virtual. Automating Model Selection
MLSEV Virtual. Automating Model Selection
 
data science with python_UNIT 2_full notes.pdf
data science with python_UNIT 2_full notes.pdfdata science with python_UNIT 2_full notes.pdf
data science with python_UNIT 2_full notes.pdf
 

More from Thinkful

893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
Thinkful
 
LA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsLA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: Fundamentals
Thinkful
 
LA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsLA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: Fundamentals
Thinkful
 
Itjsf129
Itjsf129Itjsf129
Itjsf129
Thinkful
 
Twit botsd1.30.18
Twit botsd1.30.18Twit botsd1.30.18
Twit botsd1.30.18
Thinkful
 
Build your-own-instagram-filters-with-javascript-202-335 (1)
Build your-own-instagram-filters-with-javascript-202-335 (1)Build your-own-instagram-filters-with-javascript-202-335 (1)
Build your-own-instagram-filters-with-javascript-202-335 (1)
Thinkful
 
Baggwjs124
Baggwjs124Baggwjs124
Baggwjs124
Thinkful
 
Become a Data Scientist: A Thinkful Info Session
Become a Data Scientist: A Thinkful Info SessionBecome a Data Scientist: A Thinkful Info Session
Become a Data Scientist: A Thinkful Info Session
Thinkful
 
Vpet sd-1.25.18
Vpet sd-1.25.18Vpet sd-1.25.18
Vpet sd-1.25.18
Thinkful
 
LA 1/18/18 Become A Web Developer: A Thinkful Info Session
LA 1/18/18 Become A Web Developer: A Thinkful Info SessionLA 1/18/18 Become A Web Developer: A Thinkful Info Session
LA 1/18/18 Become A Web Developer: A Thinkful Info Session
Thinkful
 
How to Choose a Programming Language
How to Choose a Programming LanguageHow to Choose a Programming Language
How to Choose a Programming Language
Thinkful
 
Batbwjs117
Batbwjs117Batbwjs117
Batbwjs117
Thinkful
 
1/16/18 Intro to JS Workshop
1/16/18 Intro to JS Workshop1/16/18 Intro to JS Workshop
1/16/18 Intro to JS Workshop
Thinkful
 
LA 1/16/18 Intro to Javascript: Fundamentals
LA 1/16/18 Intro to Javascript: FundamentalsLA 1/16/18 Intro to Javascript: Fundamentals
LA 1/16/18 Intro to Javascript: Fundamentals
Thinkful
 
(LA 1/16/18) Intro to JavaScript: Fundamentals
(LA 1/16/18) Intro to JavaScript: Fundamentals(LA 1/16/18) Intro to JavaScript: Fundamentals
(LA 1/16/18) Intro to JavaScript: Fundamentals
Thinkful
 
Websitesd1.15.17.
Websitesd1.15.17.Websitesd1.15.17.
Websitesd1.15.17.
Thinkful
 
Bavpwjs110
Bavpwjs110Bavpwjs110
Bavpwjs110
Thinkful
 
Byowwhc110
Byowwhc110Byowwhc110
Byowwhc110
Thinkful
 
Getting started-jan-9-2018
Getting started-jan-9-2018Getting started-jan-9-2018
Getting started-jan-9-2018
Thinkful
 
Introjs1.9.18tf
Introjs1.9.18tfIntrojs1.9.18tf
Introjs1.9.18tf
Thinkful
 

More from Thinkful (20)

893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
 
LA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsLA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: Fundamentals
 
LA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsLA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: Fundamentals
 
Itjsf129
Itjsf129Itjsf129
Itjsf129
 
Twit botsd1.30.18
Twit botsd1.30.18Twit botsd1.30.18
Twit botsd1.30.18
 
Build your-own-instagram-filters-with-javascript-202-335 (1)
Build your-own-instagram-filters-with-javascript-202-335 (1)Build your-own-instagram-filters-with-javascript-202-335 (1)
Build your-own-instagram-filters-with-javascript-202-335 (1)
 
Baggwjs124
Baggwjs124Baggwjs124
Baggwjs124
 
Become a Data Scientist: A Thinkful Info Session
Become a Data Scientist: A Thinkful Info SessionBecome a Data Scientist: A Thinkful Info Session
Become a Data Scientist: A Thinkful Info Session
 
Vpet sd-1.25.18
Vpet sd-1.25.18Vpet sd-1.25.18
Vpet sd-1.25.18
 
LA 1/18/18 Become A Web Developer: A Thinkful Info Session
LA 1/18/18 Become A Web Developer: A Thinkful Info SessionLA 1/18/18 Become A Web Developer: A Thinkful Info Session
LA 1/18/18 Become A Web Developer: A Thinkful Info Session
 
How to Choose a Programming Language
How to Choose a Programming LanguageHow to Choose a Programming Language
How to Choose a Programming Language
 
Batbwjs117
Batbwjs117Batbwjs117
Batbwjs117
 
1/16/18 Intro to JS Workshop
1/16/18 Intro to JS Workshop1/16/18 Intro to JS Workshop
1/16/18 Intro to JS Workshop
 
LA 1/16/18 Intro to Javascript: Fundamentals
LA 1/16/18 Intro to Javascript: FundamentalsLA 1/16/18 Intro to Javascript: Fundamentals
LA 1/16/18 Intro to Javascript: Fundamentals
 
(LA 1/16/18) Intro to JavaScript: Fundamentals
(LA 1/16/18) Intro to JavaScript: Fundamentals(LA 1/16/18) Intro to JavaScript: Fundamentals
(LA 1/16/18) Intro to JavaScript: Fundamentals
 
Websitesd1.15.17.
Websitesd1.15.17.Websitesd1.15.17.
Websitesd1.15.17.
 
Bavpwjs110
Bavpwjs110Bavpwjs110
Bavpwjs110
 
Byowwhc110
Byowwhc110Byowwhc110
Byowwhc110
 
Getting started-jan-9-2018
Getting started-jan-9-2018Getting started-jan-9-2018
Getting started-jan-9-2018
 
Introjs1.9.18tf
Introjs1.9.18tfIntrojs1.9.18tf
Introjs1.9.18tf
 

Recently uploaded

Croatia vs Italy Croatia vs Italy Predictions, Tips & Odds Azzurri looking t...
Croatia vs Italy  Croatia vs Italy Predictions, Tips & Odds Azzurri looking t...Croatia vs Italy  Croatia vs Italy Predictions, Tips & Odds Azzurri looking t...
Croatia vs Italy Croatia vs Italy Predictions, Tips & Odds Azzurri looking t...
World Wide Tickets And Hospitality
 
Spain vs Italy Spain at Euro Cup 2024 Group, Fixtures, Players to Watch and M...
Spain vs Italy Spain at Euro Cup 2024 Group, Fixtures, Players to Watch and M...Spain vs Italy Spain at Euro Cup 2024 Group, Fixtures, Players to Watch and M...
Spain vs Italy Spain at Euro Cup 2024 Group, Fixtures, Players to Watch and M...
Eticketing.co
 
Belgium Vs Romania Witsel recalled to Belgium squad for Euro 2024.docx
Belgium Vs Romania Witsel recalled to Belgium squad for Euro 2024.docxBelgium Vs Romania Witsel recalled to Belgium squad for Euro 2024.docx
Belgium Vs Romania Witsel recalled to Belgium squad for Euro 2024.docx
World Wide Tickets And Hospitality
 
European Championships Football Quiz.pptx
European Championships Football Quiz.pptxEuropean Championships Football Quiz.pptx
European Championships Football Quiz.pptx
PaulGray854697
 
Poland Vs Netherlands Poland Euro 2024 squad Who is Michal Probierz bringing ...
Poland Vs Netherlands Poland Euro 2024 squad Who is Michal Probierz bringing ...Poland Vs Netherlands Poland Euro 2024 squad Who is Michal Probierz bringing ...
Poland Vs Netherlands Poland Euro 2024 squad Who is Michal Probierz bringing ...
World Wide Tickets And Hospitality
 
Croatia Vs Italy Croatia's Euro 2024 Journey can Modric and Team Survive the ...
Croatia Vs Italy Croatia's Euro 2024 Journey can Modric and Team Survive the ...Croatia Vs Italy Croatia's Euro 2024 Journey can Modric and Team Survive the ...
Croatia Vs Italy Croatia's Euro 2024 Journey can Modric and Team Survive the ...
World Wide Tickets And Hospitality
 
Narrated Business Proposal for the Philadelphia Eagles
Narrated Business Proposal for the Philadelphia EaglesNarrated Business Proposal for the Philadelphia Eagles
Narrated Business Proposal for the Philadelphia Eagles
camrynascott12
 
Turkey's Euro 2024 Squad Overview and Transfer Speculation.docx
Turkey's Euro 2024 Squad Overview and Transfer Speculation.docxTurkey's Euro 2024 Squad Overview and Transfer Speculation.docx
Turkey's Euro 2024 Squad Overview and Transfer Speculation.docx
Euro Cup 2024 Tickets
 
Denmark Vs England Cole Palmer thrilled to be selected in England’s Euro Cup ...
Denmark Vs England Cole Palmer thrilled to be selected in England’s Euro Cup ...Denmark Vs England Cole Palmer thrilled to be selected in England’s Euro Cup ...
Denmark Vs England Cole Palmer thrilled to be selected in England’s Euro Cup ...
World Wide Tickets And Hospitality
 
Albania vs Spain Euro Cup 2024 Very Close Armando Broja Optimistic Albania Wi...
Albania vs Spain Euro Cup 2024 Very Close Armando Broja Optimistic Albania Wi...Albania vs Spain Euro Cup 2024 Very Close Armando Broja Optimistic Albania Wi...
Albania vs Spain Euro Cup 2024 Very Close Armando Broja Optimistic Albania Wi...
Eticketing.co
 
Turkey Vs Portugal-UEFA EURO 2024 Montella calls up three Serie A players to ...
Turkey Vs Portugal-UEFA EURO 2024 Montella calls up three Serie A players to ...Turkey Vs Portugal-UEFA EURO 2024 Montella calls up three Serie A players to ...
Turkey Vs Portugal-UEFA EURO 2024 Montella calls up three Serie A players to ...
World Wide Tickets And Hospitality
 
TAM Sports_IPL 17_Commercial Advertising_Report.pdf
TAM Sports_IPL 17_Commercial Advertising_Report.pdfTAM Sports_IPL 17_Commercial Advertising_Report.pdf
TAM Sports_IPL 17_Commercial Advertising_Report.pdf
Social Samosa
 
Spain vs Croatia Date, venue and match preview ahead of Euro Cup clash as Mod...
Spain vs Croatia Date, venue and match preview ahead of Euro Cup clash as Mod...Spain vs Croatia Date, venue and match preview ahead of Euro Cup clash as Mod...
Spain vs Croatia Date, venue and match preview ahead of Euro Cup clash as Mod...
Eticketing.co
 
Denmark vs England England Euro Cup squad guide Fixtures, predictions and bes...
Denmark vs England England Euro Cup squad guide Fixtures, predictions and bes...Denmark vs England England Euro Cup squad guide Fixtures, predictions and bes...
Denmark vs England England Euro Cup squad guide Fixtures, predictions and bes...
Eticketing.co
 
Ukraine Vs Belgium What are the odds for Ukraine to make the Euro Cup 2024 qu...
Ukraine Vs Belgium What are the odds for Ukraine to make the Euro Cup 2024 qu...Ukraine Vs Belgium What are the odds for Ukraine to make the Euro Cup 2024 qu...
Ukraine Vs Belgium What are the odds for Ukraine to make the Euro Cup 2024 qu...
World Wide Tickets And Hospitality
 
The Richest Female Athletes of 2024: Champions of Wealth and Excellence | CIO...
The Richest Female Athletes of 2024: Champions of Wealth and Excellence | CIO...The Richest Female Athletes of 2024: Champions of Wealth and Excellence | CIO...
The Richest Female Athletes of 2024: Champions of Wealth and Excellence | CIO...
CIOWomenMagazine
 
Poland Vs Austria Poland Euro Cup 2024 squad Who is Michal Probierz bringing ...
Poland Vs Austria Poland Euro Cup 2024 squad Who is Michal Probierz bringing ...Poland Vs Austria Poland Euro Cup 2024 squad Who is Michal Probierz bringing ...
Poland Vs Austria Poland Euro Cup 2024 squad Who is Michal Probierz bringing ...
World Wide Tickets And Hospitality
 
Belgium vs Slovakia Belgium announce provisional squad for Euro Cup 2024 Thib...
Belgium vs Slovakia Belgium announce provisional squad for Euro Cup 2024 Thib...Belgium vs Slovakia Belgium announce provisional squad for Euro Cup 2024 Thib...
Belgium vs Slovakia Belgium announce provisional squad for Euro Cup 2024 Thib...
Eticketing.co
 
Turkey vs Georgia Tickets: Turkey's Road to Glory and Building Momentum for U...
Turkey vs Georgia Tickets: Turkey's Road to Glory and Building Momentum for U...Turkey vs Georgia Tickets: Turkey's Road to Glory and Building Momentum for U...
Turkey vs Georgia Tickets: Turkey's Road to Glory and Building Momentum for U...
Eticketing.co
 
Portugal Vs Czechia- Ronaldo feels 'proud' of new UEFA Euro 2024 record.docx
Portugal Vs Czechia- Ronaldo feels 'proud' of new UEFA Euro 2024 record.docxPortugal Vs Czechia- Ronaldo feels 'proud' of new UEFA Euro 2024 record.docx
Portugal Vs Czechia- Ronaldo feels 'proud' of new UEFA Euro 2024 record.docx
World Wide Tickets And Hospitality
 

Recently uploaded (20)

Croatia vs Italy Croatia vs Italy Predictions, Tips & Odds Azzurri looking t...
Croatia vs Italy  Croatia vs Italy Predictions, Tips & Odds Azzurri looking t...Croatia vs Italy  Croatia vs Italy Predictions, Tips & Odds Azzurri looking t...
Croatia vs Italy Croatia vs Italy Predictions, Tips & Odds Azzurri looking t...
 
Spain vs Italy Spain at Euro Cup 2024 Group, Fixtures, Players to Watch and M...
Spain vs Italy Spain at Euro Cup 2024 Group, Fixtures, Players to Watch and M...Spain vs Italy Spain at Euro Cup 2024 Group, Fixtures, Players to Watch and M...
Spain vs Italy Spain at Euro Cup 2024 Group, Fixtures, Players to Watch and M...
 
Belgium Vs Romania Witsel recalled to Belgium squad for Euro 2024.docx
Belgium Vs Romania Witsel recalled to Belgium squad for Euro 2024.docxBelgium Vs Romania Witsel recalled to Belgium squad for Euro 2024.docx
Belgium Vs Romania Witsel recalled to Belgium squad for Euro 2024.docx
 
European Championships Football Quiz.pptx
European Championships Football Quiz.pptxEuropean Championships Football Quiz.pptx
European Championships Football Quiz.pptx
 
Poland Vs Netherlands Poland Euro 2024 squad Who is Michal Probierz bringing ...
Poland Vs Netherlands Poland Euro 2024 squad Who is Michal Probierz bringing ...Poland Vs Netherlands Poland Euro 2024 squad Who is Michal Probierz bringing ...
Poland Vs Netherlands Poland Euro 2024 squad Who is Michal Probierz bringing ...
 
Croatia Vs Italy Croatia's Euro 2024 Journey can Modric and Team Survive the ...
Croatia Vs Italy Croatia's Euro 2024 Journey can Modric and Team Survive the ...Croatia Vs Italy Croatia's Euro 2024 Journey can Modric and Team Survive the ...
Croatia Vs Italy Croatia's Euro 2024 Journey can Modric and Team Survive the ...
 
Narrated Business Proposal for the Philadelphia Eagles
Narrated Business Proposal for the Philadelphia EaglesNarrated Business Proposal for the Philadelphia Eagles
Narrated Business Proposal for the Philadelphia Eagles
 
Turkey's Euro 2024 Squad Overview and Transfer Speculation.docx
Turkey's Euro 2024 Squad Overview and Transfer Speculation.docxTurkey's Euro 2024 Squad Overview and Transfer Speculation.docx
Turkey's Euro 2024 Squad Overview and Transfer Speculation.docx
 
Denmark Vs England Cole Palmer thrilled to be selected in England’s Euro Cup ...
Denmark Vs England Cole Palmer thrilled to be selected in England’s Euro Cup ...Denmark Vs England Cole Palmer thrilled to be selected in England’s Euro Cup ...
Denmark Vs England Cole Palmer thrilled to be selected in England’s Euro Cup ...
 
Albania vs Spain Euro Cup 2024 Very Close Armando Broja Optimistic Albania Wi...
Albania vs Spain Euro Cup 2024 Very Close Armando Broja Optimistic Albania Wi...Albania vs Spain Euro Cup 2024 Very Close Armando Broja Optimistic Albania Wi...
Albania vs Spain Euro Cup 2024 Very Close Armando Broja Optimistic Albania Wi...
 
Turkey Vs Portugal-UEFA EURO 2024 Montella calls up three Serie A players to ...
Turkey Vs Portugal-UEFA EURO 2024 Montella calls up three Serie A players to ...Turkey Vs Portugal-UEFA EURO 2024 Montella calls up three Serie A players to ...
Turkey Vs Portugal-UEFA EURO 2024 Montella calls up three Serie A players to ...
 
TAM Sports_IPL 17_Commercial Advertising_Report.pdf
TAM Sports_IPL 17_Commercial Advertising_Report.pdfTAM Sports_IPL 17_Commercial Advertising_Report.pdf
TAM Sports_IPL 17_Commercial Advertising_Report.pdf
 
Spain vs Croatia Date, venue and match preview ahead of Euro Cup clash as Mod...
Spain vs Croatia Date, venue and match preview ahead of Euro Cup clash as Mod...Spain vs Croatia Date, venue and match preview ahead of Euro Cup clash as Mod...
Spain vs Croatia Date, venue and match preview ahead of Euro Cup clash as Mod...
 
Denmark vs England England Euro Cup squad guide Fixtures, predictions and bes...
Denmark vs England England Euro Cup squad guide Fixtures, predictions and bes...Denmark vs England England Euro Cup squad guide Fixtures, predictions and bes...
Denmark vs England England Euro Cup squad guide Fixtures, predictions and bes...
 
Ukraine Vs Belgium What are the odds for Ukraine to make the Euro Cup 2024 qu...
Ukraine Vs Belgium What are the odds for Ukraine to make the Euro Cup 2024 qu...Ukraine Vs Belgium What are the odds for Ukraine to make the Euro Cup 2024 qu...
Ukraine Vs Belgium What are the odds for Ukraine to make the Euro Cup 2024 qu...
 
The Richest Female Athletes of 2024: Champions of Wealth and Excellence | CIO...
The Richest Female Athletes of 2024: Champions of Wealth and Excellence | CIO...The Richest Female Athletes of 2024: Champions of Wealth and Excellence | CIO...
The Richest Female Athletes of 2024: Champions of Wealth and Excellence | CIO...
 
Poland Vs Austria Poland Euro Cup 2024 squad Who is Michal Probierz bringing ...
Poland Vs Austria Poland Euro Cup 2024 squad Who is Michal Probierz bringing ...Poland Vs Austria Poland Euro Cup 2024 squad Who is Michal Probierz bringing ...
Poland Vs Austria Poland Euro Cup 2024 squad Who is Michal Probierz bringing ...
 
Belgium vs Slovakia Belgium announce provisional squad for Euro Cup 2024 Thib...
Belgium vs Slovakia Belgium announce provisional squad for Euro Cup 2024 Thib...Belgium vs Slovakia Belgium announce provisional squad for Euro Cup 2024 Thib...
Belgium vs Slovakia Belgium announce provisional squad for Euro Cup 2024 Thib...
 
Turkey vs Georgia Tickets: Turkey's Road to Glory and Building Momentum for U...
Turkey vs Georgia Tickets: Turkey's Road to Glory and Building Momentum for U...Turkey vs Georgia Tickets: Turkey's Road to Glory and Building Momentum for U...
Turkey vs Georgia Tickets: Turkey's Road to Glory and Building Momentum for U...
 
Portugal Vs Czechia- Ronaldo feels 'proud' of new UEFA Euro 2024 record.docx
Portugal Vs Czechia- Ronaldo feels 'proud' of new UEFA Euro 2024 record.docxPortugal Vs Czechia- Ronaldo feels 'proud' of new UEFA Euro 2024 record.docx
Portugal Vs Czechia- Ronaldo feels 'proud' of new UEFA Euro 2024 record.docx
 

Predicting the NBA MVP

  • 1. Predicting the NBA MVP with Data Science bit.ly/nba-la CrossCamp.us Events
  • 2. About us We train developers and data scientists through 1-on-1 mentorship and career prep
  • 3. About me • Alex Nussbacher • Lead Data Science Instructor at Thinkful • Data scientist at Uber, focus on consumption economics and economics of choice 🤔
  • 4. What’s your background? • I have a software background • I have a math or stats background • None of the above
  • 5. Data Science Process • Frame the question. • Collect the raw data. • Process the data. • Explore the data. • Communicate results.
  • 6. Frame the question • Who will win the MVP in the NBA this season?
  • 7. Collect the Data • What kind of data do we need? • Individual stats • Team stats and success • Past winners and voting records • All data from basketball-reference.com
  • 8. Process the data • How’s the data “dirty” and how can we fix it? • User input, redundancies, missing data… • Formatting: adapt the data to meet certain specifications. • Cleaning: detecting and correcting corrupt or inaccurate records.
  • 9. Explore the data • What are the meaningful patterns in the data? • How meaningful is each data point for our predictions?
  • 10. Goals • Introduction to a data scientist's tools and methods: • Jupyter notebooks, numpy, pandas, sklearn… • Overview of basic machine learning concepts: • Data formatting and cleaning, Decision trees, Overfitting, Random Forests…
  • 11. Jupyter Notebooks • One of data scientist’s everyday tools. • Find the links in our classroom tool. • Contains cells with code.
  • 12. NumPy • The fundamental package for scientific computing with Python. • Provides powerful multi-dimensional array objects. • Many methods for fast operations on arrays.
  • 13. Pandas • Fundamental high-level building block for doing practical, real world data analysis in Python. • Built on top of NumPy. • Offers data structures and operations for manipulating numerical tables and time series.
  • 14. Scikit-learn • Python module for machine learning. • Provides a large menu of libraries for scientific computation, such as integration, interpolation, signal processing, linear algebra, statistics, etc.
  • 15. Initial imports and loading data with Pandas
  • 16. Understanding your data • .head(n) method: Returns first n rows. • .value_counts() method: Returns the counts of unique values in the DataFrame.
  • 17. Training Set • We loaded in our data as a training set. • This is because we’re going to use this data to build, or train, our model • It consists of every year for which we have data on NBA MVP voting, from the 1955-56 season onward
  • 19. Formatting your Data • We need to put our data in the easiest to use format • No blanks allowed • Numeric strings (like win loss record) need to have the numbers extracted and typed as integers • Factors, or categories, need to be changed to dummies, which report a 0 or 1 to show if that value is present
  • 20. Decision Trees • It breaks down a dataset into smaller and smaller subsets. • The final result is a model with a tree structure that has: • Decision nodes: ask a question and have two or more branches. • Leaf nodes: represent a classification or decision.
  • 21.
  • 22. Classification vs Regression • Classification — Predict categories. • Identifying group membership. • Regression — Predict values. • Involves estimating or predicting a response.
  • 25. Regression • Regression — Predict values. • Involves estimating or predicting a response. • This is what we’ll be doing. Predicting vote share…
  • 26. Creating your first Decision Tree You will use the scikit-learn and numpy libraries to build your first decision tree. We will need the following to build a decision tree • Response (y): A one-dimensional array or series containing the target from the train data. • Inputs (X): A multidimensional pandas data frame containing the features/predictors from the train data.
  • 27. Creating your first Decision Tree
  • 28. Importances and Score • .feature_importances_ attribute: tells us how important the features are for the final result. • .score() method: returns the mean accuracy of our fitting.
  • 30. That looks good… But that’s actually not clear.
  • 31. CLASS IMBALANCE • We have what is called a class imbalance problem. • The outcome of not being MVP is much much more common than being the MVP, • So our model is ‘accurate’ if it just tells everyone they’re not going to be MVP
  • 33. Looking at our results • We seem to be doing a decent job of identifying players who are great players • But the ordering isn’t perfect • And we have a lot of people who are scored as equivalent • Also note this seems to be a year with a lot of great performers this year
  • 34. Let’s improve it! • We have options for improving the model • Firstly, we can look at our feature list and select a smaller but more effective list of features • We could also choose a better type of model…
  • 36. Modify the feature list • We put a lot of features into our model • Trimming it down to a smaller list could improve the efficiency of our trees and possibly improve accuracy as well
  • 37. Overfitting • Resulting model too tied to the training set. • It doesn’t generalize to new data, which is the point of prediction.
  • 38. Random Forest Classifier • Random Forest Classifiers use many Decision Trees to build a classifier. • We introduce a bit of randomness. • Each Tree can uses a subset of the data to give a different answer (a vote). The final classification is the most common amongst the Trees.
  • 41. And the MVP goes to…
  • 43.
  • 44. What’s going on? • Our model is giving good weight to major statistical categories and position, but not enough to team record… • How could you fix continue to improve???
  • 45. Trim our variable list…
  • 50. More about Thinkful • Anyone who’s committed can learn to code • 1-on-1 mentorship is the best way to learn • Flexibility! Learn anywhere, anytime, & at your own pace
  • 51. Our Program You’ll learn concepts, practice with drills, and build capstone projects — all guided by a personal mentor
  • 52. Our Mentors Mentors have, on average, 10+ years of experience
  • 53. Data Science Syllabus • Managing data with SQL and Python • Modeling with both supervised and unsupervised models • Data visualization and communicating with data • Technical interviews + career services
  • 54. Special Introductory Offer • Prep course for 50% off — $250 instead of $500 • Covers math, stats, Python, and data science toolkit • Option to continue into full program • Talk to me (or email noel@thinkful.com) if you’re interested

Editor's Notes

  1. 80-20 rule: that 80% of a typical data science project is sourcing cleaning and preparing the data, while the remaining 20% is actual data analysis. Surprisingly time-consuming task. What we’re seeing now is increased number of data analysts who work on cleaning data to free up data scientist time.
  2. Let's start with loading in the training and testing set into your Python environment. You will use the training set to build your model, and the test set to validate it. The data is stored as csv files. You can load this data with the read_csv() method from the Pandas library.
  3. Before starting with the actual analysis, it's important to understand the structure of your data.
  4. the decision tree algorithm starts with all the data at the root node and scans all the variables for the best one to split on. Once a variable is chosen, you do the split and go down one level (or one node) and repeat.
  5. Famous example is Iris data set. Flowers have four features, sepal length and width, petal width and length
  6. If we plot it out across two dimensions, we can see that the setosa is in red, versicolor in green and virginia in blue. Imagine each of these dots represent a training point, something I’ve told a computer about. Then I show that computer the gray dot and ask what it is. What should the computer predict? Imaging this same concept is taking place in three dimensions. Or more! The more data we have, the better we can teach the computer how to do various things.
  7. In January, 2016 Thinkful became the first online bootcamp to publish a jobs report. And now we’re the first one to use a 3rd-party auditor to ensure our data is accurate and our methods are applied as advertised. We’ve seen 92% of our graduates land jobs as developers within 4 months of graduation. Our students generally move into full-time, salaried positions as developers or engineers. They work at startups and also larger, more established companies in several industries. We published the report because we want to give students the tools they need to make an informed decision about the programming school they attend. Education requires trust, and transparency builds it. Until now students choosing a bootcamp must take a leap of faith that schools are honest, their numbers up to date, and the results accurate. That's not sustainable and we hope it stops. We want to make sure our students have the tools they need to make an informed decision on which programming school they attend. Feel free to take a look on our website if you’d like to see all the data and the audit report. 1. Job placement stats. Audited stats. We are the only bootcamp that publishes monthly job stats. One of only bootcamps in the nation that has these stats verified by a third party. 2. 32% of flexible bootcamps. Whenever a student withdraws. Overall the most common reason is there are changes in schedule or financial ability changes. We try to address first one. Over 60% are full-time. Outside of our control. Full-time is 85% grad rate. In Atlanta, we’ve yet to have someone drop out.