SlideShare a Scribd company logo
Can IMDB Predict TV
Seasons?
Xuan Qi
OBJECTIVE
• Predict number of TV
seasons using
information scraped
from IMDB website
6/11/2018Project2
2
OBJECTIVE
• Predict number of TV
seasons using
information scraped
from IMDB website
6/11/2018Project2
3
WORKFLOW
6/11/2018Project2
4
Scrape Data Data Cleaning
Data
Exploratory
Build Model
DATA SCRAPING
6/11/2018Project2
5
Scrape
Features:
o Popularity
o Title
o Year
o Licensing
o Rating
o Runtime
o Genre
o Votes
o url link
6/11/2018Project2
6
Scrape
Predict:
Season
Episodes
DATA SCRAPING AND CLEANING
• Of 20,000 TV series, only 1521 TV series have data with complete information
• 7 Features: rating, popularity, run time, licensing, genre, votes, average
number of episodes per season
• 1 predictor : number of seasons
• Create dummy numbers for categorical variables:
• 8 Licensing categories (TV-G, TV-Y7, and etc.)
• 24 Genre categories (Family, romance, and etc. )
6/11/2018Project2
7
DATA EXPLORATORY
6/11/2018Project2
8
• correlation heat map matrix
• 38 by 38
DATA EXPLORATORY
6/11/2018Project2
9
Log
transform
Ordinal Linear Model (OLM)
6/11/2018Project2
10
Training data (90%) Testing
Model selection
6/11/2018
Project2
11
Ordinal Linear
model
2nd order
Polynomial
Testing Scores Training Scores
0.261 0.303
-0.287 0.638
0.264 0.302±0.078Ridge Regression
Over-training!
Ridge Regression
Selecting tuning parameter
6/11/2018Project2
12
Training data (90%) Testing
10-fold cross-validation Validation
Best: 5.3
Ridge Regression
model performance
6/11/2018Project2
13
Training data (90%) Testing
10-fold cross-validation Validation
Training Score: 0.255±0.073
Testing Score: 0.270
MODEL
6/11/2018Project2
14
Rating Votes Popularity Licensing Runtime Genre Avg Episode
0.20 0.05 -0.12 -0.05 0.13
TV-14 TV-G TV-MA TV-PG TV-R TV-Y TV- UNRATE
D
-0.20 0.26 -0.25 -0.005 -0.05 0.10 0.005 0.13
Not Suitable
For Children
FUTURE WORK
6/11/2018Project2
15
• Cross reference ranking of
actors, directors.
• Delete “outliers”
• Cross reference TV budge
if possible
• Time series analysis
THANK YOU
Questions?
6/11/2018Project2
16

More Related Content

Similar to Project2

Netflix Dec 9 Tech Talk Presentation
Netflix Dec 9 Tech Talk PresentationNetflix Dec 9 Tech Talk Presentation
Netflix Dec 9 Tech Talk Presentation
Shobana Radhakrishnan
 
Film and Television on the Internet
Film and Television on the InternetFilm and Television on the Internet
Film and Television on the Internet
Bradley Jobling
 
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupMaintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Daniel Jacobson
 
VUEDB media kit
VUEDB media kitVUEDB media kit
VUEDB media kit
VUEDB
 
BrightRoll Introduces: The Most Interesting Media Buyer in the World
BrightRoll Introduces: The Most Interesting Media Buyer in the WorldBrightRoll Introduces: The Most Interesting Media Buyer in the World
BrightRoll Introduces: The Most Interesting Media Buyer in the World
BrightRoll
 
Oscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons LearnedOscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons Learned
Sangeeta Narayanan
 
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Daniel Jacobson
 
Game On! Building Hulu’s Real-Time Notification Platform for Live TV with Ama...
Game On! Building Hulu’s Real-Time Notification Platform for Live TV with Ama...Game On! Building Hulu’s Real-Time Notification Platform for Live TV with Ama...
Game On! Building Hulu’s Real-Time Notification Platform for Live TV with Ama...
Amazon Web Services
 
Global dvd player_markets-futuristic_reports
Global dvd player_markets-futuristic_reportsGlobal dvd player_markets-futuristic_reports
Global dvd player_markets-futuristic_reports
jerrythomas78
 
Next Top Game (Metis Project 2)
Next Top Game (Metis Project 2)Next Top Game (Metis Project 2)
Next Top Game (Metis Project 2)
Jonathan Liu
 
Case Study - USA TODAY Networks: How A.I. is Revolutionizing News Publishing
Case Study - USA TODAY Networks: How A.I. is Revolutionizing News PublishingCase Study - USA TODAY Networks: How A.I. is Revolutionizing News Publishing
Case Study - USA TODAY Networks: How A.I. is Revolutionizing News Publishing
IRIS.TV
 
Android Box: What's Next for TV Streaming
Android Box: What's Next for TV StreamingAndroid Box: What's Next for TV Streaming
Android Box: What's Next for TV Streaming
paul young cpa, cga
 
Streaming: What's Next
Streaming: What's Next Streaming: What's Next
Streaming: What's Next
paul young cpa, cga
 
CRM Netflix case-study
CRM Netflix case-studyCRM Netflix case-study
CRM Netflix case-study
Ola Haffar
 
AWS Media & Entertainment Symposium -- Los Angeles 2019
AWS Media & Entertainment Symposium -- Los Angeles 2019AWS Media & Entertainment Symposium -- Los Angeles 2019
AWS Media & Entertainment Symposium -- Los Angeles 2019
Amazon Web Services
 
Maintaining the Front Door to Netflix
Maintaining the Front Door to NetflixMaintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Benjamin Schmaus
 
Netflix Inc
Netflix Inc Netflix Inc
Netflix Inc
Financial Services
 
How to Scrape IMDb Data for Cinematic Insights Using IMDb Scraper?
How to Scrape IMDb Data for Cinematic Insights Using IMDb Scraper?How to Scrape IMDb Data for Cinematic Insights Using IMDb Scraper?
How to Scrape IMDb Data for Cinematic Insights Using IMDb Scraper?
OTTScrape
 
IRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social MediaIRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET Journal
 
Design Thinking NETFLIX using all techniques.pptx
Design Thinking NETFLIX using all techniques.pptxDesign Thinking NETFLIX using all techniques.pptx
Design Thinking NETFLIX using all techniques.pptx
saathvikreddy2003
 

Similar to Project2 (20)

Netflix Dec 9 Tech Talk Presentation
Netflix Dec 9 Tech Talk PresentationNetflix Dec 9 Tech Talk Presentation
Netflix Dec 9 Tech Talk Presentation
 
Film and Television on the Internet
Film and Television on the InternetFilm and Television on the Internet
Film and Television on the Internet
 
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupMaintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
 
VUEDB media kit
VUEDB media kitVUEDB media kit
VUEDB media kit
 
BrightRoll Introduces: The Most Interesting Media Buyer in the World
BrightRoll Introduces: The Most Interesting Media Buyer in the WorldBrightRoll Introduces: The Most Interesting Media Buyer in the World
BrightRoll Introduces: The Most Interesting Media Buyer in the World
 
Oscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons LearnedOscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons Learned
 
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
 
Game On! Building Hulu’s Real-Time Notification Platform for Live TV with Ama...
Game On! Building Hulu’s Real-Time Notification Platform for Live TV with Ama...Game On! Building Hulu’s Real-Time Notification Platform for Live TV with Ama...
Game On! Building Hulu’s Real-Time Notification Platform for Live TV with Ama...
 
Global dvd player_markets-futuristic_reports
Global dvd player_markets-futuristic_reportsGlobal dvd player_markets-futuristic_reports
Global dvd player_markets-futuristic_reports
 
Next Top Game (Metis Project 2)
Next Top Game (Metis Project 2)Next Top Game (Metis Project 2)
Next Top Game (Metis Project 2)
 
Case Study - USA TODAY Networks: How A.I. is Revolutionizing News Publishing
Case Study - USA TODAY Networks: How A.I. is Revolutionizing News PublishingCase Study - USA TODAY Networks: How A.I. is Revolutionizing News Publishing
Case Study - USA TODAY Networks: How A.I. is Revolutionizing News Publishing
 
Android Box: What's Next for TV Streaming
Android Box: What's Next for TV StreamingAndroid Box: What's Next for TV Streaming
Android Box: What's Next for TV Streaming
 
Streaming: What's Next
Streaming: What's Next Streaming: What's Next
Streaming: What's Next
 
CRM Netflix case-study
CRM Netflix case-studyCRM Netflix case-study
CRM Netflix case-study
 
AWS Media & Entertainment Symposium -- Los Angeles 2019
AWS Media & Entertainment Symposium -- Los Angeles 2019AWS Media & Entertainment Symposium -- Los Angeles 2019
AWS Media & Entertainment Symposium -- Los Angeles 2019
 
Maintaining the Front Door to Netflix
Maintaining the Front Door to NetflixMaintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
 
Netflix Inc
Netflix Inc Netflix Inc
Netflix Inc
 
How to Scrape IMDb Data for Cinematic Insights Using IMDb Scraper?
How to Scrape IMDb Data for Cinematic Insights Using IMDb Scraper?How to Scrape IMDb Data for Cinematic Insights Using IMDb Scraper?
How to Scrape IMDb Data for Cinematic Insights Using IMDb Scraper?
 
IRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social MediaIRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social Media
 
Design Thinking NETFLIX using all techniques.pptx
Design Thinking NETFLIX using all techniques.pptxDesign Thinking NETFLIX using all techniques.pptx
Design Thinking NETFLIX using all techniques.pptx
 

Recently uploaded

Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 

Recently uploaded (20)

Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 

Project2

Editor's Notes

  1. NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image.
  2. It run through 10 seasons since 2017some of the tv series have more than one season. Such as …
  3. 2012
  4. The tv parental guidelines are a television content rating system in the united states. Sexual content graphic violence