SlideShare a Scribd company logo
Product Analysis using YouTube comments
Gaurav Sawant
Advisor: Amir H. Gandomi
Methodology & Results
1. Latent Dirichlet Allocation (LDA) performed on all the
comments to assign topics after removing stopwords
2. Test sets labeled using the decided topics
3. LDA performed again with test set to check performance
Honda Accord Topic 0: Competitors/Engine, Topic 1: Looks/Manual
Toyota Camry Topics Nissan Altima Topics
Perplexity is the measure used to decide k for topic modelling
LDA training test Results:
Honda Accord Toyota Camry Nissan Altima
Sentiment Analysis using VADER
Accord Overall Camry Overall Altima Overall
Accord by topic Camry by topic Altima by topic
Introduction: Problem
• Reviews are one of the most watched YouTube videos
• Is it possible for companies to get insights about customer
sentiment towards products by using YouTube comments?
• Can mining textual reviews provide any useful insights?
Business Value:
 Get direct access to customer views on an open platform
 YouTube comments longer than tweets, may pack more
useful information
Conclusion
• Mining textual reviews can provide customer insights
• Feedback about specific features can be obtained. Example:
CVT transmission in Altima
• Stopwords, min_df, max_df help produce better separation of
topics while using LDA
• Better separation & a greater number of topics are desired &
could help produce better results
• Labelling the test data is a time-consuming job and requires
some domain knowledge
Statistical Learning & Analytics
Spring, 2019
Data & Data Processing
Comments of 3 competitor cars from popular YouTube channel
Redline Reviews extracted using YouTube’s Data API
• Rows with only original comments kept. Replies deleted
• Columns with only text i.e. comments kept, others removed
• Honda Accord- Published on 10/1/2017, 2155 comments
• Toyota Camry – Published on 6/21/2017, 1603 comments
• Nissan Altima – Published on 10/12/2018, 1444 comments
Other details: All 3 cars are in the same class, are all new
models & are reviewed by the same reviewer
 Removed punctuation marks, converted all text to lower case
and removed stopwords when necessary
Some EDA
Honda Accord (No stopwords)
Toyota Camry (No stopwords)
Nissan Altima (No Stopwords)

More Related Content

Similar to Car brand analysis using NLP on YouTube comments using Python

Atlassian Tools and Platform Strategy
Atlassian Tools and Platform StrategyAtlassian Tools and Platform Strategy
Atlassian Tools and Platform Strategy
Marlon Palha
 
Car Recommendation System Using Customer Reviews
Car Recommendation System Using Customer ReviewsCar Recommendation System Using Customer Reviews
Car Recommendation System Using Customer Reviews
IRJET Journal
 
Strategic partner1
Strategic partner1Strategic partner1
Strategic partner1
sajidbaigmg
 
ASO: Best Practices 2015
ASO: Best Practices 2015ASO: Best Practices 2015
ASO: Best Practices 2015
AppFollow
 
Specification by example and agile acceptance testing
Specification by example and agile acceptance testingSpecification by example and agile acceptance testing
Specification by example and agile acceptance testing
gojkoadzic
 
Entrepreneurship new business development and launching
Entrepreneurship new business development and launchingEntrepreneurship new business development and launching
Entrepreneurship new business development and launching
Asif Razzaq
 
White Paper Session
White Paper SessionWhite Paper Session
White Paper Session
David Hannon, PE, CPESC
 
Opportunities & Challenges in China’s Automotive Aftermarket Compared to Thai...
Opportunities & Challenges in China’s Automotive Aftermarket Compared to Thai...Opportunities & Challenges in China’s Automotive Aftermarket Compared to Thai...
Opportunities & Challenges in China’s Automotive Aftermarket Compared to Thai...
Ipsos Business Consulting
 
Sumeet kumar mainframe developer 8.5 years
Sumeet kumar mainframe developer 8.5 yearsSumeet kumar mainframe developer 8.5 years
Sumeet kumar mainframe developer 8.5 years
sumeet kumar
 
15_CoOp Report Out - Angelo Stekardis
15_CoOp Report Out - Angelo Stekardis15_CoOp Report Out - Angelo Stekardis
15_CoOp Report Out - Angelo Stekardis
Angelo Stekardis
 
Top 5 Considerations When Choosing a New HCM Vendor
Top 5 Considerations When Choosing a New HCM VendorTop 5 Considerations When Choosing a New HCM Vendor
Top 5 Considerations When Choosing a New HCM Vendor
Aggregage
 
SWOT anlysis of HONDA AMAZE
SWOT anlysis of HONDA AMAZESWOT anlysis of HONDA AMAZE
SWOT anlysis of HONDA AMAZE
Chirag Jadhav
 
Overcome-3-common-aem-delivery-challenges
Overcome-3-common-aem-delivery-challengesOvercome-3-common-aem-delivery-challenges
Overcome-3-common-aem-delivery-challenges
iCiDIGITAL
 
Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager
Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager
Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager
Mark Kelley
 
Win and lose analysis of jupiter
Win and lose analysis of jupiterWin and lose analysis of jupiter
Win and lose analysis of jupiter
venkatesh vaddempudi
 
Car care
Car careCar care
Abbas Bhabhrawala_5+ Years Exp_After Sales_Automotive
Abbas Bhabhrawala_5+ Years Exp_After Sales_AutomotiveAbbas Bhabhrawala_5+ Years Exp_After Sales_Automotive
Abbas Bhabhrawala_5+ Years Exp_After Sales_Automotive
abbas bhabhrawala
 
Overcoming Objections by Ali Jani
Overcoming Objections by Ali JaniOvercoming Objections by Ali Jani
Overcoming Objections by Ali Jani
Acumatica Cloud ERP
 
Deming award , Rane Group, Mahindra and Mahindra
Deming award , Rane Group, Mahindra and MahindraDeming award , Rane Group, Mahindra and Mahindra
Deming award , Rane Group, Mahindra and Mahindra
khushboodevpura02
 
Jump Start Agile Testing with Acceptance Test Driven Development
Jump Start Agile Testing with Acceptance Test Driven DevelopmentJump Start Agile Testing with Acceptance Test Driven Development
Jump Start Agile Testing with Acceptance Test Driven Development
TechWell
 

Similar to Car brand analysis using NLP on YouTube comments using Python (20)

Atlassian Tools and Platform Strategy
Atlassian Tools and Platform StrategyAtlassian Tools and Platform Strategy
Atlassian Tools and Platform Strategy
 
Car Recommendation System Using Customer Reviews
Car Recommendation System Using Customer ReviewsCar Recommendation System Using Customer Reviews
Car Recommendation System Using Customer Reviews
 
Strategic partner1
Strategic partner1Strategic partner1
Strategic partner1
 
ASO: Best Practices 2015
ASO: Best Practices 2015ASO: Best Practices 2015
ASO: Best Practices 2015
 
Specification by example and agile acceptance testing
Specification by example and agile acceptance testingSpecification by example and agile acceptance testing
Specification by example and agile acceptance testing
 
Entrepreneurship new business development and launching
Entrepreneurship new business development and launchingEntrepreneurship new business development and launching
Entrepreneurship new business development and launching
 
White Paper Session
White Paper SessionWhite Paper Session
White Paper Session
 
Opportunities & Challenges in China’s Automotive Aftermarket Compared to Thai...
Opportunities & Challenges in China’s Automotive Aftermarket Compared to Thai...Opportunities & Challenges in China’s Automotive Aftermarket Compared to Thai...
Opportunities & Challenges in China’s Automotive Aftermarket Compared to Thai...
 
Sumeet kumar mainframe developer 8.5 years
Sumeet kumar mainframe developer 8.5 yearsSumeet kumar mainframe developer 8.5 years
Sumeet kumar mainframe developer 8.5 years
 
15_CoOp Report Out - Angelo Stekardis
15_CoOp Report Out - Angelo Stekardis15_CoOp Report Out - Angelo Stekardis
15_CoOp Report Out - Angelo Stekardis
 
Top 5 Considerations When Choosing a New HCM Vendor
Top 5 Considerations When Choosing a New HCM VendorTop 5 Considerations When Choosing a New HCM Vendor
Top 5 Considerations When Choosing a New HCM Vendor
 
SWOT anlysis of HONDA AMAZE
SWOT anlysis of HONDA AMAZESWOT anlysis of HONDA AMAZE
SWOT anlysis of HONDA AMAZE
 
Overcome-3-common-aem-delivery-challenges
Overcome-3-common-aem-delivery-challengesOvercome-3-common-aem-delivery-challenges
Overcome-3-common-aem-delivery-challenges
 
Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager
Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager
Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager
 
Win and lose analysis of jupiter
Win and lose analysis of jupiterWin and lose analysis of jupiter
Win and lose analysis of jupiter
 
Car care
Car careCar care
Car care
 
Abbas Bhabhrawala_5+ Years Exp_After Sales_Automotive
Abbas Bhabhrawala_5+ Years Exp_After Sales_AutomotiveAbbas Bhabhrawala_5+ Years Exp_After Sales_Automotive
Abbas Bhabhrawala_5+ Years Exp_After Sales_Automotive
 
Overcoming Objections by Ali Jani
Overcoming Objections by Ali JaniOvercoming Objections by Ali Jani
Overcoming Objections by Ali Jani
 
Deming award , Rane Group, Mahindra and Mahindra
Deming award , Rane Group, Mahindra and MahindraDeming award , Rane Group, Mahindra and Mahindra
Deming award , Rane Group, Mahindra and Mahindra
 
Jump Start Agile Testing with Acceptance Test Driven Development
Jump Start Agile Testing with Acceptance Test Driven DevelopmentJump Start Agile Testing with Acceptance Test Driven Development
Jump Start Agile Testing with Acceptance Test Driven Development
 

Recently uploaded

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 

Recently uploaded (20)

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 

Car brand analysis using NLP on YouTube comments using Python

  • 1. Product Analysis using YouTube comments Gaurav Sawant Advisor: Amir H. Gandomi Methodology & Results 1. Latent Dirichlet Allocation (LDA) performed on all the comments to assign topics after removing stopwords 2. Test sets labeled using the decided topics 3. LDA performed again with test set to check performance Honda Accord Topic 0: Competitors/Engine, Topic 1: Looks/Manual Toyota Camry Topics Nissan Altima Topics Perplexity is the measure used to decide k for topic modelling LDA training test Results: Honda Accord Toyota Camry Nissan Altima Sentiment Analysis using VADER Accord Overall Camry Overall Altima Overall Accord by topic Camry by topic Altima by topic Introduction: Problem • Reviews are one of the most watched YouTube videos • Is it possible for companies to get insights about customer sentiment towards products by using YouTube comments? • Can mining textual reviews provide any useful insights? Business Value:  Get direct access to customer views on an open platform  YouTube comments longer than tweets, may pack more useful information Conclusion • Mining textual reviews can provide customer insights • Feedback about specific features can be obtained. Example: CVT transmission in Altima • Stopwords, min_df, max_df help produce better separation of topics while using LDA • Better separation & a greater number of topics are desired & could help produce better results • Labelling the test data is a time-consuming job and requires some domain knowledge Statistical Learning & Analytics Spring, 2019 Data & Data Processing Comments of 3 competitor cars from popular YouTube channel Redline Reviews extracted using YouTube’s Data API • Rows with only original comments kept. Replies deleted • Columns with only text i.e. comments kept, others removed • Honda Accord- Published on 10/1/2017, 2155 comments • Toyota Camry – Published on 6/21/2017, 1603 comments • Nissan Altima – Published on 10/12/2018, 1444 comments Other details: All 3 cars are in the same class, are all new models & are reviewed by the same reviewer  Removed punctuation marks, converted all text to lower case and removed stopwords when necessary Some EDA Honda Accord (No stopwords) Toyota Camry (No stopwords) Nissan Altima (No Stopwords)