SlideShare a Scribd company logo
1 of 11
Predicting Restaurants Rating and Popularity
based on Yelp Dataset
Presented by : Alin Babu(67)|Nandu O(66)|Liju Thomas(36)
Abstract
Restaurants rating on Yelp becomes an important indicator of their
future. In this project, we focus on predicting ratings and popularity
change of restaurants. With data from Yelp, we use several machine
learning methods including logistic regression and Naive Bayes, to make
relevant predictions. While logistic regression seems to perform better
than the others, predictions from all the methods are far from perfect.
This implies the potential improvement of more data and more suited
methodology.
Project Goals
 To predict ratings of restaurants on Yelp and popularity change based
on restaurant features.
 Project can shed lights on what customers value the most about a
restaurant.
Input
 Uses Yelp dataset
 The input to the algorithm is related feature variables about a
restaurant, such as :
– Restaurant name
– Comfortability
– Date
– Review
 Star Rating
 Comments
Algorithms and Methods
 Logistic Regression
 Naive Bayes
 Multinomial Naive Bayes
Dataset
 The data comes from Yelp Dataset Challenge .
 It includes review data, including text, time and rating.
 From the raw dataset, we select 20000 samples for testing.
 Due to different cultures across cities, we only focus on restaurants in
a particular city and surrounding areas in this project.
Data Pre-processing
• There is Restaurant name, date, comfortability, star rating, comments, review id
 Selecting two valid features manually.
 Star rating
 Comments
• Finding missing values of an attribute using isna().
• Root word extracting from comment rating using the methods,
 Removing punctuations
 Removing stop words
 Stemming
Performance Evaluation
Performance Evaluation
Performance Evaluation
Prediciting restaurant and popularity based on Yelp Dataset - 2

More Related Content

Similar to Prediciting restaurant and popularity based on Yelp Dataset - 2

Text Data Mining and Predictive Modeling of Online Reviews
Text Data Mining and Predictive Modeling of Online ReviewsText Data Mining and Predictive Modeling of Online Reviews
Text Data Mining and Predictive Modeling of Online Reviews
Mark Chesney
 
GEOG 465 Final Project Presentation
GEOG 465 Final Project PresentationGEOG 465 Final Project Presentation
GEOG 465 Final Project Presentation
Jinyang Luo
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kim
Jin Young Kim
 

Similar to Prediciting restaurant and popularity based on Yelp Dataset - 2 (20)

BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and Strategy
 
Deriving measurable content drivers for effective Content Marketing
Deriving measurable content drivers for effective Content MarketingDeriving measurable content drivers for effective Content Marketing
Deriving measurable content drivers for effective Content Marketing
 
Text Data Mining and Predictive Modeling of Online Reviews
Text Data Mining and Predictive Modeling of Online ReviewsText Data Mining and Predictive Modeling of Online Reviews
Text Data Mining and Predictive Modeling of Online Reviews
 
Context Matters - Shifting from Reporting to Analysis
Context Matters - Shifting from Reporting to AnalysisContext Matters - Shifting from Reporting to Analysis
Context Matters - Shifting from Reporting to Analysis
 
Neil Shapiro Resume Digital Marketing Google Analytics Expert
Neil Shapiro Resume Digital Marketing Google Analytics ExpertNeil Shapiro Resume Digital Marketing Google Analytics Expert
Neil Shapiro Resume Digital Marketing Google Analytics Expert
 
Restaurant Review Sentiment Analysis
Restaurant Review Sentiment AnalysisRestaurant Review Sentiment Analysis
Restaurant Review Sentiment Analysis
 
Study harvard reviews reputation and revenue- the case of yelp
Study   harvard reviews reputation and revenue- the case of yelpStudy   harvard reviews reputation and revenue- the case of yelp
Study harvard reviews reputation and revenue- the case of yelp
 
How to boost your ASO with data analytics?
How to boost your ASO with data analytics?How to boost your ASO with data analytics?
How to boost your ASO with data analytics?
 
Restaurants of Seoul - "likes" prediction report
Restaurants of Seoul - "likes" prediction reportRestaurants of Seoul - "likes" prediction report
Restaurants of Seoul - "likes" prediction report
 
Opticon 2015-Scaling Your Testing Program for Maximum Impact
Opticon 2015-Scaling Your Testing Program for Maximum ImpactOpticon 2015-Scaling Your Testing Program for Maximum Impact
Opticon 2015-Scaling Your Testing Program for Maximum Impact
 
Hypothesis presentation (2)
Hypothesis presentation (2)Hypothesis presentation (2)
Hypothesis presentation (2)
 
Sentiment analysis of Restaurant reviews ppt
Sentiment analysis of Restaurant reviews pptSentiment analysis of Restaurant reviews ppt
Sentiment analysis of Restaurant reviews ppt
 
What is analytics
What is analyticsWhat is analytics
What is analytics
 
Final.Version
Final.VersionFinal.Version
Final.Version
 
GEOG 465 Final Project Presentation
GEOG 465 Final Project PresentationGEOG 465 Final Project Presentation
GEOG 465 Final Project Presentation
 
Making Restaurant Inspections Smarter
Making Restaurant Inspections SmarterMaking Restaurant Inspections Smarter
Making Restaurant Inspections Smarter
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kim
 
Empowering Businesses using Yelp Reviews Mining
Empowering Businesses using Yelp Reviews MiningEmpowering Businesses using Yelp Reviews Mining
Empowering Businesses using Yelp Reviews Mining
 
Maxymizely - On-page Conversion Rate Optiimization via A/B testing and Machin...
Maxymizely - On-page Conversion Rate Optiimization via A/B testing and Machin...Maxymizely - On-page Conversion Rate Optiimization via A/B testing and Machin...
Maxymizely - On-page Conversion Rate Optiimization via A/B testing and Machin...
 
AfterPM | Find what's open after hours
AfterPM | Find what's open after hoursAfterPM | Find what's open after hours
AfterPM | Find what's open after hours
 

More from ALIN BABU

More from ALIN BABU (7)

SECRY - Secure file storage on cloud using hybrid cryptography
SECRY - Secure file storage on cloud using hybrid cryptographySECRY - Secure file storage on cloud using hybrid cryptography
SECRY - Secure file storage on cloud using hybrid cryptography
 
Project final report
Project final reportProject final report
Project final report
 
Secry poster
Secry posterSecry poster
Secry poster
 
Secure Cloud Storage
Secure Cloud StorageSecure Cloud Storage
Secure Cloud Storage
 
Secure cloud storage
Secure cloud storageSecure cloud storage
Secure cloud storage
 
iPhone
iPhoneiPhone
iPhone
 
Report
ReportReport
Report
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Prediciting restaurant and popularity based on Yelp Dataset - 2

  • 1. Predicting Restaurants Rating and Popularity based on Yelp Dataset Presented by : Alin Babu(67)|Nandu O(66)|Liju Thomas(36)
  • 2. Abstract Restaurants rating on Yelp becomes an important indicator of their future. In this project, we focus on predicting ratings and popularity change of restaurants. With data from Yelp, we use several machine learning methods including logistic regression and Naive Bayes, to make relevant predictions. While logistic regression seems to perform better than the others, predictions from all the methods are far from perfect. This implies the potential improvement of more data and more suited methodology.
  • 3. Project Goals  To predict ratings of restaurants on Yelp and popularity change based on restaurant features.  Project can shed lights on what customers value the most about a restaurant.
  • 4. Input  Uses Yelp dataset  The input to the algorithm is related feature variables about a restaurant, such as : – Restaurant name – Comfortability – Date – Review  Star Rating  Comments
  • 5. Algorithms and Methods  Logistic Regression  Naive Bayes  Multinomial Naive Bayes
  • 6. Dataset  The data comes from Yelp Dataset Challenge .  It includes review data, including text, time and rating.  From the raw dataset, we select 20000 samples for testing.  Due to different cultures across cities, we only focus on restaurants in a particular city and surrounding areas in this project.
  • 7. Data Pre-processing • There is Restaurant name, date, comfortability, star rating, comments, review id  Selecting two valid features manually.  Star rating  Comments • Finding missing values of an attribute using isna(). • Root word extracting from comment rating using the methods,  Removing punctuations  Removing stop words  Stemming