Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Prediciting restaurant and popularity based on Yelp Dataset - 2
1. Predicting Restaurants Rating and Popularity
based on Yelp Dataset
Presented by : Alin Babu(67)|Nandu O(66)|Liju Thomas(36)
2. Abstract
Restaurants rating on Yelp becomes an important indicator of their
future. In this project, we focus on predicting ratings and popularity
change of restaurants. With data from Yelp, we use several machine
learning methods including logistic regression and Naive Bayes, to make
relevant predictions. While logistic regression seems to perform better
than the others, predictions from all the methods are far from perfect.
This implies the potential improvement of more data and more suited
methodology.
3. Project Goals
To predict ratings of restaurants on Yelp and popularity change based
on restaurant features.
Project can shed lights on what customers value the most about a
restaurant.
4. Input
Uses Yelp dataset
The input to the algorithm is related feature variables about a
restaurant, such as :
– Restaurant name
– Comfortability
– Date
– Review
Star Rating
Comments
6. Dataset
The data comes from Yelp Dataset Challenge .
It includes review data, including text, time and rating.
From the raw dataset, we select 20000 samples for testing.
Due to different cultures across cities, we only focus on restaurants in
a particular city and surrounding areas in this project.
7. Data Pre-processing
• There is Restaurant name, date, comfortability, star rating, comments, review id
Selecting two valid features manually.
Star rating
Comments
• Finding missing values of an attribute using isna().
• Root word extracting from comment rating using the methods,
Removing punctuations
Removing stop words
Stemming