SlideShare a Scribd company logo
1 of 14
Download to read offline
Seattle’s Top 30 Restaurants
based on Yelp and Instagram Data
Janessa Cordeiro | Haneen Al Hassani | Kevin Ho | Jinyang Luo (Regina) | Kanin Sangcharoenvanakul
GEOG 465: GIS Database & Programming
“
Is there a relationship between Yelp’s
restaurant ratings and social media
reactions through the number of posts
and “likes” of photographs on
Instagram?
Research Question:
Method
◇ Collect Yelp Data
◇ Collect Instagram Data
◇ Cleanse & Geocode Data
◇ Determine statistical interpretation of data
◇ Create an interactive web map application
Collecting Yelp Data
◇ Use import.io to scrape 30 restaurant with highest
rated and another 30 highest number of reviews
◇ Scraped data: restaurant name, street addresses,
ratings, number of reviews, URLS
◇ Select top 30 restaurants based on multiplication
of the two variables we selected
◇ Use a combination of a Python Script and
OpenStreetMap server to geocode our primary
data
◇ Collect Instagram data using Application Program
Interface (API)
◇ Obtained data: Instagram photos URL, number of likes,
date posted, latitudes and longitudes
◇ Clear up the data and count number of post and number
of like for each restaurant and multiple them
The API terms of agreement only allow us obtain data for most
33 posts Instagram post.
Collecting Instagram
Data
Geocoding with Python &
OpenStreetMap Server
◇ CartoDB
◇ CSS
◇ SPSS Statistics
Data Visualization
Yelp Interactive Map
Instagram Interactive Map
Statistical
Correlation Analysis
Instagram
Spearman's rho Yelp Correlation Coefficient .490**
Sig. (2-tailed) .006
N 30
**. Correlation is significant at the 0.01 level (2-tailed).
Conclusion
◇ Spearman’s Rank Correlation Coefficient (rho) = 0.49
■ Statistically significant moderately positive
correlation
■ 99% confidence interval
◇ Possible Hypotheses:
■ demographic and behavioral differences between
users on Yelp and Instagram
Limitations
◇ Instagram API (sampling issues)
◇ Social Media data is always rapidly changing
■ Yelp contains long term data over months or
years
■ Instagram only can gain maximum 33 post
per restaurant in real time
Bibliography
Data Sources:
1. Yelp (2015).
2. Instagram API (2015)
Base Maps:
1. CartoDB
2. OpenStreetMap
Q&A

More Related Content

Similar to GEOG 465 Final Project Presentation

Prediciting restaurant and popularity based on Yelp Dataset - 2
Prediciting restaurant and popularity based on Yelp Dataset - 2Prediciting restaurant and popularity based on Yelp Dataset - 2
Prediciting restaurant and popularity based on Yelp Dataset - 2ALIN BABU
 
App Store Optimization - Metrics, Organic Discovery, & The Future | SMX Muni...
App Store Optimization - Metrics, Organic Discovery, & The Future | SMX Muni...App Store Optimization - Metrics, Organic Discovery, & The Future | SMX Muni...
App Store Optimization - Metrics, Organic Discovery, & The Future | SMX Muni...Kahena Digital Marketing
 
How to boost your ASO with data analytics?
How to boost your ASO with data analytics?How to boost your ASO with data analytics?
How to boost your ASO with data analytics?GameCamp
 
Restaurant recommender
Restaurant recommenderRestaurant recommender
Restaurant recommenderAnnie Thomas
 
Voxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analyticsVoxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analyticsVoxeo Corp
 
Jonathan Weber - All Things DATA 2017
Jonathan Weber - All Things DATA 2017Jonathan Weber - All Things DATA 2017
Jonathan Weber - All Things DATA 2017Shuki Mann
 
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real WorldWSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real WorldWSO2
 
The art and science of website optimization
The art and science of website optimizationThe art and science of website optimization
The art and science of website optimizationRaj Lal
 
Prediciting restaurant and popularity based on Yelp Dataset - 1
Prediciting restaurant and popularity based on Yelp Dataset - 1Prediciting restaurant and popularity based on Yelp Dataset - 1
Prediciting restaurant and popularity based on Yelp Dataset - 1ALIN BABU
 

Similar to GEOG 465 Final Project Presentation (10)

Prediciting restaurant and popularity based on Yelp Dataset - 2
Prediciting restaurant and popularity based on Yelp Dataset - 2Prediciting restaurant and popularity based on Yelp Dataset - 2
Prediciting restaurant and popularity based on Yelp Dataset - 2
 
App Store Optimization - Metrics, Organic Discovery, & The Future | SMX Muni...
App Store Optimization - Metrics, Organic Discovery, & The Future | SMX Muni...App Store Optimization - Metrics, Organic Discovery, & The Future | SMX Muni...
App Store Optimization - Metrics, Organic Discovery, & The Future | SMX Muni...
 
How to boost your ASO with data analytics?
How to boost your ASO with data analytics?How to boost your ASO with data analytics?
How to boost your ASO with data analytics?
 
Restaurant recommender
Restaurant recommenderRestaurant recommender
Restaurant recommender
 
Voxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analyticsVoxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analytics
 
Jonathan Weber - All Things DATA 2017
Jonathan Weber - All Things DATA 2017Jonathan Weber - All Things DATA 2017
Jonathan Weber - All Things DATA 2017
 
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real WorldWSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
 
The art and science of website optimization
The art and science of website optimizationThe art and science of website optimization
The art and science of website optimization
 
Prediciting restaurant and popularity based on Yelp Dataset - 1
Prediciting restaurant and popularity based on Yelp Dataset - 1Prediciting restaurant and popularity based on Yelp Dataset - 1
Prediciting restaurant and popularity based on Yelp Dataset - 1
 
ANLTYIC TOOLS-3.pptx
ANLTYIC TOOLS-3.pptxANLTYIC TOOLS-3.pptx
ANLTYIC TOOLS-3.pptx
 

GEOG 465 Final Project Presentation

  • 1. Seattle’s Top 30 Restaurants based on Yelp and Instagram Data Janessa Cordeiro | Haneen Al Hassani | Kevin Ho | Jinyang Luo (Regina) | Kanin Sangcharoenvanakul GEOG 465: GIS Database & Programming
  • 2. “ Is there a relationship between Yelp’s restaurant ratings and social media reactions through the number of posts and “likes” of photographs on Instagram? Research Question:
  • 3. Method ◇ Collect Yelp Data ◇ Collect Instagram Data ◇ Cleanse & Geocode Data ◇ Determine statistical interpretation of data ◇ Create an interactive web map application
  • 4. Collecting Yelp Data ◇ Use import.io to scrape 30 restaurant with highest rated and another 30 highest number of reviews ◇ Scraped data: restaurant name, street addresses, ratings, number of reviews, URLS ◇ Select top 30 restaurants based on multiplication of the two variables we selected ◇ Use a combination of a Python Script and OpenStreetMap server to geocode our primary data
  • 5. ◇ Collect Instagram data using Application Program Interface (API) ◇ Obtained data: Instagram photos URL, number of likes, date posted, latitudes and longitudes ◇ Clear up the data and count number of post and number of like for each restaurant and multiple them The API terms of agreement only allow us obtain data for most 33 posts Instagram post. Collecting Instagram Data
  • 6. Geocoding with Python & OpenStreetMap Server
  • 7. ◇ CartoDB ◇ CSS ◇ SPSS Statistics Data Visualization
  • 10. Statistical Correlation Analysis Instagram Spearman's rho Yelp Correlation Coefficient .490** Sig. (2-tailed) .006 N 30 **. Correlation is significant at the 0.01 level (2-tailed).
  • 11. Conclusion ◇ Spearman’s Rank Correlation Coefficient (rho) = 0.49 ■ Statistically significant moderately positive correlation ■ 99% confidence interval ◇ Possible Hypotheses: ■ demographic and behavioral differences between users on Yelp and Instagram
  • 12. Limitations ◇ Instagram API (sampling issues) ◇ Social Media data is always rapidly changing ■ Yelp contains long term data over months or years ■ Instagram only can gain maximum 33 post per restaurant in real time
  • 13. Bibliography Data Sources: 1. Yelp (2015). 2. Instagram API (2015) Base Maps: 1. CartoDB 2. OpenStreetMap
  • 14. Q&A