1. Seattle’s Top 30 Restaurants
based on Yelp and Instagram Data
Janessa Cordeiro | Haneen Al Hassani | Kevin Ho | Jinyang Luo (Regina) | Kanin Sangcharoenvanakul
GEOG 465: GIS Database & Programming
2. “
Is there a relationship between Yelp’s
restaurant ratings and social media
reactions through the number of posts
and “likes” of photographs on
Instagram?
Research Question:
3. Method
◇ Collect Yelp Data
◇ Collect Instagram Data
◇ Cleanse & Geocode Data
◇ Determine statistical interpretation of data
◇ Create an interactive web map application
4. Collecting Yelp Data
◇ Use import.io to scrape 30 restaurant with highest
rated and another 30 highest number of reviews
◇ Scraped data: restaurant name, street addresses,
ratings, number of reviews, URLS
◇ Select top 30 restaurants based on multiplication
of the two variables we selected
◇ Use a combination of a Python Script and
OpenStreetMap server to geocode our primary
data
5. ◇ Collect Instagram data using Application Program
Interface (API)
◇ Obtained data: Instagram photos URL, number of likes,
date posted, latitudes and longitudes
◇ Clear up the data and count number of post and number
of like for each restaurant and multiple them
The API terms of agreement only allow us obtain data for most
33 posts Instagram post.
Collecting Instagram
Data
11. Conclusion
◇ Spearman’s Rank Correlation Coefficient (rho) = 0.49
■ Statistically significant moderately positive
correlation
■ 99% confidence interval
◇ Possible Hypotheses:
■ demographic and behavioral differences between
users on Yelp and Instagram
12. Limitations
◇ Instagram API (sampling issues)
◇ Social Media data is always rapidly changing
■ Yelp contains long term data over months or
years
■ Instagram only can gain maximum 33 post
per restaurant in real time