Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Starbucks Customer Ratings Analysis in Manhattan
1. Customer Opinions Analysis
for Starbucks in Yelp
Web Analytics, Fall 2014
Professor Yilu Zhou
ISGB 7978
Team:
Yixi Zhang, Xiaoshan Jin, Yi
Chun Chien, Yi Ting Kao
2. Agenda
1. Problem Statement
2. Project Design
3. Stage 1 - Analytic Pre-define
4. Stage 2 - Unstructured Data Analysis
• Correlation Analysis (Overall rating)
• How Rating Differs from Location (Overall rating)
• Feature Selection (Low rating)
• Python Feature Counts Algorithm (Low rating)
• Definition of Top Bad Performance Areas (Low rating)
• Analytics – Manhattan Visualization (Low rating)
5. Analytics Summary & Recommendation
2
3. 3
Problem Statement
There are 212 Starbucks stores in Manhattan. The average rating on
Yelp is 2.8 stars. Some comments have Low rating with 1~2 star.
Project Goal:
Find out the factors causing Starbucks stores’ bad performance to
ensure the highest level of customer satisfaction.
5. 5
Stage 1 - Analytic Pre-define
• Platform & Tool Selection: Python, Content Analyzer and JMP
• Data collection:
• Use Python to craw 176 Starbucks stores in Yelp
• Variables: Store location, user location, user comment, user rating
• Reviews Distribution:
- Total review number: 3052
- Average Rating: 2.8
- 74% customers from NY;
26% customers from other places
• Pre-define Complaints Categories
• Product, Service, Waiting-time & Environment
User Rating User Location
Review#
6. 6
Stage 2 - Correlation Analysis (Overall rating)
• Target variable: User Rating Group :High (4,5stars) vs. Low (1,2stars)
• Independent Variable: Store Area, User Location, Comment Length
• Use Goodness-of-Fit Test to see correlation between target and independent variables
- Comment Length &
Store location correlated
to User Rating Group
Significant
Significant
7. 7
Stage 2: How Rating Differs from Location (Overall Rating)
Review # Rating
Why Midtown East is better than Midtown West
When both area have similar numbers of review and
user location?
• Top 3 Bad Areas:
• Lower East Side
• Greenwich Village and SOHO
• Chelsea and Clinton
• Top 3 Good Areas:
• Central Park and Murray Hill
• Lower Manhattan
• Inwood and Washington Heights
Low
Rating
> 62%
High
Rating
> 52%
8. 8
Stage 2 - Feature Selection (Low rating)
Assumptions:
1) All comments from Low rating only talk about negative opinions about Starbucks;
2) An index for each feature is set as Features counts numbers/Bad Comments numbers to
every zip code in order to compare features based on zip code level.
Content Analyzer output cleansing: Stop Words and Word Stemming.
Finalized Feature list:
Product – (coffee, drink, drinks, cup, latte, tea, iced, milk, food, wrong)
Waiting time – (time, line, minutes, long, wait, slow, waiting, busy)
Environment – (bathroom, small, clean, seating)
Service – (people, service, staff, barista, baristas, rude, cashier, manager, friendly, attitude)
9. 9
Stage 2 - Python Feature Counts Algorithm:(Low rating)
Calculation rule:
Any feature occurrence in the feature lists labels as “1”. Otherwise, “0”.
• Assess every user review by Product,
Service, Waiting time, and Environment
features;
• Group all of the feature counts based on
store location(Zip Code) .
10. 10
Stage 2 – Definition of Top Bad Performance Areas
(Low rating)
Definition Rules(%)
Environment Complaint Product Complaint Service Complaint Waiting time Complaint
Index Range 10.71-
60
Index Range 46.43-
100
Index Range 43.75-
100
Index Range 44.44-
100
Index Median 35.36 Index Median 73.21 Index Median 71.88 Index Median 72.22
Top Bad
Performance
Index Point
35 Top Bad
Performance
Index Point
85 Top Bad
Performance
Index Point
85 Top Bad Performance
Index Point
65
11. 11
Analytics Summary
Manhattan Top Bad Performance Areas
Environment
Complaint
Product Complaint Service Complaint Waiting Complaint
Upper West Side Lower East Side Lower Manhattan Central Park and
Murray Hill
Chelsea and Clinton Central Park and
Murray Hill
Upper East Side Chelsea and Clinton
Greenwich Village and
Soho
Upper East Side Inwood and
Washington
Heights
Lower Manhattan
N/A Inwood and Washington
Heights
N/A Inwood and
Washington Heights
N/A Central Harlem N/A
13. 13
Analytics – Manhattan Visualization (Low rating)
Waiting Time Complaint Service Complaint
Midtown East has lower “Service
Complaints” rate than Midtown
West
14. 14
Recommendations
To Manager of Manhattan area:
1. The common concerns for customers in all Manhattan
area are long waiting time and bad service.
• Hire more cashiers and baristas based on each store’s
situation (financially efficient)
• Train current employees to provide more professional,
flexible and efficient services in a high quality.
• Establish an awards and penalty system for employees.
(Attitude, Efficiency)
2. Give priority to areas with high number of reviews but
relative Low rating. E.g. downtown, west midtown
15. 15
Recommendations
3. Each zip code area should try to
improve the top three concerns of the
customers no matter what the overall
rating it get.
E.g. Inwood and Washington Heights