Curbing Deceptive Yelp Behaviors

Turning the Tide: Curbing Deceptive
Yelp Behaviors
Mahmudur Rahman
Bogdan Carbunar, Jaime Ballesteros, George Burri, Duen Horng Chau
SIAM SDM 2014

Problem Statement
Detect deceptive behaviors in review centered
online social networks
2

Post review, check-in
Review centered social networks
3

Challenges
• Up to 25% of Yelp reviews are claimed to be fraudulent [3]
• SEO companies offer review campaigns for business owners to
manipulate venues’ ratings
• An extra half-star rating on Yelp causes a restaurant to sell out
19% more often [1], and a one-star increase leads to a 5–9%
increase in revenue [2]
[1] Michael Anderson and Jeremy Magruder. Learning from the crowd: Regression discontinuity estimates
of the effects of an online review database. Economic Journal, 122(563):957–989, 2012.
[2] Michael Luca. Reviews, Reputation, and Revenue: The Case of Yelp.com. Available at
hbswk.hbs.edu/item/6833.html.
[3] Yelp admits a quarter of submitted reviews could be fake. BBC, www.bbc.co.uk/news/technology-
24299742
4

Background
• Yelp’s Review System
• Fraudulent Reviews
• Deceptive Venues
– Consumer Alert

Marco: Identify Deceptive Behaviors
• Automatic detection of fraudulent reviews, deceptive
venues and impactful review campaigns
• Leveraged the wealth of spatial, temporal and social
information provided by Yelp
• Increased the cost and complexity of attacks by
imposing a tradeoff on fraudsters
6

Adversary Model
• Model as a corrupt SEO
– Needs to adjust the rating of V according to a finite budget
– Controls a set of unique (IP address, Yelp Sybil account) pairs
– Has access to a market of review writers, with valid yelp
accounts
Malory
7

Collected Yelp Dataset
• Our data consists of:
• (i) 90 deceptive and 100 legitimate venues
• (ii) 200 fraudulent and 202 genuine reviews, and
• (iii) 7,435 venues and their 270,121 reviews from 195,417
reviewers from San Francisco, New York City and Miami
• Collection process took 3 weeks

YCrawl
• Fetches raw HTML pages of Yelp venue and user
accounts
• Uses a pool of servers, IP proxies & DeathByCaptcha
• Uses breadth-first crawling strategy and stratified
sampling
9

The Data
• Ground truth deceptive venues
– “Consumer Alert” feature
• Gold standard legitimate venues
– Well known consistent quality
• Gold standard fraudulent reviews
– Spelp sites, forums used
• Gold standard genuine reviews
– Remove plagiarized text and reviewer photos
– Discard short, generic reviews
– Give preference to active users

Marco System Overview
11
1. Friend & review count
2. Venue “expertise”
3. Venue activities
4. …
FRI Module
RSD Module
Venue timeline
ARD Module
Review ratings
Venue Classifier
Deceptive &
legitimate venues Features
7,435 venues
195,417 users
270,121 reviews
Train
Train Label
Fraudulent & genuine reviews
Review Classifier
Users Venues
Friend
relations Spatial
vicinity
Reviews/Time

Review Spike Detection (RSD) Module
• Identify venues that receive higher number of positive
(negative) reviews than normal
• Uses the measures of dispersion of Box-and-Whisker
plots to detect outliers
• Outputs two features -
• the number of spikes detected for a venue
• the normalized amplitude of the highest spike
12

Timelines of positive reviews of 3 deceptive venues

Review Spike Detection (RSD) Module (Cont.)
• Assumption: A review campaign is successful if it
increases the rating of the target venue by at least half
a star
• Claim #1: The minimum number of reviews the
adversary needs to post in order to (fraudulently)
increase the rating of a venue by half a star is n/7
14

15
3. Venue activities
4. …
FRI Module
RSD Module
Venue timeline
ARD Module
Review ratings
Venue Classifier
Deceptive &
7,435 venues
195,417 users
270,121 reviews
Train
Train Label
Review Classifier
Users Venues
Friend
relations Spatial
vicinity
Reviews/Time

Aggregate Rating Disparity (ARD) Module
• Measures the divergence of reviews involved in a
campaign from the average rating of the venue
• Calculates the aggregated divergence of that venue
• Outputs one feature - the ARD score
16

Evolution in time of the average rating of the venue “Azure Nail & Waxing
Studio” of Chicago, IL, compared against the ratings assigned by its reviews.

18
3. Venue activities
4. …
FRI Module
RSD Module
Venue timeline Review ratings
Venue Classifier
Deceptive &
7,435 venues
195,417 users
270,121 reviews
Train
Train Label
Review Classifier
Users Venues
Friend
relations Spatial
vicinity
Reviews/Time
ARD Module

Fraudulent Review Impact (FRI) Module
• Challenges:
• Venues that receive few genuine reviews are particularly
vulnerable to review campaigns
• Long term review campaigns can re-define the “normal”
review posting behavior, flatten spikes and escape detection
by the RSD module
• Our Solution: detect such behaviors through fraudulent
reviews that significantly impact the aggregate rating of venues
• Uses features extracted from each review, its reviewer
and the relation (user expertise) between the reviewer
and the target venue
19

Fraudulent Review Impact (FRI) Module
• Uses following features:
• The number of friends
• The number of reviews written
• The expertise of U around V
• The number of check-ins of U at V
• The number of photos of U at V
• The feedback count of R
• Age of U’s account when R was posted
• Contributes two features
• The FRI score
• The percentage of reviews classified as fraudulent
20

21
3. Venue activities
4. …
FRI Module
RSD Module
Venue timeline Review ratings
Venue Classifier
Deceptive &
7,435 venues
195,417 users
270,121 reviews
Train
Train Label
Review Classifier
Users Venues
Friend
relations Spatial
vicinity
Reviews/Time
ARD Module

Venue Classification Features
• Uses following features:
• The number of review spikes for V
• The amplitude of the highest spike
• Aggregate rating disparity
• The fraudulent review impact of V
• Count of reviews classified fraudulent
• The rating of V
• The number of reviews of V
• The number of reviews with check-ins
• The number of reviews with photos
• The age of V
22

Review Classification
• The overall accuracy of RF, Bagging and DT is 94%, 93.5% and 93%
respectively.
• Top 2 most important features are: 1) number of reviews written by
U and 2) The expertise of U around V
23

Venue Classification
RF and DT are tied for best accuracy, of 95.8%.
24

Comparison with state-of-the-art
• Compared Marco with the three deceptive venue
detection strategies of Feng et al. [1], avg∆, distΦ
and peak↑
Strategy Accuracy (%)
Marco/RF 95.8
avg∆ 66.3
distΦ 72.1
peak↑ 58.9

Marco’s Overhead
Per-module overhead Zoom-in of FRI module overhead

Experimental Results on Live Data
City Car Shop Mover Spa
Miami, FL 1000 (6) 348 (8) 1000 (21)
San Frans., CA 612 (59) 475 (45) 1000 (42)
NYC, NY 1000 (8) 1000 (27) 1000 (28)
Detected deceptive venues by Marco out of collected venues in Yelp
27
Marco flags almost 10% of its car repair and moving companies
as suspicious

Our Contributions
28
• Introduce a lower bound on the number of reviews
required to launch a review campaign
• Presented Marco to flag suspicious reviews and
venues
• Contributed a novel dataset of reviews and venues
• Demonstrated that Marco is effective and fast

Future works
29
• Deceptive opinion spam detection (feature-based
sentiment analysis)
• Identify plagiarized reviews and outlier reviews

Curbing Deceptive Yelp Behaviors

Recommended

Recommended

More Related Content

Similar to Curbing Deceptive Yelp Behaviors

Similar to Curbing Deceptive Yelp Behaviors (20)

Recently uploaded

Recently uploaded (20)

Curbing Deceptive Yelp Behaviors

Editor's Notes