Robust recommendation

The problem
 Collaborative
environments promise us
this...
 But how do we know we
aren’t getting this...?

In other words
 Collaborative applications are vulnerable
a user can bias their output
by biasing the input
 Because these are public utilities
open access
pseudonymous users
large numbers of sybils (fake copies) can be
constructed

Research question
 Is collaborative recommendation doomed?
 That is,
Users must come to trust the output of
collaborative systems
They will not do so if the systems can be easily
biased by attackers
 So,
Can we protect collaborative recommender
systems from (the most severe forms of) attack?

Denial of insight attack
 Term coined by Whit Andrews, Gartner
Research
 Interesting category of vulnerability
 Not denial of service
the application still runs
 But
denial or corruption of the insights it is
supposed to provide

Collaborative
Recommendation
Identify peers
Generate recommendation

What is an attack?
 Can we distinguish a single profile
injected by an attacker from an oddball
user?
 Short answer: no

What is an attack?
 An attack is
a set of user profiles added to the system
crafted to obtain excessive influence over the
recommendations given to others
 In particular
to make the purchase of a particular product
more likely (push attack)
or less likely (nuke attack)
 There are other kinds
but this is the place to concentrate – profit
motive

Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation
with Alice
Alice 5 2 3 3 ?
User 1 2 4 4 1 -1.00
User 2 2 1 3 1 2 0.33
User 3 4 2 3 2 1 .90
User 4 3 3 2 3 1 0.19
User 5 3 2 2 2 -1.00
User 6 5 3 1 3 2 0.65
User 7 5 1 5 1 -1.00
Best
match
Prediction

Example Collaborative
System

Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation
with Alice
Alice 5 2 3 3 ?
User 1 2 4 4 1 -1.00
User 2 2 1 3 1 2 0.33
User 3 4 2 3 2 1 .90
User 4 3 3 2 3 1 0.19
User 5 3 2 2 2 -1.00
User 6 5 3 1 3 2 0.65
User 7 5 1 5 1 -1.00
Attack 1 2 3 2 5 -1.00
Attack 2 3 2 3 2 5 0.76
Attack 3 3 2 2 2 5 0.93
Prediction

Best
Match
A Successful Push Attack

Definitions
 An attack is a set of user profiles A and an item t
 such that |A|>1
 t is the “target” of the attack
 Object of the attack
 let ρt be the rate at which t is recommended to users
 Goal of the attacker
○ either ρ't >> ρt (push attack)
○ or ρ't << ρt (nuke attack)
○ ∆ρ = "Hit rate increase“
○ (usually ρt is ≈ 0)
 Or alternatively
 let rt be the average rating that the system gives to item t
 Goal of the attacker
○ r't >> rt (push attack)
○ r't << rt(nuke attack)
○ ∆r = “Prediction shift”

Approach
 Assume attacker is interested in maximum
impact
for any given attack size k = |A|
want the largest ∆ρ or ∆r possible
 Assume the attacker knows the algorithm
no “security through obscurity”
 What is the most effective attack an
informed attacker could make?
reverse engineer the algorithm
create profiles that will “move” the algorithm as
much as possible

But
 What if the attacker deviates from the
“optimal attack”?
 If the attack deviates a lot
it will have to be larger to achieve the same
impact
 Really large attacks can be detected
and defeated relatively easily
more like denial of service

“Box out” the attacker
Scale
Impact
Efficient
attack
Inefficient
attack
Detectable
Detectable

Reverse Engineering
 Attacker’s ideal
every real user has enough
neighboring attack profiles
That the prediction for the target
item is influenced in the right direction
 Assume
attacker does not have access to profile database P
attacker wants to minimize |A|
 Idea
approximate “average user”
ensure similarity to this average

Basic attacks
 Lam & Riedl, 2004
 Random attack
pick items at random
give them random ratings
give the target item the maximum rating
not very effective
 Average attack
pick items at random
give them ratings = the average rating of these items
give the target item the maximum rating
pretty effective
○ but possibly hard to mount

Bandwagon attack
 Build profiles using popular items with lots of
raters
frequently-rated items are usually highly-rated items
getting at the “average user” without knowing the
data
 Special items are highly popular items
“best sellers” / “blockbuster movies”
can be determined outside of the system
 Almost as effective as Average Attack
little system-specific knowledge

Item-based recommendation
 Item-based collaborative
recommendation
uses collaborative data
but compares items rather than users
 Can be more efficient
but also more robust against the average /
bandwagon attacks
“algorithmic response”

Targeted Attacks
 Not all users are equally “valuable”
targets
 Attacker may not want to give
recommendations to the “average” user
but rather to a specific subset of users

Segment attack
 Idea
differentially attack users with a preference
for certain classes of items
people who have rated the popular items in
particular categories
 Can be determined outside of the
system
the attacker would know his market
○ “Horror films”, “Children’s fantasy novels”, etc.

Segment attack
 Identify items closely related to target
item
select most salient (likely to be rated)
examples
○ “Top Ten of X” list
Let IS be these items
fS = Rmax
 These items define the user segment
V = users who have high ratings for IS items
evaluate ∆ρ(v) on V, rather than U

Nuke attacks
 Interesting result
asymmetry between push and nuke
especially with respect to ∆ρ
it is easy to make something rarely
recommended
 Some attacks don’t work
Reverse Bandwagon
 Some very simple attacks work well
Love / Hate Attack
○ love everything, hate the target item

Findings
 Possible to craft an effective attack
regardless of algorithm
 Possible to craft an effective attack even
in the absence of system-specific
knowledge
 Relatively small attacks effective
1% for some attacks
smaller if item is rated sparsely

What to do?
 We can try to keep attackers from creating
lots of profiles
pragmatic solution
but the sparsity trade-off?
 We can build better algorithms
if we can achieve lower ∆ρ
without lower accuracy
algorithmic solution
 We can try to weed out the attack profiles
from the database
reactive solution

Other solutions
 Hybrid solution
 use other knowledge sources in addition to collaborative ones
○ helps quite a bit
 Trust solution
 accept recommendations only from people you know
○ do we need collaborative recommendation for this?
 transitivity
○ vs. gullibility?
 recommendation ≠ reputation
 Market solution
 provide incentives for honest disclosure
 problem
○ usually the reward / profit is outside the system’s control
○ can’t build it into a market mechanism

Detection and response
 Goal
classify users into attackers / genuine users
but remember definition
○ An attacker is a profile that is part of a large
group A
 Then ignore A when making predictions

Unsupervised Classification
 Clustering is the basic idea
Reduced dimensional space
Attacks cluster together
 Mehta, 2007
PCA compression
Identify users highly similar
○ In lower-dimensional space
Works well for average attack
○ At higher attack sizes
○ > 90% precision and recall
○ Computationally expensive

Supervised Classification
 Identify characteristic features likely to
discriminate between users and attackers
Example
○ profile variance
○ target focus
Total of 25 derived attributes
 Learn a classifier over labeled examples of
attacks and genuine data
Best results with SVM
 Detection is low-cost

Methodology
 Divide ratings database into test data
and training data
UT and UR
 Add attacks to UR
UR + AR = UR’
 Train the classifier on UR’
 Test performance against
UT + AT = UT’
where AT uses a different set of target items

Stratified Training
 We want to train against multiple attack types
and sizes
AR = A1 + A2 + … + An
AR must be large to include all combinations
But if AR is too big relative to UR
Then derived features are biased
○ Attack profiles become “normal”
 Let F(U,u) be the features derived from a
profile u in the context of a database U
instead of calculating F(UR’, AR)
calculate F(UR+A1,A1), F(UR+A2,A2), etc.
Then combine resulting features with the training
data

SVM Results
Nuke
Attack
Push
Attack
Attacks essentially
neutralized up to 12%.
Both push and nuke.
Other attack types similar
results.

Obfuscated Attacks
 What about the middle part
of the figure?
How big is the hole?
 Small amounts of deviation from known attack
types
esp. using Rmax = 4 instead of 5
do not impact attack effectiveness much
○ About 10-20%
But do reduce effectiveness of detection
○ About 20%
 System trained only on known types
future work: additional training with wider range of
attacks
Scale
Impact
Efficient
attack
Inefficient
attack
Detectable
Detectable

Where are we?
 Attacks work well against all standard
collaborative recommendation algorithms
 What to do
Use e-commerce common sense
○ Protect accounts, if applicable
○ Monitor the system, check up on customer complaints
Hide your ratings distribution
Use additional knowledge sources if you can
○ hybrid recommendation
Use model-based recommendation if
computationally feasible
Use attack detection

Current Work
 Other recommender-like systems
Esp. tagging systems
Does tag spam look like profile injection?
How to characterize / defend against it?
 Self-protection / dynamics
Evolution of rating data
Interaction with
○ user / item quarantining
○ attack detection

Tagging systems
 Del.icio.us / flikr.com
 allow users to tag items with arbitrary text labels
 Multi-dimensional labels
 more complex than ratings
 More complex output
 Tag -> resources
 Resource -> resources
 etc.
 Can we model denial of insight attacks against
tagging systems?
 don’t want to look just at a single output modality
 use a PageRank-like metric to evaluate relative centrality
of items

Self-protection
Ratings Database
Attack
Classifier
User
Quarantine
New
Users
Item
Quarantine
New
Items
Rater Diversity
Detection

Open issues
 Real-time detection
different from static / matrix-based results?
 Handling cold-start items / users
 Handling large-scale, low impact attacks

Larger question
 Machine learning techniques widespread
Recommender systems
Social networks
Data mining
Adaptive sensors
…
 Systems learning from open, public input
How do these systems function in an adversarial
environment?
Will similar approaches work for these algorithms?

Robust recommendation

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (16)

Similar to Robust recommendation

Similar to Robust recommendation (20)

Recently uploaded

Recently uploaded (20)

Robust recommendation