SlideShare a Scribd company logo
Robin Burke
DePaul University
The problem
 Collaborative
environments promise us
this...
 But how do we know we
aren’t getting this...?
Example: Spore
But on Amazon.com
Hmm.
In other words
 Collaborative applications are vulnerable
a user can bias their output
by biasing the input
 Because these are public utilities
open access
pseudonymous users
large numbers of sybils (fake copies) can be
constructed
Research question
 Is collaborative recommendation doomed?
 That is,
Users must come to trust the output of
collaborative systems
They will not do so if the systems can be easily
biased by attackers
 So,
Can we protect collaborative recommender
systems from (the most severe forms of) attack?
Denial of insight attack
 Term coined by Whit Andrews, Gartner
Research
 Interesting category of vulnerability
 Not denial of service
the application still runs
 But
denial or corruption of the insights it is
supposed to provide
Collaborative
Recommendation
Identify peers
Generate recommendation
What is an attack?
 Can we distinguish a single profile
injected by an attacker from an oddball
user?
 Short answer: no
What is an attack?
 An attack is
a set of user profiles added to the system
crafted to obtain excessive influence over the
recommendations given to others
 In particular
to make the purchase of a particular product
more likely (push attack)
or less likely (nuke attack)
 There are other kinds
but this is the place to concentrate – profit
motive
Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation
with Alice
Alice 5 2 3 3 ?
User 1 2 4 4 1 -1.00
User 2 2 1 3 1 2 0.33
User 3 4 2 3 2 1 .90
User 4 3 3 2 3 1 0.19
User 5 3 2 2 2 -1.00
User 6 5 3 1 3 2 0.65
User 7 5 1 5 1 -1.00
Best
match
Prediction

Example Collaborative
System
Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation
with Alice
Alice 5 2 3 3 ?
User 1 2 4 4 1 -1.00
User 2 2 1 3 1 2 0.33
User 3 4 2 3 2 1 .90
User 4 3 3 2 3 1 0.19
User 5 3 2 2 2 -1.00
User 6 5 3 1 3 2 0.65
User 7 5 1 5 1 -1.00
Attack 1 2 3 2 5 -1.00
Attack 2 3 2 3 2 5 0.76
Attack 3 3 2 2 2 5 0.93
Prediction

Best
Match
A Successful Push Attack
Definitions
 An attack is a set of user profiles A and an item t
 such that |A|>1
 t is the “target” of the attack
 Object of the attack
 let ρt be the rate at which t is recommended to users
 Goal of the attacker
○ either ρ't >> ρt (push attack)
○ or ρ't << ρt (nuke attack)
○ ∆ρ = "Hit rate increase“
○ (usually ρt is ≈ 0)
 Or alternatively
 let rt be the average rating that the system gives to item t
 Goal of the attacker
○ r't >> rt (push attack)
○ r't << rt(nuke attack)
○ ∆r = “Prediction shift”
Approach
 Assume attacker is interested in maximum
impact
for any given attack size k = |A|
want the largest ∆ρ or ∆r possible
 Assume the attacker knows the algorithm
no “security through obscurity”
 What is the most effective attack an
informed attacker could make?
reverse engineer the algorithm
create profiles that will “move” the algorithm as
much as possible
But
 What if the attacker deviates from the
“optimal attack”?
 If the attack deviates a lot
it will have to be larger to achieve the same
impact
 Really large attacks can be detected
and defeated relatively easily
more like denial of service
“Box out” the attacker
Scale
Impact
Efficient
attack
Inefficient
attack
Detectable
Detectable
Reverse Engineering
 Attacker’s ideal
every real user has enough
neighboring attack profiles
That the prediction for the target
item is influenced in the right direction
 Assume
attacker does not have access to profile database P
attacker wants to minimize |A|
 Idea
approximate “average user”
ensure similarity to this average
Basic attacks
 Lam & Riedl, 2004
 Random attack
pick items at random
give them random ratings
give the target item the maximum rating
not very effective
 Average attack
pick items at random
give them ratings = the average rating of these items
give the target item the maximum rating
pretty effective
○ but possibly hard to mount
Bandwagon attack
 Build profiles using popular items with lots of
raters
frequently-rated items are usually highly-rated items
getting at the “average user” without knowing the
data
 Special items are highly popular items
“best sellers” / “blockbuster movies”
can be determined outside of the system
 Almost as effective as Average Attack
little system-specific knowledge
Typical Results
Item-based recommendation
 Item-based collaborative
recommendation
uses collaborative data
but compares items rather than users
 Can be more efficient
but also more robust against the average /
bandwagon attacks
“algorithmic response”
Results (basic attacks)
Targeted Attacks
 Not all users are equally “valuable”
targets
 Attacker may not want to give
recommendations to the “average” user
but rather to a specific subset of users
Segment attack
 Idea
differentially attack users with a preference
for certain classes of items
people who have rated the popular items in
particular categories
 Can be determined outside of the
system
the attacker would know his market
○ “Horror films”, “Children’s fantasy novels”, etc.
Segment attack
 Identify items closely related to target
item
select most salient (likely to be rated)
examples
○ “Top Ten of X” list
Let IS be these items
fS = Rmax
 These items define the user segment
V = users who have high ratings for IS items
evaluate ∆ρ(v) on V, rather than U
Results (segment attack)
Nuke attacks
 Interesting result
asymmetry between push and nuke
especially with respect to ∆ρ
it is easy to make something rarely
recommended
 Some attacks don’t work
Reverse Bandwagon
 Some very simple attacks work well
Love / Hate Attack
○ love everything, hate the target item
Nuke attack results
Findings
 Possible to craft an effective attack
regardless of algorithm
 Possible to craft an effective attack even
in the absence of system-specific
knowledge
 Relatively small attacks effective
1% for some attacks
smaller if item is rated sparsely
What to do?
 We can try to keep attackers from creating
lots of profiles
pragmatic solution
but the sparsity trade-off?
 We can build better algorithms
if we can achieve lower ∆ρ
without lower accuracy
algorithmic solution
 We can try to weed out the attack profiles
from the database
reactive solution
Other solutions
 Hybrid solution
 use other knowledge sources in addition to collaborative ones
○ helps quite a bit
 Trust solution
 accept recommendations only from people you know
○ do we need collaborative recommendation for this?
 transitivity
○ vs. gullibility?
 recommendation ≠ reputation
 Market solution
 provide incentives for honest disclosure
 problem
○ usually the reward / profit is outside the system’s control
○ can’t build it into a market mechanism
Detection and response
 Goal
classify users into attackers / genuine users
but remember definition
○ An attacker is a profile that is part of a large
group A
 Then ignore A when making predictions
Unsupervised Classification
 Clustering is the basic idea
Reduced dimensional space
Attacks cluster together
 Mehta, 2007
PCA compression
Identify users highly similar
○ In lower-dimensional space
Works well for average attack
○ At higher attack sizes
○ > 90% precision and recall
○ Computationally expensive
Supervised Classification
 Identify characteristic features likely to
discriminate between users and attackers
Example
○ profile variance
○ target focus
Total of 25 derived attributes
 Learn a classifier over labeled examples of
attacks and genuine data
Best results with SVM
 Detection is low-cost
Methodology
 Divide ratings database into test data
and training data
UT and UR
 Add attacks to UR
UR + AR = UR’
 Train the classifier on UR’
 Test performance against
UT + AT = UT’
where AT uses a different set of target items
Stratified Training
 We want to train against multiple attack types
and sizes
AR = A1 + A2 + … + An
AR must be large to include all combinations
But if AR is too big relative to UR
Then derived features are biased
○ Attack profiles become “normal”
 Let F(U,u) be the features derived from a
profile u in the context of a database U
instead of calculating F(UR’, AR)
calculate F(UR+A1,A1), F(UR+A2,A2), etc.
Then combine resulting features with the training
data
SVM Results
Nuke
Attack
Push
Attack
Attacks essentially
neutralized up to 12%.
Both push and nuke.
Other attack types similar
results.
Obfuscated Attacks
 What about the middle part
of the figure?
How big is the hole?
 Small amounts of deviation from known attack
types
esp. using Rmax = 4 instead of 5
do not impact attack effectiveness much
○ About 10-20%
But do reduce effectiveness of detection
○ About 20%
 System trained only on known types
future work: additional training with wider range of
attacks
Scale
Impact
Efficient
attack
Inefficient
attack
Detectable
Detectable
Where are we?
 Attacks work well against all standard
collaborative recommendation algorithms
 What to do
Use e-commerce common sense
○ Protect accounts, if applicable
○ Monitor the system, check up on customer complaints
Hide your ratings distribution
Use additional knowledge sources if you can
○ hybrid recommendation
Use model-based recommendation if
computationally feasible
Use attack detection
Current Work
 Other recommender-like systems
Esp. tagging systems
Does tag spam look like profile injection?
How to characterize / defend against it?
 Self-protection / dynamics
Evolution of rating data
Interaction with
○ user / item quarantining
○ attack detection
Tagging systems
 Del.icio.us / flikr.com
 allow users to tag items with arbitrary text labels
 Multi-dimensional labels
 more complex than ratings
 More complex output
 Tag -> resources
 Resource -> resources
 etc.
 Can we model denial of insight attacks against
tagging systems?
 don’t want to look just at a single output modality
 use a PageRank-like metric to evaluate relative centrality
of items
Example results
Self-protection
Ratings Database
Attack
Classifier
User
Quarantine
New
Users
Item
Quarantine
New
Items
Rater Diversity
Detection
Open issues
 Real-time detection
different from static / matrix-based results?
 Handling cold-start items / users
 Handling large-scale, low impact attacks
Larger question
 Machine learning techniques widespread
Recommender systems
Social networks
Data mining
Adaptive sensors
…
 Systems learning from open, public input
How do these systems function in an adversarial
environment?
Will similar approaches work for these algorithms?
Questions

More Related Content

Viewers also liked

Rad grid dynamically building a grid and adding a hierarchy with declarative...
Rad grid  dynamically building a grid and adding a hierarchy with declarative...Rad grid  dynamically building a grid and adding a hierarchy with declarative...
Rad grid dynamically building a grid and adding a hierarchy with declarative...
Aravindharamanan S
 
Item based approach
Item based approachItem based approach
Item based approach
Aravindharamanan S
 
Emergency androidstudiochapter
Emergency androidstudiochapterEmergency androidstudiochapter
Emergency androidstudiochapter
Aravindharamanan S
 
Lec7 collaborative filtering
Lec7 collaborative filteringLec7 collaborative filtering
Lec7 collaborative filtering
Aravindharamanan S
 
Lecture 3 soap
Lecture 3 soapLecture 3 soap
Lecture 3 soap
Aravindharamanan S
 
Visual studio-2012-product-guide
Visual studio-2012-product-guideVisual studio-2012-product-guide
Visual studio-2012-product-guide
Aravindharamanan S
 
Collab filtering-tutorial
Collab filtering-tutorialCollab filtering-tutorial
Collab filtering-tutorial
Aravindharamanan S
 
Twdatasci cjlin-big data analytics - challenges and opportunities
Twdatasci cjlin-big data analytics - challenges and opportunitiesTwdatasci cjlin-big data analytics - challenges and opportunities
Twdatasci cjlin-big data analytics - challenges and opportunities
Aravindharamanan S
 
Aws tkv-ug
Aws tkv-ugAws tkv-ug
Aws tkv-ug
Aravindharamanan S
 
Workshop 04 android-development
Workshop 04 android-developmentWorkshop 04 android-development
Workshop 04 android-development
Aravindharamanan S
 
Introducing visual studio_2010_v1.0--chappell
Introducing visual studio_2010_v1.0--chappellIntroducing visual studio_2010_v1.0--chappell
Introducing visual studio_2010_v1.0--chappell
Aravindharamanan S
 
Secc tutorials development and deployment of rest web services in java_v2.0
Secc tutorials development and deployment of rest web services in java_v2.0Secc tutorials development and deployment of rest web services in java_v2.0
Secc tutorials development and deployment of rest web services in java_v2.0
Aravindharamanan S
 
Android wear notes
Android wear notesAndroid wear notes
Android wear notes
Aravindharamanan S
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
Aravindharamanan S
 
Release documentation
Release documentationRelease documentation
Release documentation
Aravindharamanan S
 
Recommender lecture
Recommender lectureRecommender lecture
Recommender lecture
Aravindharamanan S
 

Viewers also liked (16)

Rad grid dynamically building a grid and adding a hierarchy with declarative...
Rad grid  dynamically building a grid and adding a hierarchy with declarative...Rad grid  dynamically building a grid and adding a hierarchy with declarative...
Rad grid dynamically building a grid and adding a hierarchy with declarative...
 
Item based approach
Item based approachItem based approach
Item based approach
 
Emergency androidstudiochapter
Emergency androidstudiochapterEmergency androidstudiochapter
Emergency androidstudiochapter
 
Lec7 collaborative filtering
Lec7 collaborative filteringLec7 collaborative filtering
Lec7 collaborative filtering
 
Lecture 3 soap
Lecture 3 soapLecture 3 soap
Lecture 3 soap
 
Visual studio-2012-product-guide
Visual studio-2012-product-guideVisual studio-2012-product-guide
Visual studio-2012-product-guide
 
Collab filtering-tutorial
Collab filtering-tutorialCollab filtering-tutorial
Collab filtering-tutorial
 
Twdatasci cjlin-big data analytics - challenges and opportunities
Twdatasci cjlin-big data analytics - challenges and opportunitiesTwdatasci cjlin-big data analytics - challenges and opportunities
Twdatasci cjlin-big data analytics - challenges and opportunities
 
Aws tkv-ug
Aws tkv-ugAws tkv-ug
Aws tkv-ug
 
Workshop 04 android-development
Workshop 04 android-developmentWorkshop 04 android-development
Workshop 04 android-development
 
Introducing visual studio_2010_v1.0--chappell
Introducing visual studio_2010_v1.0--chappellIntroducing visual studio_2010_v1.0--chappell
Introducing visual studio_2010_v1.0--chappell
 
Secc tutorials development and deployment of rest web services in java_v2.0
Secc tutorials development and deployment of rest web services in java_v2.0Secc tutorials development and deployment of rest web services in java_v2.0
Secc tutorials development and deployment of rest web services in java_v2.0
 
Android wear notes
Android wear notesAndroid wear notes
Android wear notes
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Release documentation
Release documentationRelease documentation
Release documentation
 
Recommender lecture
Recommender lectureRecommender lecture
Recommender lecture
 

Similar to Robust recommendation

Strategies of detecting Profile-injection attacks in E-Commerce Recommender S...
Strategies of detecting Profile-injection attacks in E-Commerce Recommender S...Strategies of detecting Profile-injection attacks in E-Commerce Recommender S...
Strategies of detecting Profile-injection attacks in E-Commerce Recommender S...
IJERA Editor
 
SensePost Threat Modelling
SensePost Threat ModellingSensePost Threat Modelling
SensePost Threat Modelling
SensePost
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
Data Con LA
 
Threat Activity Groups - Dragos
Threat Activity Groups - Dragos Threat Activity Groups - Dragos
Threat Activity Groups - Dragos
Dragos, Inc.
 
[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...
[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...
[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...
CODE BLUE
 
Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)
butest
 
The security mindset securing social media integrations and social learning...
The security mindset   securing social media integrations and social learning...The security mindset   securing social media integrations and social learning...
The security mindset securing social media integrations and social learning...
franco_bb
 
Major
MajorMajor
Profile Injection Attack Detection in Recommender System
Profile Injection Attack Detection in Recommender SystemProfile Injection Attack Detection in Recommender System
Profile Injection Attack Detection in Recommender System
ASHISH PANNU
 
DevSecOps: Securing Applications with DevOps
DevSecOps: Securing Applications with DevOpsDevSecOps: Securing Applications with DevOps
DevSecOps: Securing Applications with DevOps
Wouter de Kort
 
Incident Response
Incident ResponseIncident Response
Incident Response
MichaelRodriguesdosS1
 
Assignment 1 Attack Methodology and CountermeasuresDue Week 4 and.docx
Assignment 1 Attack Methodology and CountermeasuresDue Week 4 and.docxAssignment 1 Attack Methodology and CountermeasuresDue Week 4 and.docx
Assignment 1 Attack Methodology and CountermeasuresDue Week 4 and.docx
murgatroydcrista
 
R af d
R af dR af d
Risk Analysis for Dummies
Risk Analysis for DummiesRisk Analysis for Dummies
Risk Analysis for Dummies
William L. McGill
 
Threat modeling demystified
Threat modeling demystifiedThreat modeling demystified
Threat modeling demystified
Priyanka Aash
 
20160831_app_storesecurity_Seminar
20160831_app_storesecurity_Seminar20160831_app_storesecurity_Seminar
20160831_app_storesecurity_Seminar
Jisoo Park
 
Introduction to FAIR - Factor Analysis of Information Risk
Introduction to FAIR - Factor Analysis of Information RiskIntroduction to FAIR - Factor Analysis of Information Risk
Introduction to FAIR - Factor Analysis of Information Risk
Osama Salah
 
Penetration Testing Guide
Penetration Testing GuidePenetration Testing Guide
Penetration Testing Guide
Badawy Abd El-Aziz
 
McAfee Labs 2017 Threats Predictions
McAfee Labs 2017 Threats PredictionsMcAfee Labs 2017 Threats Predictions
McAfee Labs 2017 Threats Predictions
Matthew Rosenquist
 
Threat Modeling to Reduce Software Security Risk
Threat Modeling to Reduce Software Security RiskThreat Modeling to Reduce Software Security Risk
Threat Modeling to Reduce Software Security Risk
Security Innovation
 

Similar to Robust recommendation (20)

Strategies of detecting Profile-injection attacks in E-Commerce Recommender S...
Strategies of detecting Profile-injection attacks in E-Commerce Recommender S...Strategies of detecting Profile-injection attacks in E-Commerce Recommender S...
Strategies of detecting Profile-injection attacks in E-Commerce Recommender S...
 
SensePost Threat Modelling
SensePost Threat ModellingSensePost Threat Modelling
SensePost Threat Modelling
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Threat Activity Groups - Dragos
Threat Activity Groups - Dragos Threat Activity Groups - Dragos
Threat Activity Groups - Dragos
 
[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...
[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...
[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...
 
Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)
 
The security mindset securing social media integrations and social learning...
The security mindset   securing social media integrations and social learning...The security mindset   securing social media integrations and social learning...
The security mindset securing social media integrations and social learning...
 
Major
MajorMajor
Major
 
Profile Injection Attack Detection in Recommender System
Profile Injection Attack Detection in Recommender SystemProfile Injection Attack Detection in Recommender System
Profile Injection Attack Detection in Recommender System
 
DevSecOps: Securing Applications with DevOps
DevSecOps: Securing Applications with DevOpsDevSecOps: Securing Applications with DevOps
DevSecOps: Securing Applications with DevOps
 
Incident Response
Incident ResponseIncident Response
Incident Response
 
Assignment 1 Attack Methodology and CountermeasuresDue Week 4 and.docx
Assignment 1 Attack Methodology and CountermeasuresDue Week 4 and.docxAssignment 1 Attack Methodology and CountermeasuresDue Week 4 and.docx
Assignment 1 Attack Methodology and CountermeasuresDue Week 4 and.docx
 
R af d
R af dR af d
R af d
 
Risk Analysis for Dummies
Risk Analysis for DummiesRisk Analysis for Dummies
Risk Analysis for Dummies
 
Threat modeling demystified
Threat modeling demystifiedThreat modeling demystified
Threat modeling demystified
 
20160831_app_storesecurity_Seminar
20160831_app_storesecurity_Seminar20160831_app_storesecurity_Seminar
20160831_app_storesecurity_Seminar
 
Introduction to FAIR - Factor Analysis of Information Risk
Introduction to FAIR - Factor Analysis of Information RiskIntroduction to FAIR - Factor Analysis of Information Risk
Introduction to FAIR - Factor Analysis of Information Risk
 
Penetration Testing Guide
Penetration Testing GuidePenetration Testing Guide
Penetration Testing Guide
 
McAfee Labs 2017 Threats Predictions
McAfee Labs 2017 Threats PredictionsMcAfee Labs 2017 Threats Predictions
McAfee Labs 2017 Threats Predictions
 
Threat Modeling to Reduce Software Security Risk
Threat Modeling to Reduce Software Security RiskThreat Modeling to Reduce Software Security Risk
Threat Modeling to Reduce Software Security Risk
 

Recently uploaded

writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Ashish Kohli
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
NelTorrente
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 

Recently uploaded (20)

writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 

Robust recommendation

  • 2. The problem  Collaborative environments promise us this...  But how do we know we aren’t getting this...?
  • 6. In other words  Collaborative applications are vulnerable a user can bias their output by biasing the input  Because these are public utilities open access pseudonymous users large numbers of sybils (fake copies) can be constructed
  • 7. Research question  Is collaborative recommendation doomed?  That is, Users must come to trust the output of collaborative systems They will not do so if the systems can be easily biased by attackers  So, Can we protect collaborative recommender systems from (the most severe forms of) attack?
  • 8. Denial of insight attack  Term coined by Whit Andrews, Gartner Research  Interesting category of vulnerability  Not denial of service the application still runs  But denial or corruption of the insights it is supposed to provide
  • 10. What is an attack?  Can we distinguish a single profile injected by an attacker from an oddball user?  Short answer: no
  • 11. What is an attack?  An attack is a set of user profiles added to the system crafted to obtain excessive influence over the recommendations given to others  In particular to make the purchase of a particular product more likely (push attack) or less likely (nuke attack)  There are other kinds but this is the place to concentrate – profit motive
  • 12. Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation with Alice Alice 5 2 3 3 ? User 1 2 4 4 1 -1.00 User 2 2 1 3 1 2 0.33 User 3 4 2 3 2 1 .90 User 4 3 3 2 3 1 0.19 User 5 3 2 2 2 -1.00 User 6 5 3 1 3 2 0.65 User 7 5 1 5 1 -1.00 Best match Prediction  Example Collaborative System
  • 13. Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation with Alice Alice 5 2 3 3 ? User 1 2 4 4 1 -1.00 User 2 2 1 3 1 2 0.33 User 3 4 2 3 2 1 .90 User 4 3 3 2 3 1 0.19 User 5 3 2 2 2 -1.00 User 6 5 3 1 3 2 0.65 User 7 5 1 5 1 -1.00 Attack 1 2 3 2 5 -1.00 Attack 2 3 2 3 2 5 0.76 Attack 3 3 2 2 2 5 0.93 Prediction  Best Match A Successful Push Attack
  • 14. Definitions  An attack is a set of user profiles A and an item t  such that |A|>1  t is the “target” of the attack  Object of the attack  let ρt be the rate at which t is recommended to users  Goal of the attacker ○ either ρ't >> ρt (push attack) ○ or ρ't << ρt (nuke attack) ○ ∆ρ = "Hit rate increase“ ○ (usually ρt is ≈ 0)  Or alternatively  let rt be the average rating that the system gives to item t  Goal of the attacker ○ r't >> rt (push attack) ○ r't << rt(nuke attack) ○ ∆r = “Prediction shift”
  • 15. Approach  Assume attacker is interested in maximum impact for any given attack size k = |A| want the largest ∆ρ or ∆r possible  Assume the attacker knows the algorithm no “security through obscurity”  What is the most effective attack an informed attacker could make? reverse engineer the algorithm create profiles that will “move” the algorithm as much as possible
  • 16. But  What if the attacker deviates from the “optimal attack”?  If the attack deviates a lot it will have to be larger to achieve the same impact  Really large attacks can be detected and defeated relatively easily more like denial of service
  • 17. “Box out” the attacker Scale Impact Efficient attack Inefficient attack Detectable Detectable
  • 18. Reverse Engineering  Attacker’s ideal every real user has enough neighboring attack profiles That the prediction for the target item is influenced in the right direction  Assume attacker does not have access to profile database P attacker wants to minimize |A|  Idea approximate “average user” ensure similarity to this average
  • 19. Basic attacks  Lam & Riedl, 2004  Random attack pick items at random give them random ratings give the target item the maximum rating not very effective  Average attack pick items at random give them ratings = the average rating of these items give the target item the maximum rating pretty effective ○ but possibly hard to mount
  • 20. Bandwagon attack  Build profiles using popular items with lots of raters frequently-rated items are usually highly-rated items getting at the “average user” without knowing the data  Special items are highly popular items “best sellers” / “blockbuster movies” can be determined outside of the system  Almost as effective as Average Attack little system-specific knowledge
  • 22. Item-based recommendation  Item-based collaborative recommendation uses collaborative data but compares items rather than users  Can be more efficient but also more robust against the average / bandwagon attacks “algorithmic response”
  • 24. Targeted Attacks  Not all users are equally “valuable” targets  Attacker may not want to give recommendations to the “average” user but rather to a specific subset of users
  • 25. Segment attack  Idea differentially attack users with a preference for certain classes of items people who have rated the popular items in particular categories  Can be determined outside of the system the attacker would know his market ○ “Horror films”, “Children’s fantasy novels”, etc.
  • 26. Segment attack  Identify items closely related to target item select most salient (likely to be rated) examples ○ “Top Ten of X” list Let IS be these items fS = Rmax  These items define the user segment V = users who have high ratings for IS items evaluate ∆ρ(v) on V, rather than U
  • 28. Nuke attacks  Interesting result asymmetry between push and nuke especially with respect to ∆ρ it is easy to make something rarely recommended  Some attacks don’t work Reverse Bandwagon  Some very simple attacks work well Love / Hate Attack ○ love everything, hate the target item
  • 30. Findings  Possible to craft an effective attack regardless of algorithm  Possible to craft an effective attack even in the absence of system-specific knowledge  Relatively small attacks effective 1% for some attacks smaller if item is rated sparsely
  • 31. What to do?  We can try to keep attackers from creating lots of profiles pragmatic solution but the sparsity trade-off?  We can build better algorithms if we can achieve lower ∆ρ without lower accuracy algorithmic solution  We can try to weed out the attack profiles from the database reactive solution
  • 32. Other solutions  Hybrid solution  use other knowledge sources in addition to collaborative ones ○ helps quite a bit  Trust solution  accept recommendations only from people you know ○ do we need collaborative recommendation for this?  transitivity ○ vs. gullibility?  recommendation ≠ reputation  Market solution  provide incentives for honest disclosure  problem ○ usually the reward / profit is outside the system’s control ○ can’t build it into a market mechanism
  • 33. Detection and response  Goal classify users into attackers / genuine users but remember definition ○ An attacker is a profile that is part of a large group A  Then ignore A when making predictions
  • 34. Unsupervised Classification  Clustering is the basic idea Reduced dimensional space Attacks cluster together  Mehta, 2007 PCA compression Identify users highly similar ○ In lower-dimensional space Works well for average attack ○ At higher attack sizes ○ > 90% precision and recall ○ Computationally expensive
  • 35. Supervised Classification  Identify characteristic features likely to discriminate between users and attackers Example ○ profile variance ○ target focus Total of 25 derived attributes  Learn a classifier over labeled examples of attacks and genuine data Best results with SVM  Detection is low-cost
  • 36. Methodology  Divide ratings database into test data and training data UT and UR  Add attacks to UR UR + AR = UR’  Train the classifier on UR’  Test performance against UT + AT = UT’ where AT uses a different set of target items
  • 37. Stratified Training  We want to train against multiple attack types and sizes AR = A1 + A2 + … + An AR must be large to include all combinations But if AR is too big relative to UR Then derived features are biased ○ Attack profiles become “normal”  Let F(U,u) be the features derived from a profile u in the context of a database U instead of calculating F(UR’, AR) calculate F(UR+A1,A1), F(UR+A2,A2), etc. Then combine resulting features with the training data
  • 38. SVM Results Nuke Attack Push Attack Attacks essentially neutralized up to 12%. Both push and nuke. Other attack types similar results.
  • 39. Obfuscated Attacks  What about the middle part of the figure? How big is the hole?  Small amounts of deviation from known attack types esp. using Rmax = 4 instead of 5 do not impact attack effectiveness much ○ About 10-20% But do reduce effectiveness of detection ○ About 20%  System trained only on known types future work: additional training with wider range of attacks Scale Impact Efficient attack Inefficient attack Detectable Detectable
  • 40. Where are we?  Attacks work well against all standard collaborative recommendation algorithms  What to do Use e-commerce common sense ○ Protect accounts, if applicable ○ Monitor the system, check up on customer complaints Hide your ratings distribution Use additional knowledge sources if you can ○ hybrid recommendation Use model-based recommendation if computationally feasible Use attack detection
  • 41. Current Work  Other recommender-like systems Esp. tagging systems Does tag spam look like profile injection? How to characterize / defend against it?  Self-protection / dynamics Evolution of rating data Interaction with ○ user / item quarantining ○ attack detection
  • 42. Tagging systems  Del.icio.us / flikr.com  allow users to tag items with arbitrary text labels  Multi-dimensional labels  more complex than ratings  More complex output  Tag -> resources  Resource -> resources  etc.  Can we model denial of insight attacks against tagging systems?  don’t want to look just at a single output modality  use a PageRank-like metric to evaluate relative centrality of items
  • 45. Open issues  Real-time detection different from static / matrix-based results?  Handling cold-start items / users  Handling large-scale, low impact attacks
  • 46. Larger question  Machine learning techniques widespread Recommender systems Social networks Data mining Adaptive sensors …  Systems learning from open, public input How do these systems function in an adversarial environment? Will similar approaches work for these algorithms?