An Overview of Advesarial-attack-in-Recommender-system.pptx

Adversarial attacks to
recommender systems

Table of content
1. Introduction of Recommender Systems
2. Adversarial Attack
3. Defense against adversarial attacks
4. Conclusion
2

Table of content
4. Conclusion
3

1.1 Introduction
• Provide users with suggestions based on their preferences.
• Recommend items that normally fit users’ taste/need.
• Have been widely used in online systems, e.g., YouTube,
• Netflix, Amazon,... to customize recommendation
4

1.1 Introduction
The basic principle:
• “The basic principle of recommendations
is that significant dependencies exist
between user and item-centric activity.”
• Charu C. Aggarwal,
Recommender Systems, Springer,
DOI: 10.1007/978-3-319-29659-3
5

1.2 Utility matrix
Utility matrix(User-Item matrix):
Machine Learning cơ bản
Items
Users
Ratings
• Importance role
6

1.2 Utility matrix
Build Utility matrix(User-Item matrix):
• Ask users for reviewing the product: send email remind,…
• Base on the user's behavior or preferences:
• Purchase History
• Browsing History
• Search Queries
• Social Interactions
7

1.3 Classification
• Two main-types of Recommender System:
• Content-Based Filtering.
• Collaborative filtering:
• Neighborhood-based Collaborative Filtering:
• User-Based collaborative filtering (User-User collaborative filtering)
• Item-Based collaborative filtering(Item-Item collaborative filtering)
• Matrix Factorization Collaborative Filtering.
• Hybrid Recommender Systems.
8

1.3.1 Content-Based Filtering
• Recommending items to users by analyzing the attributes or features of
items and comparing them to a user's profile or historical preferences
Image source: https://bit.ly/3TS19t5
9

1.3.1 Content-Based Filtering
• User's models:
• Regression: in the case of ratings is a range of values.
• Classification: in the case of ratings is some specific case, like like/dislike
for example.
• Training data: pair of (X, y) = (item's feature, rating)
Machine learning cơ bản
10

1.3.2 Collaborative filtering
• Exploiting the similarity among users to recommend items.
• The model allows users to discover new knowledge (known by
similar users).
11

1.3.2.1 Neighborhood-based Collaborative Filtering
Computing Similarity
https://doi.org/10.3390/info13010042
Computing Rating
or User 2
or User 1
or
12

1.3.2.1.1 User-User Collaborative filtering
• Recommending items to a user based on the preferences and
behaviors of users with similar profiles.
13

14

• Pros:
• Personalized recommendations.
• Effective with limited user data.
• Adapts to changing user preferences.
• Introduces serendipity in suggestions.
• Cons:
• Challenges with sparse data and scalability.
• Dependence on high-quality data.
• Struggles with new item recommendations(Cold start).
• Privacy concerns in user preference comparison.
• Limited adaptability to dynamic user preferences.
15

1.3.2.1.2 Item-Item Collaborative filtering
• Recommending items to a user by finding items that
are similar to those the user has interacted with.
16

Machine
learning
cơ
bản
17

• Pros:
• Scalability: Efficient for large systems with many items.
• Stability: Recommendations remain consistent over time.
• Data Handling: Performs well with sparse data.
• New Users: Effectively handles new user scenarios.
• Cons:
• New Items Challenge(Cold start): Difficulty in recommending new
items lacking sufficient history.
• Limited Serendipity: Recommendations may lack surprising or
unexpected choices.
• Attribute Dependency: Quality depends on accurate item attributes.
• Computational Load: Resource-intensive calculations for item
similarities. 18

1.3.2.2 Matrix Factorization Collaborative Filtering
• Recall: Content-Base Filtering.
• Building item profiles plays a very important role and has a direct
impact on the performance of the model.
M, N are the number of items and the number of
users, respectively.
Mission: find W
X is pre-built
19

• Matrix Factorization Collaborative Filtering.
• Main idea behind Matrix Factorization for Recommendation
Systems:
• Latent features exist that describe the relationship between items and
Mission: find
both of X, W
20

• Pros:
• Accuracy Boost: Provides more accurate recommendations.
• Sparsity Management: Handles sparse data effectively by filling in missing
values through latent factors.
• Personalization: Offers tailored recommendations.
• Flexibility: Adaptable to diverse recommendation scenarios.
• Cons:
• Data Dependency: Performance relies on data quality.
• Complex Computations: Training can be computationally intensive.
• Limited Interpretability: Latent factors may lack clear interpretation.
• Overfitting Risk: Potential overfitting with limited data.
21

1.3.3 Hybrid Recommender Systems
• Hybrid Recommender Systems:
• Combining multiple recommendation techniques, often
blending collaborative filtering and content-based filtering, to
improve recommendation quality and address the limitations of
individual methods.
22

Table of content
4. Conclusion
23

2. Adversarial Attacks – Introduction
24
• The most common reason to cause a malfunction in a machine learning model;
an adversarial attack, might entail presenting a model with inaccurate or
misrepresentative data as its training or introducing maliciously designed data
to deceive an already trained model.
• These attacks can be considered a very acute version of Anomalies in the
dataset, directed maliciously from the get to affect a machine learning model.
To understand better, most machine learning techniques are primarily designed
to work on specific problem sets, assuming that the training and test data are
generated from the same statistical distribution. Still, sometimes this
assumption can be exploited by some users deliberately to mess up your
MLOps pipeline. Users often use these attacks to manipulate your model’s
performance, affecting your product and reputation.

• Adversarial attacks on machine learning require that augmentation and additions be
introduced in the model pipeline, especially when the model holds a vital role in
situations where the error window is very narrow. For example, an adversarial attack
could involve feeding a model false or misleading data while training or adding
maliciously prepared data to trick an already trained model.
• To break this down, what the said “perturbation” has done to the panda image is that it
has taken into account how the feature extractor in the model will filter the image and
effectively change or influence the values of those specific pixels to classify the image
wrongly completely.
• Some more examples where these attacks can destroy the pipeline can be seen in the
Automation Industry, where something like putting wrong stickers on the street can off-
put an autonomous vehicle and confuse the decision-making module for horrible
outcomes.
25

We can classify attacks against a ML model along three main
dimensions:
• Attack timing – based on which part of the model pipeline
is affected
• Attack goal – based on the expected output
• Attacker’s knowledge about the model
26

2.1 Attack timing
An adversary can attack a ML model at two main stages of the learning
pipeline - during training or production. These two categories of attacks
are respectively known as;
• Training-time attack (a.k.a. causative or poisoning attack)
• Inference-time attack (a.k.a. exploratory or evasion attack)
27

2.1 Attack timing - Data Poisoning
Poisoning is the contamination of the training dataset.
• Given that datasets impact learning algorithms, poisoning possibly
holds the potential to reprogram algorithms.
• Serious concerns have been highlighted, particularly about user-
generated training data, such as for content recommendation or
natural language models, given the prevalence of false accounts.
28

2.1 Attack timing - Evasion
Evasion attacks include taking advantage of a trained model’s flaw.
• Trying to manipulate/ mislead a recommendation system by adding
fake reviews is one such example.
• Another example is, the effect of damaged road signs on
autonomous driving.
29

2.2 Attack goal
Two categories:
• Targeted attacks - attacks where the
attacker tries to misguide the
model to a particular class other
than the true class.
• Non-targeted attacks - attacks
where the attacker tries to
misguide the model to predict any
of the incorrect classes.
30

2.3 Attacker's knowledge
Can be further divided into two extensive categories:
• Black box attacks - assumes that the attacker can query the model
but does not have access to the model’s architecture, parameters,
training dataset etc. (ie.:one where we only know the model's inputs,
and have an oracle we can query for output labels or confidence
scores)
• White box attacks - assumes that the attacker knows everything
about the model’s architecture, parameters, training dataset etc.
31

2.3 Attacker's knowledge - Black box
CopyAttack
• A reinforcement learning based black-box attack method that
harnesses real users from a source domain by copying their profiles
into the target domain with the goal of promoting a subset of items.
• CopyAttack is constructed to both efficiently and effectively learn
policy gradient networks that first select, and then further refine/craft,
user profiles from the source domain to ultimately copy into the target
domain. CopyAttack's goal is to maximize the hit ratio of the targeted
items in the Top-k recommendation list of the users in the target
domain.
32

2.3 Attacker's knowledge - White box
Fast Gradient Sign Method
• Google researchers Ian J. Goodfellow, Jonathon Shlens, and Christian
Szegedy developed one of the first attacks for producing adversarial
examples. The assault was referred to as the fast gradient sign method.
• It consisted of adding a linear amount of undetectable noise to a picture
to falsely cause a model to identify it. This noise is calculated by
multiplying by a small constant epsilon the sign of the gradient
concerning the image we wish to affect. In its most basic form, FGSM
entails the addition of noise (not random noise), the direction of which
corresponds to the same gradient as that of the cost function
concerning the data
33

2.3 Attacker's knowledge - White box
Carlini and Wagner
The Carlini and Wagner (C&W) attacks are a family of adversarial attacks on
neural networks that are based on solving an optimization problem that
minimizes the perturbation and the distance to the decision boundary of
the neural network. The C&W attacks can target specific classes, generate
imperceptible perturbations, and bypass defensive mechanisms, such as
gradient masking or distillation. The C&W attacks are considered to be one
of the strongest adversarial attacks on neural networks, but they are also
very complex and expensive.
34

Table of content
4. Conclusion
35

3.1. Proactive countermeasures:
• Robust Optimization: optimize the model with the assumption that
every sample in the training data D can be a source for adversarial
behavior
• Defensive Distillation: train distilled model using the label predicted
by the initial model. It explicit relative information about classes
prevents models from fitting too tightly to the data, and contributes
to a better generalization around training points.
3.2. Reactive countermeasures:
• Adversarial Detecting
• Input Reconstruction
• Network Verification
36

3.1.1. Adversarial Training
Idea:
Including adversarial samples in training of a model makes it more
robust.
The objective function of the model adversarially-trained is:
Adversarial training provides better generalization performance.
37

38
Idea

Example: Adversarial Personalized Ranking (APS)
MF-BPS: maximize the distance between positively and negatively rated
items.
Given the training dataset 𝐷 composed by positive and negative for each
user, and the triple items (𝑢, 𝑖, 𝑗) (user 𝑢, a positive item 𝑖 and negative
item 𝑗), the BPR objective function is defined as:
The predicted rating yui and yuj is estimated by Matrix Factorization.
39

Example: Adversarial Personalized Ranking (APS)
Adversarial training of MF-BPR can be formulated as:
40

3.1.2. Defensive Distillation
Defensive Distillation in DNN Classifier
- First train an initial network F on data X with a softmax temperature
of T.
- Use the probability vector F(X), which includes additional
knowledge about classes compared to a class label, predicted by
network F to train a distilled network Fd at temperature T on the
same data X.
41

Defensive Distillation with Knowledge in in RecSys:
1. Regular review-based recommendation models. Wi and Wj are the
reviews of user i and item j in the training set, which are
respectively concatenated before inputting into the CNN networks
to make the final rating prediction.
2. Distillation model: decomposing user review modeling into a
teacher model, which only exists in the training phase, and will
not be leveraged when making predictions. wij is the review from
user i to item j.
42

Defensive Distillation with knowledge in in RecSys:
The left user-item prediction network and the right review prediction
network are connected in an adversarial manner, and the shared and
private features in the top layer of each network are encouraged to be
orthogonal.
43

3.2. Reactive countermeasures
• Adversarial Detecting: train a detection model is a supervised
classifier detects adversarial examples.
• Input Reconstruction: reconstruct the input data, remove the
adversarial noise.
• Network Verification: train model verify the prediction.
44

Table of content
4. Conclusion
45

4. Conclusion
46
We have reviewed:
- Classes of Recommendation System: CBF, CF & Hybrid
- Adversarial Attack to Recommendation System: FGSM, C&M, ...
- Defense methods against Adversarial Attack: Adverarial Training,
Distillation & detection

An Overview of Advesarial-attack-in-Recommender-system.pptx

Recommended

Recommended

More Related Content

Similar to An Overview of Advesarial-attack-in-Recommender-system.pptx

Similar to An Overview of Advesarial-attack-in-Recommender-system.pptx (20)

Recently uploaded

Recently uploaded (20)

An Overview of Advesarial-attack-in-Recommender-system.pptx