slides_security_and_privacy_in_machine_learning.pptx

Nicolas Papernot
Pennsylvania State University & Google Brain
Lecture for Prof. Trent J aeger’s CSE 543 Computer Security
Class
November 2017 - Penn

(Penn State)
(Google Brain)
Martín
Abadi
(Google
Brain)
Somesh (U of
Thank you to my collaborators
2
Alexey
Kurakin
(Google
Brain)
Xi Wu (Google

3
Machine Learning
Classifier
0.01 0.84 0.02 0.01 0.03 0.01
p(0|x,θ) p(1|x,θ) p(2|x,θ) p(7|x,θ) p(8|x,θ) p(9|x,θ)

Machine
Learning
Classifier
cost/loss function (~model error)
4

Part I
Security in machine learning
6

bad even if an attacker needs to know details of the
machine learning model to do an attack --- aka a white-box attacker
worse if attacker who knows very little (e.g. only
gets to ask a few questions) can do an attack --- aka a black-box attacker
ML
Attack Models
7

Attack Models
bad even if an attacker needs to know details of the
machine learning model to do an attack --- aka a white-box attacker
worse if attacker who knows very little (e.g. only
gets to ask a few questions) can do an attack --- aka a black-box attacker
ML
8

Adversari
al
examples
(white-box
attacks)
9

Jacobian-based Saliency Map Approach (J SMA)
ations of Deep Learning in Adversarial Settings
10
Papernot et al. The Limit

J acobian-Based Iterative Approach: source-target misclassification
11
Papernot et al. The Limitations of Deep Learning in Adversarial Settings

Evading a Neural Network Malware Classifier
P[X=Benign]=
0.10
P[X*=Benign]= 0.90
12
Grosse et al. Adversarial Perturbations Against Deep Neural Networks for Malware Classification

Supervised vs. reinforcement learning
13
Supervised learning Reinforcement learning
Model inputs Observation
(e.g., traffic sign, music, email)
Environment & Reward function
Model outputs Class
(e.g., stop/yield, jazz/classical,
spam/legitimate)
Action
Training “goal”
(i.e., cost/loss)
Minimize class prediction error
over pairs of (inputs, outputs)
Maximize reward
by exploring the environment and
taking actions
Example

Adversarial attacks on neural network policies
14
Huang et al. Adversarial Attacks on Neural Network Policies

Adversari
al
examples
(black-box
attacks)
15

Threat model of a black-box attack
16

Our approach to black-box attacks
Alleviate lack of knowledge
about model
17
Alleviate lack of
training data

Adversarial example transferability
These property comes in
several v
● Intra-technique
transferability:
ari
a
nts
:
○ Cross model transferability
○ Cross training set
transferability
● Cross-technique
transferability
ML A
18
Szegedy et al. Intriguing properties of neural networks

These property comes in
several v
● Intra-technique
transferability:
ari
a
nts
:
○ Cross model transferability
○ Cross training set
transferability
● Cross-technique
transferabi
lity
ML A
ML B
Victim
19
Szegedy et al. Intriguing properties of neural networks

20

Cross-technique transferability
21
Papernot et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial

Cross-technique transferability
22
Papernot et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial

23
Alleviate lack of
training data
about model

Attacking remotely hosted black-box models
Remote
ML sys
24

Remote
ML sys
25

2
Remote
ML sys
6

Remote
ML sys
27

28
about model
Alleviate lack of
training data

Results on real-world remote systems
29
[PMG16a] Papernot et al. Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples
Remote Platform ML technique Number of queries Adversarial
examples
misclassified
(after querying)
Deep Learning 6,400 84.24%
Logistic Regression 800 96.19%
Unknown 2,000 97.72%

Benchmarkin
g progress in
th
e adversarial
ML
community
30

Growing community
1.3K+
stars
340+ forks
40+ contributors
32

34
Adversarial examples are a tangible
instance of hypothetical AI safety
problems
Image source: http://www.nerdist.com/wp-content/uploads/2013/07/Space-Odyssey-4.jpg

Part II
Privacy in machine
learning
35

Types of adversaries and our threat model
In our work, the threat model assumes:
- Adversary can make a potentially unbounded number of
queries
- Adversary has access to model internals
Model querying (black-box
adversary)
Black-box
ML
36

Teacher ensemble
Sensitiv
e Data
40
Partition 1 Teacher 1
Partition n Teacher n

Teacher ensemble
Partition
3
Teacher
3
Aggregate
d
Teacher
Sensitiv
e Data
44

Student training
Partition
3
Teacher
3
Aggregate
d
Teacher
Sensitiv
e Data
45
Student
Public
Data

Why train an additional “student”model?
1
2
46

Student training
Partition
3
Teacher
3
Aggregate
d
Teacher
Sensitiv
e Data
47
Student
Public
Data

Differential privacy analysis
49

Experimental setup
51
Dataset Teacher Model
Student
Model
MNIST Convolutional Neural Network Generative Adversarial Networks
SVHN Convolutional Neural Network Generative Adversarial Networks
UCI Adult Random Forest Random Forest
UCI Diabetes Random Forest Random Forest
/ /models/tree/master/differential_privacy/multiple_teacher
s

Aggregated teacher accuracy
52

Trade-off between student accuracy and privacy
53

Trade-off between student accuracy and privacy
54
UCI Diabetes
᷑ 1.44
ᶖ 10-5
Non-private
baseline
93.81%
Student
accuracy
93.94%

Synergy between privacy and generalization
55

www.papernot.fr
@NicolasPapernot 56

slides_security_and_privacy_in_machine_learning.pptx

More Related Content

Similar to slides_security_and_privacy_in_machine_learning.pptx

Recently uploaded

slides_security_and_privacy_in_machine_learning.pptx