Memebership inference attacks against machine learning models

Membership Inference Attacks
Against Machine Learning
Models
Reza Shokri, Marco Stronati, Congzheng Song and Vitaly Shmatikov
2017 IEEE Symposium on Security and Privacy

Supervised Learning
GOAL : Perform a Classification Task
Data
Population
(X, Y)
…..Training
Data
Test
Data
Target
Model
Learn f: X -> Y
Learn relationship between X & Y 3
Draw samplesTrainingTesting
Prediction
Results
Problem?
Overfitting

Machine Learning as a Service
4
Cloud Platform
Google Prediction
API
Amazon ML
Training
Data
Upload
Target
Model
Pay & receive
black-box access
Test
Data
Query
Create/Update
Test
Data
Test
Data
Pay &
Query

Privacy Breach in ML as a Service
5
Cloud Platform
Google Prediction
API
Amazon ML
Training
Data
Upload
Target
Model
Pay & receive
black-box access
Test
Data
Query
Test
Data
Test
Data
Pay &
Query
X
Query (x)
Result (y)
Does (x,y) belongs
to the training data
X = Patient’s symptoms
Y = Patient has cancer/not
(X, Y) Є Training Data Reveals
that the patient has cancer

Membership Inference Attack
6
Training
Data
Target
Model
X Query (x)
Result (y)
Does (x,y) belongs to
the training data ?
Data
Population
(X, Y)
…..
Draw samplesTraining

Problem Statement
8
Training Data
(X,Y)
Target
Model
X Query (x = [xi, … xN])
Result (y = [p1, .. pk])
Data
Population
(X, Y)
…..
Draw samplesTraining
Black-box or Query
Access ONLY
Adversary does not know anything about the
model & its training procedure except the
format of X, Y and its possible values
Little or no knowledge about
the data population
Little : e.g. Knows the prior of the marginal
distribution of the features
Prediction
Vector/Confidence
ValuesAttack
Model
[x,y]
Result (a = in/out)
Attack model attempts to learn how
differently target model behave (y) for x in
training data vs x not in training data

Shadow Training
9
Attack
Model
Training Data
([X,Y], A)Training Generate
Shadow
Model
Imitate the Behavior
of Target Model
Shadow
Training Data
(X,Y)
Training
Shadow
Test Data
(X,Y)
Shadow
Model
Shadow
Training Data
(X,Y)
Training
Shadow
Test Data
(X,Y)
…..
…..
…..
([X,Y], A = in)
([X,Y], A = out)

Generate Shadow Train & Test Data
10
Knowledge of the Adversary
(about data population)
Model-based synthesis Statistics-based synthesis Noisy real data
Knowledge: Nothing Knowledge: prior of the
distribution of features of X
Knowledge: Noisy version
of the training data
Method: Generate data
using target model itself
Method: Sample data
from distribution
Method: Flip binary values
of 10-20% features in X

Model-based Synthesis
For each class k:
x <- randomly generate
iterate till you find enough amount of shadow data:
y <- ftarget(x)
if yk > confidence threshold:
sample x
x <- randomly generate by flipping few constant number of random
features
11
X which are classified with high
confidence (yc) by target model
should be similar to the training data

Training Attack Model
12
Attack
Model 1
Training Data
([X,Y], A)
Training
Generate
Shadow
Model
Shadow
Training Data
(X,Y)
Training
Shadow
Test Data
(X,Y)
Shadow
Model
Shadow
Training Data
(X,Y)
Training
Shadow
Test Data
(X,Y)
…..
…..
…..
Training Data 1
([X,Y], A)
Training Data K
([X,Y], A)
…..
…..
….. Partition
Max_index(Y) = 1
Max_index(Y) = K
Attack
Model K Training
…..
…..
…..
During testing [x,y] is
applied to
Max_index(y)th
attack model

Data
14
Dataset CIFAR Purchase Location Hospital Stay MNIST UCI Adult
X 32*32 size
images
600 features 446 binary
features
6170 binary
features
32*32 images 14
K = |Y| 10, 100 2, 10, 20, 50 & 100 30 100 10 2
Target Model Neural Network
(NN)
Google P-API,
Amazon ML, NN
Google P-API Google P-API Google P-API/
Amazon ML
Google P-API/
Amazon ML
Number of
Shadow Models
100 20 60 10 50 20
Range of class size is explored (binary to 100)
Trained only using a NN
Google P-API : no control of the training
Amazon ML : # of epochs & regularization amount are changed
Two set of configurations of Amazon ML: (10, 1e-6) & (100, 1e-4)

Evaluation Setup
• Metrices
1. Precision – Fraction of records inferred by attack model as members of the
training dataset that are indeed members
2. Recall – Fraction of members that are correctly identified as members by
the attack model
• Test set : 50 % members & 50 % non-members of target model
Baseline precision = 0.5
• Attack models are trained using similar architecture (NN/ Google P-
API)
15

[R1] Number of Classes & Training Set Size
16
K = 10 K = 100
[O1] As training data size target model
increases, attack model’s precision drops
[O2] As K increases, information leakage
is high (precision close to 1)

[R2] Overfitting & Model Types
17
Overfitting
---------------
0.34
0.5
0.5
0.16 [O3] Google Prediction API leaks more compare Amazon ML or
NN
[O4] When the model is less over fitted, then leaks less. e.g. NN
model
[O5] Overfitting is not the ONLY reason for information leakage.
The model structure & architecture are also the reason for
information leakage.
E.g. Google P-API vs Amazon ML

[R3] Performance with Noisy Data
18
[O6] With the increase in noise in attack model’s training data, it’s
performance drops
[O7] Even when the noise level is 20%, the attack model
outperforms baseline (0.5).
* This indicates that the proposed model is robust even when
the adversary’s assumption about target model’s training data is
not accurate

[R4] Performance with Synthetic Data
19
[O8] There is a significant performance drop when marginal
based synthetic data is used
[O9] Attack using model-based synthetic data performs closer to
attack using real data except for classes with less training
examples in target model’s training data.
* Membership inference attack is possible only with black-box
access to target model without any knowledge about data
population
Less than 0.1 fraction of classes performs poor
These classes contributes under 0.6% of the
training data

Performance in Six Datasets
20
Overfit/Performance Gap
------------------------
0
0.06
0.33
0.01
0.13
0.22
0.31
0.34
0.15
Attack performance is
low
1. [O2] K is low
2. [O4] Low Overfitting
Attack performance is
high
1. [O4] High Overfitting
Attack performance is low even with high
overfitting
[O5] Overfitting is not the ONLY reason for
information leakage

[R5] Why does attack successful?
• Prediction Uncertainty vs Accuracy of Target Model
PU = Normalized entropy of the prediction vector
21
K = 10 K = 100
[O10] When PU is low, the accuracy of target model is
high
[O11] With the increase in K, PU increases & accuracy
of target model drops
[O12] As K increases, PU distribution is significantly
differ for member vs non-members
* Behavioral difference in target model for a member
vs non-member is utilized by the attack for successful
attack
Fractionoftestsample

Mitigation Strategies
1. Restrict the prediction vector to top k classes, in most restrictive
scenario return only the label of top most likely class
2. Round up probabilities to d digits
3. Increase entropy of prediction vector
- E.g. Apply a temperature variable to SoftMax layer of NN
4. Regularization – penalize for larger model parameter
22

[R6] Effect of Mitigation Strategies
23
[O13] Precision drops significantly, when top 1 label
ONLY is returned or with high regularization
[O14] Accuracy drops significantly, when t=20 or
with high regularization
[O15] Target model’s performance is even increased
with required amount of regularization
[O16] High regularization may significantly reduce
the target model’s performance
[O17] Even with the mitigation strategies, attack
model outperforms baseline (0.5)
* Attack is robust

Conclusion
• Success of the membership inference attack is depended on
1. Generalizability of the target model
2. Diversity of the training data
• Overfitting is one of the important reason for information leakage, but not the
ONLY reason. Model type & training also determines the amount of information
leakage
• Membership inference attack can be successful with black-box access to target
model even if the adversary has no knowledge about data population
24

Memebership inference attacks against machine learning models

More Related Content

What's hot

Similar to Memebership inference attacks against machine learning models

Recently uploaded

Memebership inference attacks against machine learning models