Membership Inference Attacks
Against Machine Learning
Models
Reza Shokri, Marco Stronati, Congzheng Song and Vitaly Shmatikov
2017 IEEE Symposium on Security and Privacy
1. Background
2
Supervised Learning
GOAL : Perform a Classification Task
Data
Population
(X, Y)
…..Training
Data
Test
Data
Target
Model
Learn f: X -> Y
Learn relationship between X & Y 3
Draw samplesTrainingTesting
Prediction
Results
Problem?
Overfitting
Machine Learning as a Service
4
Cloud Platform
Google Prediction
API
Amazon ML
Training
Data
Upload
Target
Model
Pay & receive
black-box access
Test
Data
Query
Create/Update
Test
Data
Test
Data
Pay &
Query
Privacy Breach in ML as a Service
5
Cloud Platform
Google Prediction
API
Amazon ML
Training
Data
Upload
Target
Model
Pay & receive
black-box access
Test
Data
Query
Test
Data
Test
Data
Pay &
Query
X
Query (x)
Result (y)
Does (x,y) belongs
to the training data
X = Patient’s symptoms
Y = Patient has cancer/not
(X, Y) Є Training Data Reveals
that the patient has cancer
Membership Inference Attack
6
Training
Data
Target
Model
X Query (x)
Result (y)
Does (x,y) belongs to
the training data ?
Data
Population
(X, Y)
…..
Draw samplesTraining
2. Problem & Solution
7
Problem Statement
8
Training Data
(X,Y)
Target
Model
X Query (x = [xi, … xN])
Result (y = [p1, .. pk])
Data
Population
(X, Y)
…..
Draw samplesTraining
Black-box or Query
Access ONLY
Adversary does not know anything about the
model & its training procedure except the
format of X, Y and its possible values
Little or no knowledge about
the data population
Little : e.g. Knows the prior of the marginal
distribution of the features
Prediction
Vector/Confidence
ValuesAttack
Model
[x,y]
Result (a = in/out)
Attack model attempts to learn how
differently target model behave (y) for x in
training data vs x not in training data
Shadow Training
9
Attack
Model
Training Data
([X,Y], A)Training Generate
Shadow
Model
Imitate the Behavior
of Target Model
Shadow
Training Data
(X,Y)
Training
Shadow
Test Data
(X,Y)
Shadow
Model
Shadow
Training Data
(X,Y)
Training
Shadow
Test Data
(X,Y)
…..
…..
…..
([X,Y], A = in)
([X,Y], A = out)
Generate Shadow Train & Test Data
10
Knowledge of the Adversary
(about data population)
Model-based synthesis Statistics-based synthesis Noisy real data
Knowledge: Nothing Knowledge: prior of the
distribution of features of X
Knowledge: Noisy version
of the training data
Method: Generate data
using target model itself
Method: Sample data
from distribution
Method: Flip binary values
of 10-20% features in X
Model-based Synthesis
For each class k:
x <- randomly generate
iterate till you find enough amount of shadow data:
y <- ftarget(x)
if yk > confidence threshold:
sample x
x <- randomly generate by flipping few constant number of random
features
11
X which are classified with high
confidence (yc) by target model
should be similar to the training data
Training Attack Model
12
Attack
Model 1
Training Data
([X,Y], A)
Training
Generate
Shadow
Model
Shadow
Training Data
(X,Y)
Training
Shadow
Test Data
(X,Y)
Shadow
Model
Shadow
Training Data
(X,Y)
Training
Shadow
Test Data
(X,Y)
…..
…..
…..
Training Data 1
([X,Y], A)
Training Data K
([X,Y], A)
…..
…..
….. Partition
Max_index(Y) = 1
Max_index(Y) = K
Attack
Model K Training
…..
…..
…..
During testing [x,y] is
applied to
Max_index(y)th
attack model
3. Evaluation
13
Data
14
Dataset CIFAR Purchase Location Hospital Stay MNIST UCI Adult
X 32*32 size
images
600 features 446 binary
features
6170 binary
features
32*32 images 14
K = |Y| 10, 100 2, 10, 20, 50 & 100 30 100 10 2
Target Model Neural Network
(NN)
Google P-API,
Amazon ML, NN
Google P-API Google P-API Google P-API/
Amazon ML
Google P-API/
Amazon ML
Number of
Shadow Models
100 20 60 10 50 20
Range of class size is explored (binary to 100)
Trained only using a NN
Google P-API : no control of the training
Amazon ML : # of epochs & regularization amount are changed
Two set of configurations of Amazon ML: (10, 1e-6) & (100, 1e-4)
Evaluation Setup
• Metrices
1. Precision – Fraction of records inferred by attack model as members of the
training dataset that are indeed members
2. Recall – Fraction of members that are correctly identified as members by
the attack model
• Test set : 50 % members & 50 % non-members of target model
Baseline precision = 0.5
• Attack models are trained using similar architecture (NN/ Google P-
API)
15
[R1] Number of Classes & Training Set Size
16
K = 10 K = 100
[O1] As training data size target model
increases, attack model’s precision drops
[O2] As K increases, information leakage
is high (precision close to 1)
[R2] Overfitting & Model Types
17
Overfitting
---------------
0.34
0.5
0.5
0.16 [O3] Google Prediction API leaks more compare Amazon ML or
NN
[O4] When the model is less over fitted, then leaks less. e.g. NN
model
[O5] Overfitting is not the ONLY reason for information leakage.
The model structure & architecture are also the reason for
information leakage.
E.g. Google P-API vs Amazon ML
[R3] Performance with Noisy Data
18
[O6] With the increase in noise in attack model’s training data, it’s
performance drops
[O7] Even when the noise level is 20%, the attack model
outperforms baseline (0.5).
* This indicates that the proposed model is robust even when
the adversary’s assumption about target model’s training data is
not accurate
[R4] Performance with Synthetic Data
19
[O8] There is a significant performance drop when marginal
based synthetic data is used
[O9] Attack using model-based synthetic data performs closer to
attack using real data except for classes with less training
examples in target model’s training data.
* Membership inference attack is possible only with black-box
access to target model without any knowledge about data
population
Less than 0.1 fraction of classes performs poor
These classes contributes under 0.6% of the
training data
Performance in Six Datasets
20
Overfit/Performance Gap
------------------------
0
0.06
0.33
0.01
0.13
0.22
0.31
0.34
0.15
Attack performance is
low
1. [O2] K is low
2. [O4] Low Overfitting
Attack performance is
high
1. [O4] High Overfitting
Attack performance is low even with high
overfitting
[O5] Overfitting is not the ONLY reason for
information leakage
[R5] Why does attack successful?
• Prediction Uncertainty vs Accuracy of Target Model
PU = Normalized entropy of the prediction vector
21
K = 10 K = 100
[O10] When PU is low, the accuracy of target model is
high
[O11] With the increase in K, PU increases & accuracy
of target model drops
[O12] As K increases, PU distribution is significantly
differ for member vs non-members
* Behavioral difference in target model for a member
vs non-member is utilized by the attack for successful
attack
Fractionoftestsample
Mitigation Strategies
1. Restrict the prediction vector to top k classes, in most restrictive
scenario return only the label of top most likely class
2. Round up probabilities to d digits
3. Increase entropy of prediction vector
- E.g. Apply a temperature variable to SoftMax layer of NN
4. Regularization – penalize for larger model parameter
22
[R6] Effect of Mitigation Strategies
23
[O13] Precision drops significantly, when top 1 label
ONLY is returned or with high regularization
[O14] Accuracy drops significantly, when t=20 or
with high regularization
[O15] Target model’s performance is even increased
with required amount of regularization
[O16] High regularization may significantly reduce
the target model’s performance
[O17] Even with the mitigation strategies, attack
model outperforms baseline (0.5)
* Attack is robust
Conclusion
• Success of the membership inference attack is depended on
1. Generalizability of the target model
2. Diversity of the training data
• Overfitting is one of the important reason for information leakage, but not the
ONLY reason. Model type & training also determines the amount of information
leakage
• Membership inference attack can be successful with black-box access to target
model even if the adversary has no knowledge about data population
24
Thank You
25

Memebership inference attacks against machine learning models

  • 1.
    Membership Inference Attacks AgainstMachine Learning Models Reza Shokri, Marco Stronati, Congzheng Song and Vitaly Shmatikov 2017 IEEE Symposium on Security and Privacy
  • 2.
  • 3.
    Supervised Learning GOAL :Perform a Classification Task Data Population (X, Y) …..Training Data Test Data Target Model Learn f: X -> Y Learn relationship between X & Y 3 Draw samplesTrainingTesting Prediction Results Problem? Overfitting
  • 4.
    Machine Learning asa Service 4 Cloud Platform Google Prediction API Amazon ML Training Data Upload Target Model Pay & receive black-box access Test Data Query Create/Update Test Data Test Data Pay & Query
  • 5.
    Privacy Breach inML as a Service 5 Cloud Platform Google Prediction API Amazon ML Training Data Upload Target Model Pay & receive black-box access Test Data Query Test Data Test Data Pay & Query X Query (x) Result (y) Does (x,y) belongs to the training data X = Patient’s symptoms Y = Patient has cancer/not (X, Y) Є Training Data Reveals that the patient has cancer
  • 6.
    Membership Inference Attack 6 Training Data Target Model XQuery (x) Result (y) Does (x,y) belongs to the training data ? Data Population (X, Y) ….. Draw samplesTraining
  • 7.
    2. Problem &Solution 7
  • 8.
    Problem Statement 8 Training Data (X,Y) Target Model XQuery (x = [xi, … xN]) Result (y = [p1, .. pk]) Data Population (X, Y) ….. Draw samplesTraining Black-box or Query Access ONLY Adversary does not know anything about the model & its training procedure except the format of X, Y and its possible values Little or no knowledge about the data population Little : e.g. Knows the prior of the marginal distribution of the features Prediction Vector/Confidence ValuesAttack Model [x,y] Result (a = in/out) Attack model attempts to learn how differently target model behave (y) for x in training data vs x not in training data
  • 9.
    Shadow Training 9 Attack Model Training Data ([X,Y],A)Training Generate Shadow Model Imitate the Behavior of Target Model Shadow Training Data (X,Y) Training Shadow Test Data (X,Y) Shadow Model Shadow Training Data (X,Y) Training Shadow Test Data (X,Y) ….. ….. ….. ([X,Y], A = in) ([X,Y], A = out)
  • 10.
    Generate Shadow Train& Test Data 10 Knowledge of the Adversary (about data population) Model-based synthesis Statistics-based synthesis Noisy real data Knowledge: Nothing Knowledge: prior of the distribution of features of X Knowledge: Noisy version of the training data Method: Generate data using target model itself Method: Sample data from distribution Method: Flip binary values of 10-20% features in X
  • 11.
    Model-based Synthesis For eachclass k: x <- randomly generate iterate till you find enough amount of shadow data: y <- ftarget(x) if yk > confidence threshold: sample x x <- randomly generate by flipping few constant number of random features 11 X which are classified with high confidence (yc) by target model should be similar to the training data
  • 12.
    Training Attack Model 12 Attack Model1 Training Data ([X,Y], A) Training Generate Shadow Model Shadow Training Data (X,Y) Training Shadow Test Data (X,Y) Shadow Model Shadow Training Data (X,Y) Training Shadow Test Data (X,Y) ….. ….. ….. Training Data 1 ([X,Y], A) Training Data K ([X,Y], A) ….. ….. ….. Partition Max_index(Y) = 1 Max_index(Y) = K Attack Model K Training ….. ….. ….. During testing [x,y] is applied to Max_index(y)th attack model
  • 13.
  • 14.
    Data 14 Dataset CIFAR PurchaseLocation Hospital Stay MNIST UCI Adult X 32*32 size images 600 features 446 binary features 6170 binary features 32*32 images 14 K = |Y| 10, 100 2, 10, 20, 50 & 100 30 100 10 2 Target Model Neural Network (NN) Google P-API, Amazon ML, NN Google P-API Google P-API Google P-API/ Amazon ML Google P-API/ Amazon ML Number of Shadow Models 100 20 60 10 50 20 Range of class size is explored (binary to 100) Trained only using a NN Google P-API : no control of the training Amazon ML : # of epochs & regularization amount are changed Two set of configurations of Amazon ML: (10, 1e-6) & (100, 1e-4)
  • 15.
    Evaluation Setup • Metrices 1.Precision – Fraction of records inferred by attack model as members of the training dataset that are indeed members 2. Recall – Fraction of members that are correctly identified as members by the attack model • Test set : 50 % members & 50 % non-members of target model Baseline precision = 0.5 • Attack models are trained using similar architecture (NN/ Google P- API) 15
  • 16.
    [R1] Number ofClasses & Training Set Size 16 K = 10 K = 100 [O1] As training data size target model increases, attack model’s precision drops [O2] As K increases, information leakage is high (precision close to 1)
  • 17.
    [R2] Overfitting &Model Types 17 Overfitting --------------- 0.34 0.5 0.5 0.16 [O3] Google Prediction API leaks more compare Amazon ML or NN [O4] When the model is less over fitted, then leaks less. e.g. NN model [O5] Overfitting is not the ONLY reason for information leakage. The model structure & architecture are also the reason for information leakage. E.g. Google P-API vs Amazon ML
  • 18.
    [R3] Performance withNoisy Data 18 [O6] With the increase in noise in attack model’s training data, it’s performance drops [O7] Even when the noise level is 20%, the attack model outperforms baseline (0.5). * This indicates that the proposed model is robust even when the adversary’s assumption about target model’s training data is not accurate
  • 19.
    [R4] Performance withSynthetic Data 19 [O8] There is a significant performance drop when marginal based synthetic data is used [O9] Attack using model-based synthetic data performs closer to attack using real data except for classes with less training examples in target model’s training data. * Membership inference attack is possible only with black-box access to target model without any knowledge about data population Less than 0.1 fraction of classes performs poor These classes contributes under 0.6% of the training data
  • 20.
    Performance in SixDatasets 20 Overfit/Performance Gap ------------------------ 0 0.06 0.33 0.01 0.13 0.22 0.31 0.34 0.15 Attack performance is low 1. [O2] K is low 2. [O4] Low Overfitting Attack performance is high 1. [O4] High Overfitting Attack performance is low even with high overfitting [O5] Overfitting is not the ONLY reason for information leakage
  • 21.
    [R5] Why doesattack successful? • Prediction Uncertainty vs Accuracy of Target Model PU = Normalized entropy of the prediction vector 21 K = 10 K = 100 [O10] When PU is low, the accuracy of target model is high [O11] With the increase in K, PU increases & accuracy of target model drops [O12] As K increases, PU distribution is significantly differ for member vs non-members * Behavioral difference in target model for a member vs non-member is utilized by the attack for successful attack Fractionoftestsample
  • 22.
    Mitigation Strategies 1. Restrictthe prediction vector to top k classes, in most restrictive scenario return only the label of top most likely class 2. Round up probabilities to d digits 3. Increase entropy of prediction vector - E.g. Apply a temperature variable to SoftMax layer of NN 4. Regularization – penalize for larger model parameter 22
  • 23.
    [R6] Effect ofMitigation Strategies 23 [O13] Precision drops significantly, when top 1 label ONLY is returned or with high regularization [O14] Accuracy drops significantly, when t=20 or with high regularization [O15] Target model’s performance is even increased with required amount of regularization [O16] High regularization may significantly reduce the target model’s performance [O17] Even with the mitigation strategies, attack model outperforms baseline (0.5) * Attack is robust
  • 24.
    Conclusion • Success ofthe membership inference attack is depended on 1. Generalizability of the target model 2. Diversity of the training data • Overfitting is one of the important reason for information leakage, but not the ONLY reason. Model type & training also determines the amount of information leakage • Membership inference attack can be successful with black-box access to target model even if the adversary has no knowledge about data population 24
  • 25.