Machine learning models, especially deep neural networks have been shown to reveal membership information of inputs in the training data. Such membership inference attacks are a serious privacy concern, for example, patients providing medical records to build a model that detects HIV would not want their identity to be leaked. Further, we show that the attack accuracy amplifies when the model is used to predict samples that come from a different distribution than the training set, which is often the case in real world applications. Therefore, we propose the use of causal learning approaches where a model learns the causal relationship between the input features and the outcome. An ideal causal model is known to be invariant to the training distribution and hence generalizes well to shifts between samples from the same distribution and across different distributions. First, we prove that models learned using causal structure provide stronger differential privacy guarantees than associational models under reasonable assumptions. Next, we show that causal models trained on sufficiently large samples are robust to membership inference attacks across different distributions of datasets and those trained on smaller sample sizes always have lower attack accuracy than corresponding associational models. Finally, we confirm our theoretical claims with experimental evaluation on 4 moderately complex Bayesian network datasets and a colored MNIST image dataset. Associational models exhibit upto 80\% attack accuracy under different test distributions and sample sizes whereas causal models exhibit attack accuracy close to a random guess. Our results confirm the value of the generalizability of causal models in reducing susceptibility to privacy attacks. Paper available at https://arxiv.org/abs/1909.12732
1. Alleviating Privacy Attacks via
Causal Learning
Shruti Tople, Amit Sharma, Aditya V. Nori
Microsoft Research
https://arxiv.org/abs/1909.12732
https://github.com/microsoft/robustdg
2. Motivation: ML models leak information
about data points in the training set
Neural
Network
TrainingHealth Records
(HIV/AIDS
patients)
ML-as-a-service
Member of
Train Dataset
Non-member
Membership Inference Attacks
[SPโ17][CSFโ18][NDSSโ19][SPโ19]
3. The likely reason is overfitting
Output
85%
Output
95%
Overfitting to
dataset
โข Neural networks or associational models
overfit to the training dataset
โข Membership inference adversary exploits
differences in prediction score for training and
test data [CSFโ18]
4. Overfitting to
distribution
The likely reason is overfitting
โข Neural networks or associational models
overfit to the training dataset
โข Membership inference attacks exploit
differences in prediction score for training and
test data [CSFโ18]
โข Privacy risk can increase when model is
deployed to different distributions
โข E.g., Hospital in one region shares the model to
other regions
Output
85%
Output
95%
Overfitting to
dataset
Output
75%
Poor generalization across distributions exacerbates
membership inference risk.
6. Can causal ML models help?
Contributions
1. Causal models provide stronger (differential) privacy guarantees than
associational models.
โข Due to their better generalizability on new distributions.
2. And hence are more robust to membership inference attacks.
โข As the training dataset size โ โ, membership inference attackโs accuracy drops to a
random guess.
3. We empirically demonstrate privacy benefits of causal models across 5 datasets.
โข Associational models exhibit up to 80% attack accuracy whereas causal models exhibit
attack accuracy close to 50%.
Causal
Learning
Privacy
8. Background: Causal Learning
Use a structural causal model (SCM) that defines what
conditional probabilities are invariant across different
distributions [Pearlโ09].
Causal Predictive Model: A prediction model based only
on the parents of the outcome Y.
What if SCM is not known? Learn an invariant feature
representation across distributions [ABGDโ19, MTSโ20].
For ML models, causal learning can be useful for
fairness [KLRSโ17]
explainability [DSZโ16, MTSโ19]
privacy [this work]
Disease
Severity
๐
Blood
Pressure
Heart
Rate
๐ฟ ๐๐๐๐๐๐ ๐ฟ ๐๐๐๐๐๐
๐ฟ ๐ ๐ฟ ๐
Weight Age
10. Why is a model based on causal parents
invariant across data distributions?
๐
๐๐0 ๐ ๐๐ด
๐๐2
๐๐1
๐ ๐ถ๐ป
๐๐๐
Intervention
๐
๐๐0 ๐ ๐๐ด
๐๐2
๐๐1
๐ ๐ถ๐ป
๐๐๐
๐(๐|๐ ๐๐ด) is invariant across different distributions, unless there is a
change in true data-generating process for Y.
11. Result 1: Worst-case out-of-distribution error of a
causal model is lower than an associational model.
12. For any model โ, and ๐โ such that ๐โ ๐ ๐ ๐๐ด = ๐(๐|๐ ๐๐ด),
In-Distribution Error (IDE)= ๐๐๐ ๐ ๐, ๐ = ๐ ๐ท ๐, ๐ โ ๐ ๐บโผP(๐, ๐)
Expected loss on the same distribution as the train data
Out-of-Distribution Error (ODE)=๐๐๐ ๐,๐โ ๐, ๐ = ๐ ๐ทโ ๐, ๐ โ ๐ ๐บโผP ๐, ๐
Expected loss on a different distribution ๐โ
than the train data
Result 1: Worst-case out-of-distribution error of a
causal model is lower than an associational model.
13. For any model โ, and ๐โ such that ๐โ ๐ ๐ ๐๐ด = ๐(๐|๐ ๐๐ด),
In-Distribution Error (IDE)= ๐๐๐ ๐ ๐, ๐ = ๐ ๐ท ๐, ๐ โ ๐ ๐บโผP(๐, ๐)
Expected loss on the same distribution as the train data
Out-of-Distribution Error (ODE)=๐๐๐ ๐,๐โ ๐, ๐ = ๐ ๐ทโ ๐, ๐ โ ๐ ๐บโผP ๐, ๐
Expected loss on a different distribution ๐โ
than the train data
Proof Idea. Simple case: Assume ๐ฆ = ๐(๐) is deterministic.
๐๐๐ ๐,๐โ ๐ ๐, ๐ โค ๐๐๐ ๐(๐ ๐, ๐) + ๐ ๐๐๐ ๐ ๐ท, ๐ทโ
Discrepancy
b/w ๐ท and ๐ทโ
distributions
Causal Model
Result 1: Worst-case out-of-distribution error of a
causal model is lower than an associational model.
14. For any model โ, and ๐โ such that ๐โ ๐ ๐ ๐๐ด = ๐(๐|๐ ๐๐ด),
In-Distribution Error (IDE)= ๐๐๐ ๐ ๐, ๐ = ๐ ๐ท ๐, ๐ โ ๐ ๐บโผP(๐, ๐)
Expected loss on the same distribution as the train data
Out-of-Distribution Error (ODE)=๐๐๐ ๐,๐โ ๐, ๐ = ๐ ๐ทโ ๐, ๐ โ ๐ ๐บโผP ๐, ๐
Expected loss on a different distribution ๐โ
than the train data
Proof Idea. Simple case: Assume ๐ฆ = ๐(๐) is deterministic.
๐๐๐ ๐,๐โ ๐ ๐, ๐ โค ๐๐๐ ๐(๐ ๐, ๐) + ๐ ๐๐๐ ๐ ๐ท, ๐ทโ
๐๐๐ ๐,๐โ ๐ ๐, ๐ โค ๐๐๐ ๐ ๐ ๐, ๐ + ๐ ๐๐๐ ๐ ๐ท, ๐ทโ
+ ๐ ๐ทโ(๐ ๐,๐ท
๐ถ๐ท๐ป
, ๐)
โ max
๐โ
๐๐๐๐๐จ๐ฎ๐ง๐ ๐,๐โ ๐ ๐, ๐ โค max
๐โ
๐๐๐๐๐จ๐ฎ๐ง๐ ๐,๐โ ๐ ๐, ๐
Discrepancy
b/w ๐ท and ๐ทโ
distributions
Optimal ๐ ๐ on P is
not optimal on ๐ทโ
Causal Model
Assoc. Model
Result 1: Worst-case out-of-distribution error of a
causal model is lower than an associational model.
15. And better generalization results in lower
sensitivity for a causal model
Sensitivity: If a single data point ๐, ๐ฆ โผ ๐โ is added to the train
dataset ๐ to create ๐โฒ, how much does the learnt model h ๐
min
change?
Since the optimal causal model is the same across all ๐โ
, adding
any ๐, ๐ฆ โผ ๐โ has less impact on a trained causal model.
Sensitivity for a causal
model
Sensitivity for an
associational model
16. Main Result: A causal model has stronger
Differential Privacy guarantees
Let M be a mechanism that returns a ML model trained over dataset ๐, M(๐) = โ.
Differential Privacy [DRโ14]: A learning mechanism M satisfies ๐-differential
privacy if for any two datasets, ๐, ๐โฒ that differ in one data point,
Pr(M ๐ โ๐ป)
Pr(M ๐โฒ โ๐ป)
โค ๐ ๐.
(Smaller ๐ values provide better privacy guarantees)
Since lower sensitivity โ lower ๐,
Theorem: When equivalent Laplace noise is added and models are trained on same
dataset, causal mechanism MC provides ๐ ๐ถ-DP and associational mechanism MA
provides ๐ ๐ด-DP guarantees such that:
๐ ๐ โค ๐ ๐จ
17. Therefore, causal models are more robust to
membership inference (MI) attacks
Advantage of an MI adversary:
(True Positive Rate โ False Positive Rate)
in detecting whether ๐ฅ is from training dataset or not.
[From Yeom et al. CSFโ18] Membership advantage of an adversary is bounded by
๐ ๐
โ 1.
Since the optimal causal models are the same for ๐ and ๐โ,
As ๐ โ โ, membership advantage of causal model โ 0.
Theorem: When trained on the same dataset of size ๐, membership
advantage of a causal model is lower than the membership advantage for an
associational model.
19. Goal: Compare MI attack accuracy between
causal and associational models
[BN] When true causal structure is known
Datasets generated from Bayesian networks: Child, Sachs, Water, Alarm
Causal model: MLE estimation based on Yโs parents
Associational model: Neural networks with 3 linear layers
๐โ: Noise added to conditional probabilities (uniform or additive)
[MNIST] When true causal structure is unknown
Colored MNIST dataset (Digits are correlated with color)
Causal Model: Invariant Risk Minimization that utilizes ๐ ๐ ๐ ๐๐ด is same across distributions [ABGDโ19]
Associational Model: Empirical Risk Minimization using the same NN architecture
๐โ: Different correlations between color and digit than the train dataset
Attacker Model: Predict whether an input belongs to train dataset or not
20. [BN] With uniform noise, MI attack accuracy
for a causal model is near a random guess
80%
50%
For associational models, the attacker can guess membership in training set with 80% accuracy.
21. [BN-Child] With uniform noise, MI attack accuracy
for a causal model is near a random guess
80%
50%
For associational models, the attacker can guess membership in training set with 80% accuracy.
Privacy without loss in utility: Causal & DNN models achieve same prediction accuracy.
22. [BN-Child] MI Attack accuracy increases with
amount of noise for associational models, but
stays constant at 50% for causal models
23. [BN] Consistent results across all four datasets
High attack accuracy for associational
models when ๐โ
(Test2) has uniform noise.
Same classification accuracy between
causal and associational models.
24. [MNIST] MI attack accuracy is lower for invariant
risk minimizer compared to associational model
IRM model motivated by causal reasoning has 53% attack accuracy, close to random.
Associational model also fails to generalize: 16% accuracy on test set.
Model
Train
Accuracy
(%)
Test
Accuracy
(%)
Attack
Accuracy
(%)
Causal Model
(IRM)
70 69 53
Associational
Model (ERM)
87 16 66
25. Conclusion
โข Established theoretical connection between causality and differential privacy.
โข Demonstrated the benefits of causal ML models for alleviating privacy attacks,
both theoretically and empirically.
โข Code available at https://github.com/microsoft/robustdg
Future work: Investigate robustness of causal models with other kinds of
adversarial attacks.
Causal
Learning
Privacy
thank you!
Amit Sharma
Microsoft Research
26. References
โข [ABGDโ19] Martin Arjovsky, Lรฉon Bottou, Ishaan Gulrajani, and David Lopez-Paz. Invariant risk minimization. arXiv
preprint arXiv:1907.02893, 2019.
โข [CSFโ18] Yeom, S., Giacomelli, I., Fredrikson, M., and Jha, S. Privacy risk in machine learning: Analyzing the connection
to overfitting. CSF 2018.
โข [DRโ14] Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations of differential privacy. Foundations and
Trends in Theoretical Computer Science, 9(3โ4):211โ407, 2014.
โข [DSZโ16] Anupam Datta, Shayak Sen, and Yair Zick. Algorithmic transparency via quantitative input influence: Theory
and experiments with learning systems. In Security and Privacy (SP), 2016 IEEE Symposium on, pp. 598โ617. IEEE,
2016
โข [KLRSโ17] Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In Advances in
Neural Information Processing Systems, pp. 4066โ4076, 2017.
โข [MTSโ19] Mahajan, Divyat, Chenhao Tan, and Amit Sharma. "Preserving Causal Constraints in Counterfactual
Explanations for Machine Learning Classifiers." arXiv preprint arXiv:1912.03277 (2019).
โข [MTSโ20] Mahajan, Divyat, Shruti Tople and Amit Sharma. โDomain Generalization using Causal Matchingโ. arXiv
preprint arXiv:2006.07500, 2020.
โข [NDSSโ19] Salem, A., Zhang, Y., Humbert, M., Fritz, M., and Backes, M. Ml-leaks: Model and data independent
membership inference attacks and defenses on machine learning models. NDSS 2019.
โข [SPโ17] Shokri, R., Stronati, M., Song, C., and Shmatikov, V. Membership inference attacks against machine learning
models. Security and Privacy (SP), 2017.
โข [SPโ19] Nasr, M., Shokri, R., and Houmansadr, A. Comprehensive privacy analysis of deep learning: Stand-alone and
federated learning under passive and active white-box inference attacks. Security and Privacy (SP), 2019.