Machine learning
security
PAWEŁ ZAWISTOWSKI
AI and machine learning
help to create new tools
[1] Image: https://pixabay.com/pl/sztuczna-inteligencja-ai-robota-2228610/
Some of them make us
rethink what is “real”
lyrebird.ai
“Lyrebird allows you to create a digital voice that sounds like you with only one minute of audio.” [1]
[1] Quote & image: https://lyrebird.ai/
Learning lip sync from audio
[1] Suwajanakorn, Supasorn, Steven M. Seitz, and Ira Kemelmacher-Shlizerman. "Synthesizing obama: learning lip sync from audio." ACM Transactions on Graphics (TOG) 36.4 (2017): 95.
[2] Image: https://youtu.be/9Yq67CjDqvw
FakeApp
”A desktop app for creating photorealistic faceswap videos made with deep learning” [1]
[1] http://www.fakeapp.org/
[2] Image: Nicolas Cage fake movie compilation: https://youtu.be/BU9YAHigNx8
ML through the security lens
[1] Image: https://pixabay.com/pl/streszczenie-geometryczny-%C5%9Bwiata-1278059/
CIA triad – in machine learning context
Confidentiality – extracting model parameters and training data
Integrity – inducing particular outputs/behaviors of a trained model
Availability – making the model instable/unusable
Targeting confidentiality
Sharing datasets is tricky
[1] Image: https://www.theguardian.com/world/2018/jan/28/fitness-tracking-app-gives-away-location-of-secret-us-army-bases
A. Narayanan and V. Shmatikov. “Robust de-anonymization of large sparse datasets (how to break anonymity
of the Netflix prize dataset)”. IEEE Symposium on Security and Privacy. 2008.
A possible remedy: differential privacy
• A promise made to a data subject:
“You will not be affected, adversely or otherwise, by allowing your data to be
used in any study or analysis, no matter what other studies, data sets, or
information sources, are available.” [1]
• Adding randomness helps in protecting individual privacy.
[1] Dwork, C., & Roth, A. (2013). The Algorithmic Foundations of Differential Privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4), 211–407.
Demonstration – a quick survey
Raise your hand if you’ve been involved in some illegal activities.
Demonstration – a quick survey, take 2
Toss a fair coin:
◦ if it’s heads – toss it again and answer yes if it comes out heads,
◦ if it’s tails – answer truthfully.
Statistically ~ 25% of positives only due to randomness, the difference is where the
knowledge is hidden.
Raise your hand if you’ve been involved in some illegal activities.
Targeting integrity
Rapid progress in image recognition
[1] Left image MNIST: https://upload.wikimedia.org/wikipedia/commons/2/27/MnistExamples.png
[2] Right image CIFAR: https://www.cs.toronto.edu/~kriz/cifar.html
[3] Wan, Li, et al. "Regularization of neural networks using dropconnect." International Conference on Machine Learning. 2013.
[4] Graham, Benjamin. "Fractional max-pooling." arXiv preprint arXiv:1412.6071 (2014)
MNIST: 99.79% [3]
CIFAR-10: 96.53% [4]
“5 days after Microsoft announced it had beat the human benchmark of 5.1% errors with a 4.94% error
grabbing neural network, Google announced it had one-upped Microsoft by 0.04%” [1]
[1] https://www.eetimes.com/document.asp?doc_id=1325712
“Human level” results
In the meantime this happens
street sign birdhouse
Adversarial examples
[1] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and Harnessing Adversarial Examples”, 2014.
“[…] inputs formed by applying small but intentionally worst-case perturbations […] (which) results in
the model outputting an incorrect answer with high confidence” [1]
Goodfellow et al.
How these work?
▪Given a classifier f(x) we need to find a (minimal) perturbation r for which
f(x+r) ≠ f(x).
▪Finding r can be realized as an optimization task.
[1] Black box https://cdn.pixabay.com/photo/2014/04/03/10/22/black-box-310220_960_720.png
[2] White box https://cdn.pixabay.com/photo/2013/07/12/13/55/box-147574_960_720.png
How these work?
Training a model
Training
data
Loss
function
Inputs
Labels
Outputs
Parameter corrections
Generating adversarial examples
Adversarial
loss
Inputs
Outputs
Perturbation corrections
Perturbation
Trained
model
One step further: adversarial patch
[1] Brown, T. B., Mané, D., Roy, A., Abadi, M., & Gilmer, J. (n.d.). „Adversarial Patch”
toaster
Two steps further: adversarial object
[1] Athalye, A., Engstrom, L., Ilyas, A., & Kwok, K. (2017). Synthesizing Robust Adversarial Examples.
[2] Images: http://www.labsix.org/physical-objects-that-fool-neural-nets/
Trained
model
Adversarial
attack
Adversarial
3D model
3D Printing
Papernot et al: machine learning pipeline
security
Papernot et al. : “SoK: Towards the Science of Security and Privacy in Machine Learning”
Defense methods – first attempts
• Gradient masking.
• Defensive distillation.
[1] Image: http://cdn.emgn.com/wp-content/uploads/2016/01/society-will-fail-emgn-16.jpg
Extending the training data set
Training
data
Adversarial
examples
Train
model
Perform
attack
Extend
dataset
Detecting adversarial inputs
Online
model
Inputs
Attack
detector
Outputs
Adding some noise
Online
model
Inputs
Adding noise
Outputs
Conclusions
[1] http://maxpixel.freegreatpicture.com/
„In the history of science and technology, the
engineering artifacts have almost always
preceded the theoretical understanding[…] if you
are not happy with our understanding of the
methods you use everyday, fix it” [2]
Yann LeCun
[1] http://maxpixel.freegreatpicture.com/
[2] comment to a Ali Rahimi's "Test of Time" award talk at NIPS
Thank you for your
attention!
ON THE SIDE NOTE – WE’RE HIRING! ☺
[1] http://maxpixel.freegreatpicture.com/

Machine learning security - Pawel Zawistowski, Warsaw University of Technology/Adform

  • 1.
  • 2.
    AI and machinelearning help to create new tools [1] Image: https://pixabay.com/pl/sztuczna-inteligencja-ai-robota-2228610/ Some of them make us rethink what is “real”
  • 3.
    lyrebird.ai “Lyrebird allows youto create a digital voice that sounds like you with only one minute of audio.” [1] [1] Quote & image: https://lyrebird.ai/
  • 4.
    Learning lip syncfrom audio [1] Suwajanakorn, Supasorn, Steven M. Seitz, and Ira Kemelmacher-Shlizerman. "Synthesizing obama: learning lip sync from audio." ACM Transactions on Graphics (TOG) 36.4 (2017): 95. [2] Image: https://youtu.be/9Yq67CjDqvw
  • 5.
    FakeApp ”A desktop appfor creating photorealistic faceswap videos made with deep learning” [1] [1] http://www.fakeapp.org/ [2] Image: Nicolas Cage fake movie compilation: https://youtu.be/BU9YAHigNx8
  • 6.
    ML through thesecurity lens [1] Image: https://pixabay.com/pl/streszczenie-geometryczny-%C5%9Bwiata-1278059/
  • 7.
    CIA triad –in machine learning context Confidentiality – extracting model parameters and training data Integrity – inducing particular outputs/behaviors of a trained model Availability – making the model instable/unusable
  • 8.
  • 9.
    Sharing datasets istricky [1] Image: https://www.theguardian.com/world/2018/jan/28/fitness-tracking-app-gives-away-location-of-secret-us-army-bases A. Narayanan and V. Shmatikov. “Robust de-anonymization of large sparse datasets (how to break anonymity of the Netflix prize dataset)”. IEEE Symposium on Security and Privacy. 2008.
  • 10.
    A possible remedy:differential privacy • A promise made to a data subject: “You will not be affected, adversely or otherwise, by allowing your data to be used in any study or analysis, no matter what other studies, data sets, or information sources, are available.” [1] • Adding randomness helps in protecting individual privacy. [1] Dwork, C., & Roth, A. (2013). The Algorithmic Foundations of Differential Privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4), 211–407.
  • 11.
    Demonstration – aquick survey Raise your hand if you’ve been involved in some illegal activities.
  • 12.
    Demonstration – aquick survey, take 2 Toss a fair coin: ◦ if it’s heads – toss it again and answer yes if it comes out heads, ◦ if it’s tails – answer truthfully. Statistically ~ 25% of positives only due to randomness, the difference is where the knowledge is hidden. Raise your hand if you’ve been involved in some illegal activities.
  • 13.
  • 14.
    Rapid progress inimage recognition [1] Left image MNIST: https://upload.wikimedia.org/wikipedia/commons/2/27/MnistExamples.png [2] Right image CIFAR: https://www.cs.toronto.edu/~kriz/cifar.html [3] Wan, Li, et al. "Regularization of neural networks using dropconnect." International Conference on Machine Learning. 2013. [4] Graham, Benjamin. "Fractional max-pooling." arXiv preprint arXiv:1412.6071 (2014) MNIST: 99.79% [3] CIFAR-10: 96.53% [4]
  • 15.
    “5 days afterMicrosoft announced it had beat the human benchmark of 5.1% errors with a 4.94% error grabbing neural network, Google announced it had one-upped Microsoft by 0.04%” [1] [1] https://www.eetimes.com/document.asp?doc_id=1325712 “Human level” results
  • 16.
    In the meantimethis happens street sign birdhouse
  • 17.
    Adversarial examples [1] I.J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and Harnessing Adversarial Examples”, 2014. “[…] inputs formed by applying small but intentionally worst-case perturbations […] (which) results in the model outputting an incorrect answer with high confidence” [1] Goodfellow et al.
  • 18.
    How these work? ▪Givena classifier f(x) we need to find a (minimal) perturbation r for which f(x+r) ≠ f(x). ▪Finding r can be realized as an optimization task. [1] Black box https://cdn.pixabay.com/photo/2014/04/03/10/22/black-box-310220_960_720.png [2] White box https://cdn.pixabay.com/photo/2013/07/12/13/55/box-147574_960_720.png
  • 19.
  • 20.
  • 21.
  • 22.
    One step further:adversarial patch [1] Brown, T. B., Mané, D., Roy, A., Abadi, M., & Gilmer, J. (n.d.). „Adversarial Patch” toaster
  • 23.
    Two steps further:adversarial object [1] Athalye, A., Engstrom, L., Ilyas, A., & Kwok, K. (2017). Synthesizing Robust Adversarial Examples. [2] Images: http://www.labsix.org/physical-objects-that-fool-neural-nets/ Trained model Adversarial attack Adversarial 3D model 3D Printing
  • 24.
    Papernot et al:machine learning pipeline security Papernot et al. : “SoK: Towards the Science of Security and Privacy in Machine Learning”
  • 25.
    Defense methods –first attempts • Gradient masking. • Defensive distillation. [1] Image: http://cdn.emgn.com/wp-content/uploads/2016/01/society-will-fail-emgn-16.jpg
  • 26.
    Extending the trainingdata set Training data Adversarial examples Train model Perform attack Extend dataset
  • 27.
  • 28.
  • 29.
  • 30.
    „In the historyof science and technology, the engineering artifacts have almost always preceded the theoretical understanding[…] if you are not happy with our understanding of the methods you use everyday, fix it” [2] Yann LeCun [1] http://maxpixel.freegreatpicture.com/ [2] comment to a Ali Rahimi's "Test of Time" award talk at NIPS
  • 31.
    Thank you foryour attention! ON THE SIDE NOTE – WE’RE HIRING! ☺ [1] http://maxpixel.freegreatpicture.com/