Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

"AI Reliability Against Adversarial Inputs," a Presentation from Intel


Published on

For the full video of this presentation, please visit:

For more information about embedded vision, please visit:

Gokcen Cilingir, AI Software Architect, and Li Chen, Data Scientist and Research Scientist, both at Intel, presents the "AI Reliability Against Adversarial Inputs" tutorial at the May 2019 Embedded Vision Summit.

As artificial intelligence solutions are becoming ubiquitous, the security and reliability of AI algorithms is becoming an important consideration and a key differentiator for both solution providers and end users. AI solutions, especially those based on deep learning, are vulnerable to adversarial inputs, which can cause inconsistent and faulty system responses. Since adversarial inputs are intentionally designed to cause an AI solution to make mistakes, they are a form of security threat.

Although security-critical functions like login based on face, voice or fingerprint are the most obvious solutions requiring robustness against adversarial threats, many other AI solutions will also benefit from robustness against adversarial inputs, as this enables improved reliability and therefore enhanced user experience and trust. In this presentation, Cilingir and Chen explore selected adversarial machine learning techniques and principles from the point of view of enhancing the reliability of AI-based solutions.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

"AI Reliability Against Adversarial Inputs," a Presentation from Intel

  1. 1. © 2019 Intel AI Reliability Against Adversarial Inputs Gokcen Cilingir, Li Chen Intel May 2019
  2. 2. © 2019 Intel Motivation Adversarial examples are already seen in our daily life. Designing AI solutions that are robust against adversarial inputs is important for • [Security] Critical system asset management and defense against malware • [Reliability] Ensuring consistent and reliable system behavior 2 AI-based malware detector PASS Automated Speech Recognition Adversarial input “A B C” “X Y Z” Faulty prediction
  3. 3. © 2019 Intel How big can the damage be? 3
  4. 4. © 2019 Intel Adversarial attack examples From malware detection to other domains Face recognition 4Image source: [1]
  5. 5. © 2019 Intel Adversarial attack examples Autonomous driving 5 Image source: [2]
  6. 6. © 2019 Intel Adversarial attack examples Automated speech recognition 6Image source: [3]
  7. 7. © 2019 Intel How does it work? 7
  8. 8. © 2019 Intel Adversarial ML concepts 8 A taxonomy of adversaries against machine learning models at test time (with evasion as the goal) Image source: [6] • Evasion vs. Data poisoning attacks • Threat models • Substitute model creation and transferability
  9. 9. © 2019 Intel Adversarial example creation • The very tool that makes ML powerful is being used to break it: optimization • An adversarial example x* is found by perturbing an originally correctly classified input x by (approximately) solving the following constrained optimization problem. t is the target class. 9 Added noise magnified by 10x Prediction: School bus Prediction: Ostrich Image source: [4], Text adopted:[5]
  10. 10. © 2019 Intel How has Machine Learning been Exploited? • Take binary classification as an example. Through training, one can generally learn only an approximation of the true boundaries. • The model error between the approximate and expected decision boundaries is exploited by adversaries as illustrated in the following figure: 10 Image source, text adopted: [5]
  11. 11. © 2019 Intel Defense and mitigation 11
  12. 12. © 2019 Intel High level flow for AI-based solutions 12 AI inferenceInput Prediction & confidence Action determination Action AI model definition and training Training dataset AI model CLIENT CLOUD AI-based application
  13. 13. © 2019 Intel Defense and mitigation against adversarial attacks 13 AI inferenceInput Prediction & confidence Action determination Action Adversary aware AI model definition and training Training dataset Strengthened AI model CLIENT CLOUD AI-based application Pre-processing for perturbation removal Input validation /Adversary detection Mitigation policy Examples: Adversarial training, defensive distillation, logit pairing, architectural modifications like BNNs Examples: JPEG compression, SHIELD, MP3 audio compression Examples: Distributional detection, normalization detection, PCA-based detection, secondary classification Robustness metric
  14. 14. © 2019 Intel Defenses and their limitations • Current status: Race continues between attackers and defenses. All known defense techniques come with limitations. • Adversarial training uses generated adversarial examples as part of training • Architectures like Bayesian NNs provide better uncertainty modeling • Detection and pre-processing methods are generally domain specific. • SHIELD compresses away small pixel manipulations over images, MP3 audio compression applies the same idea over audio data. 14
  15. 15. © 2019 Intel Tools for attack simulations and defenses 15
  16. 16. © 2019 Intel MLsploit: A Cloud-Based Framework for Adversarial Machine Learning Research 16 • Research module for adversarial machine learning • Interactive interface and experimentation • Comparison for attack and defenses • Easy integration
  17. 17. © 2019 Intel 17
  18. 18. © 2019 Intel 18 Toolkits • Adversarial Robustness Toolbox (ART): Python library for adversarial attacks and defenses (evasion, poisoning) for NNs. Also supplies access to robustness metrics. • Cleverhans: An adversarial example library for constructing attacks, building defenses, and benchmarking both. • ALFASVMLib: An open-source Matlab library that implements a set of heuristic attacks against Support Vector Machines (SVMs). • Foolbox: A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, Keras, MxNet, .. • MLsploit: A web platform to demo Machine Learning as a Service on security researches. A portal to demo adversarial ML and countermeasures is provided.
  19. 19. © 2019 Intel Conclusion • ML can be vulnerable to adversarial examples. • Designing AI solutions that are robust against adversarial inputs is important for security and reliability of critical applications. • Several free and open-sourced toolkits exist to assess and strengthen AI models. 19
  20. 20. © 2019 Intel Resources 20 [1] Sharif, Mahmood, et al. "Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition." Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2016. [2] Metzen, Jan Hendrik, et al. "Universal adversarial perturbations against semantic image segmentation." 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017. [3] Carlini, Nicholas, and David Wagner. "Audio adversarial examples: Targeted attacks on speech-to- text." 2018 IEEE Security and Privacy Workshops (SPW). IEEE, 2018. [4] Szegedy, Christian, et al. "Intriguing properties of neural networks." arXiv preprint arXiv:1312.6199 (2013). [5] Goodfellow, Ian, Patrick McDaniel, and Nicolas Papernot. "Making machine learning robust against adversarial inputs." Communications of the ACM 61.7 (2018): 56-66. [6] Papernot, Nicolas, et al. "The limitations of deep learning in adversarial settings." 2016 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2016. [7] Das, Nilaksh, et al. "Shield: Fast, practical defense and vaccination for deep learning using jpeg compression." Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018. [8] Das, Nilaksh, et al. "ADAGIO: Interactive Experimentation with Adversarial Attack and Defense for Audio." Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Cham, 2018.
  21. 21. © 2019 Intel Glossary • Adversarial example: Inputs that have been intentionally optimized to cause misclassification • Threat models in the context of adversarial ML refers to the explicit definition of the capabilities of the adversary • White-box threat model: An adversary with an access to (at minimum) the model architecture and the parameter values. • Black-box threat model: An adversary with an access to either just the samples or the oracle (system output is visible) • Adversarial sample transferability: The property that adversarial samples produced by training on a specific model can affect another model, even if they have different architectures and/or training data • Evasion attack: The adversary tries to evade the system by adjusting malicious samples during testing phase • Data poisoning attack: An adversary tries to poison the training data by injecting carefully designed samples to compromise the learning process 21
  22. 22. © 2019 Intel Legal Disclaimers ▪Intel provides these materials as-is, with no express or implied warranties. ▪Intel products may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. ▪Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at ▪No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Copyright © 2019 Intel Corporation. All rights reserved. Intel, the Intel logo and others are trademarks of Intel Corporation in the U.S. and/or other countries. 22