Successfully reported this slideshow.
Your SlideShare is downloading.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2

Share

Download Now Download

Download to read offline

Adversarial Attacks on A.I. Systems — NextCon, Jan 2019

Download Now Download

Download to read offline

Machine Learning is itself just another tool, susceptible to adversarial attacks. These can have huge implications, especially in a world with self-driving cars and other automation. In this talk, we will look at recent developments in the world of adversarial attacks on the A.I. systems, and how far we have come in mitigating these attacks.

  • Be the first to comment

Adversarial Attacks on A.I. Systems — NextCon, Jan 2019

  1. 1. Adversarial Attacks on A.I. Systems 1 Anant Jain Co-founder, commonlounge.com (Compose Labs) https://commonlounge.com
 https://index.anantja.in NEXTCon Online AI Tech Talk Series Friday, Jan 18
  2. 2. 2 Are adversarial examples simply a fun toy problem for researchers? Or an example of a deeper and more chronic frailty in our models? Motivation
  3. 3. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 3 Outline
  4. 4. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 4 Outline
  5. 5. What exactly is “learnt” in Machine Learning? 5 Introduction
  6. 6. 6 Source: http://www.cleverhans.io/security/privacy/ml/2016/12/16/breaking-things-is-easy.html
  7. 7. What exactly is “learnt” in Machine Learning? Discussion 1. Neural Network 7 Feed-forward neural network
  8. 8. What exactly is “learnt” in Machine Learning? Discussion Feed-forward neural network 1. Neural Network 2. Weights 3. Back-propagation 8
  9. 9. What exactly is “learnt” in Machine Learning? Discussion 1. Neural Network 2. Weights 3. Cost Function 9 Feed-forward neural network
  10. 10. What exactly is “learnt” in Machine Learning? Discussion 1. Neural Network 2. Weights 3. Cost Function 4. Gradient Descent 10 Feed-forward neural network
  11. 11. What exactly is “learnt” in Machine Learning? Discussion 1. Neural Network 2. Weights 3. Cost Function 4. Gradient Descent 5. Back Propagation 11 Feed-forward neural network
  12. 12. 12 Source: http://www.cleverhans.io/security/privacy/ml/2016/12/16/breaking-things-is-easy.html
  13. 13. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 13 Outline
  14. 14. CIA Model of Security 14 Discussion
  15. 15. CIA Model of Security 15 Discussion • Confidentiality • Must not leak the training data used to train it • Eg: sensitive medical data
  16. 16. CIA Model of Security 16 Discussion • Confidentiality • Integrity: • should not be possible to alter predictions • during training by poisoning training data sets • during inference by showing the system adversarial examples
  17. 17. CIA Model of Security 17 Discussion • Confidentiality • Integrity • Accessibility • force a machine learning system to go into failsafe mode • Examples: Force an autonomous car to pull over
  18. 18. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 18 Outline
  19. 19. Threat Models 19 Discussion
  20. 20. Threat Models 20 Discussion • White box: • Adversary has knowledge of the machine learning model architecture and its parameters
  21. 21. Threat Models 21 Discussion • White box: • Adversary has knowledge of the machine learning model architecture and its parameters • Black box • Adversary only capable of interacting with the model by observing its predictions on chosen inputs • More realistic
  22. 22. Threat Models 22 Discussion
  23. 23. Threat Models 23 Discussion • Non-targeted attack • Force the model to misclassify the adversarial image
  24. 24. Threat Models 24 Discussion • Non-targeted attack • Force the model to misclassify the adversarial image • Targeted attack • Get the model to classify the input as a specific target class, which is different from the true class
  25. 25. 25 Discussion What are Adversarial Attacks?
  26. 26. 26
  27. 27. 27 Common Attacks
  28. 28. 28 Common Attacks 1. Fast Gradient Sign Method (FGSM)
  29. 29. 29 Common Attacks 1. Fast Gradient Sign Method (FGSM)
  30. 30. 30 Common Attacks 2. Targeted Fast Gradient Sign Method (T-FGSM)
  31. 31. 31 Common Attacks 3. Iterative Fast Gradient Sign Method (I-FGSM) Both one-shot methods (FGSM andT-FGSM) have lower success rates when compared to the iterative methods (I-FGSM) in white box attacks, however when it comes to black box attacks the basic single-shot methods turn out to be more effective.The most likely explanation for this is that the iterative methods tend to overfit to a particular model.
  32. 32. 32 Boosting Adversarial attacks with Momentum (MI-FGSM) Winning attack at NIPS 2017
  33. 33. 33 Boosting Adversarial attacks with Momentum (MI-FGSM) Winning attack at NIPS 2017
  34. 34. 34 More attacks
  35. 35. 35 • Deep Fool: • Iteratively “linearizes” the loss function at an input point (taking the tangent to the loss function at that point), and applies the minimal perturbation necessary. More attacks
  36. 36. 36 • Deep Fool: • Iteratively “linearizes” the loss function at an input point (taking the tangent to the loss function at that point), and applies the minimal perturbation necessary. • Carlini’s Attack: • Optimizes for having the minimal distance from the original example, under the constraint of having the example be misclassified by the original model • Costly but very effective More attacks
  37. 37. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 37 Outline
  38. 38. 38 https://arxiv.org/pdf/1312.6199.pdf
  39. 39. 39 Take a correctly classified image (left image in both columns), and add a tiny distortion (middle) to fool the ConvNet with the resulting image (right).
  40. 40. 40 https://arxiv.org/pdf/1412.6572.pdf
  41. 41. 41 https://arxiv.org/pdf/1412.6572.pdf
  42. 42. 42 https://arxiv.org/pdf/1607.02533.pdf
  43. 43. 43 Source: http://www.cleverhans.io/security/priva easy.html Adversarial examples can be printed out on normal paper and photographed with a standard resolution smartphone and still cause a classifier to, in this case, label a “washer” as a “safe”.
  44. 44. 44 https://arxiv.org/pdf/1712.09665.pdf
  45. 45. Demo 45
  46. 46. Demo 46
  47. 47. Download “Demitasse” 47 bit.ly/image-recog Download VGG-CNN-F (Binary Compression) model data (106 MB)
  48. 48. What are the implications of these attacks? 48 Discussion
  49. 49. What are the implications of these attacks? 49 Discussion •Self Driving Cars: A patch may make a car think that a Stop Sign is a Yield Sign
  50. 50. What are the implications of these attacks? 50 Discussion •Self Driving Cars: A patch may make a car think that a Stop Sign is a Yield Sign •Alexa: Voice-based Personal Assistants: Transmit sounds that sound like noise, but give specific commands (video)
  51. 51. What are the implications of these attacks? 51 Discussion •Self Driving Cars: A patch may make a car think that a Stop Sign is a Yield Sign •Alexa: Voice-based Personal Assistants: Transmit sounds that sound like noise, but give specific commands (video) •Ebay: Sell livestock and other banned substances.
  52. 52. 52 Three remarkable factors about Adversarial examples
  53. 53. 53 Three remarkable factors about Adversarial examples • Small perturbation • Amount of noise added is imperceivable
  54. 54. 54 Three remarkable factors about Adversarial examples • Small perturbation • Amount of noise added is imperceivable • High Confidence • It was easy to attain high confidence in the incorrect classification
  55. 55. 55 Three remarkable factors about Adversarial examples • Small perturbation • Amount of noise added is imperceivable • High Confidence • It was easy to attain high confidence in the incorrect classification • Transferability • Didn’t depend on the specific ConvNet used for the task.
  56. 56. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 56 Outline
  57. 57. How do you defend A.I. systems from these attacks? 57 Discussion
  58. 58. How do you defend A.I. systems from these attacks? 58 Discussion • Adversarial training  • Generate a lot of adversarial examples and explicitly train the model not to be fooled by each of them • Improves the generalization of a model when presented with adversarial examples at test time.
  59. 59. How do you defend A.I. systems from these attacks? 59 Discussion • Defensive distillation smooths the model’s decision surface in adversarial directions exploited by the adversary. • Train the model to output probabilities of different classes, rather than hard decisions about which class to output. • Creates a model whose surface is smoothed in the directions an adversary will typically try to exploit.
  60. 60. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 60 Outline
  61. 61. 61 Are adversarial examples simply a fun toy problem for researchers? Or an example of a deeper and more chronic frailty in our models? Motivation
  62. 62. 62 Model linearity
  63. 63. 63 Model linearity • Linear models’ behavior outside of the region where training data is concentrated is quite pathological.
  64. 64. 64 Model linearity In the example above, if we move in a direction perpendicular to the decision boundary, we can, with a relatively small-magnitude vector, push ourselves to a place where the model is very confident in the wrong direction
  65. 65. 65 Model linearity • Linear models’ behavior outside of the region where training data is concentrated is quite pathological.
 • In a high-dimensional space, each individual pixel might only increase by a very small amount, but have those small differences contribute to a dramatic difference in the weights * inputs dot product.
  66. 66. 66 Model linearity Within the space of possible nonlinear activation functions, modern deep nets have actually settled on one that is very close to linear: the Rectified Linear Units. (ReLU)
  67. 67. 67 Model linearity
  68. 68. 68 From Ian Goodfellow’s key paper on the topic:
 “Using a network that has been designed to be sufficiently linear–whether it is a ReLU or maxout network, an LSTM, or a sigmoid network that has been carefully configured not to saturate too much– we are able to fit most problems we care about, at least on the training set. The existence of adversarial examples suggests that being able to explain the training data or even being able to correctly label the test data does not imply that our models truly understand the tasks we have asked them to perform. Instead, their linear responses are overly confident at points that do not occur in the data distribution, and these confident predictions are often highly incorrect. …One may also conclude that the model families we use are intrinsically flawed. Ease of optimization has come at the cost of models that are easily misled.”
  69. 69. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 69 Outline
  70. 70. • We would like our models to be able to “fail gracefully” when used in production 70 What’s next?
  71. 71. • We would like our models to be able to “fail gracefully” when used in production • We would want to push our models to exhibit appropriately low confidence when they’re operating out of distribution 71 What’s next?
  72. 72. • We would like our models to be able to “fail gracefully” when used in production • We would want to push our models to exhibit appropriately low confidence when they’re operating out of distribution • Real problem here: models exhibiting unpredictable and overly confident performance outside of the training distribution. Adversarial examples are actually just an imperfect proxy to this problem. 72 What’s next?
  73. 73. Machine Learning is itself just another tool, susceptible to adversarial attacks. These can have huge implications, especially in a world with self-driving cars and other automation. 73 Summary
  74. 74. Thanks for attending the talk! 74 Anant Jain Co-founder, commonlounge.com (Compose Labs) https://commonlounge.com/pathfinder
 https://index.anantja.in Commonlounge.com is an online-learning platform similar to Coursera/Udacity, except our courses are in the form of lists of text-based tutorials, quizzes and step- by-step projects instead of videos. 
 
 Check out our Deep Learning Course!
  75. 75. Bonus Privacy issues in ML (and how the two can be unexpected allies) 75
  76. 76. Privacy issues in ML (and how the two can be unexpected allies) 76 Bonus • Lack of fairness and transparency when learning algorithms process the training data.
  77. 77. Privacy issues in ML (and how the two can be unexpected allies) 77 Bonus • Lack of fairness and transparency when learning algorithms process the training data. • Training data leakage: How do you make sure that ML Systems do not memorize sensitive information about the training set, such as the specific medical histories of individual patients? Differential Privacy
  78. 78. PATE (Private Aggregator of Teacher Ensembles) 78
  79. 79. Generative Adversarial Networks (GANs) 79 Bonus
  80. 80. Generative Adversarial Networks (GANs) 80
  81. 81. Applications of GANs 81 Bonus
  82. 82. Applications of GANs 82 Bonus •Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0)
  83. 83. Applications of GANs 83 Bonus •Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0) •3D objects: Shape Estimation (from 2D images), Shape Manipulation
  84. 84. Applications of GANs 84 Bonus •Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0) •3D objects: Shape Estimation (from 2D images), Shape Manipulation •Medical (Insilico Medicine): Drug discovery, Molecule development
  85. 85. Applications of GANs 85 Bonus •Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0) •3D objects: Shape Estimation (from 2D images), Shape Manipulation •Medical (Insilico Medicine): Drug discovery, Molecule development •Games / Simulation: Generating realistic environments (buildings, graphics, etc), includes inferring physical laws, and relation of objects to one another
  86. 86. Applications of GANs 86 Bonus • Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0) • 3D objects: Shape Estimation (from 2D images), Shape Manipulation • Medical (Insilico Medicine): Drug discovery, Molecule development • Games / Simulation: Generating realistic environments (buildings, graphics, etc), includes inferring physical laws, and relation of objects to one another • Robotics: Augmenting real-world training with virtual training

    Be the first to comment

    Login to see the comments

  • RobertoMarmo

    Sep. 10, 2019
  • AndreaCeccarelli1

    Jul. 15, 2021

Machine Learning is itself just another tool, susceptible to adversarial attacks. These can have huge implications, especially in a world with self-driving cars and other automation. In this talk, we will look at recent developments in the world of adversarial attacks on the A.I. systems, and how far we have come in mitigating these attacks.

Views

Total views

745

On Slideshare

0

From embeds

0

Number of embeds

15

Actions

Downloads

51

Shares

0

Comments

0

Likes

2

×