Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Securing Neural Networks


Published on

We thoroughly enjoyed sharing some early strategies to perform security analysis on Neural Networks (Deep Learning/Machine Learning Models) at Shopify.

The field is still ripe and a lot more advancements need to happen in order to build Enterprise grade scanners.

Our discussion was recorded, and your comments and opinions would help drive the field forward. To the best of our knowledge, this talk is a #First of its kind on youtube.

Video Link:

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Securing Neural Networks

  1. 1. [TensorFuzz] Debugging Neural Networks with Coverage-Guided Fuzzing Authors: Augustus Odena, Ian Goodfellow Presentor: Tahseen Shabab Facilitators: Susan Shu, Serena McDonnell Date: 26th August, 2019 Cybersecurity AI
  2. 2. Tahseen Shabab Presenter CEO, Bibu Labs Susan Shu Serena McDonnell Facilitator Data Scientist, Bell Facilitator Senior Data Scientist, Delphia Speakers
  3. 3. We Prof Hassan Khan Chief Scientist, Bibu Labs Prof. Kate Larson Prof. Larry Smith Advisor - AI, Bibu Labs Advisor - Strategy, Bibu Labs We Are Growing!
  4. 4. Feb, 2019 $1.4 B Acquisition
  5. 5. July, 2019
  6. 6. Cylance Hack: Enable Dynamic Debugging Cylance Antivirus Verbose Logging Score: { -1000: Most Malicious +1000: Most Benign } Dynamic Debugging Enabled
  7. 7. Cylance Hack: Reverse Engineer Model 7000 Feature Vectors Neural Network Post Processing Added Filter White/Black List
  8. 8. Cylance Hack: Exploit Model Bias ● Researchers found bias in the model ○ A small set of features that have significant effect on outcome ● “Added Filter” uses Clusters with specific names to Whitelist files, one being a famous game ● Researchers added strings from games executable to real malicious file ● Game Over!
  9. 9. Have We Seen This Before?
  10. 10. Lawd & Meek (2005) and Wittel & WU (2004) ● Attacks against statistical spam filters ○ Add good words ○ Words the filter consider indicative of non-spam to spam ● Append words which appear often in ham emails and rarely in spam to a spam email ● Spam Filter Fooled!
  11. 11. Why Are These Hard To Spot?
  12. 12. ● Traditional Software ○ Devs directly specify logic of system ● ML System ○ NN learns rules automatically ○ Developers can indirectly modify decision logic by manipulating ■ Training data ■ Feature selection ■ Models architecture ○ NN’s underlying rules are mostly unknown to developers! Source of Blind Spots
  13. 13. Adversarial Attacks
  14. 14. Adaptive Nature of Hackers ● Hackers Take Path of Least Resistance ● If a Patch is deployed, Hackers will take the path of least resistance Vulnerability 1 Vulnerability 2 Vulnerability 3
  15. 15. Data Distribution Actively Manipulated
  16. 16. ● Hackers strategically insert attack data ● Model trains periodically ● Decision boundary is altered Data Poisoning
  17. 17. ● Add Noise ● Classifier Misclassifies Object ● Model learns differently than humans Attack: Induce Specific Output “Explaining and Harnessing Adversarial Examples”, Ian Goodfellow
  18. 18. Submit queries, observe response ● Training Data ● Architecture ● Optimization Procedures Attack: Expose Model Attributes "Towards Reverse Engineering Black Box Neural Networks”, Seong Oh
  19. 19. Taxonomy of Attacks Against ML Systems Axis Attack Properties Influence Causative - influences training and test data Exploratory - Influences test data Security Violation Confidentiality - goal is to uncover training data Integrity - goal is false negatives (FNs) Availability - goal is false positives (FPs) Specificity Targeted - influence prediction of particular test instances Indiscriminate - influence predictions of all test instances Adversarial Machine Learning - Joseph, Nelson, Rubinstein and Tygar, 2019
  20. 20. Exploratory Attacks Against Trained Classifier ● Attacker doesn’t have access to training data ● Most known detection techniques are susceptible to blind spots ● How difficult is it for adversary to discover blind spots that is most advantageous to them?
  21. 21. How Can We Find these Blind Spots? self-from-blind-spots/
  22. 22. ● Check erroneous corner cases ● Input: Unlabeled test input ● Objective: Generate test data to: ○ Activate large number of neurons ○ Force DNNs to behave differently ● Joint Optimization Problem: Maximize ○ Differential behaviour ○ Neuron coverage DeepXplore: White Box Testing
  23. 23. ● Perform gradient guided local search ○ Starting: seed input ○ Find new inputs that maximize desired goal ● Similar to backpropagation, but: ○ Inputs: Variable ○ Weights: Constant DeepXplore: Example
  24. 24. ● Bayesian Neural Network ● Adding dropout before every weight layer approximation of gaussian process ○ Both training and test ● Dropout during test ○ Different output for same input ■ [4,5,1,2,3,6] ○ Equivalent to MC sampling ○ High Variance = High uncertainty Bayesian NN: Modelling Uncertainty site/blog_2248.html
  25. 25. TensorFuzz
  26. 26. TensorFuzz ● Open Source Tool ● Discovers errors which occur only for rare inputs (Blind Spots) ● Key Techniques: ○ Coverage Guided Fuzzing ○ Property Based Testing ○ Approximate Nearest Neighbor
  27. 27. TensorFuzz ● Open Source Tool ● Discovers errors which occur only for rare inputs (Blind Spots) ● Key Techniques: ○ Coverage Guided Fuzzing ○ Property Based Testing ○ Approximate Nearest Neighbor
  28. 28. ● Instrument Program for coverage ○ Add instructions to code allowing fuzzer to detect code paths ● Feed Random Inputs into program ● Continue to mutate inputs that exercised new part of the program ○ Genetic Algorithm ● Identify bugs Coverage Guided Fuzzing (AFL)
  29. 29. ● Aids the discovery of subtle fault conditions in the underlying code ● Security vulnerabilities are often associated with unexpected or incorrect state transitions AFL: Branch Edge Coverage AFL Documentation
  30. 30. ● Identifies potentially interesting control flow changes, ○ Ex. A block of code being executed twice when it was normally hit only once AFL Documentation AFL: Hit Count
  31. 31. ● Sequential bit flips with varying lengths and stepovers, ● Sequential addition and subtraction of small integers, ● Sequential insertion of known interesting integers (0, 1, INT_MAX, etc) AFL: Mutation Strategy
  32. 32. TensorFuzz ● Open Source Tool ● Discovers errors which occur only for rare inputs (Blind Spots) ● Key Techniques: ○ Coverage Guided Fuzzing ○ Property Based Testing ○ Approximate Nearest Neighbor
  33. 33. ● Verifies a function or program abides by a property ● Properties check for useful characteristics that must be seen in output Property Based Testing -property-based-testing-f5236229d237
  34. 34. ● Cover the scope of all possible inputs ○ Does not restrict the generated inputs ● Shrink the input in case of failure ○ On failure, the framework tries to reduce the input to a smaller input ● Reproducible and replayable ○ Each time it runs a property test, a seed is produced in order to be able to re-run the test again on the same datasets Advantage -property-based-testing-f5236229d237
  35. 35. TensorFuzz ● Open Source Tool ● Discovers errors which occur only for rare inputs (Blind Spots) ● Key Techniques: ○ Coverage Guided Fuzzing ○ Property Based Testing ○ Approximate Nearest Neighbor
  36. 36. Approximate Nearest Neighbor tures/lec16.pdf ● Nearest Neighbor ○ Given points p1,p2,...,pn, and query point q, find closest point to q among p1,...,pn ● Approximate Nearest Neighbor ○ Condition is relaxed ○ Fin pi so that ■ d(q,pi) <=c.min d(q,pj)
  37. 37. TensorFuzz ● Open Source Tool ● Discovers errors which occur only for rare inputs (Blind Spots) ● Key Techniques: ○ Coverage Guided Fuzzing ○ Property Based Testing ○ Approximate Nearest Neighbor
  38. 38. Sadly, CGF Tools Don’t Work For Neural Networks
  39. 39. ● Coverage Metrics ○ Lines of Code Executed ○ Which branches have been taken Traditional Software Workflow
  40. 40. ● Software implementation may contain many branching statements ○ Based on architecture ○ Mostly independent of input ● Different inputs will often execute ○ same lines of code ○ same branches, ● But will produce interesting variations in behaviour Neural Network Workflow
  41. 41. How Does TensorFuzz Work?
  42. 42. Let's Dive In! Dio, Holy Diver
  43. 43. TensorFuzz 1. We interact with a TensorFlow Graph instead of instrumented Computer Program
  44. 44. 2. Valid neural network inputs are fed instead of big array of bytes. Ex. For, if inputs are sequences of character, only allow characters that are in vocabulary extracted from the training set TensorFuzz
  45. 45. 3. Input Chooser intelligently chooses elements from input corpus. Following heuristics is used: : Probability of choosing corpus element ck at time t tk: Time when ck was added to the corpus Intuition: Recently sampled inputs are more likely to yield useful new coverage when mutated, but advantage decays over time. TensorFuzz
  46. 46. 4. Mutator modifies input in a controlled manner For text input, mutation occurs in accordance to following policy: Uniformly at random perform one of following operations: - Delete, Add, Subtract - Random character at random location TensorFuzz
  47. 47. Diving Deeper 5. Mutated inputs are fed to Neural Network. The following are extracted from NN - Set of coverage arrays - Enables computation of coverage - Set of metadata arrays - Fed as input to objective function
  48. 48. 5.a Objective Function - Desired Outcome - Ex. Error, crash Outputted Metadata arrays is fed into Objective function, and inputs causing system to reach goal of objective function are flagged TensorFuzz
  49. 49. 5.b Coverage Analyzer Core part of product Reading arrays from TensorFlow runtime, turning them into python objects representing coverage, checking whether that coverage is new TensorFuzz
  50. 50. Desired Properties of Coverage Analyzer ● Check if Neural Network is in new state ○ Enables detection of misbehaviour ● Check has to be fast ● Should work with many different computation graphs ○ Remove Manual Intervention as much as possible ● Exercising all of the coverage should be hard ○ Or else we won’t cover much of possible behaviours
  51. 51. Use Fast Approximate Nearest Neighbour ● Determine if two sets of NN activations are meaningfully different from each other ● Provides a coverage metric producing useful results for neural network ○ Even if underlying software implementation of the neural network does not make use of many data-dependent branches
  52. 52. Intuition: Coverage Analyzer Activation Activation Activation ActivationCurrent Input Old Input Delta DeltaDelta New Coverage Reached If Distance Sufficiently Large
  53. 53. ● On New Activation Vector a. Use Approximate nearest neighbors Algorithm b. Look up nearest neighbour c. Check distance between current and nearest neighbour in Euclidean distance d. Add input to corpus if distance is greater than L gation-from-the-beginning-77356edf427d Coverage Analyzer: Details
  54. 54. ● Note: Often, good results are achieved only by looking at logits or layer before logits gation-from-the-beginning-77356edf427d Coverage Analyzer: Details
  55. 55. 6. Mutated input is: - Add to corpus if - New coverage is achieved - Added to list of test cases if - Objective function is satisfied TensorFuzz
  56. 56. Break
  57. 57. Experiments
  58. 58. Experiment: Finding NaNs ● NaNs consistently cause trouble for researchers and practitioners, but they are hard to track them down ● A bad loss function is “fault injected” into a neural network ● TesnorFuzz could find NaNs substantially faster than a baseline random search
  59. 59. ● Left: Coverage overtime for 10 different random restarts ● Right: An example of a random image that causes neural network to NaN Experiment: Finding NaNs
  60. 60. Experiment: Quantization Errors ● We often want to quantize neural networks ● How to test for accuracy? ● We can look at differences in test sets, but often few show up ● Instead, we can fuzz for inputs that surface differences
  61. 61. ● Left: Coverage overtime for 10 different random restarts. Note that 3 runs fail ● Right: An example of an image correctly classified by the original neural network but incorrectly classified by the quantized network Experiment: Quantization Errors
  62. 62. Discussion
  63. 63. Discussion Points ● How do we embed security testing into the ML Solution development lifecycle? ● Can explainable inference help to detect blind spots? ● Can we use multiple classifiers in parallel to reduce the implications of an attack on a specific model?