SlideShare a Scribd company logo
Machine Duping
Pwning Deep Learning Systems
CLARENCE CHIO
MLHACKER
@cchio
adapted from
Dave Conway
HACKING
SKILLS
MATH & STATS
SECURITY
RESEARCHER
HAX0R
DATA
SCIENTIST
DOING IT
ImageNet
Google Inc.
Google DeepMind
UMass, Facial Recognition
Nvidia DRIVE
Nvidia
Invincea Malware Detection System
Deep Learning
why would someone choose to use it?
(Semi)Automated
feature/representation
learning
One infrastructure for
multiple problems (sort of)
Hierarchical learning:
Divide task across
multiple layers
Efficient,
easily distributed &
parallelized
Definitely not one-size-fits-all
softmaxfunction
[0.34, 0.57, 0.09]
predicted class: 1
correct class: 0
np.argmax
logits prediction
𝒇
hidden
layer
input
layer
output
layer
𝑊
activation
function
bias units
4-5-3 Neural Net Architecture
[17.0, 28.5, 4.50]
𝒇
𝒇
𝒇
𝒇
𝒇
𝒇
𝒇
𝒇
𝒇
𝒇
𝒇
softmaxfunction
[0.34, 0.57, 0.09]
predicted class: 1
correct class: 0
np.argmax
logits prediction
𝒇
hidden
layer
input
layer
output
layer
𝑊
activation
function
bias units
4-5-3 Neural Net Architecture
[17.0, 28.5, 4.50]
𝒇
𝒇
Training Deep Neural Networks
Step 1 of 2: Feed Forward
1. Each unit receives output of the neurons in the previous layer (+ bias
signal)
2. Computes weighted average of inputs
3. Apply weighted average through nonlinear activation function
4. For DNN classifier, send final layer output through softmax function
softmax
function
[0.34, 0.57, 0.09]logits
prediction
𝒇
𝑊1
activation
function
[17.0, 28.5, 4.50]
𝒇 𝒇
activation
function
activation
function
input 𝑊2
b1 b2
bias
unit
bias
unit
Training Deep Neural Networks
Step 2 of 2: Backpropagation
1. If the model made a wrong prediction, calculate the error
1. In this case, the correct class is 0, but the model predicted 1 with 57% confidence - error is thus 0.57
2. Assign blame: trace backwards to find the units that contributed to this wrong
prediction (and how much they contributed to the total error)
1. Partial differentiation of this error w.r.t. the unit’s activation value
3. Penalize those units by decreasing their weights and biases by an amount proportional
to their error contribution
4. Do the above efficiently with optimization algorithm e.g. Stochastic Gradient Descent
0.57
total error
𝒇
𝑊1
activation
function
28.5
𝒇 𝒇
activation
function
activation
function
𝑊2
b1 b2
bias
unit
bias
unit
DEMO
Beyond Multi Layer Perceptrons
Source: iOS Developer Library
vImage Programming Guide
Convolutional Neural Network
Layered Learning
Lee, 2009, “Convolutional deep belief networks for scalable
unsupervised learning of hierarchical representations."
Beyond Multi Layer Perceptrons
Source: LeNet 5, LeCun et al.
Convolutional Neural Network
Beyond Multi Layer Perceptrons
Recurrent Neural Network
neural
network
input output
loop(s)
• Just a DNN with a feedback loop
• Previous time step feeds all
intermediate and final values into
next time step
• Introduces the concept of “memory”
to neural networks
Beyond Multi Layer Perceptrons
Recurrent Neural Network
input
output
time steps
network
depth
Y O L O !
Beyond Multi Layer Perceptrons
Disney, Finding Dory
Long Short-Term Memory (LSTM) RNN
• To make good predictions, we sometimes
need more context
• We need long-term memory capabilities
without extending the network’s recursion
indefinitely (unscalable)
Colah’s Blog, “Understanding LSTM Networks"
Deng et al. “Deep Learning: Methods and Applications”
HOW TO PWN?
Attack Taxonomy
ExploratoryCausative
(Manipulative test samples)
Applicable also to online-learning
models that continuously learn
from real-time test data
(Manipulative training samples)
Targeted
Indiscriminate
Training samples that move
classifier decision boundary in an
intentional direction
Training samples that increase FP/FN
→ renders classifier unusable
Adversarial input crafted to cause
an intentional misclassification
n/a
Why can we do this?
vs.
Statistical learning models don’t learn concepts the same way that we do.
BLINDSPOTS:
Adversarial Deep Learning
Intuitions
1. Run input x through the classifier model (or substitute/approximate model)
2. Based on model prediction, derive a perturbation tensor that
maximizes chances of misclassification:
1. Traverse the manifold to find blind spots in input space; or
2. Linear perturbation in direction of neural network’s cost function gradient; or
3. Select only input dimensions with high saliency* to perturb by the model’s Jacobian matrix
3. Scale the perturbation tensor by some magnitude, resulting in the
effective perturbation (δx) to x
1. Larger perturbation == higher probability for misclassification
2. Smaller perturbation == less likely for human detection
* saliency: amount of influence a selected dimension has on the entire model’s output
Adversarial Deep Learning
Intuitions
Szegedy, 2013: Traverse the manifold to find blind spots in the input space
• Adversarial samples == pockets in the manifold
• Difficult to efficiently find by brute force (high dimensional input)
• Optimize this search, take gradient of input w.r.t. target output class
Adversarial Deep Learning
Intuitions
Goodfellow, 2015: Linear adversarial perturbation
• Developed a linear view of adversarial examples
• Can just take the cost function gradient w.r.t. the sample (x) and original
predicted class (y)
• Easily found by backpropagation
Adversarial Deep Learning
Intuitions
Papernot, 2015: Saliency map + Jacobian matrix perturbation
• More complex derivations for why the Jacobian of the learned neural network
function is used
• Obtained with respect to input features rather than network parameters
• Forward propagation is used instead of backpropagation
• To reduce probability of human detection, only perturb the dimensions that
have the greatest impact on the output (salient dimensions)
Papernot et al., 2015
Threat Model:
Adversarial Knowledge
Model hyperparameters,
variables, training tools
Architecture
Training data
Black box
Increasing
attacker
knowledge
Deep Neural Network Attacks
Adversary
knowledge
Attack
complexity
DIFFICULT
EASY
Architecture,
Training Tools,
Hyperparameters
Architecture
Training data
Oracle
Labeled Test
samples
Confidence
Reduction
Untargeted
M
isclassification
Targeted
M
isclassification
Source/Target
M
isclassification
Murphy, 2012 Szegedy, 2014
Papernot, 2016
Goodfellow, 2016
Xu, 2016
Nguyen, 2014
What can you do with limited knowledge?
• Quite a lot.
• Make good guesses: Infer the methodology from the task
• Image classification: ConvNet
• Speech recognition: LSTM-RNN
• Amazon ML, ML-as-a-service etc.: Shallow feed-forward network
• What if you can’t guess?
STILL CAN PWN?
Black box attack methodology
1. Transferability
Adversarial samples that fool model A have a
good chance of fooling a previously unseen
model B
SVM, scikit-learn
Decision Tree
Matt's Webcorner, Stanford
Linear Classifier
(Logistic Regression)
Spectral Clustering, scikit-learn
Feed Forward Neural Network
Black box attack methodology
1. Transferability
Papernot et al. “Transferability in Machine Learning:
from Phenomena to Black-Box Attacks using Adversarial Samples”
Black box attack methodology
2. Substitute model
train a new model by treating the target model’s
output as a training labels
then, generate adversarial samples with
substitute model
input data
target
black box
model
training label
backpropagate
black box prediction
substitute
model
Why is this possible?
• Transferability?
• Still an open research problem
• Manifold learning problem
• Blind spots
• Model vs. Reality dimensionality mismatch
• IN GENERAL:
• Is the model not learning anything
at all?
What this means for us
• Deep learning algorithms (Machine Learning in general) are susceptible to
manipulative attacks
• Use with caution in critical deployments
• Don’t make false assumptions about what/how the model learns
• Evaluate a model’s adversarial resilience - not just accuracy/precision/recall
• Spend effort to make models more robust to tampering
Defending the machines
• Distillation
• Train model 2x, feed first DNN output logits into second DNN input layer
• Train model with adversarial samples
• i.e. ironing out imperfect knowledge learnt in the model
• Other miscellaneous tweaks
• Special regularization/loss-function methods (simulating adversarial content during training)
DEEP-PWNING“metasploit for machine learning”
WHY DEEP-PWNING?
• lol why not
• “Penetration testing” of statistical/machine learning systems
• Train models with adversarial samples for increased robustness
PLEASE PLAY WITH IT &
CONTRIBUTE!
Deep Learning and Privacy
• Deep learning also sees challenges in other areas relating to security &
privacy
• Adversary can reconstruct training samples from a trained black box
DNN model (Fredrikson, 2015)
• Can we precisely control the learning objective of a DNN model?
• Can we train a DNN model without the training agent having complete
access to all training data? (Shokri, 2015)
WHY IS THIS IMPORTANT?
WHY DEEP-PWNING?
• MORE CRITICAL SYSTEMS RELY ON MACHINE LEARNING → MORE IMPORTANCE
ON ENSURING THEIR ROBUSTNESS
• WE NEED PEOPLE WITH BOTH SECURITY AND STATISTICAL SKILL SETS TO
DEVELOP ROBUST SYSTEMS AND EVALUATE NEW INFRASTRUCTURE
LEARN IT OR BECOME IRRELEVANT
@cchio
MLHACKER

More Related Content

What's hot

Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)
MeetupDataScienceRoma
 
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
GeekPwn Keen
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Oswald Campesato
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
ananth
 
Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural network
NAVER Engineering
 
An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)
Thomas da Silva Paula
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
Akshay Hegde
 
Deep learning
Deep learningDeep learning
Deep learning
Ratnakar Pandey
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
doppenhe
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
butest
 
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Márton Miháltz
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
ananth
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
jins0618
 
Machine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative ModelsMachine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative Models
butest
 
A Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer ModelA Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer Model
taeseon ryu
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learning
ananth
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
PyData
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
Uwe Friedrichsen
 
Scene understanding
Scene understandingScene understanding
Scene understanding
Mohammed Shoaib
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
Grigory Sapunov
 

What's hot (20)

Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)
 
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
 
Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural network
 
An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
Deep learning
Deep learningDeep learning
Deep learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
 
Machine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative ModelsMachine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative Models
 
A Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer ModelA Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer Model
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learning
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Scene understanding
Scene understandingScene understanding
Scene understanding
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
 

Similar to Machine Duping 101: Pwning Deep Learning Systems

Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
Shirin Elsinghorst
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
Anirban Santara
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
Subrat Panda, PhD
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Abhishek Bhandwaldar
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Vishwas Lele
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Amr Rashed
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
Apache MXNet
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Impetus Technologies
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
ﺁﺻﻒ ﻋﻠﯽ ﻣﯿﺮ
 
1. Introduction to deep learning.pptx
1. Introduction to deep learning.pptx1. Introduction to deep learning.pptx
1. Introduction to deep learning.pptx
Kv Sagar
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
PyData
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)
Julien SIMON
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
MonicaTimber
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
Vandana Kannan
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
Apache MXNet
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
Turi, Inc.
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
yang947066
 
Introduction to Deep Learning presentation
Introduction to Deep Learning presentationIntroduction to Deep Learning presentation
Introduction to Deep Learning presentation
johanericka2
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
Poo Kuan Hoong
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 

Similar to Machine Duping 101: Pwning Deep Learning Systems (20)

Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 
1. Introduction to deep learning.pptx
1. Introduction to deep learning.pptx1. Introduction to deep learning.pptx
1. Introduction to deep learning.pptx
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Introduction to Deep Learning presentation
Introduction to Deep Learning presentationIntroduction to Deep Learning presentation
Introduction to Deep Learning presentation
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 

Recently uploaded

Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 

Recently uploaded (20)

Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 

Machine Duping 101: Pwning Deep Learning Systems

  • 1. Machine Duping Pwning Deep Learning Systems CLARENCE CHIO MLHACKER @cchio
  • 2. adapted from Dave Conway HACKING SKILLS MATH & STATS SECURITY RESEARCHER HAX0R DATA SCIENTIST DOING IT
  • 3. ImageNet Google Inc. Google DeepMind UMass, Facial Recognition Nvidia DRIVE Nvidia Invincea Malware Detection System
  • 4. Deep Learning why would someone choose to use it? (Semi)Automated feature/representation learning One infrastructure for multiple problems (sort of) Hierarchical learning: Divide task across multiple layers Efficient, easily distributed & parallelized
  • 6. softmaxfunction [0.34, 0.57, 0.09] predicted class: 1 correct class: 0 np.argmax logits prediction 𝒇 hidden layer input layer output layer 𝑊 activation function bias units 4-5-3 Neural Net Architecture [17.0, 28.5, 4.50] 𝒇 𝒇 𝒇 𝒇 𝒇 𝒇 𝒇 𝒇 𝒇 𝒇 𝒇
  • 7. softmaxfunction [0.34, 0.57, 0.09] predicted class: 1 correct class: 0 np.argmax logits prediction 𝒇 hidden layer input layer output layer 𝑊 activation function bias units 4-5-3 Neural Net Architecture [17.0, 28.5, 4.50] 𝒇 𝒇
  • 8. Training Deep Neural Networks Step 1 of 2: Feed Forward 1. Each unit receives output of the neurons in the previous layer (+ bias signal) 2. Computes weighted average of inputs 3. Apply weighted average through nonlinear activation function 4. For DNN classifier, send final layer output through softmax function softmax function [0.34, 0.57, 0.09]logits prediction 𝒇 𝑊1 activation function [17.0, 28.5, 4.50] 𝒇 𝒇 activation function activation function input 𝑊2 b1 b2 bias unit bias unit
  • 9. Training Deep Neural Networks Step 2 of 2: Backpropagation 1. If the model made a wrong prediction, calculate the error 1. In this case, the correct class is 0, but the model predicted 1 with 57% confidence - error is thus 0.57 2. Assign blame: trace backwards to find the units that contributed to this wrong prediction (and how much they contributed to the total error) 1. Partial differentiation of this error w.r.t. the unit’s activation value 3. Penalize those units by decreasing their weights and biases by an amount proportional to their error contribution 4. Do the above efficiently with optimization algorithm e.g. Stochastic Gradient Descent 0.57 total error 𝒇 𝑊1 activation function 28.5 𝒇 𝒇 activation function activation function 𝑊2 b1 b2 bias unit bias unit
  • 10. DEMO
  • 11. Beyond Multi Layer Perceptrons Source: iOS Developer Library vImage Programming Guide Convolutional Neural Network
  • 12. Layered Learning Lee, 2009, “Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations."
  • 13. Beyond Multi Layer Perceptrons Source: LeNet 5, LeCun et al. Convolutional Neural Network
  • 14. Beyond Multi Layer Perceptrons Recurrent Neural Network neural network input output loop(s) • Just a DNN with a feedback loop • Previous time step feeds all intermediate and final values into next time step • Introduces the concept of “memory” to neural networks
  • 15. Beyond Multi Layer Perceptrons Recurrent Neural Network input output time steps network depth Y O L O !
  • 16. Beyond Multi Layer Perceptrons Disney, Finding Dory Long Short-Term Memory (LSTM) RNN • To make good predictions, we sometimes need more context • We need long-term memory capabilities without extending the network’s recursion indefinitely (unscalable) Colah’s Blog, “Understanding LSTM Networks"
  • 17. Deng et al. “Deep Learning: Methods and Applications”
  • 19. Attack Taxonomy ExploratoryCausative (Manipulative test samples) Applicable also to online-learning models that continuously learn from real-time test data (Manipulative training samples) Targeted Indiscriminate Training samples that move classifier decision boundary in an intentional direction Training samples that increase FP/FN → renders classifier unusable Adversarial input crafted to cause an intentional misclassification n/a
  • 20. Why can we do this? vs. Statistical learning models don’t learn concepts the same way that we do. BLINDSPOTS:
  • 21. Adversarial Deep Learning Intuitions 1. Run input x through the classifier model (or substitute/approximate model) 2. Based on model prediction, derive a perturbation tensor that maximizes chances of misclassification: 1. Traverse the manifold to find blind spots in input space; or 2. Linear perturbation in direction of neural network’s cost function gradient; or 3. Select only input dimensions with high saliency* to perturb by the model’s Jacobian matrix 3. Scale the perturbation tensor by some magnitude, resulting in the effective perturbation (δx) to x 1. Larger perturbation == higher probability for misclassification 2. Smaller perturbation == less likely for human detection * saliency: amount of influence a selected dimension has on the entire model’s output
  • 22. Adversarial Deep Learning Intuitions Szegedy, 2013: Traverse the manifold to find blind spots in the input space • Adversarial samples == pockets in the manifold • Difficult to efficiently find by brute force (high dimensional input) • Optimize this search, take gradient of input w.r.t. target output class
  • 23. Adversarial Deep Learning Intuitions Goodfellow, 2015: Linear adversarial perturbation • Developed a linear view of adversarial examples • Can just take the cost function gradient w.r.t. the sample (x) and original predicted class (y) • Easily found by backpropagation
  • 24. Adversarial Deep Learning Intuitions Papernot, 2015: Saliency map + Jacobian matrix perturbation • More complex derivations for why the Jacobian of the learned neural network function is used • Obtained with respect to input features rather than network parameters • Forward propagation is used instead of backpropagation • To reduce probability of human detection, only perturb the dimensions that have the greatest impact on the output (salient dimensions)
  • 26. Threat Model: Adversarial Knowledge Model hyperparameters, variables, training tools Architecture Training data Black box Increasing attacker knowledge
  • 27. Deep Neural Network Attacks Adversary knowledge Attack complexity DIFFICULT EASY Architecture, Training Tools, Hyperparameters Architecture Training data Oracle Labeled Test samples Confidence Reduction Untargeted M isclassification Targeted M isclassification Source/Target M isclassification Murphy, 2012 Szegedy, 2014 Papernot, 2016 Goodfellow, 2016 Xu, 2016 Nguyen, 2014
  • 28. What can you do with limited knowledge? • Quite a lot. • Make good guesses: Infer the methodology from the task • Image classification: ConvNet • Speech recognition: LSTM-RNN • Amazon ML, ML-as-a-service etc.: Shallow feed-forward network • What if you can’t guess?
  • 30. Black box attack methodology 1. Transferability Adversarial samples that fool model A have a good chance of fooling a previously unseen model B SVM, scikit-learn Decision Tree Matt's Webcorner, Stanford Linear Classifier (Logistic Regression) Spectral Clustering, scikit-learn Feed Forward Neural Network
  • 31. Black box attack methodology 1. Transferability Papernot et al. “Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples”
  • 32. Black box attack methodology 2. Substitute model train a new model by treating the target model’s output as a training labels then, generate adversarial samples with substitute model input data target black box model training label backpropagate black box prediction substitute model
  • 33. Why is this possible? • Transferability? • Still an open research problem • Manifold learning problem • Blind spots • Model vs. Reality dimensionality mismatch • IN GENERAL: • Is the model not learning anything at all?
  • 34. What this means for us • Deep learning algorithms (Machine Learning in general) are susceptible to manipulative attacks • Use with caution in critical deployments • Don’t make false assumptions about what/how the model learns • Evaluate a model’s adversarial resilience - not just accuracy/precision/recall • Spend effort to make models more robust to tampering
  • 35. Defending the machines • Distillation • Train model 2x, feed first DNN output logits into second DNN input layer • Train model with adversarial samples • i.e. ironing out imperfect knowledge learnt in the model • Other miscellaneous tweaks • Special regularization/loss-function methods (simulating adversarial content during training)
  • 37. WHY DEEP-PWNING? • lol why not • “Penetration testing” of statistical/machine learning systems • Train models with adversarial samples for increased robustness
  • 38. PLEASE PLAY WITH IT & CONTRIBUTE!
  • 39. Deep Learning and Privacy • Deep learning also sees challenges in other areas relating to security & privacy • Adversary can reconstruct training samples from a trained black box DNN model (Fredrikson, 2015) • Can we precisely control the learning objective of a DNN model? • Can we train a DNN model without the training agent having complete access to all training data? (Shokri, 2015)
  • 40. WHY IS THIS IMPORTANT?
  • 41. WHY DEEP-PWNING? • MORE CRITICAL SYSTEMS RELY ON MACHINE LEARNING → MORE IMPORTANCE ON ENSURING THEIR ROBUSTNESS • WE NEED PEOPLE WITH BOTH SECURITY AND STATISTICAL SKILL SETS TO DEVELOP ROBUST SYSTEMS AND EVALUATE NEW INFRASTRUCTURE
  • 42. LEARN IT OR BECOME IRRELEVANT
  • 43.