4. How Machine
Learning Works
REF: Confidence-Guided-Open-World [Li et al]
● A neural network is trained on
labelled data and can practice
classifying the image with feedback
● It is then tested against images it
hasn’t seen before
● If the neural network is trained on the
training data too much it will start to
overfit, and when given test data will
not necessarily give accurate results
5. What is an Adversarial
Image Attack?
● Purposefully causing a ML model to produce mispredictions
when identifying data
● There are 3 main fields:
○ Image recognition
○ Natural language
○ Auditory processing
● As a field it is relatively new as it has only been around since
2013
REF:
Making an Invisibility Cloak: Real World Adversarial Attacks on Object
Detectors [Wu et al]
https://arxiv.org/abs/1312.6199
6. What does an Adversarial Image Attack look like?
● This image of a panda has been identified,
the model gives confidence limits on the
categories it has identified. These represent
the probability that the respective category
has been found in the image
● It has identified the giant panda and has put a
box around it in the image
7. What does an Adversarial Image Attack look like?
REF: https://openai.com/research/attacking-machine-learning-with-adversarial-examples
There are some slight differences between these two pictures, e.g the right one is fuzzier. This is
because the right image has had an adversarial attack performed on it, while the left image is the
original.
8. What does an Adversarial Image Attack look like?
REF: https://openai.com/research/attacking-machine-learning-with-adversarial-examples
The image is fuzzier due to a layer of noise that has been layered onto it. This causes the panda to be misclassified as a gibbon
9. Panda Gibbon
A Brief Overview
This shows a simplified
model of how the
network classifies
pandas and gibbons.
When the adversarial
attack is performed the
image of the panda
moves across the line
making it appear to the
neural network as a
gibbon.
10. REF: https://www.youtube.com/watch?v=i1sp4X57TL4
Adversarial Patch - Tom Brown
Adversarial attacks can be
very powerful, altering one
pixel or even creating a patch.
The video shows how a
banana can be misclassified
as a toaster in real time just
by adding in the patch.
11. Real World Example: Stop Sign
REF: Robust Physical-World Attacks on
Deep Learning Models [Ekyholt et al]
Researchers in Michigan placed
small pieces of tape on this stop
sign which caused the model to
misclassify it as a 45 mph speed
limit sign.
When this technology is used in the
real world it has the potential to go
wrong very drastically.
12. Adversarial Examples
REF: Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors [Wu et al]
The adversarial attack varies in
effectiveness, due to a variety of
factors, a small rotation or slight
illumination. Can cause the attack to
stop working.
The man is wearing an adversarial
patch as a jumper. This stops him
being recognised by the image
recognition network. However in a
slightly different environment, his is
recognised and identified.
13. Nightshade: a Defensive use
Nightshade was created as a way for artists to protect
their work from being used as data to train image
generation models on. It was created by Ben Zhao at the
university of Chicago.
14. Nightshade: a Defensive use
Nightshade works by adding imperceptible noise to
images during the training process, making them more
robust to adversarial attacks.
When an image is used without permission it poisons
the network. Causing the generated images to
become distorted. As more poisoned images are used
the image becomes unrecognisable.
Nightshade can be found here:
https://nightshade.cs.uchicago.edu/userguide.html
16. Further Reading
2018
Threat of Adversarial Attacks on
Deep learning in Computer
Vision: A Survey
2021
Advances in Adversarial Attacks and
Defenses in Computer Vision: A
Survey
2021
Hacking AI: Security & Privacy of
Machine Learning Models
?
Black Box
X Y
2020
A Survey of Black-Box Adversarial
Attacks on Computer Vision
Models
Interesting Papers