"Deep Neural Networks Are Easily Fooled" is a paper that highlights the vulnerability of deep neural networks (DNNs) to adversarial examples. The authors explain that adversarial examples are input samples that are intentionally designed to cause the network to misclassify them. These examples are created by adding small perturbations to the input data, which are usually imperceptible to humans but can have a significant impact on the network's output.
The authors conducted several experiments to demonstrate the vulnerability of DNNs to adversarial examples. In one experiment, they showed that adding small perturbations to an image of a panda can cause the network to misclassify it as a gibbon with high confidence. Another experiment involved creating a digital image that the network classified as a school bus with high confidence, but which to humans appeared to be a random pattern of pixels.
The authors also explored the nature of adversarial examples and their impact on DNNs. They discovered that adversarial examples are not random noise, but rather, they contain structured patterns that exploit the vulnerabilities of the network. The authors also showed that adversarial examples are not specific to any particular architecture or training algorithm, but are a fundamental property of DNNs.
The paper's findings have important implications for the security and reliability of DNNs, especially in applications such as autonomous driving, facial recognition, and fraud detection. The authors suggest that future research should focus on developing more robust and resilient models that are less susceptible to adversarial examples. Additionally, the authors propose that understanding the fundamental properties of DNNs that make them vulnerable to these attacks is crucial for designing better models and enhancing their security.
Overall, "Deep Neural Networks Are Easily Fooled" is a significant paper that has contributed to the advancement of the field of machine learning. It has inspired research into developing more robust and secure models, and it has highlighted the need to explore the fundamental properties of DNNs to better understand their vulnerabilities.
Displacement, Velocity, Acceleration, and Second Derivatives
Deep Neural Networks Are Easily Fooled.pptx
1. Deep Neural Networks Are Easily Fooled:
High Confidence Predictions for Unrecognizable Images
Nguyen et al., CVPR15
Reporter:Yanhua Si
2. Human vs. Computer Object Recognition
Given the near-human ability
of DNNs to classify objects,
what differences remain
between human and computer
vision?
3. Experimental setup
1. LeNet model trained on MNIST dataset– “MNIST DNN”
2. AlexNet DNN trained on ImageNet dataset- ”ImageNet
DNN” (larger dataset, bigger network)
5. Direct Encoding
Grayscale values for MNIST, HSV values for ImageNet.
Each pixel value is initialized with uniform random
noise within the [0, 255] range. Those numbers are
independently mutated.
7. Indirect encoding -CPNN (Compositional
Pattern-Producing Network) Encoding
This encoding is more likely to produce “Regular” images – e.g.
contains symmetry and repetition
Similar to ANNs, but with different nonlinear functions.
10. Fooling images via Gradient Ascent
Following the direction of gradient of the posterior probability
for a specific class
11. Results – MNIST
In less than 50 generations, each run of
evolution produces unrecognizable
images classified by MNIST DNNs with ≥
99.99% confidence.
MNIST DNNs labelled unrecognizable images as
digits with 99.99% confidence after only a few
generations.
12. Results – ImageNet Dataset
Even after 20,000 generations,
evolution failed to produce high-
confidence images for many
categories, but produced images
with ≥ 99% confidence for 45
categories. 21.59% Median
Confidence
Directly encoded EA
Many images with DNN
confidence scores ≥99.99%, but
that are unrecognizable. After
5000 generations, Median
Confidence score is 88.11%,
similar to that for natural images
CPNN Encoding
13. CPNN Encoding on ImageNet
Evolution produces images which has most discriminative features of a class.
14. CPNN Encoding on ImageNet
Many images are related to each other phylogenetically,
which leads evolution to produce similar images for closely
related categories.
Different runs of evolution produce different image types.
15. Repetition Ablation Study
To test whether repetition
improves the confidence
scores, some of the repeated
elements were ablated.
In many images, ablation leaded to a small performance
drop
17. Adding “Fooling Images” class
It is easier to learn to tell CPPN images apart from natural images than it is to tell CPPN
images from MNIST digits.
Editor's Notes
Deep neural networks learn hierarchical layers from sensory input in order to perform pattern recognition. Recently, these deep architectures have demonstrated impressive results, sometimes it can get human-competitive results on many pattern recognition tasks, especially vision classification problems . Given the near-human ability of DNNs to classify visual objects, questions arise as to what differences remain between computer and human vision.
a major difference between DNN and human vision is: changing an image, which has been correctly classified, only change a little bit that human eyes may not be able to notice, can cause a DNN to label the image as something else entirely.
In this paper, we show another difference between human and computer vision, that is: for some images, which are unrecognizable to humans, but can be recognizable to DNN with a very high confidence.
To test whether DNNs might give false positives for unrecognizable images, we need a DNN trained to near state-of-the-art performance. We choose the well-known “AlexNet” architecture
To test that our results hold for other DNN architectures and datasets, we also conduct experiments with the Caffeprovided LeNet model [18] trained on the MNIST dataset
.
first by determining which numbers are mutated, via a rate that starts at 0.1 (each number has a 10% chance of being chosen to be mutated) and drops by half every 1000 generations. The numbers chosen to be mutated are then altered via the polynomial mutation operator [8] with a fixed mutation strength of 15
first by determining which numbers are mutated, via a rate that starts at 0.1 (each number has a 10% chance of being chosen to be mutated) and drops by half every 1000 generations. The numbers chosen to be mutated are then altered via the polynomial mutation operator [8] with a fixed mutation strength of 15
each image is a CPNN, pertubations are made by changing network topology, activation functions, weights
idea is that CPNN evolve images which are similarly recognized by humans and DNNs
sine, linear, gaussian
each image is a CPNN, pertubations are made by changing network topology, activation functions, weights
idea is that CPNN evolve images which are similarly recognized by humans and DNNs
sine, linear, gaussian
Calculating the gradient of the posterior probability for a specific class — here, a softmax output unit of the DNN — with respect to the input image using backprop, and then following the gradient to increase a chosen unit’s activation.
By 200 generations, median confidence is 99.99%.
smaller dataset could lead to overfitting. in some cases if target class was told one might relate the features
Dogs and cats are overrepresented in Image net, more images, less overfitting, less fooling
Another explanation is DNN find it hard to get images that score high in say one specific dog category
Diversity is interesting coz it’s shown that class label can be changed by imperceptible changes to the image. second paper
evolution produced images high scoring for all the classes,
some classes are closely related so images produced are similar
More than one discriminative features. More ways to fool the net.
extra copies make DNNs more confident. Thus DNNs tend to focus on low-mid level features rather the global structure.
which means that mostly they do
an extra class to the DNN . the class of fooling images
also retrained model learns features specific of CPPN