Proposal for using defensive layers as a means of combatting adversarial attacks, along with preliminary research.
We propose two distinct defensive layers and continue to analyze their effects on model accuracy and vulnerability to attacks.
2. Agenda
● Background
○ Adversarial attacks
○ Defenses against adversarial attacks
● Intra-model Approach
○ Denoising layers
○ Experimental setup
● Experimental Results
● Conclusions and Future Work
3. Background - Attacks
Deliberate perturbation to the input image to
result in a different classification output.
http://jlin.xyz/advis/
A lot of different formulations, but typical
white-box attacks use the model to generate
the attacks.
4. Background - Defense
Focused on roughly 3 categories:
1. Augment training set with adversarial examples (attack-specific)
2. Modify network architecture
3. Detect adversarial example and modify it
“Image Super-Resolution as a Defense Against Adversarial Attacks”
(Enhanced deep super resolution + wavelet denoising)
ShieldNets: Defending Against Adversarial Attacks Using Probabilistic Adversarial Robustness
5. Our Proposal
Use inter-network defense layers placed at different points in the classifier network
as a defense against adversarial attacks.
11. Conclusions
● Empirical results suggest that intra-model denoising layers can improve test
accuracy on adversarial datasets compared to pre-classifier defenses, though
more comprehensive testing with higher baseline accuracies are needed to
substantiate these results
● In most cases, the inclusion of one intermediary denoising layer is as good or
better than the addition of multiple layers at once
12. Future Work
● Use different networks besides VGG19
● Use models that have a higher baseline accuracy
● Try using more types of adversarial attacks
● Work at increasing no-attack accuracy
○ Potentially training the latter end of the network once the layer is in place
● Attempt to convert other current defense mechanisms into intermediate
defenses