Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Adversarial ml
1. Adversarial Machine Learning
Junfei Wang
Supervisor: Pirathayini Srikantha
Graduated form UWO in April,2020
Now a Phd student at York University
2. Outline
Presentation Title Here
1. Introduction
2. Connections Between Standard ML and Adversarial
ML
3. Attack Algorithm
4. Defense Mechanism
3. Introduction
Presentation Title Here
Self-driving car: physical change on traffic sign may cause misclassifying.
ASR system: https://adversarial-attacks.net/
…….
Small change cause huge difference on output
Not really a noise
Use case:
4. Introduction
In 2014, the phenomenen is discovered in [1]
Definition: legitimate inputs altered by adding small, often imperceptible,
perturbations to force a learned classier to misclassify the resulting
adversarial inputs, while remaining correctly classified by a human observer.
The perturbation can be physical. Example of traffic sign.
Presentation Title Here
[1]C. Szegedy, et al. Intriguing properties of neural networks. In Proceedings of the International
Conference on Learning Representations, 2014.
5. Recap of Machine Learning Training
Process (1)
Presentation Title Here
Given inputs and labels, keep updating weights of the model to fit
them
6. Recap of Machine Learning Training
Process (1)
Presentation Title Here
Given the model, we change input to travel across the boundary
7. Recap of Machine Learning Process (2)
Presentation Title Here
Loss
w
8. Recap of Machine Learning Process (2)
Presentation Title Here
Loss
X
9. White-box Adversarial Attack (1)
Perspective 1: Given the model and the original label, we keep updating
input so as to change the output label.
Perspective 2: Instead travel downhill the loss curve, we can do gradient
ascent to increase the loss.
Perspective 3: For any input, it can be perturbed, and fool the target
model, but it may not be stealthy enough.
So, a successful adversarial attack should evade the detection
Presentation Title Here
10. Detector and Stealthiness
Detector:
a. Image & audio attack: human observer
b. Fraud transaction (time series): Anomaly detection mechanism
c. Can be defense mechanism toward adversarial attack
How to make stealthy attack?
Impossible to build model for detector, restrict the norm of perturbation.
a. L0 Norm: number of dimensions can be perturbed
b. L2: Euclidean Distance
c. L-∞: maximum change among all dimensions
Presentation Title Here
12. White-box Attack Algorithm(1)
1. Projected Gradient Descent(PGD)[2]:Using Gradient Descent with L-∞ constraint.
Presentation Title Here
[2] Madry, Aleksander, et al. "Towards deep learning models resistant to adversarial
attacks."
13. White-box Attack Algorithm(2)
2. Fast Gradient Sign Method (FGSM)[3]:
Rely on the first order-derivative, using sgn function to avoid too small gradient.
Presentation Title Here
http://jlin.xyz/advis/
[3] Ian J Goodfellow, et al. Explaining and harnessing adversarial examples. In
Proceedings of the International Conference on Learning Representations, 2015.
14. White-box Attack Algorithm(3)
3. Jacobian Saliency Map Attack(JSMA)[4]:
• JSMA: Iteratively modify the most sensitive pixel (dimension)
• Jacobian Saliency Map: sensitivity function
• A 70 km/h speed limit sign misclassified as a 30km/h speed limit sign.
Presentation Title Here
[4]Papernot, Nicolas, et al. "Practical black-box attacks against machine learning." Proceedings of
the 2017 ACM on Asia conference on computer and communications security. 2017.
15. White-box Attack Algorithm(4)
4. AdvGAN[5]:
Presentation Title Here
[5]Xiao, Chaowei, et al. "Generating adversarial examples with adversarial networks." arXiv
preprint arXiv:1801.02610 (2018).
17. Black-box Attack Strategy
Train substitute model: a local representative model built based on
strategically querying the targeted model
Transferability: Adversarial examples generated by model A can also fool
model B
Jacobian-based Dataset Augmentation: on substitute F, identifying
directions in which the model's output is varying
Presentation Title Here
[6]Papernot, Nicolas, et al. "Practical black-box attacks against machine learning." Proceedings of
the 2017 ACM on Asia conference on computer and communications security. 2017.
18. Defense Mechanism
Unavoidable: a little bit pessimistic
Robustness: is the price that attackers have to pay
Defense mechanisms:
a. Detection-based defense
b. Adversarial training: training set augmentation, reducing sensitivity of the
model
c. Input data sanitization: denoise the input data, mapping back to learned
manifold
Presentation Title Here
19. MagNet: AE-based Defence
Detector + Reformer
Presentation Title Here
[7]Meng, Dongyu, and Hao Chen. "Magnet: a two-pronged defense against adversarial examples."
Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. 2017.
20. DefenseGAN
Training Stage: standard GAN training
Inference Stage:
Presentation Title Here
[8]Samangouei, Pouya, Maya Kabkab, and Rama Chellappa. "Defense-gan: Protecting classifiers
against adversarial attacks using generative models." arXiv preprint arXiv:1805.06605 (2018).