riken-RBlur-slides.pptx

Biologically Inspired Foveation
Filter Improves Robustness to
Adversarial Attacks
Muhammad Ahmed Shah
4/3/2023

Adversarial Attacks on ML Models
2
𝛿 𝑥 + 𝛿
Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. "Explaining and harnessing adversarial examples." arXiv preprint arXiv:1412.6572 (2014).
ϵ

Adversarial Vulnerability of DNNs
• We have two classifiers
3
The Human The ML classifier The Adversary

• The ML classifier is trying to replicate the human’s decision function.
4
cat

• To teach the classifier, the human provides it very sparse feedback.
5
That’s a cat!

• The classifier is trying to replicate the human’s decision function.
6
cat

7
cat

8
cat

9
cat

• The adversary searches for points within (an approximation of) the
human’s perceptual boundary for which the ML classifier responds
differently than we do
10
cat

• If the classifier accurately models the perceptual boundary, the
adversary would have to find a point outside the boundary to change
the classifier’s output.
11
cat

• If the classifier accurately models the perceptual boundary the
adversary would have to find a point outside the boundary to change
the classifier’s output.
12
Not A Cat

Adversarial Attack Methods
• Fast Gradient Sign Method (FGSM) [Goodfellow+2014]
𝑥𝑎𝑑𝑣 = 𝑥 + 𝜖 ⋅ sign(∇𝑥ℒ 𝑓 𝑥 , 𝑦 )
cat
13
𝜖 ⋅ sign(∇𝑥ℒ 𝑓 𝑥 , 𝑦 )

• Projected Gradient Descent (PGD) [Madry+2018]
1. 𝛿 = Π𝜖 𝑈 −1,1
2. 𝑓𝑜𝑟 𝑘: 1 → 𝐾
3. 𝛿′ ← Π𝜖 𝛿 + ∇𝛿ℒ 𝑓 𝑥 + 𝛿 , 𝑦
4. 𝛿 ← Π𝑥 𝑥 + 𝛿′ − 𝑥
cat
14
Π𝜖 is a projection onto a ℓ𝑝-norm ball of
radius 𝜖. Usually, ℓ∞ or ℓ2 norms are used
and Π𝑥 projects to onto the subspace that 𝑥
lies in. Usually [-1, 1] for images.

1. 𝛿 = Π𝜖 𝑈 −1,1
2. 𝑓𝑜𝑟 𝑘: 1 → 𝐾
3. 𝛿′ ← Π𝜖 𝛿 + ∇𝛿ℒ 𝑓 𝑥 + 𝛿 , 𝑦
4. 𝛿 ← Π𝑥 𝑥 + 𝛿′ − 𝑥
cat
15

1. 𝛿 = Π𝜖 𝑈 −1,1
2. 𝑓𝑜𝑟 𝑘: 1 → 𝐾
3. 𝛿′ ← Π𝜖 𝛿 + ∇𝛿ℒ 𝑓 𝑥 + 𝛿 , 𝑦
4. 𝛿 ← Π𝑥 𝑥 + 𝛿′ − 𝑥
cat
16

1. 𝛿 = Π𝜖 𝑈 −1,1
2. 𝑓𝑜𝑟 𝑘: 1 → 𝐾
3. 𝛿′ ← Π𝜖 𝛿 + ∇𝛿ℒ 𝑓 𝑥 + 𝛿 , 𝑦
4. 𝛿 ← Π𝑥 𝑥 + 𝛿′ − 𝑥
cat
17

1. 𝛿 = Π𝜖 𝑈 −1,1
2. 𝑓𝑜𝑟 𝑘: 1 → 𝐾
3. 𝛿′ ← Π𝜖 𝛿 + ∇𝛿ℒ 𝑓 𝑥 + 𝛿 , 𝑦
4. 𝛿 ← Π𝑥 𝑥 + 𝛿′ − 𝑥
cat
18

SoTA Adversarial Defenses Train the Model to
be Robust
• Adversarial Training:
• Basic Algorithm [Madry+2017]
1. 𝑓𝑜𝑟 𝑥, 𝑦 ∈ 𝐷:
2. 𝑥𝑎𝑑𝑣 ← max
𝑥′∈𝒳
𝑥−𝑥′ ≤𝜖
ℒ(𝑓𝜃 𝑥′ , 𝑦)
3. 𝜃 ← ∇𝜃ℒ 𝑓𝜃 𝑥𝑎𝑑𝑣 , 𝑦
• Work well in most practical scenarios (hence empirical)
but no formal proof.
• Overfitting to the attack type and attack parameters
used during training.
19

Most Defenses Overfit to Training
Configuration
• The robustness is learned not intrinsic  different types of attacks can break the defense
• Small norm perturbations are only one type of perturbations that humans are invariant
to.
• Ideal models should be invariant to all!
20
clean
ℓ
∞
PGD
Attack
Adversarial
Patch
Real World Adversarial Attacks Common Corruptions
Humans are invariant to all of the above without any specialized training

Hypothesis: The Robustness of Human Vision
is Emergent not Learned
• Perhaps the robustness of human vision is due to mechanisms and
constraints that DNNs do not have.
21
Humans DNNs
Neural activations are stochastic Neural activations are deterministic
Highly recurrent Usually exclusively feed-forward
Independent synaptic weights Tied weights (in convolutional NNs)
Constrained receptive fields (usually
Gabors)
Arbitrary receptive fields
Foveated vision (>90% of visual field is
low-res)
100% of the visual field is high-res
… …

Hypothesis: The Robustness of Human Vision
is Emergent not Learned
• Perhaps the robustness of human vision is due to mechanisms and
constraints that DNNs do not have.
22
Humans DNNs
Neural activations are stochastic Neural activations are deterministic
Highly recurrent Usually exclusively feed-forward
Independent synaptic weights Tied weights (in convolutional NNs)
Constrained receptive fields (usually
Gabors)
Arbitrary receptive fields
Foveated vision (>90% of visual field is
low-res)
100% of the visual field is high-res
… …

Hypothesis: Viewing the world at different
levels of fidelity makes human vision robust
• Humans vision is high-resolution only at the point of fixation and
blurry everywhere else (this is called foveation).
• Due to this constraint humans rely on low frequency features of the
image, such as shape. (Geirhos+, 2018)
• Since DNNs are not constrained, they tend to rely on high-frequency
features, such as textures. (Geirhos+, 2018)
• Adversarial attacks exploit DNNs by adding high-frequency
perturbations to the image (Wang+, 2020).
• We simulate foveation as a preprocessing step in CNNs and evaluate
its impact on robustness.
25

What is Foveation?
• There are 2 types of photoreceptors in the retina:
• Cones are sensitive to color
• Rods are sensitive to only illumination
27

What is Foveation?
• Cones are densely packed in a
small region in the center of the
retina called fovea, but sparse
every where else.
• Thus, vision has maximum fidelity
(acuity) at the fovea and
deteriorates in the periphery
• The perceived image in the
periphery is low resolution and
appears desaturated. (Hansen+,
2009; Stewart+, 2019)
28
By Cmglee - Own work,
CC BY-SA 3.0,
https://commons.wikime
dia.org/w/index.php?curi
d=29924570
Michael O. Wilkinson, Roger S.
Anderson, Arthur Bradley, Larry
N. Thibos; Resolution acuity
across the visual field for mesopic
and scotopic illumination. Journal
of Vision 2020;20(10):7.
doi: https://doi.org/10.1167/jov.20.
10.7.

R-Blur: Simulating Foveation via Adaptive
Blurring and Desaturation
30

R-Blur: Overview
31
+ 𝛿~𝒩(0, 𝜎)
𝜶𝑐
𝜶𝑟
1. Select fixation point
2. Add Gaussian Noise
3. Split into color and grey channels and
apply adaptive blurring
4. Combine the color and grey channels

Selecting the Fixation Point
• Training: Random fixation
• Evaluation: five fixation points at
the four corners and the center,
and the logits are averaged
• Not necessarily optimal
• Currently developing methods for
dynamically selecting fixation point
based on the image.
32
DNN DNN DNN DNN DNN
1
5
Σ
𝑦

Computing Eccentricity
• Eccentricity ≡ Distance from
fixation point
• Opticians measure the eccentricity
radially, i.e. Euclidian distance
• Need to extract circular regions to blur
– inefficient
• We use a different distance metric
𝑒𝑝𝑥,𝑝𝑦
=
max( 𝑝𝑥 − 𝑓𝑥 , |𝑝𝑦 − 𝑓𝑦|)
𝑊
• Regions with same eccentricity are
squares – can be extracted by slicing
the image tensor
33

Estimating Visual Acuity
• The acuity of color vision decreases
exponentially with eccentricity.
• The acuity of grey vision generally
much lower, and is minimal at the
fixation point.
• We approximate color and grey
acuity as:
𝒟𝐶 𝑒 ; 𝜎𝐶 = max Λ 𝑒; 0, 𝜎𝐶 , 𝜍 𝑒; 0,2.5𝜎𝐶
𝒟𝑅 𝑒; 𝜎𝑅, 𝑚 = 𝑚(1 − 𝒟𝐶 𝑒 ; 𝜎𝑅 )
• Λ and 𝜍 are the PDF for the Laplace
and Cauchi distribution
• We set 𝜎𝐶 = 0.12, 𝜎𝑅 = 0.09 and 𝑚 =
0.12
34
Michael O. Wilkinson, Roger S.
Anderson, Arthur Bradley, Larry
N. Thibos; Resolution acuity
across the visual field for mesopic
and scotopic illumination. Journal
of Vision 2020;20(10):7.
doi: https://doi.org/10.1167/jov.20.
10.7.

Quantizing Visual Acuity
• The visual acuity at a pixel
determines the std. dev. of
Gaussian blur applied to it.
• #Kernels = # Unique acuity values
= # unique eccentricity values =
𝑊
• To improve efficiency, we
quantize the estimated visual
acuity values
35

Applying Blur
• We compute the std. dev. of the
Gaussian kernel at each pixel as
𝛽𝐷(𝑒𝑝𝑥,𝑝𝑦
)
• 𝐷(𝑒𝑝𝑥,𝑝𝑦
) is the estimated acuity,
and 𝛽 = 0.05
36

Desaturation via Combination
• The blurred grey and color
images are combined in a
pixelwise combination
• The pixel weights are the color
and grey visual acuity values
37

Evaluation: Models and Baselines
38
ResNet18
ResNet
Baseline
ResNet18
R-Blur
ResNet18
R-Warp
PGD Attack
ResNet18
Adversarial
Training

Evaluation: Datasets
* We use 1000 testing images
39
Dataset #Train #Val #Test
Ecoset 1.4 M 28 K 28 K*
Ecoset-10 50 K 1 K 1 K
Ecoset-100 480 K 5 K 5 K*

Result #1: R-Blur Improves Empirical
Robustness
• Measured accuracy under Auto-PGD
(Croce & Hein, 2020)
• R-Blur is much better than ResNet
baseline, and R-Warp
• R-Blur is not as good as adversarial
training but that is expected
• AT is trained on adversarially perturbed
data
40
Ecoset
Ecoset-100

Computing Certifiable Robustness
• Accuracy under Auto-PGD is not a formal guarantee.
• Certifiably Correct @ 𝑟 ≡ 𝐸𝛿: 𝛿 2≤𝑟 1 𝑓 𝑥 + 𝛿 = 𝑦∗ ≥ 0.999
• The model 𝑓 classifies 𝑥 certifiably correctly at radius 𝑟 if it correctly classifies
𝑥 99.9% of the time under random perturbations of size at most 𝑟
• Certified Accuracy @ 𝑟 ≡ 𝐸𝑥,𝑦~𝐷 𝐶𝐶(𝑥, 𝑦; 𝑓)
• The certified accuracy at radius 𝑟 is the fraction of test data on which the
prediction of 𝑓 is certifiably correct.
41

Computing Certifiable Robustness
• Used randomized smoothing (Cohen+, 2019) to compute certified
accuracy at different perturbation sizes (radii):
1. Perturb the input by 105
noise samples from 𝒩 0, 𝜎
2. Obtain the model’s prediction for each sample
3. Compute the Binomial p-value to determine the maximum 𝑟 at which the
model classifies the image certifiably correctly
42

Result #2: R-Blur is Certifiable Robust
• G-Noise is a model trained
with Gaussian noise only
(no blurring)
• The robustness of R-Blur is
certifiable
• R-Blur achieves better
robustness than G-Noise
against larger
perturbations
43
Ecoset-100 Ecoset

Result #3: R-Blur is robust to Common
Corruptions
• We want models to be robust
to a variety of perturbations
not just those with small ℓ𝑝
norms.
• Images were perturbed by 19
non-adversarial corruptions
at 5 different severity levels.
• R-Blur achieves higher
accuracy than all models
when images are severely
corrupted
44
Ecoset-100 Ecoset

Result #4: All Components of R-Blur
Contribute to Robustness
• Removing each component of R-Blur
leads to reduction in accuracy under
adversarial attack.
45

Result #5: R-Blur Lowers Clean Accuracy
Method Ecoset-10 Ecoset-100 Ecoset
ResNet
Baseline
93.8 87.4 71.2
AT 92.1 83.9 61.0
R-Warp 93.4 88.0 70.9
R-Blur 87.9 75.7 57.5
46

Result #6: Accuracy of R-Blur Depends on the
Fixation Point
• A small (but significant) fraction
of images are highly sensitive to
the fixation point.
• We use an oracle to obtain the
model’s prediction at 49 fixation
points and pick the one at which
it is correct (if such a point exists)
47

Result #6: Accuracy of R-Blur Depends on the
Fixation Point
• Under oracle fixation point selection the
accuracy of R-Blur increases to within 2%
of the ResNet baseline
• Methods for predicting the fixation point
from the image are part of ongoing work.
48
5-fixation
Oracle

Conclusion
• R-Blur significantly improves the robustness of CNNs without being
trained on perturbed data.
• The robustness of R-Blur generalizes better than AT to different
perturbation types.
• R-Blur shows the promise of biologically motivated approaches to
model design, especially as it relates to their robustness.
• With predefined fixation points the clean accuracy of R-Blur is lacking.
• Appropriate fixation point selection can mitigate the loss in accuracy
considerably.
49

References
Ramezani, F., Kheradpisheh, S. R., Thorpe, S. J., and Ghodrati, M. Object categorization in visual periphery is modulated by
delayed foveal noise. Journal of Vision, 19 (9):1–1, 2019.
Stewart, E. E. M., Valsecchi, M., and Sch¨utz, A. C. A review of interactions between peripheral and foveal vision. Journal of Vision,
20(12):2–2, 11 2020. ISSN 1534-7362. doi: 10.1167/jov.20.12.2.
Hansen, T., Pracejus, L., and Gegenfurtner, K. R. Color perception in the intermediate periphery of the visual field. Journal of vision,
9(4):26–26, 2009
Wang, Haohan, et al. "High-frequency component helps explain the generalization of convolutional neural networks." Proceedings
of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
Dapello, J., Marques, T., Schrimpf, M., Geiger, F., Cox, D., and DiCarlo, J. J. Simulating a primary visual cortex at the front of cnns
improves robustness to image perturbations. Advances in Neural Information Processing Systems, 33:13073–13087, 2020.
Mehrer, J., Spoerer, C. J., Jones, E. C., Kriegeskorte, N., and Kietzmann, T. C. An ecologically motivated image dataset for deep
learning yields better models of human vision. Proceedings of the National Academy of Sciences, 118(8):e2011417118, 2021.
Krizhevsky, A., Nair, V., and Hinton, G. Cifar-10 (Canadian institute for advanced research). URL
http://www.cs.toronto.edu/∼kriz/cifar.html.
Croce, F. and Hein, M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In
International conference on machine learning, pp. 2206–2216. PMLR, 2020.
Cohen, J., Rosenfeld, E., and Kolter, Z. Certified adversarial robustness via randomized smoothing. In International Conference on
Machine Learning, pp. 1310–1320. PMLR, 2019.
Geirhos, Robert, et al. "ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness." International
Conference on Learning Representations, 2018.
50

riken-RBlur-slides.pptx

Recommended

Recommended

More Related Content

Similar to riken-RBlur-slides.pptx

Similar to riken-RBlur-slides.pptx (20)

Recently uploaded

Recently uploaded (20)

riken-RBlur-slides.pptx

Editor's Notes