2. The Features of Grad-CAM
● Grad-CAM(Gradient-weighted Class Activation Mapping, 2016, Ramprasaath)
○ Most Famous Method in XAI ( I described the reason in later slide)
○ Update CAM(2015, Zhou) 、Generalize to Any Kind of CNN Architecture
● The Goal of XAI(Explainable Artificial Intelligence)
Identify the Mode of Failure (AI << Human)
Predict with more Confidence (AI ≒ Human)
AI teaches Human (AI >> Human)
3. The Content
- The referred Paper of Grad-CAM
-
-
- Grad-CAMのモデル中身
- Result and Discussion
- Implement with Pytorch and Google Colaboratory
4. NIN(Network In Network, 2014 Lin et al)
- Proficient Paper because of two great ideas
Introduce 1x1 Conv to reduce the calculation cost
( Applied to InceptionNet、ResNet Botttleneck Block)
Introduce GAP(Global Average Pooling)
→ Recently Adaptive Average Pooling is used
● GAP
Performed as a Structural Regularizer
○ More Native to the correspondence between Feature Map and Category
○ NO Added Parameter
○ Robust to Spatial Translation
5. Object Detectors Emerge In Deep Scene Cnns(2015 Zhou et al)
- CNN Model Scene Recognition → Object Detector Emerges
No Supervised Dataset of Object Classification and Detection
In Previous Research, Object Classification → Object Localization
Places Database (2014 Zhou et al )
6. CAM(Class Activation Mapping 2015 Zhou et al)
…
…
Final
Conv
GAP FC
K Featuer Maps K Element
…
C class
a
a
1
Generate CAM
Using
8. Math Equation and Concept of CAM
Sum with
i, j
Weighted
Sum with k
Each Process is Independent
Z is size of Feature Map (Z=49)
9. Usage of CAM( After Inference)
Average
With i, j
(Image Source : Zhou et al 2015)
CAMWeighted
Sum with k
Inference Generate
CAM
Weighted
Sum with k
10. Guided Back-Propagation(2015 Springenberg)
- Deconvolutional Network (2011 Zeiler)
Opposite Process of Max Pooling
- Guided Backprop
Combine with DeconvNet and
ReLU BackPropagation
12. Grad-CAM(2016 Ramprasaath)
CAM limits with GAP → Grad-CAM generalize to Any Architecture
Combine CAM(Corase) with Guided-Backprop(Fined-Grained)
Insert ReLU to CAM(Only Positive Value is enough)
No need to Architectural Change and Re-Train
Sum with
i and j
Weighted
Sum with
Weighted
Sum with
13. Result 1 of Grad-CAM
- Microsoft COCO
Dataset
- Sample from
Validation Dataset
- Mistake with
Ice Cream
14. Result 2 of Grad-CAM
Mistake at VGG@ImageNet Whether the model has bias or not