SlideShare a Scribd company logo
1 of 24
Advisor: Henry Horng-Shing Lu
Students: Jane Hsing-Chuan Hsieh
Date: 2021-08-10
Concern: Model Transparency & Interpretability
 Despite unprecedented breakthroughs of CNN in a variety of
computer vision tasks, their lack of decomposability into
individually intuitive components makes them hard to interpret
Purpose:
1. Visualizing CNNs
 visualized CNN predictions by highlighting ‘important’ pixels (i.e. change in
intensities of these pixels have the most impact on the prediction score)
2. Help Users to Build Trust to AI
 we must build ‘transparent’ models that have the ability to explain why they
predict what they predict.
What makes a good visual explanation?
1. Class Discriminative – localize the category in the image
 Class Activation Mapping (CAM)
 Gradient-weighted Class Activation Mapping (Grad-CAM)
2. High-Resolution –
capture fine-grained
detail (Pixel-space
gradient visualizations)
 Guided Back
propagation
 Deconvolution
3. Both –
 Guided Grad-CAM
not class-
discriminative
1. Brief Introduction for CNN Visualizing Tools
 CAM
 Grad-CAM
 Guided back propagation
 Guided Grad-CAM
• CAM
• Grad-CAM
• Guided back propagation
• Guided Grad-CAM
Convolutional layers of
CNNs actually behave as
object detectors (i.e., to
localize objects)
 despite no supervision on the
location of the object is
provided
In other words,
convolutional layers
naturally retain spatial
information
E.g., for action classification, CNN is able
to localize the discriminative regions as
the objects that the humans are
interacting with rather than the humans
themselves
However, this ability (spatial information / object detectors
) is lost in fully-connected layers
 So we expect the last convolutional layer have the most
detailed spatial information
 The higher the convolutional layers are, the higher level of
semantics are extracted
For a particular category (𝑐), a Class Activation Map
(CAM) indicates the discriminative image regions used by
the CNN to identify that category
Characteristics
Replace fully-connected layers
with global average pooling
(GAP) layers
1. to minimize the number of
parameters while maintaining
high performance
2. act as structural regularizer,
preventing overfitting during
training
CNN
Architecture
1. For each feature map (𝑓𝑘 𝑥, 𝑦 , 𝑘 = 1, … , 𝑛) at the last convolutional
layer, GAP outputs the spatial average of each feature map
𝐹𝑘 =
𝑥,𝑦
𝑓𝑘 (𝑥, 𝑦)
2. For a given class 𝑐, the input for output layer: 𝑆𝑐 = 𝑘 𝑤𝑘
𝑐
𝐹𝑘
(𝑤𝑘
𝑐
: importance of 𝐹𝑘 for class 𝑐)
3. Output score for class 𝑐: 𝑃𝑐 =
exp(𝑆𝑐)
𝑐 exp(𝑆𝑐)
(e.g., softmax)
𝑓𝑘 𝑥, 𝑦 P𝑐
CAM
Procedure
 Weights (𝑤1
𝑐
, 𝑤2
𝑐
, …, 𝑤𝑛
𝑐
) of output layer indicate the importance
of the image regions (𝐹𝑘) to a specific class (𝑐)
  Compute CAM:
𝑀𝑐(𝑥, 𝑦) =
𝑘
𝑤𝑘
𝑐
𝑓𝑘(𝑥, 𝑦)
Note: if the shape (H, W) of CAM (𝑀𝑐) is different from that of input images, up-sampling is needed to equalize the
shapes
𝑓𝑘 𝑥, 𝑦 P𝑐
CAM trades off model complexity and performance (using
global average pooling (GAP)) for more transparency
Shortage
 To apply CAM, any CNN-based network must change its
architecture, where GAP is a must before the output layer
 i.e., architectural changes and hence re-training is needed
Gradient-weighted Class Activation Mapping (Grad-CAM)
generalizes CAM for a wide variety of CNN-based
architectures
 i.e., without requiring architectural changes or re-training
Characteristics
 Without GAP layer, we need a way to define weights – 𝑤𝑘
𝑐
  Grad-CAM uses the gradients of any target concept (𝑐) (e.g., ‘dog’
in a classification network) flowing into the final convolutional layer,
and derive summary statistics out of it to represent the weights
(importance)
Source: Selvaraju, Ramprasaath R., et al. "Grad-cam: Visual explanations from deep
networks via gradient-based localization." Proceedings of the IEEE international conference
on computer vision. 2017.
Procedure
 For a given class 𝑐, compute the gradient of its score– y𝑐
(before the
softmax), w.r.t. each feature map activations 𝐴𝑘 ∈ ℝ𝑢×𝑣, 𝑘 = 1, … , 𝑛 of a
convolutional layer, i.e.
𝜕𝑦𝑐
𝜕𝐴𝑘 ∈ ℝ𝑢×𝑣
 Define the importance weights of feature map 𝑘 via GAP:
𝛼𝑘
𝑐
=
1
𝑍 𝑖∈𝑥 𝑗∈𝑦
𝜕𝑦𝑐
𝜕𝐴𝑖𝑗
𝑘
𝐴𝑘 𝑥, 𝑦
y𝑐
 Influence of 𝐴𝑘
𝑥, 𝑦 to 𝑦𝑐
Procedure
  Compute Grad-CAM:
𝐿𝐺𝑟𝑎𝑑−𝐶𝐴𝑀
𝑐
𝑥, 𝑦 = 𝑅𝑒𝐿𝑈
𝑘
𝛼𝑘
𝑐
𝐴𝑘 𝑥, 𝑦 ∈ ℝ𝑢×𝑣
 ReLU is applied because we are only interested in the features (neurons) that
have a positive influence on the class of interest
 i.e. pixels whose intensity should be increased in order to increase 𝑦𝑐
Note: if the shape (u, v) of 𝐿𝐺𝑟𝑎𝑑−𝐶𝐴𝑀
𝑐
is different from that of input images, up-sampling is needed to equalize the
shapes
𝐴𝑘 𝑥, 𝑦
y𝑐
Grad-CAM generates visual explanations for a wide
variety of CNN-based networks without requiring
architectural changes or re-training.
Grad-CAM can help
identify the biases in
dataset
 Models trained on biased
datasets may not generalize to
real-world scenarios, or worse,
may perpetuate biases and
stereotypes (w.r.t. gender, race,
age, etc.)
 E.g., for a “doctor” vs. “nurse”
binary classification task
Biased model had learned to look at
the person’s face hairstyle to
distinguish nurses from doctors
 thus learning gender stereotype
Unbiased model made the right
prediction looking at the white
coat, and the stethoscope
Shortage
 The generated localization map (heatmap) from Grad-CAM (also
CAM) is coarse (low-resolution)  unclear enough why the
network predicts a particular
instance (e.g., “tiger cat”)
 Guided Back Propagation
is another approach to
provide high-resolution map
 i.e. fine-grained detail, or pixel-
space gradient visualizations
Guided Backpropagation visualizes gradients of the
network’s prediction (i.e., output neuron) w.r.t. the input
image
 This determines which pixels need to be changed the least to affect
the prediction the most (i.e., higher absolute gradients)
Negative gradients are suppressed through ReLU when
backpropagating
 because we are only interested in the pixels that increase the
activation of the output neuron, rather than suppressing it
Guided Back Propagation is high-resolution
 since it derive gradients directly w.r.t. the input image instead of
w.r.t. last convolutional Layer (i.e., Grad-CAM)
Shortage
 Not class-discriminative
  Guided Grad-CAM
combines Guided
backpropagation and
Grad-CAM, and thus
becomes class-
discriminative
Characteristics
 Guided Grad-CAM is both high-resolution and class-
discriminative
Procedure
 Fusing Guided Back Propagation with Grad-CAM to create
Guided Grad-CAM visualizations
× =
Guided Grad-CAM also help untrained users successfully
discern a ‘stronger’ network from a ‘weaker’ one, even
when both make identical predictions.
stronger
network
weaker
network
2 models (A vs B)
with same
prediction accuracies
because guided backpropagation adds an additional
guidance signal from the higher layers to usual
backpropagation.
This prevents backward flow of negative gradients,
corresponding to the neurons which decrease the activation
of the higher layer unit we aim to visualize

More Related Content

What's hot

Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
Depth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayDepth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayNAVER Engineering
 
Image colorization
Image colorizationImage colorization
Image colorizationYash Saraf
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation岳華 杜
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and ApplicationsHoang Nguyen
 
[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)Donghyeon Kim
 
Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)Asha Aher
 
Digital Image Processing (Lab 07)
Digital Image Processing (Lab 07)Digital Image Processing (Lab 07)
Digital Image Processing (Lab 07)Moe Moe Myint
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Denoising autoencoder by Harish.R
Denoising autoencoder by Harish.RDenoising autoencoder by Harish.R
Denoising autoencoder by Harish.RHARISH R
 
Building trust through Explainable AI
Building trust through Explainable AIBuilding trust through Explainable AI
Building trust through Explainable AIPeet Denny
 
Feed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentFeed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentMuhammad Rasel
 
Brain Tumor Segmentation in MRI Images
Brain Tumor Segmentation in MRI ImagesBrain Tumor Segmentation in MRI Images
Brain Tumor Segmentation in MRI ImagesIJRAT
 

What's hot (20)

Medical image analysis
Medical image analysisMedical image analysis
Medical image analysis
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Depth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayDepth estimation do we need to throw old things away
Depth estimation do we need to throw old things away
 
Image colorization
Image colorizationImage colorization
Image colorization
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Hog and sift
Hog and siftHog and sift
Hog and sift
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
Capsule Networks
Capsule NetworksCapsule Networks
Capsule Networks
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 
[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)
 
Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)
 
Digital Image Processing (Lab 07)
Digital Image Processing (Lab 07)Digital Image Processing (Lab 07)
Digital Image Processing (Lab 07)
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Denoising autoencoder by Harish.R
Denoising autoencoder by Harish.RDenoising autoencoder by Harish.R
Denoising autoencoder by Harish.R
 
Building trust through Explainable AI
Building trust through Explainable AIBuilding trust through Explainable AI
Building trust through Explainable AI
 
Feed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentFeed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descent
 
Brain Tumor Segmentation in MRI Images
Brain Tumor Segmentation in MRI ImagesBrain Tumor Segmentation in MRI Images
Brain Tumor Segmentation in MRI Images
 

Similar to Introduction to Grad-CAM (short version)

220206 transformer interpretability beyond attention visualization
220206 transformer interpretability beyond attention visualization220206 transformer interpretability beyond attention visualization
220206 transformer interpretability beyond attention visualizationtaeseon ryu
 
Background Estimation Using Principal Component Analysis Based on Limited Mem...
Background Estimation Using Principal Component Analysis Based on Limited Mem...Background Estimation Using Principal Component Analysis Based on Limited Mem...
Background Estimation Using Principal Component Analysis Based on Limited Mem...IJECEIAES
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
Recognition of Handwritten Mathematical Equations
Recognition of  Handwritten Mathematical EquationsRecognition of  Handwritten Mathematical Equations
Recognition of Handwritten Mathematical EquationsIRJET Journal
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networksananth
 
20150703.journal club
20150703.journal club20150703.journal club
20150703.journal clubHayaru SHOUNO
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisNaeem Shehzad
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcscpconf
 
Median based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructionMedian based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructioncsandit
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcsandit
 
A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution Mohammed Ashour
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attributiontaeseon ryu
 
Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...VasileiosMezaris
 
griffm3_ECSE4540_Final_Project_Report
griffm3_ECSE4540_Final_Project_Reportgriffm3_ECSE4540_Final_Project_Report
griffm3_ECSE4540_Final_Project_ReportMatt Griffin
 
Implementing Neural Style Transfer
Implementing Neural Style Transfer Implementing Neural Style Transfer
Implementing Neural Style Transfer Tahsin Mayeesha
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
 
Conception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfConception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfSofianeHassine2
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUEScscpconf
 
Multi-hypothesis projection-based shift estimation for sweeping panorama reco...
Multi-hypothesis projection-based shift estimation for sweeping panorama reco...Multi-hypothesis projection-based shift estimation for sweeping panorama reco...
Multi-hypothesis projection-based shift estimation for sweeping panorama reco...Tuan Q. Pham
 

Similar to Introduction to Grad-CAM (short version) (20)

220206 transformer interpretability beyond attention visualization
220206 transformer interpretability beyond attention visualization220206 transformer interpretability beyond attention visualization
220206 transformer interpretability beyond attention visualization
 
Background Estimation Using Principal Component Analysis Based on Limited Mem...
Background Estimation Using Principal Component Analysis Based on Limited Mem...Background Estimation Using Principal Component Analysis Based on Limited Mem...
Background Estimation Using Principal Component Analysis Based on Limited Mem...
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
Recognition of Handwritten Mathematical Equations
Recognition of  Handwritten Mathematical EquationsRecognition of  Handwritten Mathematical Equations
Recognition of Handwritten Mathematical Equations
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
20150703.journal club
20150703.journal club20150703.journal club
20150703.journal club
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
 
Median based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructionMedian based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstruction
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
 
A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attribution
 
Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...
 
Lda
LdaLda
Lda
 
griffm3_ECSE4540_Final_Project_Report
griffm3_ECSE4540_Final_Project_Reportgriffm3_ECSE4540_Final_Project_Report
griffm3_ECSE4540_Final_Project_Report
 
Implementing Neural Style Transfer
Implementing Neural Style Transfer Implementing Neural Style Transfer
Implementing Neural Style Transfer
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
 
Conception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfConception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdf
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
 
Multi-hypothesis projection-based shift estimation for sweeping panorama reco...
Multi-hypothesis projection-based shift estimation for sweeping panorama reco...Multi-hypothesis projection-based shift estimation for sweeping panorama reco...
Multi-hypothesis projection-based shift estimation for sweeping panorama reco...
 

Recently uploaded

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 

Recently uploaded (20)

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 

Introduction to Grad-CAM (short version)

  • 1. Advisor: Henry Horng-Shing Lu Students: Jane Hsing-Chuan Hsieh Date: 2021-08-10
  • 2. Concern: Model Transparency & Interpretability  Despite unprecedented breakthroughs of CNN in a variety of computer vision tasks, their lack of decomposability into individually intuitive components makes them hard to interpret Purpose: 1. Visualizing CNNs  visualized CNN predictions by highlighting ‘important’ pixels (i.e. change in intensities of these pixels have the most impact on the prediction score) 2. Help Users to Build Trust to AI  we must build ‘transparent’ models that have the ability to explain why they predict what they predict.
  • 3. What makes a good visual explanation? 1. Class Discriminative – localize the category in the image  Class Activation Mapping (CAM)  Gradient-weighted Class Activation Mapping (Grad-CAM) 2. High-Resolution – capture fine-grained detail (Pixel-space gradient visualizations)  Guided Back propagation  Deconvolution 3. Both –  Guided Grad-CAM not class- discriminative
  • 4. 1. Brief Introduction for CNN Visualizing Tools  CAM  Grad-CAM  Guided back propagation  Guided Grad-CAM
  • 5. • CAM • Grad-CAM • Guided back propagation • Guided Grad-CAM
  • 6. Convolutional layers of CNNs actually behave as object detectors (i.e., to localize objects)  despite no supervision on the location of the object is provided In other words, convolutional layers naturally retain spatial information E.g., for action classification, CNN is able to localize the discriminative regions as the objects that the humans are interacting with rather than the humans themselves
  • 7. However, this ability (spatial information / object detectors ) is lost in fully-connected layers  So we expect the last convolutional layer have the most detailed spatial information  The higher the convolutional layers are, the higher level of semantics are extracted
  • 8. For a particular category (𝑐), a Class Activation Map (CAM) indicates the discriminative image regions used by the CNN to identify that category Characteristics Replace fully-connected layers with global average pooling (GAP) layers 1. to minimize the number of parameters while maintaining high performance 2. act as structural regularizer, preventing overfitting during training
  • 9. CNN Architecture 1. For each feature map (𝑓𝑘 𝑥, 𝑦 , 𝑘 = 1, … , 𝑛) at the last convolutional layer, GAP outputs the spatial average of each feature map 𝐹𝑘 = 𝑥,𝑦 𝑓𝑘 (𝑥, 𝑦) 2. For a given class 𝑐, the input for output layer: 𝑆𝑐 = 𝑘 𝑤𝑘 𝑐 𝐹𝑘 (𝑤𝑘 𝑐 : importance of 𝐹𝑘 for class 𝑐) 3. Output score for class 𝑐: 𝑃𝑐 = exp(𝑆𝑐) 𝑐 exp(𝑆𝑐) (e.g., softmax) 𝑓𝑘 𝑥, 𝑦 P𝑐
  • 10. CAM Procedure  Weights (𝑤1 𝑐 , 𝑤2 𝑐 , …, 𝑤𝑛 𝑐 ) of output layer indicate the importance of the image regions (𝐹𝑘) to a specific class (𝑐)   Compute CAM: 𝑀𝑐(𝑥, 𝑦) = 𝑘 𝑤𝑘 𝑐 𝑓𝑘(𝑥, 𝑦) Note: if the shape (H, W) of CAM (𝑀𝑐) is different from that of input images, up-sampling is needed to equalize the shapes 𝑓𝑘 𝑥, 𝑦 P𝑐
  • 11. CAM trades off model complexity and performance (using global average pooling (GAP)) for more transparency Shortage  To apply CAM, any CNN-based network must change its architecture, where GAP is a must before the output layer  i.e., architectural changes and hence re-training is needed
  • 12. Gradient-weighted Class Activation Mapping (Grad-CAM) generalizes CAM for a wide variety of CNN-based architectures  i.e., without requiring architectural changes or re-training Characteristics  Without GAP layer, we need a way to define weights – 𝑤𝑘 𝑐   Grad-CAM uses the gradients of any target concept (𝑐) (e.g., ‘dog’ in a classification network) flowing into the final convolutional layer, and derive summary statistics out of it to represent the weights (importance) Source: Selvaraju, Ramprasaath R., et al. "Grad-cam: Visual explanations from deep networks via gradient-based localization." Proceedings of the IEEE international conference on computer vision. 2017.
  • 13. Procedure  For a given class 𝑐, compute the gradient of its score– y𝑐 (before the softmax), w.r.t. each feature map activations 𝐴𝑘 ∈ ℝ𝑢×𝑣, 𝑘 = 1, … , 𝑛 of a convolutional layer, i.e. 𝜕𝑦𝑐 𝜕𝐴𝑘 ∈ ℝ𝑢×𝑣  Define the importance weights of feature map 𝑘 via GAP: 𝛼𝑘 𝑐 = 1 𝑍 𝑖∈𝑥 𝑗∈𝑦 𝜕𝑦𝑐 𝜕𝐴𝑖𝑗 𝑘 𝐴𝑘 𝑥, 𝑦 y𝑐  Influence of 𝐴𝑘 𝑥, 𝑦 to 𝑦𝑐
  • 14. Procedure   Compute Grad-CAM: 𝐿𝐺𝑟𝑎𝑑−𝐶𝐴𝑀 𝑐 𝑥, 𝑦 = 𝑅𝑒𝐿𝑈 𝑘 𝛼𝑘 𝑐 𝐴𝑘 𝑥, 𝑦 ∈ ℝ𝑢×𝑣  ReLU is applied because we are only interested in the features (neurons) that have a positive influence on the class of interest  i.e. pixels whose intensity should be increased in order to increase 𝑦𝑐 Note: if the shape (u, v) of 𝐿𝐺𝑟𝑎𝑑−𝐶𝐴𝑀 𝑐 is different from that of input images, up-sampling is needed to equalize the shapes 𝐴𝑘 𝑥, 𝑦 y𝑐
  • 15. Grad-CAM generates visual explanations for a wide variety of CNN-based networks without requiring architectural changes or re-training.
  • 16. Grad-CAM can help identify the biases in dataset  Models trained on biased datasets may not generalize to real-world scenarios, or worse, may perpetuate biases and stereotypes (w.r.t. gender, race, age, etc.)  E.g., for a “doctor” vs. “nurse” binary classification task Biased model had learned to look at the person’s face hairstyle to distinguish nurses from doctors  thus learning gender stereotype Unbiased model made the right prediction looking at the white coat, and the stethoscope
  • 17. Shortage  The generated localization map (heatmap) from Grad-CAM (also CAM) is coarse (low-resolution)  unclear enough why the network predicts a particular instance (e.g., “tiger cat”)  Guided Back Propagation is another approach to provide high-resolution map  i.e. fine-grained detail, or pixel- space gradient visualizations
  • 18. Guided Backpropagation visualizes gradients of the network’s prediction (i.e., output neuron) w.r.t. the input image  This determines which pixels need to be changed the least to affect the prediction the most (i.e., higher absolute gradients) Negative gradients are suppressed through ReLU when backpropagating  because we are only interested in the pixels that increase the activation of the output neuron, rather than suppressing it
  • 19. Guided Back Propagation is high-resolution  since it derive gradients directly w.r.t. the input image instead of w.r.t. last convolutional Layer (i.e., Grad-CAM) Shortage  Not class-discriminative   Guided Grad-CAM combines Guided backpropagation and Grad-CAM, and thus becomes class- discriminative
  • 20. Characteristics  Guided Grad-CAM is both high-resolution and class- discriminative Procedure  Fusing Guided Back Propagation with Grad-CAM to create Guided Grad-CAM visualizations × =
  • 21. Guided Grad-CAM also help untrained users successfully discern a ‘stronger’ network from a ‘weaker’ one, even when both make identical predictions. stronger network weaker network 2 models (A vs B) with same prediction accuracies
  • 22.
  • 23.
  • 24. because guided backpropagation adds an additional guidance signal from the higher layers to usual backpropagation. This prevents backward flow of negative gradients, corresponding to the neurons which decrease the activation of the higher layer unit we aim to visualize

Editor's Notes

  1. just before the final output layer (softmax in the case of categorization), we perform global average pooling on the convolutional feature maps and use those as features for a fully-connected layer that produces the desired output (categorical or otherwise). Given this simple connectivity structure, we can identify the importance of the image regions by projecting back the weights of the output layer on to the convolutional feature maps, a technique we call class activation mapping. Feature map: fk is the map of the presence of this visual pattern. MAC: Mc(x, y) directly indicates the importance of the activation at spatial grid (x, y) leading to the classification of an image to class c
  2. just before the final output layer (softmax in the case of categorization), we perform global average pooling on the convolutional feature maps and use those as features for a fully-connected layer that produces the desired output (categorical or otherwise). Given this simple connectivity structure, we can identify the importance of the image regions by projecting back the weights of the output layer on to the convolutional feature maps, a technique we call class activation mapping. Feature map: fk is the map of the presence of this visual pattern. MAC: Mc(x, y) directly indicates the importance of the activation at spatial grid (x, y) leading to the classification of an image to class c