2021 03-02-transformer interpretability

Abstract
• Existing methods either rely on the obtained attention maps, or employ
heuristic propagation along the attention graph.
• We propose a novel way to compute relevancy for Transformer networks.

Introduction
Problems of previous works
1. This is usually done for a single attention layer
2. Combine multiple layers : Simply averaging
-> Blurring of signals
-> Not consider the different roles of the layers (deeper layers are more semantic)
The rollout method is an alternative.
-> we show, by relying on simplistic assumptions, irrelevant tokens often become highlighted

Rollout
attention rollout attention flow
information
ratio
information
capacity

Layer-wise Relevance Propagation

Methods
𝐶 : number of classes
𝑥(𝑛) : input of layer 𝐿(𝑛)
(𝑥(1) ∶ network output, 𝑥(𝑁) : network input)
𝐿𝑖
𝑛
(𝑋, 𝑌) : layer operation on two tensors X and Y
Deep Taylor Decomposition formulation :

Non parametric relevance propagation
Skip connection
: 𝐿 𝑛
𝑢, 𝑣 = 𝑢 + 𝑣
Matrix Multiplication
: 𝐿 𝑛 𝑢, 𝑣 = 𝑢𝑣

Attention
v
u
u + v
Attention
𝑅𝑣
𝑅𝑢
𝑅(𝑛−1)

Relevance and gradient diffusion
𝐴(𝑏): attention map of block 𝑏
∇𝐴(𝑏) : gradients
𝑅(𝑛) : relevance
𝐶 ∈ ℝ𝑠×𝑠
𝔼ℎ : is the mean across the
“heads” dimension.
⨀ : Hadamard product.

Experiments
First, a pre-trained network is used for extracting visualizations for the
validation set of ImageNet.
Second, we gradually mask out the pixels of the input image, and
measure the mean top-1 accuracy of the network.

Conclusion
How to construct LRP + rollout

2021 03-02-transformer interpretability

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 2021 03-02-transformer interpretability

Similar to 2021 03-02-transformer interpretability (20)

More from JAEMINJEONG5

More from JAEMINJEONG5 (12)

Recently uploaded

Recently uploaded (20)

2021 03-02-transformer interpretability