Squeeze Excitation Networks, The simple idea that won the final ImageNet Challnege.

SENet
Squeeze-and-Excitation Networks

Main Idea
 Current methods give equal weighting to all features in a
Convolutional Neural Network.
 Not all features are equally important in an image or feature
representation.
 Use channel-wise attention in Convolutional Neural Networks

What is Attention? : Explanation
 Attention is a transformer mechanism originally proposed for
Natural Language Processing (NLP) with Recurrent Neural
Networks (RNNs).
 It gives weightings to outputs so that the next layer can “pay
attention” only to more important features.
 The network itself learns how to weight the attention values.

Attention in CNNs
 This is not the first time that attention has been proposed for CNNs.
 Other models also use spatial attention for attention to volume as
well as for features.
 SENet only uses channel-wise attention for computational efficiency.

Squeeze-and-Excitation Blocks
 The fundamental unit of SENet.
 Very lightweight but very
effective on a wide variety of
tasks.
 Applicable to any convolutional
network.

SE Block Structure
 The SE Block consists of the
following.
1. Global Average Pooling Layer
2. FC layer with C/r neurons
3. ReLU activation
4. FC layer with C neurons
5. Sigmoid activation
6. Scaling (channel-wise multiply)

Explanation: Squeeze
 Recall that the first FC had C/r neurons. (‘C’ is number of channels)
 The hyperparameter ‘r’ is used to condense information into a smaller
representation which also reduces computational time.
 The authors found that r=16 was a good balance between performance and
computational efficiency.
 Global Average Pooling was used for simplicity. The authors acknowledge that
more sophisticated channel feature extraction methods may work better.

Explanation: Excite
 The Second FC layer has ‘C’ neurons, one for each channel.
 These have sigmoid activations to determine how much attention
each channel (and hence each feature) should receive.
 The output values of the sigmoids are multiplied to each
corresponding channel of the original layer in the ‘Scale’ layer.

Comparison with original models

Explanation
 The effects of SE Blocks are sustained even at very deep architectures,
where increased depth saturates or is impractical to train.
 SE Blocks are effective for a wide variety of CNN architectures, not
only residual or modular architectures.
 The SE Block also increases model depth significantly, but at very little
computational cost.

Comparisons on other tasks and environments

Effects of attention at
different depths
Near the end, at SE_5_2, the activations
show almost binary behavior while
SE_5_3 shows scaling behavior, which
could be done by classifiers.
This suggests that the attention values
at the very last block is not very useful,
while it is the most computationally
intensive part due to the large number
of channels at the end.
Indeed, the authors have found that
there is almost no performance drop if
the SE Block is removed for the last
residual block of ResNet-50.
The activations for features behave differently at different depths.
The activations for different classes are similar at lower layers.
Activations are highly class dependent for higher layers.
We can conclude that lower level features are general while higher level features
are class specific.

Conclusion
 The Squeeze-and Excitation Network is a general framework which can be
applied to any existing convolutional network.
 It improves performance by applying attention to channel features,
allowing neural networks to focus on only the features that are most
relevant, instead of being swamped by irrelevant ones.
 It is effective at a variety of tasks, including image classification, scene
classification, and object detection.

Squeeze Excitation Networks, The simple idea that won the final ImageNet Challnege.

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Squeeze Excitation Networks, The simple idea that won the final ImageNet Challnege.

Similar to Squeeze Excitation Networks, The simple idea that won the final ImageNet Challnege. (20)

More from Joonhyung Lee

More from Joonhyung Lee (10)

Recently uploaded

Recently uploaded (20)

Squeeze Excitation Networks, The simple idea that won the final ImageNet Challnege.