Squeeze and Excitation
Networks
REVIEWER. JUN YEONG PARK
Introduction
▪ With the goal of improving the quality of representations by
explicitly modelling the interdependencies between the
channels of its convolutional features.
▪ Learn to use global information to selectively emphasize
informative features and suppress less useful ones.
Related Works
▪ VGGNets
▪ Inception
Related Works
▪ ResNet
Related Works
▪ DenseNet
Squeeze and Excitation Block(SE block)
▪ 𝐹𝑡𝑟 : mapping 𝑋 ∈ ℝ 𝐻′×𝑊′×𝐶′
to 𝑈 ∈ ℝ 𝐻×𝑊×𝐶
▪ 𝐹𝑠𝑞 : squeeze global spatial information into a channel
descriptor
▪ 𝐹𝑒𝑥 : fully capture channel-wise dependencies
▪ 𝐹𝑠𝑐𝑎𝑙𝑒 : channel-wise multiplication
Squeeze
▪ Squeeze global spatial information into a channel descriptor
▪ Using Global Average Pooling(GAP)
▪ 𝑧 𝑐 = 𝐹𝑠𝑞 𝒖 𝑐 =
1
𝐻×𝑊 𝑖=1
𝐻
𝑗=1
𝑊
𝑢 𝑐(𝑖, 𝑗)
Excitation
▪ Fully capture channel-wise dependencies.
▪ 𝐹𝑒𝑥 𝒛, 𝑾 = 𝜎 𝑔 𝒛, 𝑾 = 𝜎 𝑾2 𝛿 𝑾1 𝒛 , 𝛿 is ReLU
▪ Why sigmoid?
- non-linear function
- ensure that multiple channels are allowed to be emphasized
Instantiations
▪ Networks without skip-connection(ex. VGGNet), SE block is
inserted after non-linearity following each convolution.
▪ And SE block also can be integrated into more complex
architectures(ex. ResNet, Inception)
Instantiations
▪ Inception
Instantiations
▪ ResNet
Computational Complexity
▪ Parameters =
2
𝑁 𝑠=1
𝑆
𝑁𝑠 ⋅ 𝐶𝑠
2, r is reduction ratio
Experiments - ImageNet
Experiments – Place365
Experiments - COCO
The role of excitation

Squeeze and excitation networks