• The conventional network architecture, which is built by stacking
convolutional, normalization, and nonlinearity layers, is at best sub-
optimal, because their normalization layers tend to “wash away”
information in input semantic masks.
1. Replace the segmentation mask m with the image class label and
making the modulation parameters spatially-invariant——
Conditional Batch Normalization Layer
2. Replace the segmentation mask with another image, making the
modulation parameters spatially invariant and setting N = 1 ——
Why does SPADE work better?
• It can better preserve semantic information against common