研究室文献発表 10/13 SalGAN

SalGAN: Visual Saliency Prediction with
Generative Adversarial Networks
arXiv:1701.01081v2 [cs.CV] 9 Jan 2017
Junting Pan, Elisa Sayrol and Xavier Giro-i-Nieto Image Processing Group
Universitat Politecnica de Catalunya (UPC) Barcelona,

abstract
- using BCE as loss (instead of often used MSE)
- adding adversarial loss (seeing our saliency predictor as a generator in
GAN)
- using downsampled predicted saliency map

outline
- motivation
- architecture
- training generator/discriminator
- results
- the impact of BCE
- the impact of downsampling
- adversarial gain
- comparison with SOTA
- qualitative results
- conclusion

motivation
- The diversity of metrics has resulted also in a diversity of loss functions
- MIT300: 8 metrics
- SALICON: 4 metircs (LSUN challenge) + Information Gain
- SalGAN benefits a wide range of metrics, wihtout needing to specify a tailored
loss function.

generator
- Encoder
- VGG16 (without final pooling, FC)
- pretrained on ImageNet object classification
task
- last 2 layers is fine-tuned during saliency task
training (for computational resource limitation)
- Decoder
- the reversed ordered structure of the encoder
- pooling -> upsampling
- output layer: 1x1 conv + pixel-wise sigmoid
(not softmax)
- weight init: random

discriminator
- output: the probability that the
given saliency map is generated
or ground truth

training Generator
by keeping the discriminator weights constant

training Generator
D: the probability of
fooling the Discriminator
⇒騙せれば騙せるほど、
lossは小さくなる。
入れた方が安定し、
収束も速い
hyperparameter
used 0.05
content loss adversarial loss
※最初の15epochsはcontent loss
のみでtraining

content loss
mean squared error (baseline)
binary cross entropy (our approach)

training Discriminator
using generated and ground truth samples
符号が反転してるので、だまされ
ないほどlossは低くなる。
adversarial loss

non-adversarial training
- change from MSE to BCE
brings a improvement in all
metrics
- treating saliency prediction
as multiple binary
classification is more
appropriate

non-adversarial training
- Computing cotent loss over
downsampled saliency
maps reduces the
computational resources
and actually improve
performance.
- used ¼ downsampled
versions later

adversarial gain
- after 100 and 120 epochs, the combined
GAN/BCE loss shows substantial
improvements over BCE for five of six
metrics
- The reason why SalGAN fails to improve
NSS, may be that GAN training tends to
produce a smoother and more spread out
estimate of saliency, which may increase
the false positive rate. (NSSは余計なもん
FPを拾ってないかを見てる)

NSS Normalized Scanpath Saliency
- NSS is very sensitive to flase positives.
- 余計なものに反応してしまうような saliency model を低く評価する
- image retrieval application (saliency 用いた特徴選択)では、flase negative が
多いほうが良くない
- NSSは向いてない
- 理由: FN→重要な特徴量を除外しているということ
- SalGANのような冗長性のあるmodelが向いている
- NSS is differentialble, so could be oprimised directly when important for a
particular application.

comparison with SOTA
SalGAN improves or equals
the performance of all other
models in at least one metrics.

qualitative results
1. a successful case: other models fail to detect
saliency.
2. a failure case: fail to detect the white ball, like
other models
3. limitation of the datasets
a. ground truth: the sign → reading the text (takes more
time)
b. Existing metrics tend to be agnostic to the order in
which areas are attended.

qualitative results
- BCE alone
- locally consistent with the
ground truth
- less smooth
- complex level sets
- over-fitting?
- GAN
- smoother
- simpler level sets

conclusion
- BCE-based content loss is more effective (than MSE) for
- initializing the generator
- regularization term for stabilizing adversarial training
- Adversarial loss improved all metrics excluding NSS, when compared to
futher training on BCE alone.
- Downsampled saliency maps to compute loss give improvements and
reduce the computational costs.
- for more performance
- VGG → ResNet
- more accurately tuning (particularly the tradeoff beween BCE and GAN loss (α))
- ensamble learning (needing more computational cost, even at predict time)
- dark knowledge is effective?

研究室文献発表 10/13 SalGAN

Recommended

Recommended

More Related Content

Similar to 研究室文献発表 10/13 SalGAN

Similar to 研究室文献発表 10/13 SalGAN (20)

Recently uploaded

Recently uploaded (20)

研究室文献発表 10/13 SalGAN