(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
研究室文献発表 10/13 SalGAN
1. SalGAN: Visual Saliency Prediction with
Generative Adversarial Networks
arXiv:1701.01081v2 [cs.CV] 9 Jan 2017
Junting Pan, Elisa Sayrol and Xavier Giro-i-Nieto Image Processing Group
Universitat Politecnica de Catalunya (UPC) Barcelona,
2. abstract
- using BCE as loss (instead of often used MSE)
- adding adversarial loss (seeing our saliency predictor as a generator in
GAN)
- using downsampled predicted saliency map
3. outline
- motivation
- architecture
- training generator/discriminator
- results
- the impact of BCE
- the impact of downsampling
- adversarial gain
- comparison with SOTA
- qualitative results
- conclusion
4. motivation
- The diversity of metrics has resulted also in a diversity of loss functions
- MIT300: 8 metrics
- SALICON: 4 metircs (LSUN challenge) + Information Gain
- SalGAN benefits a wide range of metrics, wihtout needing to specify a tailored
loss function.
9. training Generator
D: the probability of
fooling the Discriminator
⇒騙せれば騙せるほど、
lossは小さくなる。
入れた方が安定し、
収束も速い
hyperparameter
used 0.05
content loss adversarial loss
※最初の15epochsはcontent loss
のみでtraining
13. non-adversarial training
- change from MSE to BCE
brings a improvement in all
metrics
- treating saliency prediction
as multiple binary
classification is more
appropriate
14. non-adversarial training
- Computing cotent loss over
downsampled saliency
maps reduces the
computational resources
and actually improve
performance.
- used ¼ downsampled
versions later
16. adversarial gain
- after 100 and 120 epochs, the combined
GAN/BCE loss shows substantial
improvements over BCE for five of six
metrics
- The reason why SalGAN fails to improve
NSS, may be that GAN training tends to
produce a smoother and more spread out
estimate of saliency, which may increase
the false positive rate. (NSSは余計なもん
FPを拾ってないかを見てる)
17. NSS Normalized Scanpath Saliency
- NSS is very sensitive to flase positives.
- 余計なものに反応してしまうような saliency model を低く評価する
- image retrieval application (saliency 用いた特徴選択)では、flase negative が
多いほうが良くない
- NSSは向いてない
- 理由: FN→重要な特徴量を除外しているということ
- SalGANのような冗長性のあるmodelが向いている
- NSS is differentialble, so could be oprimised directly when important for a
particular application.
19. qualitative results
1. a successful case: other models fail to detect
saliency.
2. a failure case: fail to detect the white ball, like
other models
3. limitation of the datasets
a. ground truth: the sign → reading the text (takes more
time)
b. Existing metrics tend to be agnostic to the order in
which areas are attended.
20. qualitative results
- BCE alone
- locally consistent with the
ground truth
- less smooth
- complex level sets
- over-fitting?
- GAN
- smoother
- simpler level sets
22. conclusion
- BCE-based content loss is more effective (than MSE) for
- initializing the generator
- regularization term for stabilizing adversarial training
- Adversarial loss improved all metrics excluding NSS, when compared to
futher training on BCE alone.
- Downsampled saliency maps to compute loss give improvements and
reduce the computational costs.
- for more performance
- VGG → ResNet
- more accurately tuning (particularly the tradeoff beween BCE and GAN loss (α))
- ensamble learning (needing more computational cost, even at predict time)
- dark knowledge is effective?