SlideShare a Scribd company logo
1 of 22
SalGAN: Visual Saliency Prediction with
Generative Adversarial Networks
arXiv:1701.01081v2 [cs.CV] 9 Jan 2017
Junting Pan, Elisa Sayrol and Xavier Giro-i-Nieto Image Processing Group
Universitat Politecnica de Catalunya (UPC) Barcelona,
abstract
- using BCE as loss (instead of often used MSE)
- adding adversarial loss (seeing our saliency predictor as a generator in
GAN)
- using downsampled predicted saliency map
outline
- motivation
- architecture
- training generator/discriminator
- results
- the impact of BCE
- the impact of downsampling
- adversarial gain
- comparison with SOTA
- qualitative results
- conclusion
motivation
- The diversity of metrics has resulted also in a diversity of loss functions
- MIT300: 8 metrics
- SALICON: 4 metircs (LSUN challenge) + Information Gain
- SalGAN benefits a wide range of metrics, wihtout needing to specify a tailored
loss function.
architecture
generator
- Encoder
- VGG16 (without final pooling, FC)
- pretrained on ImageNet object classification
task
- last 2 layers is fine-tuned during saliency task
training (for computational resource limitation)
- Decoder
- the reversed ordered structure of the encoder
- pooling -> upsampling
- output layer: 1x1 conv + pixel-wise sigmoid
(not softmax)
- weight init: random
discriminator
- output: the probability that the
given saliency map is generated
or ground truth
training Generator
by keeping the discriminator weights constant
training Generator
D: the probability of
fooling the Discriminator
⇒騙せれば騙せるほど、
lossは小さくなる。
入れた方が安定し、
収束も速い
hyperparameter
used 0.05
content loss adversarial loss
※最初の15epochsはcontent loss
のみでtraining
content loss
mean squared error (baseline)
binary cross entropy (our approach)
training Discriminator
using generated and ground truth samples
符号が反転してるので、だまされ
ないほどlossは低くなる。
adversarial loss
training
Dataset: SALICON
non-adversarial training
- change from MSE to BCE
brings a improvement in all
metrics
- treating saliency prediction
as multiple binary
classification is more
appropriate
non-adversarial training
- Computing cotent loss over
downsampled saliency
maps reduces the
computational resources
and actually improve
performance.
- used ¼ downsampled
versions later
adversarial gain
adversarial gain
- after 100 and 120 epochs, the combined
GAN/BCE loss shows substantial
improvements over BCE for five of six
metrics
- The reason why SalGAN fails to improve
NSS, may be that GAN training tends to
produce a smoother and more spread out
estimate of saliency, which may increase
the false positive rate. (NSSは余計なもん
FPを拾ってないかを見てる)
NSS Normalized Scanpath Saliency
- NSS is very sensitive to flase positives.
- 余計なものに反応してしまうような saliency model を低く評価する
- image retrieval application (saliency 用いた特徴選択)では、flase negative が
多いほうが良くない
- NSSは向いてない
- 理由: FN→重要な特徴量を除外しているということ
- SalGANのような冗長性のあるmodelが向いている
- NSS is differentialble, so could be oprimised directly when important for a
particular application.
comparison with SOTA
SalGAN improves or equals
the performance of all other
models in at least one metrics.
qualitative results
1. a successful case: other models fail to detect
saliency.
2. a failure case: fail to detect the white ball, like
other models
3. limitation of the datasets
a. ground truth: the sign → reading the text (takes more
time)
b. Existing metrics tend to be agnostic to the order in
which areas are attended.
qualitative results
- BCE alone
- locally consistent with the
ground truth
- less smooth
- complex level sets
- over-fitting?
- GAN
- smoother
- simpler level sets
qualitative results
conclusion
- BCE-based content loss is more effective (than MSE) for
- initializing the generator
- regularization term for stabilizing adversarial training
- Adversarial loss improved all metrics excluding NSS, when compared to
futher training on BCE alone.
- Downsampled saliency maps to compute loss give improvements and
reduce the computational costs.
- for more performance
- VGG → ResNet
- more accurately tuning (particularly the tradeoff beween BCE and GAN loss (α))
- ensamble learning (needing more computational cost, even at predict time)
- dark knowledge is effective?

More Related Content

Similar to 研究室文献発表 10/13 SalGAN

SAP HANA SPS09 - Predictive Analysis Library
SAP HANA SPS09 - Predictive Analysis LibrarySAP HANA SPS09 - Predictive Analysis Library
SAP HANA SPS09 - Predictive Analysis LibrarySAP Technology
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsVijay Karan
 
StackAdapt Machine Learning Pipeline
StackAdapt Machine Learning PipelineStackAdapt Machine Learning Pipeline
StackAdapt Machine Learning PipelineLarkin Liu
 
Realtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN ModelsRealtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN Modelsnithinsai2992
 
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...lauratoni4
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
 
Observability at scale with Neural Networks: A more proactive approach
Observability at scale with Neural Networks: A more proactive approachObservability at scale with Neural Networks: A more proactive approach
Observability at scale with Neural Networks: A more proactive approachTech Triveni
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsVijay Karan
 
Nips 2016 tutorial generative adversarial networks review
Nips 2016 tutorial  generative adversarial networks reviewNips 2016 tutorial  generative adversarial networks review
Nips 2016 tutorial generative adversarial networks reviewMinho Heo
 
Fast and Scalable Semi Supervised Adaptation For Video Action Recognition
Fast and Scalable Semi Supervised Adaptation For Video Action RecognitionFast and Scalable Semi Supervised Adaptation For Video Action Recognition
Fast and Scalable Semi Supervised Adaptation For Video Action RecognitionIJSRED
 
Ieee transactions on image processing
Ieee transactions on image processingIeee transactions on image processing
Ieee transactions on image processingtsysglobalsolutions
 
MediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
MediaEval 2019: Concealed FGSM Perturbations for Privacy PreservationMediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
MediaEval 2019: Concealed FGSM Perturbations for Privacy Preservationmultimediaeval
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsVijay Karan
 
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Jian Wu
 
2014-08-14 Alpine Innovation to Spark
2014-08-14 Alpine Innovation to Spark2014-08-14 Alpine Innovation to Spark
2014-08-14 Alpine Innovation to SparkDB Tsai
 

Similar to 研究室文献発表 10/13 SalGAN (20)

SAP HANA SPS09 - Predictive Analysis Library
SAP HANA SPS09 - Predictive Analysis LibrarySAP HANA SPS09 - Predictive Analysis Library
SAP HANA SPS09 - Predictive Analysis Library
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab Projects
 
StackAdapt Machine Learning Pipeline
StackAdapt Machine Learning PipelineStackAdapt Machine Learning Pipeline
StackAdapt Machine Learning Pipeline
 
Realtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN ModelsRealtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN Models
 
Employee mode of commuting
Employee mode of commutingEmployee mode of commuting
Employee mode of commuting
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANN
 
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
 
Pegasus
PegasusPegasus
Pegasus
 
StarGAN
StarGANStarGAN
StarGAN
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Observability at scale with Neural Networks: A more proactive approach
Observability at scale with Neural Networks: A more proactive approachObservability at scale with Neural Networks: A more proactive approach
Observability at scale with Neural Networks: A more proactive approach
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 Projects
 
Nips 2016 tutorial generative adversarial networks review
Nips 2016 tutorial  generative adversarial networks reviewNips 2016 tutorial  generative adversarial networks review
Nips 2016 tutorial generative adversarial networks review
 
Fast and Scalable Semi Supervised Adaptation For Video Action Recognition
Fast and Scalable Semi Supervised Adaptation For Video Action RecognitionFast and Scalable Semi Supervised Adaptation For Video Action Recognition
Fast and Scalable Semi Supervised Adaptation For Video Action Recognition
 
Ieee transactions on image processing
Ieee transactions on image processingIeee transactions on image processing
Ieee transactions on image processing
 
MediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
MediaEval 2019: Concealed FGSM Perturbations for Privacy PreservationMediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
MediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 Projects
 
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
 
2014-08-14 Alpine Innovation to Spark
2014-08-14 Alpine Innovation to Spark2014-08-14 Alpine Innovation to Spark
2014-08-14 Alpine Innovation to Spark
 

Recently uploaded

Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 

Recently uploaded (20)

Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 

研究室文献発表 10/13 SalGAN

  • 1. SalGAN: Visual Saliency Prediction with Generative Adversarial Networks arXiv:1701.01081v2 [cs.CV] 9 Jan 2017 Junting Pan, Elisa Sayrol and Xavier Giro-i-Nieto Image Processing Group Universitat Politecnica de Catalunya (UPC) Barcelona,
  • 2. abstract - using BCE as loss (instead of often used MSE) - adding adversarial loss (seeing our saliency predictor as a generator in GAN) - using downsampled predicted saliency map
  • 3. outline - motivation - architecture - training generator/discriminator - results - the impact of BCE - the impact of downsampling - adversarial gain - comparison with SOTA - qualitative results - conclusion
  • 4. motivation - The diversity of metrics has resulted also in a diversity of loss functions - MIT300: 8 metrics - SALICON: 4 metircs (LSUN challenge) + Information Gain - SalGAN benefits a wide range of metrics, wihtout needing to specify a tailored loss function.
  • 6. generator - Encoder - VGG16 (without final pooling, FC) - pretrained on ImageNet object classification task - last 2 layers is fine-tuned during saliency task training (for computational resource limitation) - Decoder - the reversed ordered structure of the encoder - pooling -> upsampling - output layer: 1x1 conv + pixel-wise sigmoid (not softmax) - weight init: random
  • 7. discriminator - output: the probability that the given saliency map is generated or ground truth
  • 8. training Generator by keeping the discriminator weights constant
  • 9. training Generator D: the probability of fooling the Discriminator ⇒騙せれば騙せるほど、 lossは小さくなる。 入れた方が安定し、 収束も速い hyperparameter used 0.05 content loss adversarial loss ※最初の15epochsはcontent loss のみでtraining
  • 10. content loss mean squared error (baseline) binary cross entropy (our approach)
  • 11. training Discriminator using generated and ground truth samples 符号が反転してるので、だまされ ないほどlossは低くなる。 adversarial loss
  • 13. non-adversarial training - change from MSE to BCE brings a improvement in all metrics - treating saliency prediction as multiple binary classification is more appropriate
  • 14. non-adversarial training - Computing cotent loss over downsampled saliency maps reduces the computational resources and actually improve performance. - used ¼ downsampled versions later
  • 16. adversarial gain - after 100 and 120 epochs, the combined GAN/BCE loss shows substantial improvements over BCE for five of six metrics - The reason why SalGAN fails to improve NSS, may be that GAN training tends to produce a smoother and more spread out estimate of saliency, which may increase the false positive rate. (NSSは余計なもん FPを拾ってないかを見てる)
  • 17. NSS Normalized Scanpath Saliency - NSS is very sensitive to flase positives. - 余計なものに反応してしまうような saliency model を低く評価する - image retrieval application (saliency 用いた特徴選択)では、flase negative が 多いほうが良くない - NSSは向いてない - 理由: FN→重要な特徴量を除外しているということ - SalGANのような冗長性のあるmodelが向いている - NSS is differentialble, so could be oprimised directly when important for a particular application.
  • 18. comparison with SOTA SalGAN improves or equals the performance of all other models in at least one metrics.
  • 19. qualitative results 1. a successful case: other models fail to detect saliency. 2. a failure case: fail to detect the white ball, like other models 3. limitation of the datasets a. ground truth: the sign → reading the text (takes more time) b. Existing metrics tend to be agnostic to the order in which areas are attended.
  • 20. qualitative results - BCE alone - locally consistent with the ground truth - less smooth - complex level sets - over-fitting? - GAN - smoother - simpler level sets
  • 22. conclusion - BCE-based content loss is more effective (than MSE) for - initializing the generator - regularization term for stabilizing adversarial training - Adversarial loss improved all metrics excluding NSS, when compared to futher training on BCE alone. - Downsampled saliency maps to compute loss give improvements and reduce the computational costs. - for more performance - VGG → ResNet - more accurately tuning (particularly the tradeoff beween BCE and GAN loss (α)) - ensamble learning (needing more computational cost, even at predict time) - dark knowledge is effective?