GAN Evaluation
2020. 03. 26 (목)
이동헌
Contents
• Inception	Score	(IS)
• Fr ́0chet Inception	Distance	(FID)
• Precision	&	Recall
Heusel, Martin, et al. "Gans trained by a two time-scale update
rule converge to a local nash equilibrium." Advances in neural
information processing systems. 2017.
Salimans, Tim, et al. "Improved techniques for training
gans." Advances in neural information processing
systems. 2016.
Lucic, Mario, et al. "Are gans created equal? a large-scale
study." Advances in neural information processing systems. 2018.
Inception	Score	(IS)
§ Semantic Predictor : Inception-v3
• p(y|x), y is one of the 1,000 ImageNet classes.
UC Berkeley 2020 -- Spring -- Deep Unsupervised Learning ­ L5 & L6 Implicit Model / GAN
probability distribution of all generated images.
v Good generator generates samples that are
1. semantically diverse (Diversity) ∝ mode collapse
2. distinct images (Quality)
1Entropy	P(y)
;	클수록(uniform)
à as	many	classes	generated	as	possible
2Entropy	P(y|x)	
;	작을수록,
à each	image	x	should	have	distinctly	recognizable	object
https://medium.com/@jonathan_hui/gan-how-to-measure-gan-performance-64b988c47732
UC Berkeley 2020 -- Spring -- Deep Unsupervised Learning ­ L5 & L6 Implicit Model / GAN
Fr ́$chet Inception	Distance	(FID)
• Fr ́$chet Distance : 곡선을 따라 점의 위치와 순서를 고려한 곡선 간의 유사성을 측정
http://www.kr.tuwien.ac.at/staff/eiter/et-archive/cdtr9464.pdf
• A man is walking a dog on a leash: the man can move on one curve, the dog on the other;
both may vary their speed, but backtracking is not allowed.
• What is the length of the shortest leash that is sufficient for traversing both curves?
• The Fr´echet distance is a measure of similarity between curves that takes into account the
1location and 2ordering of the points along the curves.
• Therefore it is often better than the well-known Hausdorff distance.
reparameterization
e.g.
• Distance b/w two polygonal curves in time
• p and q are the # of segments on the polygonal curves
Fr ́$chet Inception	Distance	(FID)
• Real data와 fake data의 feature space 상에서의 거리
• Inception-v3 network에서 real data와 fake data의 feature를 추출하여,
두 feature의 mean과 covariance구하여 거리를 계산
Code : http://research.sualab.com/introduction/practice/2019/05/08/generative-adversarial-network.html
• Lower FID values mean better image quality and diversity.
Precision,	Recall and	F1	Score
• High Precision : The generated images look similar to the real images on
average. (Quality)
• High recall : The generator can generate any sample found in the training
dataset. (Diversity)
• A F1 score : The harmonic average of precision and recall.
https://medium.com/@jonathan_hui/gan-how-to-measure-gan-performance-64b988c47732
https://medium.com/@jonathan_hui/gan-how-to-measure-gan-performance-64b988c47732
SOTA
Ferjad Naeem, Muhammad, et al. "Reliable Fidelity and Diversity
Metrics for Generative Models." CVPR, 2020. (CLOVA AI)
• Improved	Precision	&	Recall
• Density	and	Coverage	(D&C)
Kynkäänniemi, Tuomas, et al. "Improved precision and recall
metric for assessing generative models." Advances in Neural
Information Processing Systems. 2019.
감사합니다

GAN Evaluation

  • 1.
    GAN Evaluation 2020. 03.26 (목) 이동헌
  • 2.
    Contents • Inception Score (IS) • Fŕ0chet Inception Distance (FID) • Precision & Recall Heusel, Martin, et al. "Gans trained by a two time-scale update rule converge to a local nash equilibrium." Advances in neural information processing systems. 2017. Salimans, Tim, et al. "Improved techniques for training gans." Advances in neural information processing systems. 2016. Lucic, Mario, et al. "Are gans created equal? a large-scale study." Advances in neural information processing systems. 2018.
  • 3.
    Inception Score (IS) § Semantic Predictor: Inception-v3 • p(y|x), y is one of the 1,000 ImageNet classes. UC Berkeley 2020 -- Spring -- Deep Unsupervised Learning ­ L5 & L6 Implicit Model / GAN probability distribution of all generated images. v Good generator generates samples that are 1. semantically diverse (Diversity) ∝ mode collapse 2. distinct images (Quality)
  • 4.
  • 5.
    Fr ́$chet Inception Distance (FID) •Fr ́$chet Distance : 곡선을 따라 점의 위치와 순서를 고려한 곡선 간의 유사성을 측정 http://www.kr.tuwien.ac.at/staff/eiter/et-archive/cdtr9464.pdf • A man is walking a dog on a leash: the man can move on one curve, the dog on the other; both may vary their speed, but backtracking is not allowed. • What is the length of the shortest leash that is sufficient for traversing both curves? • The Fr´echet distance is a measure of similarity between curves that takes into account the 1location and 2ordering of the points along the curves. • Therefore it is often better than the well-known Hausdorff distance. reparameterization e.g. • Distance b/w two polygonal curves in time • p and q are the # of segments on the polygonal curves
  • 6.
    Fr ́$chet Inception Distance (FID) •Real data와 fake data의 feature space 상에서의 거리 • Inception-v3 network에서 real data와 fake data의 feature를 추출하여, 두 feature의 mean과 covariance구하여 거리를 계산 Code : http://research.sualab.com/introduction/practice/2019/05/08/generative-adversarial-network.html • Lower FID values mean better image quality and diversity.
  • 9.
    Precision, Recall and F1 Score • HighPrecision : The generated images look similar to the real images on average. (Quality) • High recall : The generator can generate any sample found in the training dataset. (Diversity) • A F1 score : The harmonic average of precision and recall. https://medium.com/@jonathan_hui/gan-how-to-measure-gan-performance-64b988c47732
  • 10.
  • 11.
    SOTA Ferjad Naeem, Muhammad,et al. "Reliable Fidelity and Diversity Metrics for Generative Models." CVPR, 2020. (CLOVA AI) • Improved Precision & Recall • Density and Coverage (D&C) Kynkäänniemi, Tuomas, et al. "Improved precision and recall metric for assessing generative models." Advances in Neural Information Processing Systems. 2019.
  • 12.