Progressive Growing of GANs for Improved Quality, Stability, and Variation Review

PGGAN
논문 리뷰
모두의 연구소 GAN찮아 3기
김태엽
Karras, Tero
Aila,Timo
Laine, Samuli
Lehtinen, Jaakko

Contents
3. NetworksforcelebA-HQ
3-1.NetworksStructure
3-2.TrainingConfiguration
6.References
2. Contributions
1-2.Introduction
1.Introduction
1-1.Overview
2-1.ProgressiveGrowingofGANs
2-2.IncreasingVariationusingMinibatchStandardDeviation
2-3.NormalizationinGeneratorandDiscriminator
2-4.EqualizedLearningRate
4. NewMetricforAssessingResults
4-1.Backgrounds
4-2.TheNewMetric
5.Experiments
5-1.ImportanceofIndividualContributions
5-2.ConvergenceandTrainingSpeed
5-3.High-resolutionImageGenerationusingCelebA-HQ
5-4.LSUNresults
5-5.CIFAR10InceptionScores
2-3.PixelwiseFeatureVectorNormalizationInGenerator
6-1.References

1.Introduction
1. Overview
 논문에서 제안한 것들
1. Progressive Growing of GANs : A new training methodology for GAN
Grows both generator and discriminator progressively
2. Minibatch Standard Deviation : Improve variation in generated images
Discourage generator from producing too homogeneous results
3. Normalization : Discourage unhealthy competition between two networks
3-1. Equalized Learning Rate : Make layers learn at same pace
3-2. Pixelwise Feature Vector Normalization : Avoid mode collapse
4. A new metric for evaluating GAN results : both in terms of image quality
and variation

1.Introduction
1. Introduction
•Sharp한 이미지를 만들 수 있다
•Evaluate하는 것이 느림
•Latent space가 없다
•픽셀에 대한 조건부 모델링
Autoregressive
•훈련이 쉽다
•모델의 한계로 Blurry한 이미지를 만듬
•최근에 향상시키는 연구가 많이 나옴
VAE
•Sharp한 이미지를 만듬
•Variation이 낮은 해상도로 제한됨
•학습과정이 Unstable함
GAN
• 가장저명한GenerativeModel3가지
• 각각은명확한장단점을지니고있음

1.Introduction
1. Introduction
•고해상도 이미지의 생성은 구별이 쉽기 때문에 생성하기 힘들다
•메모리 제약 조건때문에 minibatch를 줄여야 하고 training이 불안정해진다
•다채로운 이미지(variation)을 만들지 못함
Problem
•Generator와 Discriminator를 점진적으로 키운다
•mode collapse를 방지하고 variation을 키우는 몇가지 tweak을 사용
•variation과 이미지 퀄리티를 평가하는 metric을 이용한다
Approach
•training 속도를 향상시키고 높은 해상도에서 안정성을 향상시킴
•1024 x 1024의 고해상도의 이미지를 생성할 수 있음
•unsupervised CIFAR10에서 inception score 8.80을 기록함
Benefits

1. Progressive Growing of GANs
2.Contributions
• 저해상도에서시작해서점진적으로네트워크에레이어를더해서해상도를높인다
• 이미지분포에서large-scale(전반적인얼굴형태)구조를먼저발견하게끔해준다
• 해상도를높일수록세부적인scale의디테일로관심을옮기게끔한다
• 모든scale을동시에학습하는것이아니다

2.Contributions
• 레이어를더할때에는smooth하게fade시키는transition을두도록한다
• 이미학습된이전단계의레이어에suddenshock을방지
• alpha값은linearly하게0에서1로증가시킴
• toRGB는featurevector를RGBcolors로만들고fromRGB는RGBcolors를featurevector로변환

2.Contributions
 Generator → Discriminator

2.Contributions
 Discriminator를 학습시킬 때 네트워크의 현재 해상도에 맞춰서 real images
를 downscale하여서 넣도록 한다.
 Resolution transition을 하는 중에는 real images의 두 해상도 사이를 선형보
간(interpolation)을 한다

2.Contributions
장점1. 안정적인 훈련
• 저해상도에선 class 정보가 적고 mode가 적기 때문에 이미지 생성이 훨씬 안정적이다.
• latent vector를 직접 1024 x 1024에 바로 mapping시키는 방법과 비교했을 때 훨씬 더
간단한 질문을 요구하면서 해상도를 증가시킴
장점2. 학습 시간의 단축
• 학습 시간이 줄어들었고 비슷한 결과의 퀄리티를 얻는데 2-6배 빨리 얻을 수 있었다.
효과

2. Minibatch Standard Deviation
2.Contributions
 GAN에서 training data에서 부분적인 variation만 캡쳐하는 경향이 있음
→ Mode Collapse
 이를 해결하기 위한 방안으로 Minibatch Discrimination이 제시되었음
 Discriminator 끝 단에 Minibatch layer를 더하여 minibatch에 있는 이미지
들의 feature statistics를 계산함

2.Contributions
 Minibatch Standard Deviation은 Minibatch Discrimination을 단순화한 방법
→ 추가적인 parameter 학습/새로운 hyperparameter가 필요 없음
 먼저 전체 minibatch에 대해 각 feature의 spatial location에서의 표준편차를
구한다
 이 계산한 값을 spatial location에서 모든 feature에 대한 평균을 내고 하나의
값으로 만든 뒤 concat한다

2.Contributions
 이 레이어를 어느 위치에나 들어갈 수 있으나 끝단 방향에 두는 게 가장 좋
은 성능을 보였다
 다른 통계량을 실험하였지만 더 이상 variation을 향상시키지 못하였다

3. Normalization in Generator and Discriminator
2.Contributions
 두 네트워크 사이의 불건전한 경쟁으로부터 escalation of signal magnitude
가 일어나기 쉬운 경향이 있음
 이를 억제하기 위해 대부분 연구에선 Batch Normalization이 사용됨
 초기에 covariate shift를 제거하기 위해 도입된 방법임
 실제 GAN에서 필요한 것은 constraining signal magnitude and competition

4. Equalized Learning Rate (Runtime weight scaling)
2.Contributions
 Weights을 그냥 gaussian(0, 1)로 initialization하였음
 대신에 런타임에서 weights를 explicitly scaling 하였음
 c는 He’s initializer에서 나오는 레이어당 normalization constant
 RMSProp과 Adam과 같은 Adaptive stochastic gradient descent은
scale-invariance을 지닌다
 파라미터 scale에 무관하게 gradient를 업데이트하는데 만약 파라미터 마
다 dynamic range가 다르다면 이를 조절하는데 시간이 많이 걸림

4. Equalized Learning Rate (Runtime weight scaling)
2.Contributions
 저자들의 접근법은 모든 weight들에 대해 같은 dynamic range를 갖게 함→ 학습속도는 레이어
의 크기에 무관하게 되었음

5. Pixelwise Feature Vector Normalization
2.Contributions
 Generator와 Discriminator가 경쟁의 결과로 magnitudes가 통제불능이 되는 것
을 방지 해야함
 Generator에서 Conv layer마다 나오는 Feature Vector를 각 픽셀에 대해 단위
길이로 정규화 하였음
 LSN(local Response Normalization)을 변형하여 구현하였음
 𝑎𝑥,𝑦 ∶ originalfeaturevector , 𝑏𝑥,𝑦 ∶ normalizedfeaturevector,n ∶ featuremap갯수
 대부분의 경우 결과물에 변화를 주진 않았지만 escalation of signal
magnitudes를 효과적으로 방지함

2.Contributions
 Local Response Normalization

2.Contributions

2.Contributions
 실제 코드에서 활용

1. Networks Structure
3.NetworksforcelebA-HQ
• CelebA-HQ데이터셋을이용한PGGAN의네트워크구조

2. Training Configuration
 Discriminator에게 800K개의 real image를 보여줄 때까지 4x4 해상도에서 시작
하여 네트워크를 학습함
 두 가지 Phase를 번갈아 가면서 진행함 :
처음에 800K 이미지 동안 3-레이어 블록을 fade시키고 다음 800K 이미지 동안 안정화를 시킴
 real image와 fake image 모두 범위는 [-1, 1]
 Adam optimizer(α = 0.0001, β1=0, β2=0.99, 𝜖 = 10−8)
 4x4에서 128x128까지는 minibatch size=16, 256x256에선 14, 512x512에선 6,
1024x1024에선 3으로 줄여가면서 메모리 부족을 방지

 WGAN-GP loss를 사용했지만 Gulrajani et al과 다르게 미니배치 당 generator
와 discriminator의 최적화를 번갈아 가면서 함 → 𝑛𝑐𝑟𝑖𝑡𝑖𝑐 = 1
 Discriminator loss에 다음과 같은 추가 term을 더하여 discriminator의 출력 값
이 0으로 부터 너무 멀리 떨어지는 것을 방지함
2. Training Configuration

1. Backgrounds
4.NewMetricforAssessingResults
 서로 다른 GAN 끼리 결과를 비교하기 위해선 일일이 비교하고 방대한 분
량의 이미지를 조사해야 한다 → 이를 위한 자동화 된 Metric 필요
 large-scale mode collapse를 찾아내는 MS-SSIM 과 같은 Metric들이 있음
 하지만 색상이나 텍스처의 다양성과 같은 작은 효과에는 반응하지 않는다
 또한 이미 존재하는 방법들은 training set의 이미지와의 유사도 측면에서
이미지 퀄리티를 평가하는게 아니다

1. Backgrounds
 Generator가 샘플을 제대로 만들어냈다면 모든 scale에 걸쳐서 local image
structure가 training set과 유사해야 한다
 저자들이 제안한 점 : 생성된 이미지와 타겟 이미지의 Laplacian pyramid
representation에서 나온 이미지 조각의 분포 사이의 multi-scale statistical
similarity를 고려하도록 하자!

2. The New Metric
 Laplacian pyramid representation : Gaussian pyramid와 비슷한데 각 피라미
드층은 본래의 이미지와 Gaussian Blur를 거친 이미지의 차이를 저장한다.
 하나의 Laplacian pyramid 층은 특정한 spatial frequency band에 대응된다

2. The New Metric
 랜덤하게 16384(214)개의 이미지를 샘플 후 Laplacian pyramid의 각층에서
128개의 descriptors(feature)를 뽑아냄 → 한 층마다 총 2.1M개(221)의
descriptors를 뽑음
 각각의 descriptor는 7x7 픽셀이고 3개의 color 채널을 지닌다.

2. The New Metric
 {𝑥𝑖
l
} 과 {𝑦𝑖
l
}를 각각의 color channel에 대한 평균과 분산으로 정규화를 한 뒤
에 둘 사이의 sliced Wasserstein distance(SWD)를 계산하였다
 SWD가 작다 → 이미지 조각들의 분포가 유사하다 → 해당 해상도에서
훈련용 이미지와 생성된 이미지의 외관과 다양성이 유사하다
 가장 낮은 해상도 16x16에서 뽑은 이미지 조각 사이의 거리는 large-scale
image structure가 유사함을 의미함

1. Importance of Individual Contributions
5.Experiments

1. Importance of Individual Contributions
5.Experiments
 Gularajani et al.(Baseline) Configuration :
• α = 0.0001, β2 = 0.9, 𝑛𝑐𝑟𝑖𝑡𝑖 𝑐 = 5, 𝑒𝑑𝑟𝑖 𝑓𝑡 = 0
• Minibatch size = 64
• Progressive resolution, minibatch stddev, runtime weight scaling 미사용
• He initializer를 사용
• Generator의 activation을 relu로 바꾸고 마지막은 tanh로 바꿈
• Pixelwise norm 대신 Batch norm을 사용, Discriminator는 Layer normalization을 사용
• Latent vector는 128짜리 Normal distribution에서 샘플

2. Convergence and Training Speed
5.Experiments

3.High-ResolutionImageGenerationUsingCelebA-HQ Dataset
5.Experiments
 기존의 데이터셋이 저해상도라서 1024x1024 해상도의 30K개의 High
Quality 데이터셋을 만듬
 4일동안 8개의 Tesla V100 GPUs를 사용하여 학습시킴

3.High-ResolutionImageGenerationUsingCelebA-HQ Dataset
5.Experiments
 생성된 이미지에 대응되는 Nearest neighbors in training set

5. CIFAR10 Inception Scores
5.Experiments
 Unsupervised Setting에서 CIFAR10 Inception score를 8.80을 찍음
 CIFAR 10은 10가지 카테고리의 32x32 RGB 이미지
 네트워크와 training setup은 celebA때와 똑같고 progression만 32x32로 제한

6.ConclusionandReferences
• https://users.aalto.fi/~laines9/publications/karras2018iclr_poster.pdf
• Laplacian Pyramid,
http://sepwww.stanford.edu/data/media/public/sep/morgan/texturematch/
paper_html/node3.html
• Laplacian Pyramid, https://en.wikipedia.org/wiki/Pyramid_(image_processing)
• He initializer, http://andyljones.tumblr.com/post/110998971763/an-
explanation-of-xavier-initialization
• He initializer, https://arxiv.org/pdf/1502.01852v1.pdf
• [요약] PGGAN, https://curt-park.github.io/2018-05-09/pggan/
• 2018 ICLR oral, Wenjing Wang,
http://www.icst.pku.edu.cn/F/course/icb/Seminar/WenjingWang_180506/We
njingWang_180506.pdf
• https://github.com/tkarras/progressive_growing_of_gans
• Local Response Normalization (LRN), yeephycho,
http://yeephycho.github.io/2016/08/03/Normalizations-in-neural-networks/
• Karras et al. Progressive Growing of GANs for Improved Quality, Stability, and
Variation, http://arxiv.org/abs/1710.10196
• Gulrajani et al. Improved Training of Wasserstein GANs ,
http://arxiv.org/abs/1704.00028
2. References

Progressive Growing of GANs for Improved Quality, Stability, and Variation Review

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Progressive Growing of GANs for Improved Quality, Stability, and Variation Review

Similar to Progressive Growing of GANs for Improved Quality, Stability, and Variation Review (20)

Progressive Growing of GANs for Improved Quality, Stability, and Variation Review

Editor's Notes