SlideShare a Scribd company logo
GAN, with Mathematics
연세대학교 영상 및 비디오 패턴 인식 연구실
이형민
GAN??
GAN에 대한 일반적인 오해
GAN??
MAGIC!!
이미지를 만들어주는 알고리즘
데이터 양을 늘려주는 알고리즘
GAN??
???
Discovering the Distribution!!
Discovering the Distribution!!
Discriminative Model Generative Model
Sample
New Sample
Which Label??
Sample
GAN
Kullback – Leibler Divergence
두 Distribution이 얼마나 유사한가?
https://www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained
VS
정보량이 얼만큼 손실되는가?
𝐼𝑖 = − log 𝑝 𝑥𝑖
𝐻 = 𝐸 𝐼𝑖 = −
𝑖=1
𝑁
𝑝 𝑥𝑖 log 𝑝 𝑥𝑖
정보량
엔트로피
𝐷 𝐾𝐿 𝑝 𝑞 = 𝐸 log 𝑝 𝑥 − log 𝑞 𝑥 =
𝑖=1
𝑁
𝑝 𝑥𝑖 log
𝑝(𝑥𝑖)
𝑞(𝑥𝑖)
손실되는 정보량의 기댓값!!
𝐷 𝐾𝐿 = 0.338 𝐷 𝐾𝐿 = 0.477
But!! KL-Divergence는 비대칭 함수!!
Jensen – Shannon Divergence를 쓴다.
𝐽𝑆𝐷(𝑝| 𝑞 =
1
2
𝐾𝐿 𝑝
𝑝 + 𝑞
2
+ 𝐾𝐿 𝑞
𝑝 + 𝑞
2
Go back to GAN…
𝒛~𝒑 𝒛(𝒛) 𝒙~𝒑 𝒈(𝒙)
𝒙~𝒑 𝒅𝒂𝒕𝒂(𝒙)
G
G를 먼저 고정! D를 Optimize 해보자!
𝑉 𝐺, 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷(𝐺(𝑧)))]
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥) (∵ 𝐺 𝑧 ~𝑝 𝑔(𝑥))
= 𝑥
𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥
D에 대해 미분!!
𝐷 𝐺
∗
𝑥 =
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎 𝑥 + 𝑝 𝑔(𝑥)
이제 D를 Optimal로 가정하고 G를 Optimize 해보자!!
𝐶 𝐺 = 𝑉 𝐺, 𝐷∗
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷∗
𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥)[log(1 − 𝐷∗
(𝑥))]
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔(𝑥)[log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥)
]
= − log 4 + 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log(𝑝 𝑑𝑎𝑡𝑎(𝑥)) − log(
𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥)
2
)
+ 𝐸 𝑥~𝑝 𝑔(𝑥) log(𝑝 𝑔(𝑥)) − log(
𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥)
2
)
= − log 4 + 𝐾𝐿( 𝑝 𝑑𝑎𝑡𝑎
𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥)
2
) + 𝐾𝐿( 𝑝 𝑔
𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥)
2
)
= − log 4 + 2 × 𝐽𝑆𝐷( 𝑝 𝑑𝑎𝑡𝑎 𝑝 𝑔)
이론적인 GAN의 Training 시나리오…
 Optimize D  Optimize G  Optimize D  …
https://www.youtube.com/watch?v=RlAgB0Ooxaw&list=PLlMkM4tgfjnJhhd4wn5aj8fVTYJwIpWkS&index=49
However…
실제 데이터는 공간상에 넓게 분포하지 않음
‘집합의 크기’
Ex: 수직선상에서 (0,1)의 Measure = 1
𝑅2
에서 (0,1)×(0,1)의 Measure = 1
But, 𝑅2
에서 {(x, y) : 0<x<1, y=0}의 Measure = 0
간단히 말하면, 전체 집합보다 차원이 작은 영역의 Measure는 0이다.
실제 데이터는 공간상에 넓게 분포하지 않음
실제 데이터는 공간상에 넓게 분포하지 않음
Real, Fake data의 영역 모두 Measure가 Zero
 두 영역의 교집합 또한 Measure가 Zero
𝑫 𝒙 = 𝟏
𝛁𝑫 𝒙 = 𝟎
𝑫 𝒙 = 𝟎
𝛁𝑫 𝒙 = 𝟎
그래서, Real과 Fake를 구분하는 완벽한 Discriminator가 항상 존재한다.
𝑴 = 𝑺𝒖𝒑𝒑(𝑷 𝒓) 𝑴 = 𝑺𝒖𝒑𝒑(𝑷 𝒈)
𝛁𝑫 𝒙 = 𝟎
Vanishing Gradient
differentiate
= 𝜹
D의 Gradient가 0이면 G의 Gradient도 0이 된다.
 학습 불가
Solution 1.
Real Data와 Fake Data에 Noise를 추가한다.
𝑴 = 𝑺𝒖𝒑𝒑(𝑷 𝒓) 𝑴 = 𝑺𝒖𝒑𝒑(𝑷 𝒈)
Noise를 추가하므로, 당연히 성능 저하!!
Solution 2.
JSD 말고 다른 Distance를 이용하자.
Infimum = Greatest Lower Bound
inf 0,1 = 0
inf 0,1 = 0
inf 0,1 = 0
inf 0,1 = 0  프로그래밍적으로 구현 불가능  Wasserstein GAN
Solution 3.
Progressive GAN (a.k.a. 자낳괴)
아직 GAN의 학습 방법론에 대한 연구는 현재 진행형…
GAN/VAE 비교 및 코드
Tensorflow
https://github.com/hwalsuklee/
tensorflow-generative-model-collections
PyTorch
https://github.com/znxlwm/
pytorch-generative-model-collections
GAN Loss Function 계열 (학습 안정화)
• GAN (NIPS 2014)
• Least Square GAN (LSGAN, ICCV 2017)
• Wasserstein GAN (WGAN, ICML 2017)
• Improved Training of Wasserstein GANs (WGAN_GP, NIPS 2017)
• Improving the Improved Training of Wasserstein GANs (CT-GAN, ICLR 2018)
cGAN / Application 계열 (Conditional Generation)
• Conditional GAN (cGAN, 2014)
• Image to Image Translation with Conditional Adversarial Networks (Pix2Pix, CVPR 2017)
• Learning to Discover Cross-Domain Relations with GAN (DiscoGAN, ICML 2017)
• Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (CycleGAN, ICCV 2017)
 LSGAN Loss 사용
• Toward Multimodal Image-to-Image Translation (BiCycleGAN, NIPS 2017)
 LSGAN Loss 사용 / Image 변환에 다양성 부여
• High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs (Pix2PixHD, 2017)
 LSGAN Loss 사용 / 고화질 영상 생성
• StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation (CVPR 2018)
 WGAN_GP Loss 사용 / 여러 Domain Image 변환을 동시에 학습
• Progressive Growing of GANs for Improved Quality, Stability and Variation (ProGAN, ICLR 2018)
 WGAN_GP Loss 사용 / 실험적 방법으로 생성 능력 극대화
결론?
GAN을 우리가 원하는 이미지를 만들어주는 편리하고 신기한 Tool 정도로
만 이해하면 더 깊은 연구를 하는 데 한계가 있다.
우리 분야에도 굉장히 많은 수학적 Background를 요구하는 분야가 많으
며, 수학을 잘하면 논문이 간지난다.

More Related Content

What's hot

[PR12] intro. to gans jaejun yoo
[PR12] intro. to gans   jaejun yoo[PR12] intro. to gans   jaejun yoo
[PR12] intro. to gans jaejun yoo
JaeJun Yoo
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
Appsilon Data Science
 
A Unified Approach to Interpreting Model Predictions (SHAP)
A Unified Approach to Interpreting Model Predictions (SHAP)A Unified Approach to Interpreting Model Predictions (SHAP)
A Unified Approach to Interpreting Model Predictions (SHAP)
Rama Irsheidat
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial Networks
Dong Heon Cho
 
20191019 sinkhorn
20191019 sinkhorn20191019 sinkhorn
20191019 sinkhorn
Taku Yoshioka
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
Manohar Mukku
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
MLReview
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
Yunjey Choi
 
Variational inference intro. (korean ver.)
Variational inference intro. (korean ver.)Variational inference intro. (korean ver.)
Variational inference intro. (korean ver.)
Kiho Hong
 
파이콘 한국 2019 튜토리얼 - LRP (Part 2)
파이콘 한국 2019 튜토리얼 - LRP (Part 2)파이콘 한국 2019 튜토리얼 - LRP (Part 2)
파이콘 한국 2019 튜토리얼 - LRP (Part 2)
XAIC
 
A Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksA Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial Networks
Jong Wook Kim
 
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
WON JOON YOO
 
Machine Learning lecture4(logistic regression)
Machine Learning lecture4(logistic regression)Machine Learning lecture4(logistic regression)
Machine Learning lecture4(logistic regression)
cairo university
 
Anomaly Detection with GANs
Anomaly Detection with GANsAnomaly Detection with GANs
Anomaly Detection with GANs
홍배 김
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
WithTheBest
 
eScience SHAP talk
eScience SHAP talkeScience SHAP talk
eScience SHAP talk
Scott Lundberg
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
Sangwoo Mo
 
VQ-VAE
VQ-VAEVQ-VAE
VQ-VAE
수철 박
 
(20180715) ksiim gan in medical imaging - vuno - kyuhwan jung
(20180715) ksiim   gan in medical imaging - vuno - kyuhwan jung(20180715) ksiim   gan in medical imaging - vuno - kyuhwan jung
(20180715) ksiim gan in medical imaging - vuno - kyuhwan jung
Kyuhwan Jung
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
Jeong-Gwan Lee
 

What's hot (20)

[PR12] intro. to gans jaejun yoo
[PR12] intro. to gans   jaejun yoo[PR12] intro. to gans   jaejun yoo
[PR12] intro. to gans jaejun yoo
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
A Unified Approach to Interpreting Model Predictions (SHAP)
A Unified Approach to Interpreting Model Predictions (SHAP)A Unified Approach to Interpreting Model Predictions (SHAP)
A Unified Approach to Interpreting Model Predictions (SHAP)
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial Networks
 
20191019 sinkhorn
20191019 sinkhorn20191019 sinkhorn
20191019 sinkhorn
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Variational inference intro. (korean ver.)
Variational inference intro. (korean ver.)Variational inference intro. (korean ver.)
Variational inference intro. (korean ver.)
 
파이콘 한국 2019 튜토리얼 - LRP (Part 2)
파이콘 한국 2019 튜토리얼 - LRP (Part 2)파이콘 한국 2019 튜토리얼 - LRP (Part 2)
파이콘 한국 2019 튜토리얼 - LRP (Part 2)
 
A Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksA Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial Networks
 
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
 
Machine Learning lecture4(logistic regression)
Machine Learning lecture4(logistic regression)Machine Learning lecture4(logistic regression)
Machine Learning lecture4(logistic regression)
 
Anomaly Detection with GANs
Anomaly Detection with GANsAnomaly Detection with GANs
Anomaly Detection with GANs
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
 
eScience SHAP talk
eScience SHAP talkeScience SHAP talk
eScience SHAP talk
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
VQ-VAE
VQ-VAEVQ-VAE
VQ-VAE
 
(20180715) ksiim gan in medical imaging - vuno - kyuhwan jung
(20180715) ksiim   gan in medical imaging - vuno - kyuhwan jung(20180715) ksiim   gan in medical imaging - vuno - kyuhwan jung
(20180715) ksiim gan in medical imaging - vuno - kyuhwan jung
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
 

Similar to GAN with Mathematics

Lecture 2: Supervised Learning
Lecture 2: Supervised LearningLecture 2: Supervised Learning
Lecture 2: Supervised Learning
Sang Jun Lee
 
Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)
Susang Kim
 
Generative adversarial network
Generative adversarial networkGenerative adversarial network
Generative adversarial network
강민국 강민국
 
Lecture 3: Unsupervised Learning
Lecture 3: Unsupervised LearningLecture 3: Unsupervised Learning
Lecture 3: Unsupervised Learning
Sang Jun Lee
 
2016 UCPC 풀이
2016 UCPC 풀이2016 UCPC 풀이
2016 UCPC 풀이
JeonDaePeuYeon
 
Chapter 7 Regularization for deep learning - 2
Chapter 7 Regularization for deep learning - 2Chapter 7 Regularization for deep learning - 2
Chapter 7 Regularization for deep learning - 2
KyeongUkJang
 
PR-073 : Generative Semantic Manipulation with Contrasting GAN
PR-073 : Generative Semantic Manipulation with Contrasting GANPR-073 : Generative Semantic Manipulation with Contrasting GAN
PR-073 : Generative Semantic Manipulation with Contrasting GAN
광희 이
 
Variational Auto Encoder, Generative Adversarial Model
Variational Auto Encoder, Generative Adversarial ModelVariational Auto Encoder, Generative Adversarial Model
Variational Auto Encoder, Generative Adversarial Model
SEMINARGROOT
 
PR-203: Class-Balanced Loss Based on Effective Number of Samples
PR-203: Class-Balanced Loss Based on Effective Number of SamplesPR-203: Class-Balanced Loss Based on Effective Number of Samples
PR-203: Class-Balanced Loss Based on Effective Number of Samples
Sunghoon Joo
 
Dsh data sensitive hashing for high dimensional k-nn search
Dsh  data sensitive hashing for high dimensional k-nn searchDsh  data sensitive hashing for high dimensional k-nn search
Dsh data sensitive hashing for high dimensional k-nn search
WooSung Choi
 
파이썬과 케라스로 배우는 강화학습 저자특강
파이썬과 케라스로 배우는 강화학습 저자특강파이썬과 케라스로 배우는 강화학습 저자특강
파이썬과 케라스로 배우는 강화학습 저자특강
Woong won Lee
 
Lecture 4: Neural Networks I
Lecture 4: Neural Networks ILecture 4: Neural Networks I
Lecture 4: Neural Networks I
Sang Jun Lee
 
Deep learning study 1
Deep learning study 1Deep learning study 1
Deep learning study 1
San Kim
 
02.09 naive bayesian classifier
02.09 naive bayesian classifier02.09 naive bayesian classifier
02.09 naive bayesian classifier
Dea-hwan Ki
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
NAVER D2
 
개발자를 위한 공감세미나 tensor-flow
개발자를 위한 공감세미나 tensor-flow개발자를 위한 공감세미나 tensor-flow
개발자를 위한 공감세미나 tensor-flow
양 한빛
 
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
Susang Kim
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classification
LEE HOSEONG
 
Variational AutoEncoder(VAE)
Variational AutoEncoder(VAE)Variational AutoEncoder(VAE)
Variational AutoEncoder(VAE)
강민국 강민국
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...
LEE HOSEONG
 

Similar to GAN with Mathematics (20)

Lecture 2: Supervised Learning
Lecture 2: Supervised LearningLecture 2: Supervised Learning
Lecture 2: Supervised Learning
 
Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)
 
Generative adversarial network
Generative adversarial networkGenerative adversarial network
Generative adversarial network
 
Lecture 3: Unsupervised Learning
Lecture 3: Unsupervised LearningLecture 3: Unsupervised Learning
Lecture 3: Unsupervised Learning
 
2016 UCPC 풀이
2016 UCPC 풀이2016 UCPC 풀이
2016 UCPC 풀이
 
Chapter 7 Regularization for deep learning - 2
Chapter 7 Regularization for deep learning - 2Chapter 7 Regularization for deep learning - 2
Chapter 7 Regularization for deep learning - 2
 
PR-073 : Generative Semantic Manipulation with Contrasting GAN
PR-073 : Generative Semantic Manipulation with Contrasting GANPR-073 : Generative Semantic Manipulation with Contrasting GAN
PR-073 : Generative Semantic Manipulation with Contrasting GAN
 
Variational Auto Encoder, Generative Adversarial Model
Variational Auto Encoder, Generative Adversarial ModelVariational Auto Encoder, Generative Adversarial Model
Variational Auto Encoder, Generative Adversarial Model
 
PR-203: Class-Balanced Loss Based on Effective Number of Samples
PR-203: Class-Balanced Loss Based on Effective Number of SamplesPR-203: Class-Balanced Loss Based on Effective Number of Samples
PR-203: Class-Balanced Loss Based on Effective Number of Samples
 
Dsh data sensitive hashing for high dimensional k-nn search
Dsh  data sensitive hashing for high dimensional k-nn searchDsh  data sensitive hashing for high dimensional k-nn search
Dsh data sensitive hashing for high dimensional k-nn search
 
파이썬과 케라스로 배우는 강화학습 저자특강
파이썬과 케라스로 배우는 강화학습 저자특강파이썬과 케라스로 배우는 강화학습 저자특강
파이썬과 케라스로 배우는 강화학습 저자특강
 
Lecture 4: Neural Networks I
Lecture 4: Neural Networks ILecture 4: Neural Networks I
Lecture 4: Neural Networks I
 
Deep learning study 1
Deep learning study 1Deep learning study 1
Deep learning study 1
 
02.09 naive bayesian classifier
02.09 naive bayesian classifier02.09 naive bayesian classifier
02.09 naive bayesian classifier
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
 
개발자를 위한 공감세미나 tensor-flow
개발자를 위한 공감세미나 tensor-flow개발자를 위한 공감세미나 tensor-flow
개발자를 위한 공감세미나 tensor-flow
 
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classification
 
Variational AutoEncoder(VAE)
Variational AutoEncoder(VAE)Variational AutoEncoder(VAE)
Variational AutoEncoder(VAE)
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...
 

More from Hyeongmin Lee

PR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track TogetherPR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track Together
Hyeongmin Lee
 
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
Hyeongmin Lee
 
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized ReparameterizationPR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
Hyeongmin Lee
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
Hyeongmin Lee
 
PR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale HyperpriorPR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale Hyperprior
Hyeongmin Lee
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
Hyeongmin Lee
 
PR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame InterpolationPR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame Interpolation
Hyeongmin Lee
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed video
Hyeongmin Lee
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression Framework
Hyeongmin Lee
 
PR-328: End-to-End Optimized Image Compression
PR-328: End-to-End OptimizedImage CompressionPR-328: End-to-End OptimizedImage Compression
PR-328: End-to-End Optimized Image Compression
Hyeongmin Lee
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
Hyeongmin Lee
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Hyeongmin Lee
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Hyeongmin Lee
 
Pr266
Pr266Pr266
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant Again
Hyeongmin Lee
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
Hyeongmin Lee
 
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
Hyeongmin Lee
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
Hyeongmin Lee
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again
Hyeongmin Lee
 
Latest Frame interpolation Algorithms
Latest Frame interpolation AlgorithmsLatest Frame interpolation Algorithms
Latest Frame interpolation Algorithms
Hyeongmin Lee
 

More from Hyeongmin Lee (20)

PR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track TogetherPR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track Together
 
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
 
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized ReparameterizationPR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
 
PR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale HyperpriorPR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale Hyperprior
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
 
PR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame InterpolationPR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame Interpolation
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed video
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression Framework
 
PR-328: End-to-End Optimized Image Compression
PR-328: End-to-End OptimizedImage CompressionPR-328: End-to-End OptimizedImage Compression
PR-328: End-to-End Optimized Image Compression
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
 
Pr266
Pr266Pr266
Pr266
 
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant Again
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
 
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again
 
Latest Frame interpolation Algorithms
Latest Frame interpolation AlgorithmsLatest Frame interpolation Algorithms
Latest Frame interpolation Algorithms
 

GAN with Mathematics

  • 1. GAN, with Mathematics 연세대학교 영상 및 비디오 패턴 인식 연구실 이형민
  • 6. Discovering the Distribution!! Discriminative Model Generative Model Sample New Sample Which Label?? Sample GAN
  • 7. Kullback – Leibler Divergence 두 Distribution이 얼마나 유사한가?
  • 9. VS
  • 11. 𝐼𝑖 = − log 𝑝 𝑥𝑖 𝐻 = 𝐸 𝐼𝑖 = − 𝑖=1 𝑁 𝑝 𝑥𝑖 log 𝑝 𝑥𝑖 정보량 엔트로피
  • 12. 𝐷 𝐾𝐿 𝑝 𝑞 = 𝐸 log 𝑝 𝑥 − log 𝑞 𝑥 = 𝑖=1 𝑁 𝑝 𝑥𝑖 log 𝑝(𝑥𝑖) 𝑞(𝑥𝑖) 손실되는 정보량의 기댓값!! 𝐷 𝐾𝐿 = 0.338 𝐷 𝐾𝐿 = 0.477
  • 13. But!! KL-Divergence는 비대칭 함수!! Jensen – Shannon Divergence를 쓴다. 𝐽𝑆𝐷(𝑝| 𝑞 = 1 2 𝐾𝐿 𝑝 𝑝 + 𝑞 2 + 𝐾𝐿 𝑞 𝑝 + 𝑞 2
  • 14. Go back to GAN…
  • 15. 𝒛~𝒑 𝒛(𝒛) 𝒙~𝒑 𝒈(𝒙) 𝒙~𝒑 𝒅𝒂𝒕𝒂(𝒙) G
  • 16. G를 먼저 고정! D를 Optimize 해보자! 𝑉 𝐺, 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷(𝐺(𝑧)))] = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥) (∵ 𝐺 𝑧 ~𝑝 𝑔(𝑥)) = 𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥 D에 대해 미분!! 𝐷 𝐺 ∗ 𝑥 = 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎 𝑥 + 𝑝 𝑔(𝑥)
  • 17. 이제 D를 Optimal로 가정하고 G를 Optimize 해보자!! 𝐶 𝐺 = 𝑉 𝐺, 𝐷∗ = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥)[log(1 − 𝐷∗ (𝑥))] = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔(𝑥)[log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥) ] = − log 4 + 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log(𝑝 𝑑𝑎𝑡𝑎(𝑥)) − log( 𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥) 2 ) + 𝐸 𝑥~𝑝 𝑔(𝑥) log(𝑝 𝑔(𝑥)) − log( 𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥) 2 ) = − log 4 + 𝐾𝐿( 𝑝 𝑑𝑎𝑡𝑎 𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥) 2 ) + 𝐾𝐿( 𝑝 𝑔 𝑝 𝑑𝑎𝑡𝑎 𝑥 +𝑝 𝑔(𝑥) 2 ) = − log 4 + 2 × 𝐽𝑆𝐷( 𝑝 𝑑𝑎𝑡𝑎 𝑝 𝑔)
  • 18. 이론적인 GAN의 Training 시나리오…  Optimize D  Optimize G  Optimize D  … https://www.youtube.com/watch?v=RlAgB0Ooxaw&list=PLlMkM4tgfjnJhhd4wn5aj8fVTYJwIpWkS&index=49
  • 20. 실제 데이터는 공간상에 넓게 분포하지 않음 ‘집합의 크기’ Ex: 수직선상에서 (0,1)의 Measure = 1 𝑅2 에서 (0,1)×(0,1)의 Measure = 1 But, 𝑅2 에서 {(x, y) : 0<x<1, y=0}의 Measure = 0 간단히 말하면, 전체 집합보다 차원이 작은 영역의 Measure는 0이다.
  • 21. 실제 데이터는 공간상에 넓게 분포하지 않음
  • 22. 실제 데이터는 공간상에 넓게 분포하지 않음 Real, Fake data의 영역 모두 Measure가 Zero  두 영역의 교집합 또한 Measure가 Zero
  • 23. 𝑫 𝒙 = 𝟏 𝛁𝑫 𝒙 = 𝟎 𝑫 𝒙 = 𝟎 𝛁𝑫 𝒙 = 𝟎 그래서, Real과 Fake를 구분하는 완벽한 Discriminator가 항상 존재한다. 𝑴 = 𝑺𝒖𝒑𝒑(𝑷 𝒓) 𝑴 = 𝑺𝒖𝒑𝒑(𝑷 𝒈)
  • 24. 𝛁𝑫 𝒙 = 𝟎 Vanishing Gradient
  • 26. D의 Gradient가 0이면 G의 Gradient도 0이 된다.  학습 불가
  • 28. Real Data와 Fake Data에 Noise를 추가한다. 𝑴 = 𝑺𝒖𝒑𝒑(𝑷 𝒓) 𝑴 = 𝑺𝒖𝒑𝒑(𝑷 𝒈) Noise를 추가하므로, 당연히 성능 저하!!
  • 30. JSD 말고 다른 Distance를 이용하자. Infimum = Greatest Lower Bound inf 0,1 = 0 inf 0,1 = 0 inf 0,1 = 0 inf 0,1 = 0  프로그래밍적으로 구현 불가능  Wasserstein GAN
  • 32. Progressive GAN (a.k.a. 자낳괴) 아직 GAN의 학습 방법론에 대한 연구는 현재 진행형…
  • 33. GAN/VAE 비교 및 코드 Tensorflow https://github.com/hwalsuklee/ tensorflow-generative-model-collections PyTorch https://github.com/znxlwm/ pytorch-generative-model-collections
  • 34. GAN Loss Function 계열 (학습 안정화) • GAN (NIPS 2014) • Least Square GAN (LSGAN, ICCV 2017) • Wasserstein GAN (WGAN, ICML 2017) • Improved Training of Wasserstein GANs (WGAN_GP, NIPS 2017) • Improving the Improved Training of Wasserstein GANs (CT-GAN, ICLR 2018)
  • 35. cGAN / Application 계열 (Conditional Generation) • Conditional GAN (cGAN, 2014) • Image to Image Translation with Conditional Adversarial Networks (Pix2Pix, CVPR 2017) • Learning to Discover Cross-Domain Relations with GAN (DiscoGAN, ICML 2017) • Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (CycleGAN, ICCV 2017)  LSGAN Loss 사용 • Toward Multimodal Image-to-Image Translation (BiCycleGAN, NIPS 2017)  LSGAN Loss 사용 / Image 변환에 다양성 부여 • High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs (Pix2PixHD, 2017)  LSGAN Loss 사용 / 고화질 영상 생성 • StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation (CVPR 2018)  WGAN_GP Loss 사용 / 여러 Domain Image 변환을 동시에 학습 • Progressive Growing of GANs for Improved Quality, Stability and Variation (ProGAN, ICLR 2018)  WGAN_GP Loss 사용 / 실험적 방법으로 생성 능력 극대화
  • 36. 결론? GAN을 우리가 원하는 이미지를 만들어주는 편리하고 신기한 Tool 정도로 만 이해하면 더 깊은 연구를 하는 데 한계가 있다. 우리 분야에도 굉장히 많은 수학적 Background를 요구하는 분야가 많으 며, 수학을 잘하면 논문이 간지난다.