NAVER Clova 이활석
인식
이미지 내의 존재하는 정보를 찾는 기술
얼굴 인식 홍채 인식 번호판 인식 지문 인식
생성
특정 정보를 담는 이미지를 생성하는 기술
스타일 변환스타일 이미지
컴퓨터 비전 기술INTRODUCTION
이미지 인식 Image Recognition
ü ImageNet 대회
• 1000 종류의 클래스 존재
u 목표 : 이미지 내 존재하는 객체 분류
< 사람=5%
u 2015년부터 사람보다 잘 하기 시작 (2017년으로 종료)
빅데이터+딥러닝 > 사람
1 / 6
RECOGNITION
대회 현황IMAGENET
사람이 5% 오차? : 정답의 오류 및 종류의 세분화
정해진 수의 물건들의 특징을 잘 파악하여 구분할 수 있다!
딥뉴럴 네트워크
개와 고양이를 구분하는 모델
RECOGNITION
2015년부터 AI가 잘 할 수 있게 된 것은?IMAGENET 2 / 6
RECOGNITION
이미지 태그 생성APPLICATION
이미지와 관련 있는 태그 찾기
https://www.clarifai.com/demo
3 / 6
RECOGNITION
객체 분류APPLICATION
ü 사각형 영역 단위로 구분 (RCNN,	‘15.04) ü 화소 단위로 구분 (DeepLab,	‘17.03)
4 / 6
RECOGNITION
신체 부위 분류APPLICATION
ü Openpose,	‘17.04
오승환 투구 슬로우 모션 영상
https://github.com/CMU-Perceptual-Computing-Lab/openpose
5 / 6
얼굴 + 신체 + 손
이미지 생성 Image Generation
딥뉴럴 네트워크
개와 고양이를 생성할 수 있는 모델
분류 모델 보다는 개와 고양이를 제대로 이해하고 있다
Generative	Model
“What I cannot create, I do not understand.” - Richard Feynman
1 / 13
GENERATION
2015년부터 AI	연구자들이 관심 갖게 된 것은?INTORUDCTION
StackGAN : Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks , ‘16.12
AI가 생성한 사진을 찾아보세요!
(1)	This	flower	has	overlapping	pink	pointed	petals	surrounding	a	ring	of	short	yellow	filaments
이 꽃은 짧은 노란색 필라멘트의 고리를 둘러싼 핑크색 뾰족한 꽃잎이 겹쳐져 있습니다
(2)	This	flower	has	upturned	petals	which	are	thin	and	orange	with	rounded	edges
이 꽃은 둥근 모서리를 가진 얇고 오렌지색의 꽃잎이 위로 향해 있습니다
(3)	A	flower	with	small	pink	petals	and	a	massive	central	orange	and	black	stamen	cluster
작은 분홍색 꽃잎들과 중심에 다수의 오렌지색과 검은 색 수술 군집이 있는 꽃
(1)	 (2)	 (3)	
2 / 13
GENERATION
Quiz	:	stackGANINTORUDCTION
3 / 13
GENERATION
Generative	ModelsINTORUDCTION
𝑧
Generative	Model
0.1
0.7
-2
…
1.4
𝑥
𝑧
Discriminative	Model
0.1
0.3
0
…
0.4
𝑥
비교사학습
Un-supervised	Learning
교사학습
Supervised	Learning
고양이
강아지
여우
뱀
Latent	Vector
Noise
Code
Feature
Data	(Image)
구분 모델
생성 모델
Supervised	VS	Un-supervised
4 / 13
GENERATION
Generative	ModelsINTORUDCTION
Density	Estimation
https://github.com/mingyuliutw/cvpr2017_gan_tutorial/blob/master/gan_tutorial.pdf
Training	Examples
Maximum	Likelihood
Sampling
𝑝(𝑥)
5 / 13
GENERATION
Generative	ModelsINTORUDCTION
Taxomomy
http://www.iangoodfellow.com/slides/2016-12-04-NIPS.pdf
분포에 대한 모델을 가정하고 그
모델을 결정하는 파라미터를 조절해
가며 타겟 분포가 되도록 노력
생성되는 샘플들의 분포가
타겟 분포가 되도록 노력
VAE
GAN
VAE Varaitional AutoEncoder
Variational • 𝒑(𝒙)를 구할 때 variational learning/inference 활용
AutoEncoder
1 / 33
GENERATION
용어의 의미VAE
Auto-Encoding Variational Bayes , ‘14.03
Latent	Variable Target	Data
𝑥
𝑔)*
𝑧
𝑔)+
𝑧
𝑝 𝑥 𝑔)+
𝑧 < 𝑝 𝑥 𝑔)*
𝑧
확률 분포를 정해주는 파라미터를 추정.
(예. 가우시안일 경우 평균,표준편차)
𝑧 𝑔)(𝑧)Generator
gθ(.)
𝑝 𝑧 𝑝 𝑥 𝑔)(𝑧)
sampling modeling
- 𝑝 𝑥 𝑔)(𝑧) 𝑝 𝑧 𝑑𝑧 = 𝑝(𝑥)
variational learning
- 𝑝 𝑥 𝑔)(𝑧) 𝑝 𝑧 𝑑𝑧 = 𝐸2(3)[𝑝)(𝑥|𝑧)]
Variational • 𝒑(𝒙)를 구할 때 variational learning/inference 활용
AutoEncoder
2 / 33
GENERATION
용어의 의미VAE
Auto-Encoding Variational Bayes , ‘14.03
𝑧 𝑔)(𝑧)Generator
gθ(.)
𝑝 𝑧
sampling
• 입력과 출력이 동일한 네트워크
𝑝 𝑧|𝑥
x를 잘 생성해는 z가
샘플링 잘 되도록, x를
힌트로 제공함
𝑝 𝑧|𝑞8(𝑥)
p(z|x)의 모델을
가정하고, 그 모델을
결정짓는 파라미터를
네트워크로 추정
Decoder
gθ(.)
Encoder
qφ(.)
𝑥 𝑝 𝑧|𝑞8(𝑥)
𝑞8(𝑥)
sampling
𝑧 𝑔) 𝑧 𝑥→
3 / 33
GENERATION
Variational InferenceVAE
Variational Lower	Bound	(Evidence	Lower	BOund,	ELBO)
log 𝑝 𝑥 = 𝐸𝐿𝐵𝑂 + 𝐾𝐿(𝑞8 𝑧|𝑥 ∥ 𝑝 𝑧|𝑥 )
log 𝑝 𝑥
𝐾𝐿(𝑞8 𝑧|𝑥 ∥ 𝑝 𝑧|𝑥 )
𝐸𝐿𝐵𝑂(𝜙)
𝜙D 𝜙E
Optimization	Problem	1	on	𝜙:	Variational Inference
argmin
8
𝐾𝐿(𝑞8 𝑧|𝑥 ∥ 𝑝 𝑧|𝑥 ) == argmax
8
𝐸𝐿𝐵𝑂(𝜙)
log 𝑝 𝑥 ≥ 𝔼NO 3|P log 𝑝) 𝑥|𝑧 − 𝐾𝐿(𝑞8 𝑧|𝑥 ∥ 𝑝 𝑧 )	= 𝐸𝐿𝐵𝑂
Optimization	Problem	2	on	𝜃:	Variational Learning
argmax
)
𝔼NO 3|P log 𝑝) 𝑥|𝑧 = argmax
)
𝐸𝐿𝐵𝑂(𝜃)
Final	Optimization	Problem
argmax
),8
𝐸𝐿𝐵𝑂(𝜃, 𝜙)
• KL : Kullback-Leibler Divergence
두 확률 분포가 얼마나 다른지에 대한 척도
𝐿U 𝜙, 𝜃, 𝑥U = −𝔼NO 3|P log 𝑝) 𝑥U|𝑧 + 𝐾𝐿(𝑞8 𝑧|𝑥U ∥ 𝑝 𝑧 )
Reconstruction Error
복원 오차
입력과 출력 간의 cross-entropy
Regularization
제약 조건
Prior분포와의 다른 정도
[ VAE의 특징들 ]
1.	Decoder가 최소한 학습 데이터는 생성해 낼 수 있게 된다.
à 생성된 데이터가 학습 데이터 좀 닮아 있다.
2.	Encoder가 최소한 학습 데이터는 잘 latent	vector로 표현할
수 있게 된다.
à 데이터의 추상화를 위해 많이 사용된다.
4 / 33
GENERATION
Encoder	&	DecoderVAE
𝑥U
Encoder
𝑞8(𝑧|𝑥U)
𝜇U 𝜎U
𝜖~𝑁(0, 𝐼)
𝜖U
𝑧U
Decoder
𝑝) 𝑥|𝑧
𝑦U = 𝑥U
Bernoulli
Gaussian
Cost	Function
28 𝑥U 𝑝U
𝐷
28
𝐷=784
MLP	with	2	hidden	layers
(500,	500)
Architecture
5 / 33
GENERATION
Result	:	MNISTVAE
𝑔)(_)𝑞8 _
𝑞8 𝑧|𝑥
~𝑧
𝑔) 𝑥|𝑧
Encoder
Posterior
Inference	Network
Decoder
Generator
Generation	Network
SAMPLING
𝜇U
𝜎U
Latent
Space
Reproduce
Input	image |z|	=2 |z|	=5 |z|	=20
https://github.com/hwalsuklee/tensorflow-mnist-VAE
6 / 33
GENERATION
Result	:	MNISTVAE
Denoising
Input	image +	zero-masking	noise	with	50%	prob.
+	salt&peppr noise	with	50%	prob.
Restored	image
https://github.com/hwalsuklee/tensorflow-mnist-VAE
7 / 33
GENERATION
Result	:	MNISTVAE
Learned	Manifold
학습이 잘 되었을 수록 2D공간에서 같은 숫자들을 생성하는 z들은
뭉쳐있고, 다른 숫자들은 생성하는 z들은 떨어져 있어야 한다.
https://github.com/hwalsuklee/tensorflow-mnist-VAE
8 / 33
GENERATION
Result	:	MNISTVAE
Decoding	/	GenerationEncoding	/	Feature	Extraction
9 / 33
GENERATION
DEMO	:	MNIST	/	Gray	FaceVAE
http://www.dpkingma.com/sgvb_mnist_demo/demo.html
|z|=12
24
24
Handwritten Digits Generation
http://vdumoulin.github.io/morphing_faces/online_demo.html
|z|=29
64
64
Gray Face Generation
https://magenta.tensorflow.org/sketch-rnn-demo
https://magenta.tensorflow.org/assets/sketch_rnn_demo/multi_vae.html
10 / 33
GENERATION
DEMO	:	Sketch	RNNVAE
IntroductionCVAE
Conditional	VAE
11 / 33
GENERATION
input
CVAE,	epoch	1
VAE,	epoch	1
CVAE,	epoch	20
VAE,	epoch	20
Reproduce |z|	=	2
MNIST	resultsCVAE
https://github.com/hwalsuklee/tensorflow-mnist-CVAE
12 / 33
GENERATION
CVAE,	epoch	1
VAE,	epoch	1
CVAE,	epoch	20
VAE,	epoch	20
input
Denoising |z|	=	2
MNIST	resultsCVAE
https://github.com/hwalsuklee/tensorflow-mnist-CVAE
13 / 33
GENERATION
Handwriting	styles	obtained	by	fixing	the	class	label	and	varying	z |z|	=	2
y=[1,0,0,0,0,0,0,0,0,0] y=[0,1,0,0,0,0,0,0,0,0] y=[0,0,1,0,0,0,0,0,0,0] y=[0,0,0,1,0,0,0,0,0,0] y=[0,0,0,0,1,0,0,0,0,0]
y=[0,0,0,0,0,0,1,0,0,0]y=[0,0,0,0,0,1,0,0,0,0] y=[0,0,0,0,0,0,0,0,1,0]y=[0,0,0,0,0,0,0,1,0,0] y=[0,0,0,0,0,0,0,0,0,1]
MNIST	resultsCVAE
https://github.com/hwalsuklee/tensorflow-mnist-CVAE
14 / 33
GENERATION
Z-sampling
각 행 별로, 고정된 z값에
대해서 label정보만 바꿔
서 이미지 생성 (스타일
유지하면 숫자만 바뀜)
MNIST	resultsCVAE
Analogies	:	Result	in	paper
Semi-Supervised Learning with Deep Generative Models : https://arxiv.org/abs/1406.5298
15 / 33
GENERATION
𝑧D
𝑧E
𝑧`
𝑧a
𝑐c 𝑐D 𝑐E 𝑐` 𝑐a 𝑐d 𝑐e 𝑐f 𝑐g 𝑐h
Handwriting	style	for	a	given	z	must	be	preserved	for	all	labels
Analogies |z|	=	2
MNIST	resultsCVAE
https://github.com/hwalsuklee/tensorflow-mnist-CVAE
𝑐c 𝑐D 𝑐E 𝑐` 𝑐a 𝑐d 𝑐e 𝑐f 𝑐g 𝑐h
Real	
handwritten	
image
실제로 손으로 쓴 글씨 ‘3’을 CVAE의 label정보와 같이 넣었을 때 얻는
latent vector는 decoder의 고정 입력으로 하고, label정보만 바꿨을 경우
16 / 33
GENERATION
GAN Generative Adversarial Networks
17 / 33
GENERATION
용어의 의미GAN
Generative Adversarial Networks , ‘14.06
Generative
• Generator	(생성자)	및 Discriminator가 (구분자) 서로 적대적으로 행동
• 생성 모델
Adversarial
Networks • 생성자 및 구분자 모두 딥뉴럴 네트워크로 모델링
x
0~1D
G
G(z) Discriminator
Generator
z
“The	coolest	idea	in	ML	in	the	last	twenty	years” – Yann	LeCun
생성자
Generator
구별자
Discriminator
G
A
Networks 두 개의 네트워크
dversarial 적대적으로 학습된
enerative 데이터 생성목적으로
실제 / 가짜
가짜 사진
실제 데이터
실제 사진
구별자는 생성자의 결과를 거짓이라고 판단하려고 노력
생성자는 구별자가 참이라고 판단하게 하려고 노력
거짓말
쟁이
거짓말
탐지기
적대적
18 / 33
GENERATION
꽃 사진 생성 예시GAN
Noise
(Latent variable)
생성자
Generator
구별자
Discriminator
실 데이터 생성자
Yes / No
실 데이터?
Generator가 매 번 다른 샘플을
만들어 내기 위한 입력
정답
얼굴 DB
구별자 성능 향상
생성자 성능 향상
19 / 33
GENERATION
얼굴 사진 생성 예시GAN
Noise
(Latent	variable)
eneratorDiscriminator
Data	sample
Yes	/	No
G
𝑝ijkj(𝑥)
𝐷 𝑥 = 1
𝐷 𝐺 𝑧 = 0
𝑥
𝐺(𝑧)
𝑧
𝐷 𝐺 𝑧 =1
𝑝3(𝑧)
𝑉 𝐷, 𝐺 =	 𝔼P~2opqp(P) log𝐷(𝑥) + 𝔼3~2r(3) log 1 − 𝐷 𝐺 𝑧Value	function	of	GAN
Goal 𝐷∗
, 𝐺∗
= min
t
max
u
𝑉 𝐷, 𝐺 minmax problem!!!
20 / 33
GENERATION
ProblemGAN
Notation
D max
u
𝑉 𝐷, 𝐺 = max
u
𝔼P~2opqp
log𝐷(𝑥) + 𝔼3~2r
log 1 − 𝐷 𝐺 𝑧
G is fixed
Maximize prob. of D(real) Minimize prob. of D(fake)
G min
t
𝑉 𝐷, 𝐺 = min
t
𝔼3~2r
log 1 − 𝐷 𝐺 𝑧
D is fixed
Maximize prob. of D(fake)
= max
t
𝔼3~2r
log 𝐷 𝐺 𝑧
Alternating
Optimization
21 / 33
GENERATION
ProblemGAN
Optimization
𝑉 𝐷, 𝐺 =	 𝔼P~2opqp(P) log𝐷(𝑥) + 𝔼3~2r(3) log 1 − 𝐷 𝐺 𝑧Value	function	of	GAN
Goal 𝐷∗
, 𝐺∗
= min
t
max
u
𝑉 𝐷, 𝐺
𝐷∗
𝑥 =
𝑝ijkj(𝑥)
𝑝ijkj 𝑥 + 𝑝v(𝑥)D 𝑝v
∗
𝑥 = 𝑝ijkj(𝑥)
G
D(x)
p_data
p_g(x)
Optimization for D(x) Optimization for G(z) Equilibrium : D(x)=0.5
G(z) maps z to x
1D Gaussian Approximation example
22 / 33
GENERATION
ProblemGAN
Optimal	Solution
http://cs.stanford.edu/people/karpathy/gan/
Uniform Distribution
https://github.com/hwalsuklee/tensorflow-GAN-1d-gaussian-ex
23 / 33
GENERATION
VAE	VS	GANGAN
Model Optimization Image Quality Generalization
VAE
• Stochastic	gradient	descent
• Converge	to	local	minimum
• Easier
• Smooth
• Blurry
• Tend	to	remember	
input	images
GAN
• Alternating	stochastic	gradient	descent
• Converge	to	saddle	points
• Harder
v Model	collapsing
v Unstable convergence
• Sharp
• Artifact
• Generate	new	
unseen	images
http://aliensunmin.github.io/project/accv16tutorial/media/generative.pdf
z DEx x
DecoderEncoder x
0~1D
G
G(z) Discriminator
Generator
z
VAE GAN
Comparison	between	VAE	vs	GAN
VAE : maximum likelihood
approach
GAN
http://videolectures.net/site/normal_dl/tag=1129740/deeplearning2017_courville_generative_models_01.pdf
24 / 33
GENERATION
VAE	VS	GANGAN
이후로 소개되는 모든 논문들은 DCGAN에서 크게 벗어나지 않음
대부분의 상황에서 안정적으로 학습이 되는 Deep Convolutional GAN구조를 제안
Generator Discriminator
Pooling Layers Not	used.	But	use strided convolutions	instead.
Batch	normalization Use except	output	layer Use	except input	layer
Fully	connected	hidden	layers Not	used
Activation	function
ReLU for	all	layers	except	for	the	
output,	which	uses	Tanh
LeakyReLU for	all	layers
25 / 33
GENERATION
Key	ContributionDCGAN
Unsupervised Representation learning with Deep Convolutional Generative Adversarial Networks (DCGAN), ‘15.11
26 / 33
GENERATION
네트워크 구조DCGAN
생성자 Generator
구별자 Discriminator
27 / 33
GENERATION
Face	Generation	ResultsDCGAN
http://carpedm20.github.io/faces/
유명인 얼굴 DB
LEVEL 2
LEVEL 1
진짜
진짜
28 / 33
GENERATION
Face	Generation	Results	:	InterpolationDCGAN
z1 z2(z1, z2)를 보간 후 새로 생성한 z에 대한 결과
z1 : 썬글라스를 낀 남자들을 발생시키는 z값 평균
z2 : 썬글라스를 안 낀 남자들을 발생시키는 z값 평균
z3 : 썬글라스를 안 낀 여자들을 발생시키는 z값 평균
z1 z2 z3
z1-z2+z3
썬글라스를 낀 여자들
29 / 33
GENERATION
Face	Generation	Results	:	Vector	arithmeticDCGAN
30 / 33
GENERATION
Character	Generation	ResultsDCGAN
http://mattya.github.io/chainer-DCGAN
만화 캐릭터 DB
31 / 33
GENERATION
A	List	of	Some	PapersGAN zoo
https://deephunt.in/the-gan-zoo-79597dc8c347
GAN — Generative Adversarial Networks
3D-GAN — Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling
acGAN — Face Aging With Conditional Generative Adversarial Networks
AC-GAN — Conditional Image Synthesis With Auxiliary Classifier GANs
AdaGAN — AdaGAN: Boosting Generative Models
AEGAN — Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets
AffGAN — Amortised MAP Inference for Image Super-resolution
AL-CGAN — Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts
ALI — Adversarially Learned Inference
AMGAN — Generative Adversarial Nets with Labeled Data by Activation Maximization
AnoGAN — Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery
ArtGAN — ArtGAN: Artwork Synthesis with Conditional Categorial GANs
b-GAN — b-GAN: Unified Framework of Generative Adversarial Networks
Bayesian GAN — Deep and Hierarchical Implicit Models
BEGAN — BEGAN: Boundary Equilibrium Generative Adversarial Networks
BiGAN — Adversarial Feature Learning
BS-GAN — Boundary-Seeking Generative Adversarial Networks
CGAN — Conditional Generative Adversarial Nets
CCGAN — Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks
CatGAN — Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks
CoGAN — Coupled Generative Adversarial Networks
Context-RNN-GAN — Contextual RNN-GANs for Abstract Reasoning Diagram Generation
C-RNN-GAN — C-RNN-GAN: Continuous recurrent neural networks with adversarial training
CS-GAN — Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
CVAE-GAN — CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training
CycleGAN — Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
DTN — Unsupervised Cross-Domain Image Generation
DCGAN — Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
DiscoGAN — Learning to Discover Cross-Domain Relations with Generative Adversarial Networks
DR-GAN — Disentangled Representation Learning GAN for Pose-Invariant Face Recognition
DualGAN — DualGAN: Unsupervised Dual Learning for Image-to-Image Translation
EBGAN — Energy-based Generative Adversarial Network
f-GAN — f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
GAWWN — Learning What and Where to Draw
GoGAN — Gang of GANs: Generative Adversarial Networks with Maximum Margin Ranking
GP-GAN — GP-GAN: Towards Realistic High-Resolution Image Blending
IAN — Neural Photo Editing with Introspective Adversarial Networks
iGAN — Generative Visual Manipulation on the Natural Image Manifold
IcGAN — Invertible Conditional GANs for image editing
ID-CGAN- Image De-raining Using a Conditional Generative Adversarial Network
Improved GAN — Improved Techniques for Training GANs
InfoGAN — InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
LAGAN — Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis
LAPGAN — Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks
LR-GAN — LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation
LSGAN — Least Squares Generative Adversarial Networks
LS-GAN — Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities
MGAN — Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks
MAGAN — MAGAN: Margin Adaptation for Generative Adversarial Networks
MAD-GAN — Multi-Agent Diverse Generative Adversarial Networks
MalGAN — Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN
MaliGAN — Maximum-Likelihood Augmented Discrete Generative Adversarial Networks
MARTA-GAN — Deep Unsupervised Representation Learning for Remote Sensing Images
McGAN — McGan: Mean and Covariance Feature Matching GAN
MDGAN — Mode Regularized Generative Adversarial Networks
MedGAN — Generating Multi-label Discrete Electronic Health Records using Generative Adversarial Networks
MIX+GAN — Generalization and Equilibrium in Generative Adversarial Nets (GANs)
MPM-GAN — Message Passing Multi-Agent GANs
MV-BiGAN — Multi-view Generative Adversarial Networks
pix2pix — Image-to-Image Translation with Conditional Adversarial Networks
PPGN — Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space
PrGAN — 3D Shape Induction from 2D Views of Multiple Objects
RenderGAN — RenderGAN: Generating Realistic Labeled Data
RTT-GAN — Recurrent Topic-Transition GAN for Visual Paragraph Generation
SGAN — Stacked Generative Adversarial Networks
SGAN — Texture Synthesis with Spatial Generative Adversarial Networks
SAD-GAN — SAD-GAN: Synthetic Autonomous Driving using Generative Adversarial Networks
SalGAN — SalGAN: Visual Saliency Prediction with Generative Adversarial Networks
SEGAN — SEGAN: Speech Enhancement Generative Adversarial Network
SeGAN — SeGAN: Segmenting and Generating the Invisible
SeqGAN — SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient
SimGAN — Learning from Simulated and Unsupervised Images through Adversarial Training
SketchGAN — Adversarial Training For Sketch Retrieval
SL-GAN — Semi-Latent GAN: Learning to generate and modify facial images from attributes
Softmax-GAN — Softmax GAN
SRGAN — Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
S²GAN — Generative Image Modeling using Style and Structure Adversarial Networks
SSL-GAN — Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks
StackGAN — StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
TGAN — Temporal Generative Adversarial Nets
TAC-GAN — TAC-GAN — Text Conditioned Auxiliary Classifier Generative Adversarial Network
TP-GAN — Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View
Synthesis
Triple-GAN — Triple Generative Adversarial Nets
Unrolled GAN — Unrolled Generative Adversarial Networks
VGAN — Generating Videos with Scene Dynamics
VGAN — Generative Adversarial Networks as Variational Training of Energy Based Models
VAE-GAN — Autoencoding beyond pixels using a learned similarity metric
VariGAN — Multi-View Image Generation from a Single-View
ViGAN — Image Generation and Editing with Variational Info Generative AdversarialNetworks
WGAN — Wasserstein GAN
WGAN-GP — Improved Training of Wasserstein GANs
WaterGAN — WaterGAN: Unsupervised Generative Network to Enable Real-time Color Correction of Monocular Underwater Images
32 / 33
GENERATION
Explosive	GrowthGAN zoo
https://deephunt.in/the-gan-zoo-79597dc8c347
Explosive growth — All the named GAN variants cumulatively since 2014. Credit: Bruno Gavranović
33 / 33
GENERATION
Collections	of	Generative	ModelGAN zoo
https://github.com/soumith/talks/tree/master/2017-ICCV_Venice
https://github.com/hwalsuklee/tensorflow-generative-model-collections
ICCV 2017 GAN Tutorial
응용 Applications
1 / 20
APPLICATIONS
힌트를 통한 이미지 변환CONDITIONAL GENERATION
생성자
Generator
스타일 변환
생성자
Generator
스케치à사진
생성자
Generator
자동 채색
2 / 20
APPLICATIONS
스타일 변환CONDITIONAL GENERATION
Fast Neural Style ‘16.04
스타일
3 / 20
APPLICATIONS
스타일 변환CONDITIONAL GENERATION
Artistic style transfer for videos, ‘16.10
동영상
4 / 20
APPLICATIONS
스케치 à 사진CONDITIONAL GENERATION
pixel2pixel ‘16.11 (https://affinelayer.com/pixsrv/index.html)
건물 정면 사진 DB 고양이 사진 DB
구두 사진 DB 가방 사진 DB
5 / 20
APPLICATIONS
스케치 à 채색CONDITIONAL GENERATION
자동 채색 ‘17.01 (https://paintschainer.preferred.tech/)
6 / 20
APPLICATIONS
스케치 à 채색CONDITIONAL GENERATION
자동 채색 ‘17.01 (https://paintschainer.preferred.tech/)
7 / 20
APPLICATIONS
스케치 à 채색CONDITIONAL GENERATION
자동 채색 ‘17.01 (https://paintschainer.preferred.tech/)
원본
힌트
작업 시간 < 1분
정답
색상 힌트에 따라 채색
8 / 20
APPLICATIONS
실시간 사용자 입력CONDITIONAL GENERATION
색상 힌트에 따라 채색
Generative Visual Manipulation on the Natural Image Manifold , ‘16.09
①풀
②선택
③산
④선택
⑤하늘
9 / 20
APPLICATIONS
실시간 사용자 입력CONDITIONAL GENERATION
색상 힌트에 따라 채색
Generative Visual Manipulation on the Natural Image Manifold , ‘16.09
동영상
https://youtu.be/5jfViPdYLic
10 / 20
APPLICATIONS
FONT GENERATION
https://kaonashi-tyc.github.io/2017/04/06/zi2zi.html , ‘17.04.06
랜덤 폰트 생성 한자 폰트 à 한글 폰트
11 / 20
APPLICATIONS
FONT GENERATION
http://fontto.twiiks.co/demo/
Handwritten	Character	to	Font
12 / 20
Multimodal	Feature	Learner
StackGAN : Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks , ‘16.12
APPLICATIONS
StackGANGAN+VAE
13 / 20
Learning a Probabilistic latent Space of Object Shapes via 3D Generative-Adversarial Modeling (3D-GAN), ‘16.10
Multimodal	Feature	Learner
APPLICATIONS
3DGANGAN+VAE
14 / 20
Denoising
SEGAN: Speech Enhancement Generative Adversarial Network ‘17. 03. 28
Nothing is safe.
There will be no repeat of that
performance, that I can guarantee.
before after
before after
APPLICATIONS
SEGANGAN+VAE
15 / 20
Age	Progression/Regression	by	Conditional	Adversarial	Autoencoder
https://zzutk.github.io/Face-Aging-CAAE/
APPLICATIONS
Papers	in	CVPR2017GAN+VAE
16 / 20
Age	Progression/Regression	by	Conditional	Adversarial	Autoencoder
https://zzutk.github.io/Face-Aging-CAAE/
APPLICATIONS
Papers	in	CVPR2017GAN+VAE
17 / 20
Hallucinating	Very	Low-Resolution	Unaligned	and	Noisy	Face	Images	by	Transformative	Discriminative	Autoencoders
http://www.porikli.com/mysite/pdfs/porikli%202017%20-%20Hallucinating%20very%20low-
resolution%20unaligned%20and%20noisy%20face%20images%20by%20transformative%20discriminative%20autoencoders.pdf
APPLICATIONS
Papers	in	CVPR2017GAN+VAE
16x16 à 128x128
18 / 20
Hallucinating	Very	Low-Resolution	Unaligned	and	Noisy	Face	Images	by	Transformative	Discriminative	Autoencoders
http://www.porikli.com/mysite/pdfs/porikli%202017%20-%20Hallucinating%20very%20low-
resolution%20unaligned%20and%20noisy%20face%20images%20by%20transformative%20discriminative%20autoencoders.pdf
APPLICATIONS
Papers	in	CVPR2017GAN+VAE
TUN	loss
DL	loss
TE	loss
19 / 20
A	Generative	Model	of	People	in	Clothing
https://arxiv.org/abs/1705.04098
APPLICATIONS
Papers	in	ICCV2017GAN+VAE
20 / 20
The	Conditional	Analogy	GAN:	Swapping	Fashion	Articles	on	People	Images
http://openaccess.thecvf.com/content_ICCV_2017_workshops/w32/html/Jetchev_The_Conditional_Analogy_ICCV_2017_paper.html
APPLICATIONS
Papers	in	ICCV2017Conditional GAN
마무리 Closing
1 / 6
CLOSING
진짜 사진을 찾아보세요!FACE GENERATION
DCGAN ‘15.11.19 64 x 64 pixels
BEGAN ‘17.03.31 128 x 128 pixels
4배 화질 향상
https://arxiv.org/abs/1703.10717
2 / 6
CLOSING
DCGAN ‘15.11.19 64 x 64 pixels
Create Anime Characters with A.I. ‘17.08.14 128 x 128 pixels
4배 화질 향상
http://make.girls.moe/#/
결과 비교CHARACTER GENERATION
3 / 6
CLOSING
Deep Feature Interpolation (CVPR 2017) 1000 x 1000 pixels
https://github.com/paulu/deepfeatinterp
얼굴 이미지CONDITIONAL GENERATION
4 / 6
CLOSING
진짜 사진을 찾아보세요!FACE GENERATION
BEGAN ‘17.03.31 128 x 128 pixels
PGGAN ‘17.10.27
https://arxiv.org/abs/1710.10196
5 / 6
CLOSING
진짜 사진을 찾아보세요!FACE GENERATION
PGGAN ‘17.10.27 1024 x 1024 pixels
16*16배 화질 향상
HARDWARE
6 / 6
CLOSING
컴퓨터 비전 기술의 비약적 발전 배경
BIG
DATA
HARDWARE
논문 공유 코드 공유 개발 플랫폼 공유
공유의 문화
End of Document
List of Web Demos
• Classification : https://www.clarifai.com/demo
• Segmentation : http://www.robots.ox.ac.uk/~szheng/crfasrnndemo
• Vae mnist : http://www.dpkingma.com/sgvb_mnist_demo/demo.html
• Vae gray face : http://vdumoulin.github.io/morphing_faces/online_demo.html
• Vae sketch-rnn : https://magenta.tensorflow.org/assets/sketch_rnn_demo/multi_vae.html
• 1d gan : http://cs.stanford.edu/people/karpathy/gan/
• Dcgan asian face : https://carpedm20.github.io/faces/
• Dcgan character generation : http://mattya.github.io/chainer-DCGAN
• Style-transfer : http://demos.algorithmia.com/deep-style/
• Pix2pix : https://affinelayer.com/pixsrv/index.html
• Colorization : https://paintschainer.preferred.tech/
• Font generation : http://fontto.twiiks.co/demo/
• Conditional character generation : http://make.girls.moe/#/

그림 그리는 AI

  • 1.
  • 2.
    인식 이미지 내의 존재하는정보를 찾는 기술 얼굴 인식 홍채 인식 번호판 인식 지문 인식 생성 특정 정보를 담는 이미지를 생성하는 기술 스타일 변환스타일 이미지 컴퓨터 비전 기술INTRODUCTION
  • 3.
  • 4.
    ü ImageNet 대회 •1000 종류의 클래스 존재 u 목표 : 이미지 내 존재하는 객체 분류 < 사람=5% u 2015년부터 사람보다 잘 하기 시작 (2017년으로 종료) 빅데이터+딥러닝 > 사람 1 / 6 RECOGNITION 대회 현황IMAGENET 사람이 5% 오차? : 정답의 오류 및 종류의 세분화
  • 5.
    정해진 수의 물건들의특징을 잘 파악하여 구분할 수 있다! 딥뉴럴 네트워크 개와 고양이를 구분하는 모델 RECOGNITION 2015년부터 AI가 잘 할 수 있게 된 것은?IMAGENET 2 / 6
  • 6.
    RECOGNITION 이미지 태그 생성APPLICATION 이미지와관련 있는 태그 찾기 https://www.clarifai.com/demo 3 / 6
  • 7.
    RECOGNITION 객체 분류APPLICATION ü 사각형영역 단위로 구분 (RCNN, ‘15.04) ü 화소 단위로 구분 (DeepLab, ‘17.03) 4 / 6
  • 8.
    RECOGNITION 신체 부위 분류APPLICATION üOpenpose, ‘17.04 오승환 투구 슬로우 모션 영상 https://github.com/CMU-Perceptual-Computing-Lab/openpose 5 / 6 얼굴 + 신체 + 손
  • 9.
  • 10.
    딥뉴럴 네트워크 개와 고양이를생성할 수 있는 모델 분류 모델 보다는 개와 고양이를 제대로 이해하고 있다 Generative Model “What I cannot create, I do not understand.” - Richard Feynman 1 / 13 GENERATION 2015년부터 AI 연구자들이 관심 갖게 된 것은?INTORUDCTION
  • 11.
    StackGAN : Textto Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks , ‘16.12 AI가 생성한 사진을 찾아보세요! (1) This flower has overlapping pink pointed petals surrounding a ring of short yellow filaments 이 꽃은 짧은 노란색 필라멘트의 고리를 둘러싼 핑크색 뾰족한 꽃잎이 겹쳐져 있습니다 (2) This flower has upturned petals which are thin and orange with rounded edges 이 꽃은 둥근 모서리를 가진 얇고 오렌지색의 꽃잎이 위로 향해 있습니다 (3) A flower with small pink petals and a massive central orange and black stamen cluster 작은 분홍색 꽃잎들과 중심에 다수의 오렌지색과 검은 색 수술 군집이 있는 꽃 (1) (2) (3) 2 / 13 GENERATION Quiz : stackGANINTORUDCTION
  • 12.
  • 13.
  • 14.
    5 / 13 GENERATION Generative ModelsINTORUDCTION Taxomomy http://www.iangoodfellow.com/slides/2016-12-04-NIPS.pdf 분포에대한 모델을 가정하고 그 모델을 결정하는 파라미터를 조절해 가며 타겟 분포가 되도록 노력 생성되는 샘플들의 분포가 타겟 분포가 되도록 노력 VAE GAN
  • 15.
  • 16.
    Variational • 𝒑(𝒙)를구할 때 variational learning/inference 활용 AutoEncoder 1 / 33 GENERATION 용어의 의미VAE Auto-Encoding Variational Bayes , ‘14.03 Latent Variable Target Data 𝑥 𝑔)* 𝑧 𝑔)+ 𝑧 𝑝 𝑥 𝑔)+ 𝑧 < 𝑝 𝑥 𝑔)* 𝑧 확률 분포를 정해주는 파라미터를 추정. (예. 가우시안일 경우 평균,표준편차) 𝑧 𝑔)(𝑧)Generator gθ(.) 𝑝 𝑧 𝑝 𝑥 𝑔)(𝑧) sampling modeling - 𝑝 𝑥 𝑔)(𝑧) 𝑝 𝑧 𝑑𝑧 = 𝑝(𝑥) variational learning - 𝑝 𝑥 𝑔)(𝑧) 𝑝 𝑧 𝑑𝑧 = 𝐸2(3)[𝑝)(𝑥|𝑧)]
  • 17.
    Variational • 𝒑(𝒙)를구할 때 variational learning/inference 활용 AutoEncoder 2 / 33 GENERATION 용어의 의미VAE Auto-Encoding Variational Bayes , ‘14.03 𝑧 𝑔)(𝑧)Generator gθ(.) 𝑝 𝑧 sampling • 입력과 출력이 동일한 네트워크 𝑝 𝑧|𝑥 x를 잘 생성해는 z가 샘플링 잘 되도록, x를 힌트로 제공함 𝑝 𝑧|𝑞8(𝑥) p(z|x)의 모델을 가정하고, 그 모델을 결정짓는 파라미터를 네트워크로 추정 Decoder gθ(.) Encoder qφ(.) 𝑥 𝑝 𝑧|𝑞8(𝑥) 𝑞8(𝑥) sampling 𝑧 𝑔) 𝑧 𝑥→
  • 18.
    3 / 33 GENERATION VariationalInferenceVAE Variational Lower Bound (Evidence Lower BOund, ELBO) log 𝑝 𝑥 = 𝐸𝐿𝐵𝑂 + 𝐾𝐿(𝑞8 𝑧|𝑥 ∥ 𝑝 𝑧|𝑥 ) log 𝑝 𝑥 𝐾𝐿(𝑞8 𝑧|𝑥 ∥ 𝑝 𝑧|𝑥 ) 𝐸𝐿𝐵𝑂(𝜙) 𝜙D 𝜙E Optimization Problem 1 on 𝜙: Variational Inference argmin 8 𝐾𝐿(𝑞8 𝑧|𝑥 ∥ 𝑝 𝑧|𝑥 ) == argmax 8 𝐸𝐿𝐵𝑂(𝜙) log 𝑝 𝑥 ≥ 𝔼NO 3|P log 𝑝) 𝑥|𝑧 − 𝐾𝐿(𝑞8 𝑧|𝑥 ∥ 𝑝 𝑧 ) = 𝐸𝐿𝐵𝑂 Optimization Problem 2 on 𝜃: Variational Learning argmax ) 𝔼NO 3|P log 𝑝) 𝑥|𝑧 = argmax ) 𝐸𝐿𝐵𝑂(𝜃) Final Optimization Problem argmax ),8 𝐸𝐿𝐵𝑂(𝜃, 𝜙) • KL : Kullback-Leibler Divergence 두 확률 분포가 얼마나 다른지에 대한 척도
  • 19.
    𝐿U 𝜙, 𝜃,𝑥U = −𝔼NO 3|P log 𝑝) 𝑥U|𝑧 + 𝐾𝐿(𝑞8 𝑧|𝑥U ∥ 𝑝 𝑧 ) Reconstruction Error 복원 오차 입력과 출력 간의 cross-entropy Regularization 제약 조건 Prior분포와의 다른 정도 [ VAE의 특징들 ] 1. Decoder가 최소한 학습 데이터는 생성해 낼 수 있게 된다. à 생성된 데이터가 학습 데이터 좀 닮아 있다. 2. Encoder가 최소한 학습 데이터는 잘 latent vector로 표현할 수 있게 된다. à 데이터의 추상화를 위해 많이 사용된다. 4 / 33 GENERATION Encoder & DecoderVAE 𝑥U Encoder 𝑞8(𝑧|𝑥U) 𝜇U 𝜎U 𝜖~𝑁(0, 𝐼) 𝜖U 𝑧U Decoder 𝑝) 𝑥|𝑧 𝑦U = 𝑥U Bernoulli Gaussian Cost Function
  • 20.
    28 𝑥U 𝑝U 𝐷 28 𝐷=784 MLP with 2 hidden layers (500, 500) Architecture 5/ 33 GENERATION Result : MNISTVAE 𝑔)(_)𝑞8 _ 𝑞8 𝑧|𝑥 ~𝑧 𝑔) 𝑥|𝑧 Encoder Posterior Inference Network Decoder Generator Generation Network SAMPLING 𝜇U 𝜎U Latent Space
  • 21.
    Reproduce Input image |z| =2 |z| =5|z| =20 https://github.com/hwalsuklee/tensorflow-mnist-VAE 6 / 33 GENERATION Result : MNISTVAE
  • 22.
  • 23.
    Learned Manifold 학습이 잘 되었을수록 2D공간에서 같은 숫자들을 생성하는 z들은 뭉쳐있고, 다른 숫자들은 생성하는 z들은 떨어져 있어야 한다. https://github.com/hwalsuklee/tensorflow-mnist-VAE 8 / 33 GENERATION Result : MNISTVAE Decoding / GenerationEncoding / Feature Extraction
  • 24.
    9 / 33 GENERATION DEMO : MNIST / Gray FaceVAE http://www.dpkingma.com/sgvb_mnist_demo/demo.html |z|=12 24 24 HandwrittenDigits Generation http://vdumoulin.github.io/morphing_faces/online_demo.html |z|=29 64 64 Gray Face Generation
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
    Handwriting styles obtained by fixing the class label and varying z |z| = 2 y=[1,0,0,0,0,0,0,0,0,0] y=[0,1,0,0,0,0,0,0,0,0]y=[0,0,1,0,0,0,0,0,0,0] y=[0,0,0,1,0,0,0,0,0,0] y=[0,0,0,0,1,0,0,0,0,0] y=[0,0,0,0,0,0,1,0,0,0]y=[0,0,0,0,0,1,0,0,0,0] y=[0,0,0,0,0,0,0,0,1,0]y=[0,0,0,0,0,0,0,1,0,0] y=[0,0,0,0,0,0,0,0,0,1] MNIST resultsCVAE https://github.com/hwalsuklee/tensorflow-mnist-CVAE 14 / 33 GENERATION
  • 30.
    Z-sampling 각 행 별로,고정된 z값에 대해서 label정보만 바꿔 서 이미지 생성 (스타일 유지하면 숫자만 바뀜) MNIST resultsCVAE Analogies : Result in paper Semi-Supervised Learning with Deep Generative Models : https://arxiv.org/abs/1406.5298 15 / 33 GENERATION
  • 31.
    𝑧D 𝑧E 𝑧` 𝑧a 𝑐c 𝑐D 𝑐E𝑐` 𝑐a 𝑐d 𝑐e 𝑐f 𝑐g 𝑐h Handwriting style for a given z must be preserved for all labels Analogies |z| = 2 MNIST resultsCVAE https://github.com/hwalsuklee/tensorflow-mnist-CVAE 𝑐c 𝑐D 𝑐E 𝑐` 𝑐a 𝑐d 𝑐e 𝑐f 𝑐g 𝑐h Real handwritten image 실제로 손으로 쓴 글씨 ‘3’을 CVAE의 label정보와 같이 넣었을 때 얻는 latent vector는 decoder의 고정 입력으로 하고, label정보만 바꿨을 경우 16 / 33 GENERATION
  • 32.
  • 33.
    17 / 33 GENERATION 용어의의미GAN Generative Adversarial Networks , ‘14.06 Generative • Generator (생성자) 및 Discriminator가 (구분자) 서로 적대적으로 행동 • 생성 모델 Adversarial Networks • 생성자 및 구분자 모두 딥뉴럴 네트워크로 모델링 x 0~1D G G(z) Discriminator Generator z “The coolest idea in ML in the last twenty years” – Yann LeCun
  • 34.
    생성자 Generator 구별자 Discriminator G A Networks 두 개의네트워크 dversarial 적대적으로 학습된 enerative 데이터 생성목적으로 실제 / 가짜 가짜 사진 실제 데이터 실제 사진 구별자는 생성자의 결과를 거짓이라고 판단하려고 노력 생성자는 구별자가 참이라고 판단하게 하려고 노력 거짓말 쟁이 거짓말 탐지기 적대적 18 / 33 GENERATION 꽃 사진 생성 예시GAN
  • 35.
    Noise (Latent variable) 생성자 Generator 구별자 Discriminator 실 데이터생성자 Yes / No 실 데이터? Generator가 매 번 다른 샘플을 만들어 내기 위한 입력 정답 얼굴 DB 구별자 성능 향상 생성자 성능 향상 19 / 33 GENERATION 얼굴 사진 생성 예시GAN
  • 36.
    Noise (Latent variable) eneratorDiscriminator Data sample Yes / No G 𝑝ijkj(𝑥) 𝐷 𝑥 =1 𝐷 𝐺 𝑧 = 0 𝑥 𝐺(𝑧) 𝑧 𝐷 𝐺 𝑧 =1 𝑝3(𝑧) 𝑉 𝐷, 𝐺 = 𝔼P~2opqp(P) log𝐷(𝑥) + 𝔼3~2r(3) log 1 − 𝐷 𝐺 𝑧Value function of GAN Goal 𝐷∗ , 𝐺∗ = min t max u 𝑉 𝐷, 𝐺 minmax problem!!! 20 / 33 GENERATION ProblemGAN Notation
  • 37.
    D max u 𝑉 𝐷,𝐺 = max u 𝔼P~2opqp log𝐷(𝑥) + 𝔼3~2r log 1 − 𝐷 𝐺 𝑧 G is fixed Maximize prob. of D(real) Minimize prob. of D(fake) G min t 𝑉 𝐷, 𝐺 = min t 𝔼3~2r log 1 − 𝐷 𝐺 𝑧 D is fixed Maximize prob. of D(fake) = max t 𝔼3~2r log 𝐷 𝐺 𝑧 Alternating Optimization 21 / 33 GENERATION ProblemGAN Optimization 𝑉 𝐷, 𝐺 = 𝔼P~2opqp(P) log𝐷(𝑥) + 𝔼3~2r(3) log 1 − 𝐷 𝐺 𝑧Value function of GAN Goal 𝐷∗ , 𝐺∗ = min t max u 𝑉 𝐷, 𝐺
  • 38.
    𝐷∗ 𝑥 = 𝑝ijkj(𝑥) 𝑝ijkj 𝑥+ 𝑝v(𝑥)D 𝑝v ∗ 𝑥 = 𝑝ijkj(𝑥) G D(x) p_data p_g(x) Optimization for D(x) Optimization for G(z) Equilibrium : D(x)=0.5 G(z) maps z to x 1D Gaussian Approximation example 22 / 33 GENERATION ProblemGAN Optimal Solution http://cs.stanford.edu/people/karpathy/gan/ Uniform Distribution https://github.com/hwalsuklee/tensorflow-GAN-1d-gaussian-ex
  • 39.
    23 / 33 GENERATION VAE VS GANGAN ModelOptimization Image Quality Generalization VAE • Stochastic gradient descent • Converge to local minimum • Easier • Smooth • Blurry • Tend to remember input images GAN • Alternating stochastic gradient descent • Converge to saddle points • Harder v Model collapsing v Unstable convergence • Sharp • Artifact • Generate new unseen images http://aliensunmin.github.io/project/accv16tutorial/media/generative.pdf z DEx x DecoderEncoder x 0~1D G G(z) Discriminator Generator z VAE GAN
  • 40.
    Comparison between VAE vs GAN VAE : maximumlikelihood approach GAN http://videolectures.net/site/normal_dl/tag=1129740/deeplearning2017_courville_generative_models_01.pdf 24 / 33 GENERATION VAE VS GANGAN
  • 41.
    이후로 소개되는 모든논문들은 DCGAN에서 크게 벗어나지 않음 대부분의 상황에서 안정적으로 학습이 되는 Deep Convolutional GAN구조를 제안 Generator Discriminator Pooling Layers Not used. But use strided convolutions instead. Batch normalization Use except output layer Use except input layer Fully connected hidden layers Not used Activation function ReLU for all layers except for the output, which uses Tanh LeakyReLU for all layers 25 / 33 GENERATION Key ContributionDCGAN Unsupervised Representation learning with Deep Convolutional Generative Adversarial Networks (DCGAN), ‘15.11
  • 42.
    26 / 33 GENERATION 네트워크구조DCGAN 생성자 Generator 구별자 Discriminator
  • 43.
  • 44.
    28 / 33 GENERATION Face Generation Results : InterpolationDCGAN z1z2(z1, z2)를 보간 후 새로 생성한 z에 대한 결과
  • 45.
    z1 : 썬글라스를낀 남자들을 발생시키는 z값 평균 z2 : 썬글라스를 안 낀 남자들을 발생시키는 z값 평균 z3 : 썬글라스를 안 낀 여자들을 발생시키는 z값 평균 z1 z2 z3 z1-z2+z3 썬글라스를 낀 여자들 29 / 33 GENERATION Face Generation Results : Vector arithmeticDCGAN
  • 46.
  • 47.
    31 / 33 GENERATION A List of Some PapersGANzoo https://deephunt.in/the-gan-zoo-79597dc8c347 GAN — Generative Adversarial Networks 3D-GAN — Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling acGAN — Face Aging With Conditional Generative Adversarial Networks AC-GAN — Conditional Image Synthesis With Auxiliary Classifier GANs AdaGAN — AdaGAN: Boosting Generative Models AEGAN — Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets AffGAN — Amortised MAP Inference for Image Super-resolution AL-CGAN — Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts ALI — Adversarially Learned Inference AMGAN — Generative Adversarial Nets with Labeled Data by Activation Maximization AnoGAN — Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery ArtGAN — ArtGAN: Artwork Synthesis with Conditional Categorial GANs b-GAN — b-GAN: Unified Framework of Generative Adversarial Networks Bayesian GAN — Deep and Hierarchical Implicit Models BEGAN — BEGAN: Boundary Equilibrium Generative Adversarial Networks BiGAN — Adversarial Feature Learning BS-GAN — Boundary-Seeking Generative Adversarial Networks CGAN — Conditional Generative Adversarial Nets CCGAN — Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks CatGAN — Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks CoGAN — Coupled Generative Adversarial Networks Context-RNN-GAN — Contextual RNN-GANs for Abstract Reasoning Diagram Generation C-RNN-GAN — C-RNN-GAN: Continuous recurrent neural networks with adversarial training CS-GAN — Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets CVAE-GAN — CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training CycleGAN — Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks DTN — Unsupervised Cross-Domain Image Generation DCGAN — Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks DiscoGAN — Learning to Discover Cross-Domain Relations with Generative Adversarial Networks DR-GAN — Disentangled Representation Learning GAN for Pose-Invariant Face Recognition DualGAN — DualGAN: Unsupervised Dual Learning for Image-to-Image Translation EBGAN — Energy-based Generative Adversarial Network f-GAN — f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization GAWWN — Learning What and Where to Draw GoGAN — Gang of GANs: Generative Adversarial Networks with Maximum Margin Ranking GP-GAN — GP-GAN: Towards Realistic High-Resolution Image Blending IAN — Neural Photo Editing with Introspective Adversarial Networks iGAN — Generative Visual Manipulation on the Natural Image Manifold IcGAN — Invertible Conditional GANs for image editing ID-CGAN- Image De-raining Using a Conditional Generative Adversarial Network Improved GAN — Improved Techniques for Training GANs InfoGAN — InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets LAGAN — Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis LAPGAN — Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks LR-GAN — LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation LSGAN — Least Squares Generative Adversarial Networks LS-GAN — Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities MGAN — Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks MAGAN — MAGAN: Margin Adaptation for Generative Adversarial Networks MAD-GAN — Multi-Agent Diverse Generative Adversarial Networks MalGAN — Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN MaliGAN — Maximum-Likelihood Augmented Discrete Generative Adversarial Networks MARTA-GAN — Deep Unsupervised Representation Learning for Remote Sensing Images McGAN — McGan: Mean and Covariance Feature Matching GAN MDGAN — Mode Regularized Generative Adversarial Networks MedGAN — Generating Multi-label Discrete Electronic Health Records using Generative Adversarial Networks MIX+GAN — Generalization and Equilibrium in Generative Adversarial Nets (GANs) MPM-GAN — Message Passing Multi-Agent GANs MV-BiGAN — Multi-view Generative Adversarial Networks pix2pix — Image-to-Image Translation with Conditional Adversarial Networks PPGN — Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space PrGAN — 3D Shape Induction from 2D Views of Multiple Objects RenderGAN — RenderGAN: Generating Realistic Labeled Data RTT-GAN — Recurrent Topic-Transition GAN for Visual Paragraph Generation SGAN — Stacked Generative Adversarial Networks SGAN — Texture Synthesis with Spatial Generative Adversarial Networks SAD-GAN — SAD-GAN: Synthetic Autonomous Driving using Generative Adversarial Networks SalGAN — SalGAN: Visual Saliency Prediction with Generative Adversarial Networks SEGAN — SEGAN: Speech Enhancement Generative Adversarial Network SeGAN — SeGAN: Segmenting and Generating the Invisible SeqGAN — SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient SimGAN — Learning from Simulated and Unsupervised Images through Adversarial Training SketchGAN — Adversarial Training For Sketch Retrieval SL-GAN — Semi-Latent GAN: Learning to generate and modify facial images from attributes Softmax-GAN — Softmax GAN SRGAN — Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network S²GAN — Generative Image Modeling using Style and Structure Adversarial Networks SSL-GAN — Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks StackGAN — StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks TGAN — Temporal Generative Adversarial Nets TAC-GAN — TAC-GAN — Text Conditioned Auxiliary Classifier Generative Adversarial Network TP-GAN — Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Triple-GAN — Triple Generative Adversarial Nets Unrolled GAN — Unrolled Generative Adversarial Networks VGAN — Generating Videos with Scene Dynamics VGAN — Generative Adversarial Networks as Variational Training of Energy Based Models VAE-GAN — Autoencoding beyond pixels using a learned similarity metric VariGAN — Multi-View Image Generation from a Single-View ViGAN — Image Generation and Editing with Variational Info Generative AdversarialNetworks WGAN — Wasserstein GAN WGAN-GP — Improved Training of Wasserstein GANs WaterGAN — WaterGAN: Unsupervised Generative Network to Enable Real-time Color Correction of Monocular Underwater Images
  • 48.
    32 / 33 GENERATION Explosive GrowthGANzoo https://deephunt.in/the-gan-zoo-79597dc8c347 Explosive growth — All the named GAN variants cumulatively since 2014. Credit: Bruno Gavranović
  • 49.
    33 / 33 GENERATION Collections of Generative ModelGANzoo https://github.com/soumith/talks/tree/master/2017-ICCV_Venice https://github.com/hwalsuklee/tensorflow-generative-model-collections ICCV 2017 GAN Tutorial
  • 50.
  • 51.
    1 / 20 APPLICATIONS 힌트를통한 이미지 변환CONDITIONAL GENERATION 생성자 Generator 스타일 변환 생성자 Generator 스케치à사진 생성자 Generator 자동 채색
  • 52.
    2 / 20 APPLICATIONS 스타일변환CONDITIONAL GENERATION Fast Neural Style ‘16.04 스타일
  • 53.
    3 / 20 APPLICATIONS 스타일변환CONDITIONAL GENERATION Artistic style transfer for videos, ‘16.10 동영상
  • 54.
    4 / 20 APPLICATIONS 스케치à 사진CONDITIONAL GENERATION pixel2pixel ‘16.11 (https://affinelayer.com/pixsrv/index.html) 건물 정면 사진 DB 고양이 사진 DB 구두 사진 DB 가방 사진 DB
  • 55.
    5 / 20 APPLICATIONS 스케치à 채색CONDITIONAL GENERATION 자동 채색 ‘17.01 (https://paintschainer.preferred.tech/)
  • 56.
    6 / 20 APPLICATIONS 스케치à 채색CONDITIONAL GENERATION 자동 채색 ‘17.01 (https://paintschainer.preferred.tech/)
  • 57.
    7 / 20 APPLICATIONS 스케치à 채색CONDITIONAL GENERATION 자동 채색 ‘17.01 (https://paintschainer.preferred.tech/) 원본 힌트 작업 시간 < 1분 정답 색상 힌트에 따라 채색
  • 58.
    8 / 20 APPLICATIONS 실시간사용자 입력CONDITIONAL GENERATION 색상 힌트에 따라 채색 Generative Visual Manipulation on the Natural Image Manifold , ‘16.09 ①풀 ②선택 ③산 ④선택 ⑤하늘
  • 59.
    9 / 20 APPLICATIONS 실시간사용자 입력CONDITIONAL GENERATION 색상 힌트에 따라 채색 Generative Visual Manipulation on the Natural Image Manifold , ‘16.09 동영상 https://youtu.be/5jfViPdYLic
  • 60.
    10 / 20 APPLICATIONS FONTGENERATION https://kaonashi-tyc.github.io/2017/04/06/zi2zi.html , ‘17.04.06 랜덤 폰트 생성 한자 폰트 à 한글 폰트
  • 61.
    11 / 20 APPLICATIONS FONTGENERATION http://fontto.twiiks.co/demo/ Handwritten Character to Font
  • 62.
    12 / 20 Multimodal Feature Learner StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks , ‘16.12 APPLICATIONS StackGANGAN+VAE
  • 63.
    13 / 20 Learninga Probabilistic latent Space of Object Shapes via 3D Generative-Adversarial Modeling (3D-GAN), ‘16.10 Multimodal Feature Learner APPLICATIONS 3DGANGAN+VAE
  • 64.
    14 / 20 Denoising SEGAN:Speech Enhancement Generative Adversarial Network ‘17. 03. 28 Nothing is safe. There will be no repeat of that performance, that I can guarantee. before after before after APPLICATIONS SEGANGAN+VAE
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
    1 / 6 CLOSING 진짜사진을 찾아보세요!FACE GENERATION DCGAN ‘15.11.19 64 x 64 pixels BEGAN ‘17.03.31 128 x 128 pixels 4배 화질 향상 https://arxiv.org/abs/1703.10717
  • 73.
    2 / 6 CLOSING DCGAN‘15.11.19 64 x 64 pixels Create Anime Characters with A.I. ‘17.08.14 128 x 128 pixels 4배 화질 향상 http://make.girls.moe/#/ 결과 비교CHARACTER GENERATION
  • 74.
    3 / 6 CLOSING DeepFeature Interpolation (CVPR 2017) 1000 x 1000 pixels https://github.com/paulu/deepfeatinterp 얼굴 이미지CONDITIONAL GENERATION
  • 75.
    4 / 6 CLOSING 진짜사진을 찾아보세요!FACE GENERATION BEGAN ‘17.03.31 128 x 128 pixels PGGAN ‘17.10.27 https://arxiv.org/abs/1710.10196
  • 76.
    5 / 6 CLOSING 진짜사진을 찾아보세요!FACE GENERATION PGGAN ‘17.10.27 1024 x 1024 pixels 16*16배 화질 향상
  • 77.
    HARDWARE 6 / 6 CLOSING 컴퓨터비전 기술의 비약적 발전 배경 BIG DATA HARDWARE 논문 공유 코드 공유 개발 플랫폼 공유 공유의 문화
  • 78.
  • 79.
    List of WebDemos • Classification : https://www.clarifai.com/demo • Segmentation : http://www.robots.ox.ac.uk/~szheng/crfasrnndemo • Vae mnist : http://www.dpkingma.com/sgvb_mnist_demo/demo.html • Vae gray face : http://vdumoulin.github.io/morphing_faces/online_demo.html • Vae sketch-rnn : https://magenta.tensorflow.org/assets/sketch_rnn_demo/multi_vae.html • 1d gan : http://cs.stanford.edu/people/karpathy/gan/ • Dcgan asian face : https://carpedm20.github.io/faces/ • Dcgan character generation : http://mattya.github.io/chainer-DCGAN • Style-transfer : http://demos.algorithmia.com/deep-style/ • Pix2pix : https://affinelayer.com/pixsrv/index.html • Colorization : https://paintschainer.preferred.tech/ • Font generation : http://fontto.twiiks.co/demo/ • Conditional character generation : http://make.girls.moe/#/