Photo-realistic Single Image Super-resolution using a
Generative Adversarial Network* (SRGAN)
ISL Lab Seminar
Hansol Kang
* Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
Contents
2019-05-24
2
Introduction
Review
SRGAN
Concept
Networks
Perceptual loss
Results
Experiment
Source Code
Set5, Set14, Custom
Summary
I. Introduction
Introduction
Review
Introduction
• Review - Concept of GAN
2019-05-24
4
“Discriminator를 잘 속이는 Generator를 만들자.”
1) Vanilla GAN
D
real or
fake
G
image
2) DCGAN
D
real or
fake
image
G
Latent space의 manipulability 발견
3) InfoGAN
D
real or
fake
image
GZ MutualInformation
4) LSGAN
D
real or
fake
image
G
BCE Loss -> MSE Loss 사용
Decision
Boundary
Introduction
• Review - Concept of GAN
2019-05-24
5
“Discriminator를 잘 속이는 Generator를 만들자.”
1) Vanilla GAN
D
real or
fake
G
image
2) DCGAN
D
real or
fake
image
G
Latent space의 manipulability 발견
3) InfoGAN
D
real or
fake
image
GZ MutualInformation
4) LSGAN
D
real or
fake
image
G
BCE Loss -> MSE Loss 사용
Decision
Boundary
Introduction
• Review - Applications
2019-05-24
6
Style Transfer
In painting
Super
Resolution
II. SRGAN
SRGAN
Concept, Networks, Perceptual loss, Results
SRGAN
2019-05-24
8
• Concept
D G
“Real한 Fake 데이터 만들기”
HR SR
=>“HR 같은 SR 데이터 만들기”
SRGAN
2019-05-24
9
• Networks
    ))((1log)(logmaxmin ~~
LR
pI
HR
pIDG
IGDID GDg
LR
Dtraion
HR   EE
Real Latent code(Z)
SRGAN
2019-05-24
10
• Networks
• PReLU Vs. Leaky ReLU
• ResNet
• ESPCN
SRGAN
2019-05-24
11
• Networks
PReLU Vs. Leaky ReLU
Leaky ReLU : Fixed slope
PReLU : Learnable slope
),0min(*_),0max(LeakyReLU xslopenegativex(x) 
),0min(*),0max(PReLU xax(x) 
SRGAN
2019-05-24
12
• Networks
ResNet
* ResNet seminar (Jae Won An)
Classification (ResNet)
Detection
Enhancement (DCP)
(R-CNN, Fast R-CNN, Faster R-CNN)
Super resolution (SRCNN)
Detection (SPPNet)
Segmentation (Mask R-CNN)
SRGAN
2019-05-24
13
• Networks
ResNet
Skip connection
=>Like a ensemble effect
Residual block
* ResNet seminar (Jae Won An)
SRGAN
2019-05-24
14
• Networks
ESPCN (Efficient Sub-Pixel Convolutional Neural Network
SRCNN,
VDSR
ESPCN
LR
LR
Pixel Shuffle
Bi-cubic
: 해상도 증가 후 Feature 추출
: Feature 추출 후 해상도 증가
SRGAN
2019-05-24
15
• Networks
ESPCN (Efficient Sub-Pixel Convolutional Neural Network
엄밀한 설명은 아니지만 개념적으로 설명하면,
SRGAN
2019-05-24
16
• Networks
Just Ordinary CNN structure
Stride Conv layer(no pooling layer)
* DCGAN seminar (Hansol Kang) – https://isl-homepage.github.io/seminar/
SRGAN
2019-05-24
17
• Perceptual loss
Perceptual Loss =
SR
Gen
SR
X
SR
lll 3
10

Content Loss + Adversarial Loss
  

rW
x
rH
y
yx
LRHR
yx
SR
MSE IGI
WHr
l G
1 1
2
,,2
)(
1

MSE VGG
SOTA에서 많이 사용하는 방법
=> High PSNR, BUT perceptually BAD
(PSNR과 SSIM이 좋은 평가 지표가 아니다.)
SRGAN
2019-05-24
18
• Perceptual loss
Perceptual Loss =
SR
Gen
SR
X
SR
lll 3
10

Content Loss + Adversarial Loss
MSE VGG
: The feature map obtained by j-th convolution (after activation) before the i-th maxpooling layer within the VGG19
network
     

ji ji
G
W
x
H
y
yx
LR
jiyx
HR
ji
jiji
SR
jiVGG IGI
HW
l
, ,
1 1
2
,,,,
,,
,/ )(
1

ji,
*Basic of DCNN seminar (Hansol Kang) – https://isl-homepage.github.io/seminar/
Feature를 서로 비교하겠다.
=> 디테일한 정보가 같도록 (perceptually Good)
Input F1 F2
F
C
Feature map
SR
I
SRGAN
2019-05-24
19
• Perceptual loss
Perceptual Loss =
SR
Gen
SR
X
SR
lll 3
10

Content Loss + Adversarial Loss
  

N
n
LRSR
Gen IGDl GD
1
log 
   LR
IGD GD 1log
   LR
IGD GD log
Maximize 시키는 문제
Minimize 시키는 문제
SRGAN
2019-05-24
20
• Perceptual loss
* Medium blog, “Introduction to deep super resolution”
(https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf)
SRGAN
2019-05-24
21
• Perceptual loss
* Medium blog, “Introduction to deep super resolution”
(https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf)
Adv. loss
SRGAN
2019-05-24
22
• Perceptual loss
* Medium blog, “Introduction to deep super resolution”
(https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf)
Content loss2Content loss1
SRGAN
2019-05-24
23
• Results
• Datasets : Set5, Set14, BSD100 • Scale factor : 4
• MOS(Mean Opinion Score) testing : 26raters. (1 : bad quality, 5 : excellent quality)
12가지 버전
GT, NN, Bicubic, SRCNN, SelfExSR, DRCN, ESPCN, SRResNet-MSE, SRRestNet-VGG22, SRGAN-MSE, SRGAN-VGG22, SRGAN-VGG54
각 평가자는 1128개의 이미지(12 versions of 19 images + 9 versions of 100 images) (12*19+9*100 = 228+900=1128)
The raters were calibrated on the NN (score 1) and HR (5) versions of 20 images from the BSD300 training set
Low-level features
High-level features
SRGAN
2019-05-24
24
• Results
: MOS 관점에서 adversarial loss가 유의미한 결과 값을 출력.
: MOS 관점에서 high lever feature가 더 유의미한 결과 값을 출력.
We could not determine a significantly best loss function
SRGAN
2019-05-24
25
• Results
SRGAN
2019-05-24
26
• Results
“PSNR이나 SSIM을 원하면, SRResNet을 써.
물론 구리겠지만 찡긋”
III. Experiment
Experiment
Source Code, Set5, Set14, Custom
self.espcn1 = ESPCN(64, 256)
self.espcn2 = ESPCN(256, 256)
#최종적인 출력.
self.conv3 = nn.Conv2d(256, 3, kernel_size=9, stride=1, padding=4)
def forward(self, x):
x = F.prelu(self.conv1(x))
temp = self.block1(x)
temp = self.block2(temp)
temp = self.block3(temp)
temp = self.block4(temp)
temp = self.block5(temp)
temp = self.block6(temp)
temp = self.block7(temp)
temp = self.block8(temp)
temp = self.block9(temp)
temp = self.block10(temp)
temp = self.block11(temp)
temp = self.block12(temp)
temp = self.block13(temp)
temp = self.block14(temp)
temp = self.block15(temp)
temp = self.block16(temp)
x = x+self.bn1(self.conv2(temp))
x = self.espcn1(x)
x = self.espcn2(x)
x = self.conv3(x)
return x
Experiment
• Source Code
2019-05-24
28
class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=9, stride=1, padding=4)
#TODO : for문을 이용하여 쌓는 방법 고려하기.
self.block1 = Block_B(64)
self.block2 = Block_B(64)
self.block3 = Block_B(64)
self.block4 = Block_B(64)
self.block5 = Block_B(64)
self.block6 = Block_B(64)
self.block7 = Block_B(64)
self.block8 = Block_B(64)
self.block9 = Block_B(64)
self.block10 = Block_B(64)
self.block11 = Block_B(64)
self.block12 = Block_B(64)
self.block13 = Block_B(64)
self.block14 = Block_B(64)
self.block15 = Block_B(64)
self.block16 = Block_B(64)
self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1)
self.bn1 = nn.BatchNorm2d(64)
def forward(self, x):
x = F.leaky_relu(self.conv1(x), 0.2)
x = F.leaky_relu(self.bn1(self.conv2(x)), 0.2)
x = F.leaky_relu(self.bn2(self.conv3(x)), 0.2)
x = F.leaky_relu(self.bn3(self.conv4(x)), 0.2)
x = F.leaky_relu(self.bn4(self.conv5(x)), 0.2)
x = F.leaky_relu(self.bn5(self.conv6(x)), 0.2)
x = F.leaky_relu(self.bn6(self.conv7(x)), 0.2)
x = F.leaky_relu(self.bn7(self.conv8(x)), 0.2)
x = F.leaky_relu(self.conv9(self.flatten(x)))
x = F.sigmoid(self.conv10(x))
return x
Experiment
• Source Code
2019-05-24
29
class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=3, padding=1)
self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=2, padding=1)
self.bn1 = nn.BatchNorm2d(64)
self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
self.bn2 = nn.BatchNorm2d(128)
self.conv4 = nn.Conv2d(128, 128, kernel_size=3, stride=2, padding=1)
self.bn3 = nn.BatchNorm2d(128)
self.conv5 = nn.Conv2d(128, 256, kernel_size=3, padding=1)
self.bn4 = nn.BatchNorm2d(256)
self.conv6 = nn.Conv2d(256, 256, kernel_size=3, stride=2, padding=1)
self.bn5 = nn.BatchNorm2d(256)
self.conv7 = nn.Conv2d(256, 512, kernel_size=3, padding=1)
self.bn6 = nn.BatchNorm2d(512)
self.conv8 = nn.Conv2d(512, 512, kernel_size=3, stride=2, padding=1)
self.bn7 = nn.BatchNorm2d(512)
#TODO: flatten 과정 확인 필요
self.flatten = nn.AdaptiveAvgPool2d(1)
self.conv9 = nn.Conv2d(512, 1024, kernel_size=1)
self.conv10 = nn.Conv2d(1024, 1, kernel_size=1)
Experiment
• Source Code
2019-05-24
30
class GeneratorLoss(nn.Module):
def __init__(self):
super(GeneratorLoss, self).__init__()
vgg = vgg16(pretrained=True)
loss_network = nn.Sequential(*list(vgg.features)[:31]).eval()
for param in loss_network.parameters():
param.requires_grad = False
self.loss_network = loss_network
self.mse_loss = nn.MSELoss()
def forward(self, out_labels, out_images, target_images):
# Adversarial Loss
adversarial_loss = torch.mean(1 - out_labels)
# Perception Loss
perception_loss = self.mse_loss(self.loss_network(out_images), self.loss_network(target_images))
# Image Loss
image_loss = self.mse_loss(out_images, target_images)
loss_network = nn.Sequential(*list(vgg.features)[:31]).eval()
파이썬 Asterisk(*) 의 역할
1. Positional arg
2. Keword arg
3. Unpacking
# (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (1): ReLU(inplace)
# (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (3): ReLU(inplace)
# (4): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
# (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# ... # ...
# (30): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False) # )
Python Intermediate Seminar 예정
Experiment
• Dataset(Set-5, Set-14, Custom)
2019-05-24
31
…
Experiment
• Result#1 – Set5
2019-05-24
32
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
Experiment
• Result#2 – Set5
2019-05-24
36
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
Experiment
• Result#3 – Set5
2019-05-24
40
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
Experiment
• Result#4 – Set14
2019-05-24
43
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
Experiment
• Result#5 – Set14
2019-05-24
47
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
Experiment
• Result#6 – Custom data
2019-05-24
50
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
(240x180->960x720)
(236x125->944x500)
(137x137->548x548) (480x320->1920x1280)
Experiment
• Result#7 – Custom data(Video)
2019-05-24
58
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
IV. Summary
Summary
Summary, Future Work
Summary
2019-05-24
60
• ResNet 구조와 GAN 구조를 SR에 적용하여 객관적 지표인 PSNR, SSIM를 일정 수준 확보하면서
주관적 지표인 MOS를 향상 시킴.
• Content loss와 adversarial loss를 융합한 새로운 perceptual loss 를 제안함.
Future work
2019-05-24
61
GAN Research
Vanilla GAN
DCGAN
InfoGAN
LSGAN
SRGAN
Development tools & Language
Tips(Document & Programming)
PyTorch
C++ Coding Standard
Mathematical Theory
Linear algebra
Probability & Information theory
Other research
Level Processor
Ice Propagation
Modern C++(C++14)
Python(Intermediate)
Python executable & UI
Style Transfer
cGAN
wGAN
BEGAN
BigGAN
Cycle GAN
Style GAN
DONETODO
?
&

Photo-realistic Single Image Super-resolution using a Generative Adversarial Network (SRGAN)

  • 1.
    Photo-realistic Single ImageSuper-resolution using a Generative Adversarial Network* (SRGAN) ISL Lab Seminar Hansol Kang * Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
  • 2.
  • 3.
  • 4.
    Introduction • Review -Concept of GAN 2019-05-24 4 “Discriminator를 잘 속이는 Generator를 만들자.” 1) Vanilla GAN D real or fake G image 2) DCGAN D real or fake image G Latent space의 manipulability 발견 3) InfoGAN D real or fake image GZ MutualInformation 4) LSGAN D real or fake image G BCE Loss -> MSE Loss 사용 Decision Boundary
  • 5.
    Introduction • Review -Concept of GAN 2019-05-24 5 “Discriminator를 잘 속이는 Generator를 만들자.” 1) Vanilla GAN D real or fake G image 2) DCGAN D real or fake image G Latent space의 manipulability 발견 3) InfoGAN D real or fake image GZ MutualInformation 4) LSGAN D real or fake image G BCE Loss -> MSE Loss 사용 Decision Boundary
  • 6.
    Introduction • Review -Applications 2019-05-24 6 Style Transfer In painting Super Resolution
  • 7.
    II. SRGAN SRGAN Concept, Networks,Perceptual loss, Results
  • 8.
    SRGAN 2019-05-24 8 • Concept D G “Real한Fake 데이터 만들기” HR SR =>“HR 같은 SR 데이터 만들기”
  • 9.
    SRGAN 2019-05-24 9 • Networks    ))((1log)(logmaxmin ~~ LR pI HR pIDG IGDID GDg LR Dtraion HR   EE Real Latent code(Z)
  • 10.
    SRGAN 2019-05-24 10 • Networks • PReLUVs. Leaky ReLU • ResNet • ESPCN
  • 11.
    SRGAN 2019-05-24 11 • Networks PReLU Vs.Leaky ReLU Leaky ReLU : Fixed slope PReLU : Learnable slope ),0min(*_),0max(LeakyReLU xslopenegativex(x)  ),0min(*),0max(PReLU xax(x) 
  • 12.
    SRGAN 2019-05-24 12 • Networks ResNet * ResNetseminar (Jae Won An) Classification (ResNet) Detection Enhancement (DCP) (R-CNN, Fast R-CNN, Faster R-CNN) Super resolution (SRCNN) Detection (SPPNet) Segmentation (Mask R-CNN)
  • 13.
    SRGAN 2019-05-24 13 • Networks ResNet Skip connection =>Likea ensemble effect Residual block * ResNet seminar (Jae Won An)
  • 14.
    SRGAN 2019-05-24 14 • Networks ESPCN (EfficientSub-Pixel Convolutional Neural Network SRCNN, VDSR ESPCN LR LR Pixel Shuffle Bi-cubic : 해상도 증가 후 Feature 추출 : Feature 추출 후 해상도 증가
  • 15.
    SRGAN 2019-05-24 15 • Networks ESPCN (EfficientSub-Pixel Convolutional Neural Network 엄밀한 설명은 아니지만 개념적으로 설명하면,
  • 16.
    SRGAN 2019-05-24 16 • Networks Just OrdinaryCNN structure Stride Conv layer(no pooling layer) * DCGAN seminar (Hansol Kang) – https://isl-homepage.github.io/seminar/
  • 17.
    SRGAN 2019-05-24 17 • Perceptual loss PerceptualLoss = SR Gen SR X SR lll 3 10  Content Loss + Adversarial Loss     rW x rH y yx LRHR yx SR MSE IGI WHr l G 1 1 2 ,,2 )( 1  MSE VGG SOTA에서 많이 사용하는 방법 => High PSNR, BUT perceptually BAD (PSNR과 SSIM이 좋은 평가 지표가 아니다.)
  • 18.
    SRGAN 2019-05-24 18 • Perceptual loss PerceptualLoss = SR Gen SR X SR lll 3 10  Content Loss + Adversarial Loss MSE VGG : The feature map obtained by j-th convolution (after activation) before the i-th maxpooling layer within the VGG19 network        ji ji G W x H y yx LR jiyx HR ji jiji SR jiVGG IGI HW l , , 1 1 2 ,,,, ,, ,/ )( 1  ji, *Basic of DCNN seminar (Hansol Kang) – https://isl-homepage.github.io/seminar/ Feature를 서로 비교하겠다. => 디테일한 정보가 같도록 (perceptually Good) Input F1 F2 F C Feature map SR I
  • 19.
    SRGAN 2019-05-24 19 • Perceptual loss PerceptualLoss = SR Gen SR X SR lll 3 10  Content Loss + Adversarial Loss     N n LRSR Gen IGDl GD 1 log     LR IGD GD 1log    LR IGD GD log Maximize 시키는 문제 Minimize 시키는 문제
  • 20.
    SRGAN 2019-05-24 20 • Perceptual loss *Medium blog, “Introduction to deep super resolution” (https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf)
  • 21.
    SRGAN 2019-05-24 21 • Perceptual loss *Medium blog, “Introduction to deep super resolution” (https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf) Adv. loss
  • 22.
    SRGAN 2019-05-24 22 • Perceptual loss *Medium blog, “Introduction to deep super resolution” (https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf) Content loss2Content loss1
  • 23.
    SRGAN 2019-05-24 23 • Results • Datasets: Set5, Set14, BSD100 • Scale factor : 4 • MOS(Mean Opinion Score) testing : 26raters. (1 : bad quality, 5 : excellent quality) 12가지 버전 GT, NN, Bicubic, SRCNN, SelfExSR, DRCN, ESPCN, SRResNet-MSE, SRRestNet-VGG22, SRGAN-MSE, SRGAN-VGG22, SRGAN-VGG54 각 평가자는 1128개의 이미지(12 versions of 19 images + 9 versions of 100 images) (12*19+9*100 = 228+900=1128) The raters were calibrated on the NN (score 1) and HR (5) versions of 20 images from the BSD300 training set Low-level features High-level features
  • 24.
    SRGAN 2019-05-24 24 • Results : MOS관점에서 adversarial loss가 유의미한 결과 값을 출력. : MOS 관점에서 high lever feature가 더 유의미한 결과 값을 출력. We could not determine a significantly best loss function
  • 25.
  • 26.
    SRGAN 2019-05-24 26 • Results “PSNR이나 SSIM을원하면, SRResNet을 써. 물론 구리겠지만 찡긋”
  • 27.
  • 28.
    self.espcn1 = ESPCN(64,256) self.espcn2 = ESPCN(256, 256) #최종적인 출력. self.conv3 = nn.Conv2d(256, 3, kernel_size=9, stride=1, padding=4) def forward(self, x): x = F.prelu(self.conv1(x)) temp = self.block1(x) temp = self.block2(temp) temp = self.block3(temp) temp = self.block4(temp) temp = self.block5(temp) temp = self.block6(temp) temp = self.block7(temp) temp = self.block8(temp) temp = self.block9(temp) temp = self.block10(temp) temp = self.block11(temp) temp = self.block12(temp) temp = self.block13(temp) temp = self.block14(temp) temp = self.block15(temp) temp = self.block16(temp) x = x+self.bn1(self.conv2(temp)) x = self.espcn1(x) x = self.espcn2(x) x = self.conv3(x) return x Experiment • Source Code 2019-05-24 28 class Generator(nn.Module): def __init__(self): super(Generator, self).__init__() self.conv1 = nn.Conv2d(3, 64, kernel_size=9, stride=1, padding=4) #TODO : for문을 이용하여 쌓는 방법 고려하기. self.block1 = Block_B(64) self.block2 = Block_B(64) self.block3 = Block_B(64) self.block4 = Block_B(64) self.block5 = Block_B(64) self.block6 = Block_B(64) self.block7 = Block_B(64) self.block8 = Block_B(64) self.block9 = Block_B(64) self.block10 = Block_B(64) self.block11 = Block_B(64) self.block12 = Block_B(64) self.block13 = Block_B(64) self.block14 = Block_B(64) self.block15 = Block_B(64) self.block16 = Block_B(64) self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1) self.bn1 = nn.BatchNorm2d(64)
  • 29.
    def forward(self, x): x= F.leaky_relu(self.conv1(x), 0.2) x = F.leaky_relu(self.bn1(self.conv2(x)), 0.2) x = F.leaky_relu(self.bn2(self.conv3(x)), 0.2) x = F.leaky_relu(self.bn3(self.conv4(x)), 0.2) x = F.leaky_relu(self.bn4(self.conv5(x)), 0.2) x = F.leaky_relu(self.bn5(self.conv6(x)), 0.2) x = F.leaky_relu(self.bn6(self.conv7(x)), 0.2) x = F.leaky_relu(self.bn7(self.conv8(x)), 0.2) x = F.leaky_relu(self.conv9(self.flatten(x))) x = F.sigmoid(self.conv10(x)) return x Experiment • Source Code 2019-05-24 29 class Discriminator(nn.Module): def __init__(self): super(Discriminator, self).__init__() self.conv1 = nn.Conv2d(3, 64, kernel_size=3, padding=1) self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=2, padding=1) self.bn1 = nn.BatchNorm2d(64) self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1) self.bn2 = nn.BatchNorm2d(128) self.conv4 = nn.Conv2d(128, 128, kernel_size=3, stride=2, padding=1) self.bn3 = nn.BatchNorm2d(128) self.conv5 = nn.Conv2d(128, 256, kernel_size=3, padding=1) self.bn4 = nn.BatchNorm2d(256) self.conv6 = nn.Conv2d(256, 256, kernel_size=3, stride=2, padding=1) self.bn5 = nn.BatchNorm2d(256) self.conv7 = nn.Conv2d(256, 512, kernel_size=3, padding=1) self.bn6 = nn.BatchNorm2d(512) self.conv8 = nn.Conv2d(512, 512, kernel_size=3, stride=2, padding=1) self.bn7 = nn.BatchNorm2d(512) #TODO: flatten 과정 확인 필요 self.flatten = nn.AdaptiveAvgPool2d(1) self.conv9 = nn.Conv2d(512, 1024, kernel_size=1) self.conv10 = nn.Conv2d(1024, 1, kernel_size=1)
  • 30.
    Experiment • Source Code 2019-05-24 30 classGeneratorLoss(nn.Module): def __init__(self): super(GeneratorLoss, self).__init__() vgg = vgg16(pretrained=True) loss_network = nn.Sequential(*list(vgg.features)[:31]).eval() for param in loss_network.parameters(): param.requires_grad = False self.loss_network = loss_network self.mse_loss = nn.MSELoss() def forward(self, out_labels, out_images, target_images): # Adversarial Loss adversarial_loss = torch.mean(1 - out_labels) # Perception Loss perception_loss = self.mse_loss(self.loss_network(out_images), self.loss_network(target_images)) # Image Loss image_loss = self.mse_loss(out_images, target_images) loss_network = nn.Sequential(*list(vgg.features)[:31]).eval() 파이썬 Asterisk(*) 의 역할 1. Positional arg 2. Keword arg 3. Unpacking # (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) # (1): ReLU(inplace) # (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) # (3): ReLU(inplace) # (4): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False) # (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) # ... # ... # (30): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False) # ) Python Intermediate Seminar 예정
  • 31.
    Experiment • Dataset(Set-5, Set-14,Custom) 2019-05-24 31 …
  • 32.
    Experiment • Result#1 –Set5 2019-05-24 32 * 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN) a b c d e a : Bi-cubic b : SRCNN c : Kim d : SRGAN e : HR
  • 36.
    Experiment • Result#2 –Set5 2019-05-24 36 * 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN) a b c d e a : Bi-cubic b : SRCNN c : Kim d : SRGAN e : HR
  • 40.
    Experiment • Result#3 –Set5 2019-05-24 40 * 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN) a b c d e a : Bi-cubic b : SRCNN c : Kim d : SRGAN e : HR
  • 43.
    Experiment • Result#4 –Set14 2019-05-24 43 * 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN) a b c d e a : Bi-cubic b : SRCNN c : Kim d : SRGAN e : HR
  • 47.
    Experiment • Result#5 –Set14 2019-05-24 47 * 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN) a b c d e a : Bi-cubic b : SRCNN c : Kim d : SRGAN e : HR
  • 50.
    Experiment • Result#6 –Custom data 2019-05-24 50 * 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN) (240x180->960x720) (236x125->944x500) (137x137->548x548) (480x320->1920x1280)
  • 58.
    Experiment • Result#7 –Custom data(Video) 2019-05-24 58 * 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
  • 59.
  • 60.
    Summary 2019-05-24 60 • ResNet 구조와GAN 구조를 SR에 적용하여 객관적 지표인 PSNR, SSIM를 일정 수준 확보하면서 주관적 지표인 MOS를 향상 시킴. • Content loss와 adversarial loss를 융합한 새로운 perceptual loss 를 제안함.
  • 61.
    Future work 2019-05-24 61 GAN Research VanillaGAN DCGAN InfoGAN LSGAN SRGAN Development tools & Language Tips(Document & Programming) PyTorch C++ Coding Standard Mathematical Theory Linear algebra Probability & Information theory Other research Level Processor Ice Propagation Modern C++(C++14) Python(Intermediate) Python executable & UI Style Transfer cGAN wGAN BEGAN BigGAN Cycle GAN Style GAN DONETODO ?
  • 62.