Photo-realistic Single Image Super-resolution using a Generative Adversarial Network (SRGAN)

Photo-realistic Single Image Super-resolution using a
Generative Adversarial Network* (SRGAN)
ISL Lab Seminar
Hansol Kang
* Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.

Contents
2019-05-24
2
Introduction
Review
SRGAN
Concept
Networks
Perceptual loss
Results
Experiment
Source Code
Set5, Set14, Custom
Summary

I. Introduction
Introduction
Review

Introduction
• Review - Concept of GAN
2019-05-24
4
“Discriminator를 잘 속이는 Generator를 만들자.”
1) Vanilla GAN
D
real or
fake
G
image
2) DCGAN
D
real or
fake
image
G
Latent space의 manipulability 발견
3) InfoGAN
D
real or
fake
image
GZ MutualInformation
4) LSGAN
D
real or
fake
image
G
BCE Loss -> MSE Loss 사용
Decision
Boundary

Introduction
• Review - Concept of GAN
2019-05-24
5
“Discriminator를 잘 속이는 Generator를 만들자.”
1) Vanilla GAN
D
real or
fake
G
image
2) DCGAN
D
real or
fake
image
G
Latent space의 manipulability 발견
3) InfoGAN
D
real or
fake
image
GZ MutualInformation
4) LSGAN
D
real or
fake
image
G
BCE Loss -> MSE Loss 사용
Decision
Boundary

Introduction
• Review - Applications
2019-05-24
6
Style Transfer
In painting
Super
Resolution

II. SRGAN
SRGAN
Concept, Networks, Perceptual loss, Results

SRGAN
2019-05-24
8
• Concept
D G
“Real한 Fake 데이터 만들기”
HR SR
=>“HR 같은 SR 데이터 만들기”

SRGAN
2019-05-24
9
• Networks
    ))((1log)(logmaxmin ~~
LR
pI
HR
pIDG
IGDID GDg
LR
Dtraion
HR   EE
Real Latent code(Z)

SRGAN
2019-05-24
10
• Networks
• PReLU Vs. Leaky ReLU
• ResNet
• ESPCN

SRGAN
2019-05-24
11
• Networks
PReLU Vs. Leaky ReLU
Leaky ReLU : Fixed slope
PReLU : Learnable slope
),0min(*_),0max(LeakyReLU xslopenegativex(x) 
),0min(*),0max(PReLU xax(x) 

SRGAN
2019-05-24
12
• Networks
ResNet
* ResNet seminar (Jae Won An)
Classification (ResNet)
Detection
Enhancement (DCP)
(R-CNN, Fast R-CNN, Faster R-CNN)
Super resolution (SRCNN)
Detection (SPPNet)
Segmentation (Mask R-CNN)

SRGAN
2019-05-24
13
• Networks
ResNet
Skip connection
=>Like a ensemble effect
Residual block
* ResNet seminar (Jae Won An)

SRGAN
2019-05-24
14
• Networks
ESPCN (Efficient Sub-Pixel Convolutional Neural Network
SRCNN,
VDSR
ESPCN
LR
LR
Pixel Shuffle
Bi-cubic
: 해상도 증가 후 Feature 추출
: Feature 추출 후 해상도 증가

SRGAN
2019-05-24
15
• Networks
ESPCN (Efficient Sub-Pixel Convolutional Neural Network
엄밀한 설명은 아니지만 개념적으로 설명하면,

SRGAN
2019-05-24
16
• Networks
Just Ordinary CNN structure
Stride Conv layer(no pooling layer)
* DCGAN seminar (Hansol Kang) – https://isl-homepage.github.io/seminar/

SRGAN
2019-05-24
17
• Perceptual loss
Perceptual Loss =
SR
Gen
SR
X
SR
lll 3
10

Content Loss + Adversarial Loss
  

rW
x
rH
y
yx
LRHR
yx
SR
MSE IGI
WHr
l G
1 1
2
,,2
)(
1

MSE VGG
SOTA에서 많이 사용하는 방법
=> High PSNR, BUT perceptually BAD
(PSNR과 SSIM이 좋은 평가 지표가 아니다.)

SRGAN
2019-05-24
18
• Perceptual loss
Perceptual Loss =
SR
Gen
SR
X
SR
lll 3
10

MSE VGG
: The feature map obtained by j-th convolution (after activation) before the i-th maxpooling layer within the VGG19
network
     

ji ji
G
W
x
H
y
yx
LR
jiyx
HR
ji
jiji
SR
jiVGG IGI
HW
l
, ,
1 1
2
,,,,
,,
,/ )(
1

ji,
*Basic of DCNN seminar (Hansol Kang) – https://isl-homepage.github.io/seminar/
Feature를 서로 비교하겠다.
=> 디테일한 정보가 같도록 (perceptually Good)
Input F1 F2
F
C
Feature map
SR
I

SRGAN
2019-05-24
19
• Perceptual loss
Perceptual Loss =
SR
Gen
SR
X
SR
lll 3
10

  

N
n
LRSR
Gen IGDl GD
1
log 
   LR
IGD GD 1log
   LR
IGD GD log
Maximize 시키는 문제
Minimize 시키는 문제

SRGAN
2019-05-24
20
• Perceptual loss
* Medium blog, “Introduction to deep super resolution”
(https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf)

SRGAN
2019-05-24
21
• Perceptual loss
Adv. loss

SRGAN
2019-05-24
22
• Perceptual loss
Content loss2Content loss1

SRGAN
2019-05-24
23
• Results
• Datasets : Set5, Set14, BSD100 • Scale factor : 4
• MOS(Mean Opinion Score) testing : 26raters. (1 : bad quality, 5 : excellent quality)
12가지 버전
GT, NN, Bicubic, SRCNN, SelfExSR, DRCN, ESPCN, SRResNet-MSE, SRRestNet-VGG22, SRGAN-MSE, SRGAN-VGG22, SRGAN-VGG54
각 평가자는 1128개의 이미지(12 versions of 19 images + 9 versions of 100 images) (12*19+9*100 = 228+900=1128)
The raters were calibrated on the NN (score 1) and HR (5) versions of 20 images from the BSD300 training set
Low-level features
High-level features

SRGAN
2019-05-24
24
• Results
: MOS 관점에서 adversarial loss가 유의미한 결과 값을 출력.
: MOS 관점에서 high lever feature가 더 유의미한 결과 값을 출력.
We could not determine a significantly best loss function

SRGAN
2019-05-24
25
• Results

SRGAN
2019-05-24
26
• Results
“PSNR이나 SSIM을 원하면, SRResNet을 써.
물론 구리겠지만 찡긋”

III. Experiment
Experiment
Source Code, Set5, Set14, Custom

self.espcn1 = ESPCN(64, 256)
self.espcn2 = ESPCN(256, 256)
#최종적인 출력.
self.conv3 = nn.Conv2d(256, 3, kernel_size=9, stride=1, padding=4)
def forward(self, x):
x = F.prelu(self.conv1(x))
temp = self.block1(x)
temp = self.block2(temp)
x = x+self.bn1(self.conv2(temp))
x = self.espcn1(x)
x = self.espcn2(x)
x = self.conv3(x)
return x
Experiment
• Source Code
2019-05-24
28
class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
#TODO : for문을 이용하여 쌓는 방법 고려하기.
self.block1 = Block_B(64)
self.bn1 = nn.BatchNorm2d(64)

def forward(self, x):
x = F.leaky_relu(self.conv1(x), 0.2)
x = F.leaky_relu(self.bn1(self.conv2(x)), 0.2)
x = F.leaky_relu(self.conv9(self.flatten(x)))
x = F.sigmoid(self.conv10(x))
return x
Experiment
• Source Code
2019-05-24
29
class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=3, padding=1)
#TODO: flatten 과정 확인 필요
self.flatten = nn.AdaptiveAvgPool2d(1)
self.conv9 = nn.Conv2d(512, 1024, kernel_size=1)
self.conv10 = nn.Conv2d(1024, 1, kernel_size=1)

Experiment
• Source Code
2019-05-24
30
class GeneratorLoss(nn.Module):
def __init__(self):
super(GeneratorLoss, self).__init__()
vgg = vgg16(pretrained=True)
loss_network = nn.Sequential(*list(vgg.features)[:31]).eval()
for param in loss_network.parameters():
param.requires_grad = False
self.loss_network = loss_network
self.mse_loss = nn.MSELoss()
def forward(self, out_labels, out_images, target_images):
# Adversarial Loss
adversarial_loss = torch.mean(1 - out_labels)
# Perception Loss
perception_loss = self.mse_loss(self.loss_network(out_images), self.loss_network(target_images))
# Image Loss
image_loss = self.mse_loss(out_images, target_images)
loss_network = nn.Sequential(*list(vgg.features)[:31]).eval()
파이썬 Asterisk(*) 의 역할
1. Positional arg
2. Keword arg
3. Unpacking
# (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (1): ReLU(inplace)
# (3): ReLU(inplace)
# (4): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
# ... # ...
# (30): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False) # )
Python Intermediate Seminar 예정

Experiment
• Dataset(Set-5, Set-14, Custom)
2019-05-24
31
…

Experiment
• Result#1 – Set5
2019-05-24
32
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR

Experiment
2019-05-24
36
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR

Experiment
2019-05-24
40
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR

Experiment
2019-05-24
43
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR

Experiment
2019-05-24
47
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR

Experiment
• Result#6 – Custom data
2019-05-24
50
(240x180->960x720)
(236x125->944x500)
(137x137->548x548) (480x320->1920x1280)

Experiment
• Result#7 – Custom data(Video)
2019-05-24
58

IV. Summary
Summary
Summary, Future Work

Summary
2019-05-24
60
• ResNet 구조와 GAN 구조를 SR에 적용하여 객관적 지표인 PSNR, SSIM를 일정 수준 확보하면서
주관적 지표인 MOS를 향상 시킴.
• Content loss와 adversarial loss를 융합한 새로운 perceptual loss 를 제안함.

Future work
2019-05-24
61
GAN Research
Vanilla GAN
DCGAN
InfoGAN
LSGAN
SRGAN
Development tools & Language
Tips(Document & Programming)
PyTorch
C++ Coding Standard
Mathematical Theory
Linear algebra
Probability & Information theory
Other research
Level Processor
Ice Propagation
Modern C++(C++14)
Python(Intermediate)
Python executable & UI
Style Transfer
cGAN
wGAN
BEGAN
BigGAN
Cycle GAN
Style GAN
DONETODO
?

Photo-realistic Single Image Super-resolution using a Generative Adversarial Network (SRGAN)

In this document

More Related Content

What's hot

Similar to Photo-realistic Single Image Super-resolution using a Generative Adversarial Network (SRGAN)

More from Hansol Kang

Recently uploaded

Photo-realistic Single Image Super-resolution using a Generative Adversarial Network (SRGAN)