Photo-realistic Single Image Super-resolution using a Generative Adversarial Network (SRGAN)
The document discusses the methodology and results of using Generative Adversarial Networks (GANs) for photo-realistic single image super-resolution (SRGAN). It covers the architecture, perceptual loss functions, and experimental results using various datasets, demonstrating the effectiveness of adversarial loss in improving image quality. Additionally, it includes source code examples for the generator and discriminator components of the SRGAN framework.
Introduction to the seminar on SRGAN by Hansol Kang, focusing on photo-realistic single image super-resolution.
Detailed review of GAN concepts, including various types like Vanilla GAN, DCGAN, and their applications in style transfer, inpainting, and super-resolution.
Overview of SRGAN, emphasizing its goal to create realistic super-resolution images.
Discussion of different neural network structures used in SRGAN, including ResNet and ESPCN architectures.
Explanation of perceptual loss in SRGAN. Details on content loss and adversarial loss, emphasizing their importance in generating high-quality images.
Results overview using various datasets (Set5, Set14, BSD100). Discussion on Mean Opinion Score (MOS) evaluations and implications of adversarial loss.
Detailed breakdown of experiments conducted, including source code, dataset usage, and examples from Set5 and Set14 results.
Summary of the findings highlighting the enhancement of MOS metrics. Plans for further research in GANs and associated technologies.
Photo-realistic Single Image Super-resolution using a Generative Adversarial Network (SRGAN)
1.
Photo-realistic Single ImageSuper-resolution using a
Generative Adversarial Network* (SRGAN)
ISL Lab Seminar
Hansol Kang
* Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
Introduction
• Review -Concept of GAN
2019-05-24
4
“Discriminator를 잘 속이는 Generator를 만들자.”
1) Vanilla GAN
D
real or
fake
G
image
2) DCGAN
D
real or
fake
image
G
Latent space의 manipulability 발견
3) InfoGAN
D
real or
fake
image
GZ MutualInformation
4) LSGAN
D
real or
fake
image
G
BCE Loss -> MSE Loss 사용
Decision
Boundary
5.
Introduction
• Review -Concept of GAN
2019-05-24
5
“Discriminator를 잘 속이는 Generator를 만들자.”
1) Vanilla GAN
D
real or
fake
G
image
2) DCGAN
D
real or
fake
image
G
Latent space의 manipulability 발견
3) InfoGAN
D
real or
fake
image
GZ MutualInformation
4) LSGAN
D
real or
fake
image
G
BCE Loss -> MSE Loss 사용
Decision
Boundary
6.
Introduction
• Review -Applications
2019-05-24
6
Style Transfer
In painting
Super
Resolution
SRGAN
2019-05-24
17
• Perceptual loss
PerceptualLoss =
SR
Gen
SR
X
SR
lll 3
10
Content Loss + Adversarial Loss
rW
x
rH
y
yx
LRHR
yx
SR
MSE IGI
WHr
l G
1 1
2
,,2
)(
1
MSE VGG
SOTA에서 많이 사용하는 방법
=> High PSNR, BUT perceptually BAD
(PSNR과 SSIM이 좋은 평가 지표가 아니다.)
18.
SRGAN
2019-05-24
18
• Perceptual loss
PerceptualLoss =
SR
Gen
SR
X
SR
lll 3
10
Content Loss + Adversarial Loss
MSE VGG
: The feature map obtained by j-th convolution (after activation) before the i-th maxpooling layer within the VGG19
network
ji ji
G
W
x
H
y
yx
LR
jiyx
HR
ji
jiji
SR
jiVGG IGI
HW
l
, ,
1 1
2
,,,,
,,
,/ )(
1
ji,
*Basic of DCNN seminar (Hansol Kang) – https://isl-homepage.github.io/seminar/
Feature를 서로 비교하겠다.
=> 디테일한 정보가 같도록 (perceptually Good)
Input F1 F2
F
C
Feature map
SR
I
19.
SRGAN
2019-05-24
19
• Perceptual loss
PerceptualLoss =
SR
Gen
SR
X
SR
lll 3
10
Content Loss + Adversarial Loss
N
n
LRSR
Gen IGDl GD
1
log
LR
IGD GD 1log
LR
IGD GD log
Maximize 시키는 문제
Minimize 시키는 문제
20.
SRGAN
2019-05-24
20
• Perceptual loss
*Medium blog, “Introduction to deep super resolution”
(https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf)
21.
SRGAN
2019-05-24
21
• Perceptual loss
*Medium blog, “Introduction to deep super resolution”
(https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf)
Adv. loss
22.
SRGAN
2019-05-24
22
• Perceptual loss
*Medium blog, “Introduction to deep super resolution”
(https://medium.com/@hirotoschwert/introduction-to-deep-super-resolution-c052d84ce8cf)
Content loss2Content loss1
23.
SRGAN
2019-05-24
23
• Results
• Datasets: Set5, Set14, BSD100 • Scale factor : 4
• MOS(Mean Opinion Score) testing : 26raters. (1 : bad quality, 5 : excellent quality)
12가지 버전
GT, NN, Bicubic, SRCNN, SelfExSR, DRCN, ESPCN, SRResNet-MSE, SRRestNet-VGG22, SRGAN-MSE, SRGAN-VGG22, SRGAN-VGG54
각 평가자는 1128개의 이미지(12 versions of 19 images + 9 versions of 100 images) (12*19+9*100 = 228+900=1128)
The raters were calibrated on the NN (score 1) and HR (5) versions of 20 images from the BSD300 training set
Low-level features
High-level features
24.
SRGAN
2019-05-24
24
• Results
: MOS관점에서 adversarial loss가 유의미한 결과 값을 출력.
: MOS 관점에서 high lever feature가 더 유의미한 결과 값을 출력.
We could not determine a significantly best loss function
Experiment
• Result#1 –Set5
2019-05-24
32
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
36.
Experiment
• Result#2 –Set5
2019-05-24
36
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
40.
Experiment
• Result#3 –Set5
2019-05-24
40
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
43.
Experiment
• Result#4 –Set14
2019-05-24
43
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
47.
Experiment
• Result#5 –Set14
2019-05-24
47
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
a b
c d
e
a : Bi-cubic
b : SRCNN
c : Kim
d : SRGAN
e : HR
50.
Experiment
• Result#6 –Custom data
2019-05-24
50
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
(240x180->960x720)
(236x125->944x500)
(137x137->548x548) (480x320->1920x1280)
58.
Experiment
• Result#7 –Custom data(Video)
2019-05-24
58
* 실험 결과는 미리 학습된 네트워크(ep:100, r:4)를 사용함. - https://github.com/leftthomas/SRGAN)
Summary
2019-05-24
60
• ResNet 구조와GAN 구조를 SR에 적용하여 객관적 지표인 PSNR, SSIM를 일정 수준 확보하면서
주관적 지표인 MOS를 향상 시킴.
• Content loss와 adversarial loss를 융합한 새로운 perceptual loss 를 제안함.
61.
Future work
2019-05-24
61
GAN Research
VanillaGAN
DCGAN
InfoGAN
LSGAN
SRGAN
Development tools & Language
Tips(Document & Programming)
PyTorch
C++ Coding Standard
Mathematical Theory
Linear algebra
Probability & Information theory
Other research
Level Processor
Ice Propagation
Modern C++(C++14)
Python(Intermediate)
Python executable & UI
Style Transfer
cGAN
wGAN
BEGAN
BigGAN
Cycle GAN
Style GAN
DONETODO
?