Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Deep	Learningによる
超解像の進歩
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
⾃⼰紹介
2
Hiroto Honda
@hirotomusiker
n メーカー研究所 → 2017/1 DeNA
n ETH Zurich CVLにて客員(2013-2014)
n CVPR NTIRE Workshop Program Committee
n DeNA AI研究開発エンジニア
n 現職:Object Detection
(OSS: https://github.com/DeNA/Chainer_Mask_R-CNN )
n 前職:Low-Level Vision, Computational, Sensor LSI
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Contents
n 超解像は試しやすい
n 初期のSISRネットワーク
⁃ SRCNN, ESPCN, VDSR
⁃ Upsampling⼿法– deconv or pixelshuffle
n ベースライン⼿法:SRResNet
⁃ SRResNet, SRGAN, and EDSR
n 超解像とperception
⁃ 復元結果とロス関数の関係
⁃ Perception – Distortion Tradeoff
n まとめ
3
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
超解像とは
n 低解像度画像
n ⾼解像度画像
4
復元
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
超解像は試しやすい!
5
original(HR) LR
resize
train
アノテーションが不要な
Self-supervised	learningの⼀種
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
超解像の進歩
6
https://github.com/jbhuang0604/SelfExSRPSNR*	[dB]	(over	bicubic)
on	Set5	dataset,	x4
+1.86
+2.93
+2.06
+3.63
A+0.0
bicubic
2015 20172014 2016
+4.20
+2.48
PSNR	data	from:5)
SRCNN VDSR SRResNet EDSRESPCN
超解像の精度は年々向上している
*	PSNR	=	10	log10	(2552 /	MSE	)	when	max	value	is	255
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
超解像ネットワークの学習
n 正解画像からpatchをcropする HR
n patchをダウンサンプルする LR = g(HR)
n バッチを編成する {LR}, {HR}
n ネットワークfを学習する ロス関数は: MSE(HR, f(LR))
n ...以上!
7
LR=g(HR) f(LR) HR
f
MSE
e.g.	bicubic	down-sampling
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Non-deep⼿法:	辞書ベースのアルゴリズム
8
=
係数を最適化する
8
ベースライン:	A+	(2014)
http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-ACCV-2014.pdf
=
学習済みの辞書
x	0			+	
x	0			+	
x	0.8			+	
x	0.8			+	
x	0.05		+	
x	0.05		+	
LR
patch
HR
patch
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
n 初期のSISR networks
⁃ SRCNN, ESPCN, VDSR
⁃ Upsampling⼿法 – deconv or pixelshuffle
9
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
最初のDeep超解像– SRCNN
10
Kernel	size:	9	– 1	– 5	or		9	– 3	– 5	or	9	– 5	– 5
from:1)
⾮常にシンプルで計算量も少ない
bicubic	x2
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
VDSR:	ディープなSRCNN
11
from:3)
3x3,	64	ch D=	5	to	20
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Efficient	sub-pixel	CNN	(ESPCN)
12
SRCNNと違い、LR画像をconvするので効率的
Kernel	size
5	– 3	– 3
from:2)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
SRCNN	/	VDSR	とESPCNの違い
n Post-upsamplingのほうが効率的だが、1.6倍 といった⾮整数の
upsamplingができない
13
SRCNN,
VDSR
ESPCN
bicubic	x2 output
input
Pixel	shuffle	x2
ch
h
w
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
CNNによるアップスケール - Deconvolution	or	PixelShuffle?
n Deconvolution
14
https://distill.pub/2016/deconv-checkerboard/
位置ごとに関与する画素数が均⼀ではないため
格⼦パターンが出てしまう
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
CNNによるアップスケール - Deconvolution	or	PixelShuffle?
n resize – convolutionしては?
15
格⼦パターンはなくなる
Resize(low-pass)により情報が失われる可能性があるので、
Nearest	neighborで埋める⽅法も
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
CNNによるアップスケール - Deconvolution	or	PixelShuffle?
n Sub-pixel convolution (aka. PixelShuffle)
16
各位置でチャネルの情報をタイルする
e.g.	9	channels	->	3x3	サブピクセル
格⼦ノイズフリーではない
from:2)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
n ベースライン⼿法:SRResNet
⁃ SRResNet, SRGAN, and EDSR
17
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
SRResnet and	SRGAN	– twitter	CVPR’17
18
Skip connection
pixel
shuffle
x2
MSE
MSE
Discriminator
Trained VGG
Perceptual Loss
Discriminator
Loss
MSE Loss
from:4)
pixel
shuffle
x2
ch
h
w
・3種類のロス関数
・MSEのみを使⽤する場合SRResNetと呼ぶ
24 residual blocks, 64 ch
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
SRResnet*	and	SRGAN	– ネットワーク詳細
19
・resblockとskip	connection
・pixel	shuffle	upsampling
from:4)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
さらに⾼精度に特化したEnhanced	Deep	Super	Resolution	(EDSR)
ソウル⼤
20
32 residual blocks, 256 ch
Skip connection x2
x2
l1
l1 Loss
from:5)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
PSNRと⾒た⽬
21
from:5)
20dB台で1dB違うと明らかに⾒た⽬が変わる
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
n 超解像とPerception
⁃ 復元結果とロス関数の関係
⁃ Perception – Distortion Tradeoff
22
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
主観評価とPSNR
23
Original
SRResNet
25.53dB
SRGAN
21.15dB
bicubic
21.59dB
Method→
PSNR	→
from:	4)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
SRResnet	and	SRGAN	– lossでこんなに違う
24
MSE loss ● ●
Perceptual loss using VGG ●
Discriminator loss ● ●
from:4)
PSNRが
最も⾼い
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
3タイプのロス関数
①l1/l2 loss
②perceptual loss
③GAN loss
25
generated
image
real	/	fake
ground	
truth
multi-scale
feature	
matching
VGG
discrimi-
nator
generated
image
ground	
truth
generated
image
ground	
truth
Low
Distortion
Good	
Perception
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Perception-Distortion	Tradeoff	
どの⼿法も、low	distortionとgood	perceptual	qualityを
同時に満たせない → tradeoff把握が⼤事
26
from:8)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
超解像の⽬的はなにか?
27
Accurate Plausible
正確な復元
⾃然な復元
どちらを選ぶかは、⽤途次第!!
引⽤元:4)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
n まとめ
28
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Progress	on	SISR	– 精度と速度
29
PSNR	[dB]	(over	bicubic)
on	Set5	dataset,	x4
+1.86
+2.93
+2.06
+3.63
A+ SRCNN VDSR SRResNet EDSR0.0
bicubic
2015 20172014 2016
+4.20
ESPCN
+2.48
0.44
0.04
0.74
1.33
40.7
・CNNを通る画像サイズ
・中間レイヤのチャネル数
で計算量が⼤きく変化する PSNRデータ引⽤元:5)
Mega-Multiplication
per	one	input	pixel
for	x2	restoration
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
NTIRE	2017	超解像コンペでのベンチマーク詳細
30
EDSR
SRResNet
VDSR
ESPCN
SRCNN
A+
from:	9)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
まとめ
n 超解像はdeepが主流、⾼精度だが計算量が⼤きい
n resblock連結 + skip connectionや、pixel shuffle upsamplingが重要
n SRResNetベースの⼿法がベースライン
n ʻAccurateʼ か ʻPlausibleʼ かは⽤途次第。
31
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Appendix:	Residual	Dense	Network	for	Super-Resolution
32
DenseNetベースのSRResNet
from:	6)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Appendix:	Deep	Back-Projection	Networks	For	Super-Resolution
(best	PSNR	in	NTIRE	ʼ18	x8	bicubic	downsampling	track)
33
from:	7)
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Datasets
n DIV2K dataset (train, val)
https://data.vision.ee.ethz.ch/cvl/DIV2K/
n Set5 dataset (test)
http://people.rennes.inria.fr/Aline.Roumy/results/SR_BMVC12.html
n B100 dataset (test)
https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
n Urban100 dataset (test)
https://sites.google.com/site/jbhuang0604/publications/struct_sr
34
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
Competitions
n NTIRE2017:
New Trends in Image Restoration and Enhancement workshop and challenge on image super-
resolution in conjunction with CVPR 2017
http://www.vision.ee.ethz.ch/ntire17/
report: http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf
n NTIRE2018:
New Trends in Image Restoration and Enhancement workshop and challenge on super-resolution,
dehazing, and spectral reconstructionin conjunction with CVPR 2018
http://www.vision.ee.ethz.ch/ntire18/
report:
http://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w13/Timofte_NTIRE_2018
_Challenge_CVPR_2018_paper.pdf
n PIRM2018:
Workshop and Challenge on Perceptual Image Restoration and Manipulation in conjunction with
ECCV 2018
https://www.pirm2018.org/
35
Copyright	©	DeNA	Co.,Ltd.	All	Rights	Reserved.
References
1) Dong et al., Image Super-Resolution Using Deep Convolutional Networks,
https://arxiv.org/abs/1501.00092
2) Shi et al., Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel
Convolutional Neural Network, https://arxiv.org/abs/1609.05158
3) Kim et al., Accurate Image Super-Resolution Using Very Deep Convolutional Networks,
https://arxiv.org/pdf/1511.04587
4) Ledig et al., Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial
Network ,
https://arxiv.org/abs/1609.04802
5) Lim et al., Enhanced Deep Residual Networks for Single Image Super-Resolution,
https://arxiv.org/abs/1707.02921
6) Zhang et al., Residual Dense Network for Image Super-Resolution,
https://arxiv.org/abs/1802.08797
7) Haris et al., Deep Back-Projection Networks For Super-Resolution,
https://arxiv.org/pdf/1803.02735.pdf
8) Blau et al., Perception Distortion Tradeoff, https://arxiv.org/abs/1711.06077
9) Timofte et al., NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and
Results , http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf

Deep Learningによる超解像の進歩