[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution

1
Taegyun Jeon
Feb, 2018
Satrec Initiative
Deep Learning for
Single Image
Super-Resolution

2
Taegyun Jeon (전태균)
Satrec Initiative
 Team leader and Senior Researcher
 PhD in Machine Learning and Computer Vision
Google Developers Experts
 Machine Learning (2017-18)
TensorFlow KR Facebook User group
 Co-administrator
 PR12 (online paper reading group): http://bit.ly/TFPR12
OSGeo KR
 Newbie

3
Contents
1. Single Image Super-Resolution (SISR)
2. Deep Learning for SISR
3. SISR for Satellite Imagery
4. Conclusions

4
Part 1.
Single Image Super-Resolution
Deep Learning for Single Image Super-Resolution

5
Single Image Super-Resolution?
Example

6
Definitions on Super-Resolution (SR)
Super-Resolution
: Enhancing the resolution of an imaging system.
 Digital camera
 Satellite system
 Microscopy system
Resolution
 Optical Resolution: “the capability of an optical system to distinguish, find, or record details”
 Sensor Resolution: “the smallest change a sensor can detect in the quantity that it is measuring”
 Image Resolution: “a measure of the amount of detail in an image”
 Pixel Resolution: “number of total pixels”
 Spatial Resolution: “the ability of any image-forming device to distinguish small details of an object”

7
Challenges on Single Image SR (SISR)
Converting images from HR to LR is straightforward
downscaling
High resolution
Low resolution

8
Challenges on SISR
Interpolation on
aligned coordinates
Reconstruction of
high-freq. content
BIG! SHARP!
Low resolution
High resolution + Good qualityHigh resolution + Low quality
≈
Ground truth

9
Purpose of SISR
Recovering missing high-frequency components in the image
BIG!
SHARP!

10
Problems on SISR
High-frequency information is lost
 i.e.) edges and texture
Severely ill-posed inverse problem
 Many possible solutions
 Fill the spatial gaps with given low-resolution information
Algorithms exploit contextual information
 Use low-resolution patches and filters
 To predict high-resolution pixels

11
Evaluation for SISR
Image Quality Metrics for General Image and Satellite Image
 Swath, Non-uniformity, Targeting accuracy, Geo-location accuracy
Term Meaning
PSNR
Peak-signal-to-noise ratio (measure the quality of reconstruction w.r.t. original
image)
SSIM Structural similarity
MTF modulation transfer function
SNR signal-to-noise ratio
GSD The distance between pixel centers measured on the ground

12
Dataset for Benchmark
Set5
Set14

13
BSDS100 (Berkeley segmentation dataset)
Urban100

14
Manga109
Historical Photos

15
SpaceNet (DigitalGlobe, CosmiQ Works, NVIDIA)
 WorldView-2 (~0.5m GSD)
 Rio de Janeiro, Brazil (up to 7,000 images): one image covers 200m2

16
Competitions (NTIRE2017@CVPR Workshop)
NTIRE challenge on example-based single image super-resolution
 Track 1: bicubic downscaling x2, x3, x4 competition
 Track 2: unknown downscaling x2, x3, x4 competition
DIV2K dataset NTIRE 2017 SR Challenge reportDIV2K dataset and study

17
 Track 1: bicubic downscaling x2, x3, x4 competition
 Track 2: unknown downscaling x2, x3, x4 competition
DIV2K dataset NTIRE 2017 SR Challenge reportDIV2K dataset and study
1st Place Award (SNU)
Bee Lim, Sanghyun Son, Seungjun Nah, Heewon Kim, Kyoung Mu Lee
3rd Place Award (KAIST)
Woong Bae, Jaejun Yoo, Yoseob Han, Jong Chul Ye

18
 Track 1: Classic Bicubic - x8
 Track 2: Realistic mild - x4
 Track 3: Realistic difficult - x4
 Track 4: Realistic wild - x4
Realistic conditions
 Degradation operators emulating the image acquisition process from a digital camera
(such as blur kernel, decimation, downscaling strategy)

19
Part 2.
Deep Learning for
Single Image Super-Resolution

20
History of SISR (Traditional Models)
Deep Models
Freeman (2000)
Freeman (2002)
Chang (2004)
Yang (2010)
Zeyde (2010)
Weisheng (2011)
Peleg (2014)
Jia-Bin (2015)
Gu (2012)
Timofte (2013)
Yang (2013)
Timofte (2014)
Schulter (2015)
Salvador (2015)
Eduardo (2016)
Timofte (2016)
* Early learning-based
* Sparsity-based
* Self examplar-based
* Locally linear regression-based
Traditional Models
Daniel (2009)

21
Traditional Models for SISR
Early-Learning based models
 W. Freeman, E. Pasztor, O. Carmichael, Learning low-level vision, IJCV, 2000.
 W. Freeman, T. Jones, E. Pasztor, Example-based super-resolution, IEEE CGA, 2002.
 H. Chang, D. Yeung, Y. Xiong, Super-resolution through neighbor embedding, CVPR, 2004.
Sparsity-based models
 J. Yang, J. Wright, T. Huang, Y. Ma, Image super-resolution via sparse representation, IEEE TIP 2010.
 R. Zeyde, M. Elad, M. Protter, On single image scale-up using sparse-representations, ICCS, 2010.
 D. Weisheng, Z. Lei, S. Guangming, W. Xiaolin, Image Deblurring and Super-resolution by Adaptive Sparse Domain
Selection and Adaptive Regularization, IEEE TIP, 2011.
 T. Peleg, M. Elad, A statistical prediction model based on sparse representations for single image super-resolution,
IEEE TIP, 2014.
Self-examplar based models
 D. Glasner, S. Bagon and M. Irani, Super-Resolution from a Single Image, ICCV, 2009.
 J. Huang, A. Singh, and N. Ahuja, Single Image Super-Resolution from Transformed Self-Exemplars, CVPR, 2015.

22
Traditional Models for SISR
Locally linear regression based models
 S. Gu, N. Sang and F. Ma
Fast Image Super Resolution via Local Regression, ICPR, 2012.
 R. Timofte, V. De Smet, and L. Van Gool
Anchored neighborhood regression for fast example-based super-resolution, ICCV, 2013.
 C. Yang and M. Yang
Fast direct super-resolution by simple functions, ICCV, 2013.
 R. Timofte, V. De Smet, and L. Van Gool
A+: Adjusted anchored neighborhood regression for fast super-resolution, ACCV, 2014.
 S. Schulter, C. Leistner, and H. Bischof
Fast and accurate image upscaling with super-resolution forests, CVPR, 2015.
 J. Salvador and P. Eduardo
Naive Bayes Super-Resolution Forest, ICCV, 2015.
 P. Eduardo, J. Salvador, J. Ruiz-Hidalgo, and B. Rosenhahn,
PSyCo: Manifold Span Reduction for Super Resolution, CVPR, 2016.
 R. Timofte, R. Rothe, and L. Van Gool
Seven Ways to Improve Example-Based Single Image Super Resolution, CVPR, 2016.

23
Deep Models for SISR
 C Dong, CC Loy, K He, X Tang
Learning a deep convolutional network for image super-resolution, ECCV, 2014.
 C Dong, CC Loy, K He, X Tang
Image Super-Resolution Using Deep Convolutional Networks, TPAMI, 2016.
 Z Wang, D Liu, J Yang, W Han, T Huang
Deep networks for image super-resolution with sparse prior, ICCV, 2015.
 JSJ Ren, L Xu, Q Yan, W Sun
Shepard Convolutional Neural Networks, NIPS, 2015.
 J Bruna, P Sprechmann, Y LeCun
Super-Resolution with Deep Convolutional Sufficient Statistics, ICLR, 2016
 J Kim, J Kwon Lee, K Mu Lee
Accurate Image Super-Resolution Using Very Deep Convolutional Networks, CVPR, 2016.
SRCNN: First approach using CNN to solve single image super-resolution
SRCNN-ex: Use more training data and achieve better SR performance
SCN and CSCN: Combine conventional sparse coding model into CNN
ShCNN: Adding Shepard layer for Translation Variant Interpolation (TVI)
CNN MSE
CNN MSE
CNN MSESC
CNN MSEShepard
VDSR: Simple VGG-like model with global skip-connection for residual learning CNN MSEResidual
656
Energy model based on scattering, VGG-19, and Gibbs sampling CNN MSE
Citations (Feb 2018)
623
144
32
46
322

24
 J Kim, J Kwon Lee, K Mu Lee
Deeply-Recursive Convolutional Network for Image Super-Resolution, CVPR, 2016.
 W Shi, J Caballero, F Huszár, J Totz, A P. Aitken, R Bishop, D Rueckert, Z Wang
Real-Time Single Image and Video Super-Resolution
Using an Efficient Sub-Pixel Convolutional Neural Network, CVPR, 2016.
 C Dong, CC Loy, X Tang
Accelerating the Super-Resolution Convolutional Neural Network, ECCV, 2016.
 J Johnson, A Alahi, L Fei-Fei
Perceptual Losses for Real-Time Style Transfer and Super-Resolution, ECCV, 2016
 J Maira,
End-to-End Kernel Learning with Supervised Convolutional Kernel Networks, NIPS, 2016.
 C K Sønderby, J Caballero, L Theis, W Shi, F Huszár,
Amortised MAP Inference for Image Super-resolution, ICLR, 2017.
ESPCN: Upscaling with sub-pixel convolution layer (by fractional stride in LR space)
CNN MSEFSRCNN: Remove bicubic interpolation and adding deconvolution for various scales
CNN Perceptual
Deconv
Proposed perceptual loss functions for SR task and adopted style transfer network StyleTransfer
CNN MSESubpixel Conv
DRCN: Add recursion connection for weight sharing and model compression CNN MSERecursive
SCKN: Train SCKN using reproducing kernel Hilbert space (RKHS)
AffGAN: MAP inference under image prior and Applying for GAN
138
177
106
389
20
59
CNN MSERKHS
GAN PerceptualMAP

25
 W Lai, J Huang, N Ahuja, and M Yang
Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution, CVPR, 2017.
 K Zhang, W Zuo, S Gu, and L Zhang
Learning Deep CNN Denoiser Prior for Image Restoration, CVPR, 2017.
 Y Tai, J Yang, and X. Liu.
Image Super-Resolution via Deep Recursive Residual Network, CVPR, 2017.
 B Lim, S Son, H Kim, S Nah, and K Lee.
Enhanced Deep Residual Networks for Single Image Super-Resolution. CVPRW, 2017
 C Ledig, L Theis, F Huszar, J Caballero, A Aitken, A Tejani, J Totz, Z Wang, and W Shi
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, CVPR, 2017.
LapSRN: progressively reconstruct the sub-band residuals using Laplacian pyramid
irCNN: Add Half Quadratic Splitting for denoising prior on CNN
DRRN: Recursion connection with residual block for weight sharing & compact model
EDSR, MDSR: Optimized residual block and single-/multi- scale SR models
SRGAN: Proposed GAN-based model / evaluated by mean-opinion-score (MOS) test
CNN MSEDilated Conv
GAN
CNN Charbonnier
Laplacian pyramid
CNN L1
CNN MSERecursiveResidual
Residual
34
22
23
24
373
PerceptualResidual
MOS

26
 M Sajjadi, B Schölkopf, and M Hirsch
EnhanceNet: Single Image Super-Resolution through Automated Texture Synthesis, ICCV, 2017.
 Y Tai, J Yang, X Liu and C Xu.
MemNet: A Persistent Memory Network for Image Restoration, ICCV, 2017.
 T Tong, G Li, X Liu, and Q Gao.
Image Super-Resolution Using Dense Skip Connections. ICCV, 2017.
 J Yamanaka, S Kuwashima, and T Kurita.
Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network, ICONIPS, 2017.
ENet: Adversarial training, perceptual losses and a proposed texture transfer loss
MemNet: Deep persistent memory network
DCSCN: Bicubic upsampling, residual block, reconstruction with NIN
SRDenseNet: Densely skip connection and DenseNet block
23
GAN PerceptualTexture transfer
5
CNN MSEMemory block
CNN MSEDenseNet
CNN MSENIN
3
1

27
Architecture (CNN based)
Convolutional Neural Networks (CNN)
 HR image resolution
 LR image resolution + upsampling
Figure from C Dong, CC Loy, X Tang
Accelerating the Super-Resolution Convolutional Neural Network, ECCV, 2016

28
Backbone network (VGGNet)
 Stacked convolution layers
 residual connection / recursive connection
Figure from W Lai, J Huang, N Ahuja, and M Yang
Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution, CVPR, 2017.

29
Backbone network (ResNet)
Figure from B Lim, S Son, H Kim, S Nah, and K Lee.
Enhanced Deep Residual Networks for Single Image Super-Resolution. CVPRW, 2017

30
Backbone network (DenseNet)
Figure from F Zhou, X Li, and Z Li.
High-Frequency Details Enhancing DenseNet for Super-Resolution. Neurocomputing, 2018

31
Architecture (GAN based)
Backbone network (Generative Adversarial Network: GAN)
Figure from C Ledig, L Theis, F Huszar, J Caballero, A Aitken, A Tejani, J Totz, Z Wang, and W Shi

32
Evaluation metric vs Human Perception
PSNR and SSIM
?

33
Evaluation metric vs Human Perception
PSNR and SSIM

34
MSE-based vs GAN-based
Figure from M Sajjadi, B Schölkopf, and M Hirsch
EnhanceNet: Single Image Super-Resolution throughAutomated Texture Synthesis, ICCV, 2017.

35
Losses
L2 loss (MSE)
 𝑥: input image (LR image)
 𝑦: ground truth (HR image)
 𝑦: output image (predicted HR image)
 MSE(𝑦, 𝑦) =
1
𝑁
(𝑦 − 𝑦)2
L1 loss
 Smooth L1 loss
 Abs L1 loss

36
Losses
Slide from M Sajjadi, B Schölkopf, and M Hirsch, EnhanceNet: Single Image Super-Resolutionthrough Automated Texture Synthesis,ICCV, 2017.

37
Losses
Charbonnier loss
 Charbonnier penalty function:
 N : the number of training samples in each batch
 L : the number of Laplacian pyramid
 𝜺 = 1𝑒 − 3 (empirical setting)

38
Data augmentation
Flip
Rotation (90°, 180°, 270°)
FLIP ROTATE (90°, 180°, 270°)

39
Geometric Self-Ensemble
FLIP
ROTATE (0°, 90°, 180°, 270°)
ROTATE (0°, 90°, 180°, 270°)
Inverse
transform
Average
Inference
Inference

40
Part 3.
Single Image Super-Resolution for
Satellite Imagery

41
SISR for Satellite Imagery
Low Resolution
 L. Liebel, M. Korner, Single-image super resolution for multispectral remote sensing data using convolutional neural
networks, ISPRS, 2016
 N Brodu, Super-resolving multiresolution images with band-independant geometry of multispectral pixels, IEEE TGRS,
2017
 X Liu, Q Liu, and Y Wang, Remote Sensing Image Fusion Based on Two-stream Fusion Network, arXiv, 2017
 C Lanaras, J Bioucas-Dias, E Baltsavias, K Schindler. Super-Resolution of Multispectral Multiresolution Images from a
Single Sensor. CVPRW, 2017
Very-High Resolution
 M Bosch, C Gifford, P Rodriguez, Super-Resolution for Overhead Imagery Using DenseNets and Adversarial Learning,
arXiv, 2017.
 S Mei, X Yuan, J Ji, Y Zhang, S Wan and Q Du, Hyperspectral Image Spatial Super-Resolution via 3D Full Convolutional
Neural Network, Remote Sensing, 2017
 A Rangnekar, N Mokashi, E Ientilucci, C Kanan, M Hoffman, Aerial Spectral Super-Resolution using Conditional
Adversarial Networks, arXiv, 2017

42
SRCNN for Sentinel-2 (10m)
Figure from L. Liebel, M. Korner, Single-image super resolution for multispectral remote sensing data using convolutional neural networks, ISPRS,2016

43
DenseNet + GAN
Dataset
 SpaceNet (captured by WorldView-3): AoI-2 (Las Vegas), ~0.3m GSD
 Scale factor: x8
Figure from M Bosch, C Gifford, P Rodriguez, Super-Resolutionfor OverheadImagery Using DenseNets and Adversarial Learning, arXiv, 2017

44
DenseNet + GAN
Architecture

45
DenseNet + GAN
Results

47
3D FCN (Fully Convolutional Neural Network)
Figure from S Mei, X Yuan, J Ji, Y Zhang, S Wan and Q Du, Hyperspectral Image Spatial Super-Resolutionvia 3D Full Convolutional Neural Network,
Remote Sensing,2017

48
AeroGAN
 31 hyperspectral band
 cGAN based Model
Figure from A Rangnekar, N Mokashi, E Ientilucci, C Kanan, M Hoffman, Aerial Spectral Super-Resolution usingConditional Adversarial Networks, arXiv, 2017

49
Part 4.
Conclusions

50
Limitations
Information preservation
 Structural info
 Spectral info
 Texture info
Scale Factors with stable results
Evaluation metrics
 PSNR and SSIM

51
Summary
Architecture
 CNN and GAN
Losses
 L1 loss, L2 loss, Perceptual loss, Charbonnier loss
Backbone networks
 VGG, ResNet, and DenseNet
SR for Satellite Imagery
 We still have long way to go..

[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution

Similar to [OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution (20)

More from Taegyun Jeon

More from Taegyun Jeon (15)

Recently uploaded

Recently uploaded (20)

[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution