This document discusses single image super-resolution (SISR) techniques. SISR aims to estimate a high-resolution image from a low-resolution input using examples from a single image. Early methods like bicubic interpolation were simple but yielded oversmoothed results. SRCNN introduced a three-layer convolutional network to learn nonlinear mappings from low to high resolution. DRCN and SRResNet used deeper networks with skip connections. SRGAN introduced an adversarial loss to generate perceptually realistic textures. It trains a generator network with a discriminator to produce outputs indistinguishable from real images. Benchmark datasets and mean opinion scores are used to evaluate state-of-the-art methods.
2. What is Single Image Super-Resolution
SISR
Estimating a high-resolution image from its counterpart low-resolution image.
What distinguishes SISR is that it is more flexible and has wider range of
applications than Multiple-Image Super Resolution since it needs only one picture
for its input, but it is indeed more challenging.
2
3. Applications
Improves normal video resolution.
Improves surveillance video resolution.
Medical usages.
Improves satellite picture resolution.
3
4. Bicubic
Filter.
Very fast.
Oversimplifies the picture.
Yields solutions with overly smooth textures.
4
5. SRCNN
Three layers.
Fully feed-forward, thus pretty fast.
Upscale it to the desired size using bicubic interpolation.
The first convolutional layer of the SRCNN extracts a set of feature maps. The
second layer maps these feature maps nonlinearly to high-resolution patch
representations. The last layer combines the predictions within a spatial
neighborhood to produce the final high-resolution image.
5
7. DRCN
The network takes an interpolated input image (to the desired size).
Consists of three sub-networks: embedding, inference and reconstruction
networks.
Embedding net is used to represent the given image as feature maps ready for
inference.
The inference net solves the task.
7
8. DRCN Cont.
For intermediate outputs, we have the loss function.
For the final output, we have
8
9. DRCN Cont.
Now we give the final loss function L(θ). The training is regularized by weight decay
(L2 penalty multiplied by β).
9
10. SRRes Net
16 blocks deep ResNet.
Optimized for MSE.
Has skip-connections.
10
11. SRGAN
Generator.
Deep network with B residual blocks.
Two trained sub-pixel convolutional layers with small kernel 3*3.
Batch-Normalization layers.
Activation function is ParametricReLU
11
16. SRGAN Cont.
Introduces Mean Opinion Score (MOS).
26 raters were asked to assign an integral score from 1 (bad quality) to 5 (excellent
quality) to the super-resolved images.
Each rater rated 1128 images from the given datasets.
16
17. Datasets
The models were tested on three of the most known benchmark datasets for SISR,
Set5, Set14 and BSD100.
17