An Android App for Image Super Resolution using Deep Learning

An Android Application for Image Super Resolution
Through Deep LearningScuola Politecnica e delle Scienze di Base
Corso di Laurea Magistrale in Ingegneria Informatica
Tesi di laurea magistrale
relatore
Ch.mo Prof Carlo Sansone
correlatori
Ing. Gabriele Piantadosi, Ph.D.
Ing. Stefano Marrone
candidato
Giuseppe Caliendo
Matr. M63/639
An Android Application for Image Super Resolution Through Deep
Learning
Anno Accademico 2017/18

Context
Improvement of image quality through Super Resolution
Image Super Resolution (SR) with a single image input
Deep Learning to improve Super Resolution
Super Resolution in Mobile Environments
Contribution
Comparisons of different State-of-Art approaches to SR
Implementation of analyzed models in Tensorflow
Design and Development of an Android Application for Image Super
Resolution based on Tensorflow and Tensorflow Mobile.

Super Resolution aims to recover a high-resolution (HR) version from a
low-resolution (LR) input.
SR approaches can be categorized as:
SISO : Single Input Single Output
MISO : Multiple Input Single Output
MIMO: Multiple Input Multiple Output
Super resolution is used when high-frequency details are
widely desired (es. Medical Imaging, Video
Surveillance, Media, Photography, …)
Single-image approaches: one single LR
image input – one single HR image output
Multi-frame approaches: merge more LR
images to produce a single HR one
Super Resolution

A LR Image is an acquired/processed image hindered by a variety of factors:
Imperfections of acquired devices
Instability of the observed scene
Presence of noise (usually White)
A LR image Y can be matematically modeled as:
Where:
Y : LR image
X : Original image (HR image)
D : Downsampling Process
H : Blurring Process
n : AWGN
Single Image Super Resolution (SISR)
෡𝑿
𝒀
≈
𝑿Approximating
Ill-posed Problem

Tries to estimate the HR version of an LR image, learning from «Examples»
Examples: associations between subparts of LR images and HR image in a Dataset.
Generates optimized dictionaries through Trainable Machine Learning models
Output images composed of the HR version of the recognized “Concepts”
Depends on the size and on the variety of the dataset (may require a Large Dataset)
Deep Learning
Example-Based SISR
✓ Easy to Train
✓ Predetermined features
Classic ML Models:
× Handmade features extraction
× Task Dependence
× Predetermined Examples

Goals:
Automatic features extraction
Optimization of the patch extraction
(examples generation) and
reconstruction phases
Steps:
1. Preprocessing:
▪ Upsampling of the input to the desired output size
through Interpolation
▪ Conversion input space from RGB to YCbCr. (Optional)
▪ Extraction of luminance component (Optional)
2. Patch extraction and representation
3. Generation of non-linear mapping: correspondence
between upscaled LR patches and HR patches
4. Reconstruction: Average of the HR version of the
overlapping patches Note on YCbCr space:
▪ More efficient train
▪ Luminance contains the most of the details
Early Deep Model for SISR: SRCNN [1]
[1]: Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Image super-
resolution using deep convolutional networks.
YRGB
Cb Cr

SISR with Deep Residual Learning
SRCNN problems:
As the depth increases, performance tends to deteriorate
(information degradation for vanishing gradient)
Receptive field limited to a single convolution
Training converges too slowly
The network only works for a single scale (fixed input size)
Deep Residual Learning:
Models are forced to learn residual
mapping functions.
Training is more efficient
Information input is continuously
powered within
the network
Implemented Models:
DRRN[3]:VDSR[2]
[2]: Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. Accurate image super-resolution using
very deep convolutional networks
[3]:Ying Tai, Jian Yang, and Xiaoming Liu. Image super-resolution via deep recursive
residual network.

From Training to Deployment
Competitors:
Interpolation Bicubic
Anchored Neighborhood Regression[4]
SRCNN (pretrained)
Datasets
Dataset #images #patch Tot.
Patch.
Batch
Size
#Batch
Train S. 800 => 4800
(after aug.)
8 38400 64 600
Valid. S. 100 16 1600 400 4
Test S. 214
(14+100+10)
-- -- -- --
Goals:
Produce a trained version of DRRN and VDSR
Compare with competitors
Porting of deep models on Android
[4]: Timofte, R., De Smet, V., & Van Gool, L. (2013). Anchored neighborhood regression for fast example-based super-resolution.
Training Set (TS) and Validation Set (VS):
DIV2K Dataset
Test Set (TS): Set14 (14 images), BSD100 (100
images), Urban100 Datasets (100 images)
A data augmentation was applied
From each image for TS and VS, 64x64 patches
were extracted, in order to increase the efficiency
of the training phase

DL Architecture:
Cloud Service Google Colab
GPU Hardware: NVIDIA Tesla K80
(Google Colab) with 2496 CUDA cores
and 12GB RAM
CuDNN: CUDA primitive for Deep Neural
Networks
Jupyter Notebook: Python Environment
DL Framework: Tensorflow 1.9
Experimental Setup
Deployment Architecture:
Android: TargetSDK 28 -minSDK 24
(Android 7.0 – 9.0)
NDK 14 (to compile
Tensorflow on Android)
Tensorflow Mobile 1.9 compiled for
ARMv7-a Application Processor
Testing on a Xiaomi Redmi Note 4X
Training Parameters:
Loss Function: MSE
Optimizer: Adam
Initial Learning Rate: 0.05

Evaluation metrics
𝑴𝑺𝑬 =
𝟏
𝑴𝑵
෍
𝒊=𝟎
𝑴−𝟏
෍
𝒋=𝟎
𝑵−𝟏
[𝑿 𝒊, 𝒋 − 𝒀 𝒊, 𝒋 ] 𝟐
, 𝑷𝑺𝑵𝑹 = 𝟐𝟎 ∙ 𝒍𝒐𝒈 𝟏𝟎(
𝑴𝑨𝑿 𝑿
𝑴𝑺𝑬
)
𝑺𝑺𝑰𝑴 𝒙, 𝒚 =
(𝟐µ 𝒙µ 𝒚 + 𝒄 𝟏)(𝟐σ 𝒙𝒚 + 𝒄 𝟐)
µ 𝒙
𝟐
+ µ 𝒚
𝟐
+ 𝒄 𝟏 σ 𝒙
𝟐
+ σ 𝒚
𝟐
+ 𝒄 𝟐
PSNR:
Ratio between the maximum power of a signal and the power of noise
Noise is represented as MSE
SSIM:
Degradation of image quality Quantification
Based on visible structures in the image
Calculated on various windows of images.
Qualitative:
Based on Google Survey consisting in collections of 8 images generated by chosen
SISR models
Proposed to different users (90 submissions achieved) via Facebook
Occurence values were counted

Comparison between models output
Zebra from BSD100 Dataset.
a) Ground Truth b) Bicubic c) ANR
d) SRCNN e) DRRN f) VDSR
Experimental Results
VDSR behavior with faces never seen during
training
Quantitative comparisons of models
Dataset images: SISR on Test Set Real image: No Dataset
a)
b)
a) Original Image b) Output Image

SuRE! is a multi-thread android application for super-resolution on mobile devices. It works both with images
acquired from the device sensor and device storage. It implements SISR using the VDSR and DRRN models
Elapsed time (in seconds) of the
models on Xiaomi redmi Note4X
App Home Original Image Without SISR Whit SISR
Find me on
Google Play:
SuRE!

SISR can be successfully performed on Android, both in terms of quality and performance
In terms of PSNR, the VDSR model is the best on all used test set, while, in terms of SSIM,
the SRCNN model is better on the SET14 dataset
The results of the survey reflect those of the PSNR (except for SRCNN)
DRRN requires a lower computational effort (resulting in lower execution time on tested
smartphone)
Conclusions
Future works
Improve the model to achieve better results through an Attention-based approach
and by using larger datasets
Explore if pre or post processing can improves the result (i.e.: sharpening)
Improve the Android app user experience
Provide real-time super resolution to live camera stream

An Android App for Image Super Resolution using Deep Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to An Android App for Image Super Resolution using Deep Learning

Similar to An Android App for Image Super Resolution using Deep Learning (20)

Recently uploaded

Recently uploaded (20)

An Android App for Image Super Resolution using Deep Learning