SlideShare a Scribd company logo
1 of 53
Download to read offline
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (1/52) Presented by 솔루션사업부 이명규
2020/07/06
Landscape Summary of
Super Resolution Task
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (2/52)
I N D E X
01
02
03
Introduction
Simple Review
Featured Papers
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (3/52)
Introduction
Part 01
1. SR Task 소개
2. Taxanomy
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (4/52)
↳
SR Task 소개
1-1
• LR(Low-resolution) image 또는 video를
HR(High-resolution)로 복구하는 문제
• SISR(Single Image SR)과 MISR(Multiple Image SR)로 구분
Problem Definition
𝒚𝑳𝑹 = 𝒙⨂𝒌 𝒔 + 𝒏.
GT* HR image
Blur Kernel
Down
Sampling Noise
*GT: Ground Truth
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (5/52)
↳
• SR Algorithms
• Interpolation based method
(Bicubic, bilinear, Nearest neighbor etc….)
=> Just “upscaling” image
• Reconstruction based method
• Deep Learning based method
Problem Definition
https://bskyvision.com/531
SR Task 소개
1-1
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (6/52)
↳
Applications
Super Resolution Applications in Modern Digital Image Processing (IJCA 2016)
▲ SR for Satellite Image ▲ SR for Medical Imaging
▲ SR for Astrological Studies ▲ SR for Microscopy
Image Processing
SR Task 소개
1-1
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (7/52)
A Deep Journey into Super-resolution: A Survey
Taxonomy: Overview
1-2
SISR
GAN
Models
Linear
Networks
Early
Upsampling
Designs
Late
Upsampling
Designs
SRCNN
VDSR
DnCNN
IrCNN
FSRCNN
ESPCN
Residual
Networks
Single-Stage
Networks
Multi-Stage
Networks
EDSR
CARN
Form
ResNet
BTSRN
REDNet
Recursive
Networks
DRCN
DRRN
MemNet
Progressive
Reconstruction
Designs
SCN
LapSRN
Densely
Connected
Networks
SR-
DenseNet
RDN
D-DBPN
Multi-Branch
Designs
CNF
CMSC
IDN
Attention
Based
Networks
SelNet
RCAN
SRRAM
DRLN
ZSSR
SRMD
Multiple
Degradation
Handling
Networks
SRGAN
Enhance
Net
SRFeat
ESRGAN
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (8/52)
↳
A Deep Journey into Super-resolution: A Survey
Linear Networks
Taxonomy: Overview
1-2
• Early & Late Upsamling으로 구분: “어떻게 이미지를 키울 것인가?”
① Early Upsampling: Interpolation Based kernel로 이미지를 전처리해 모델에 피딩
➢ Nearest-Neighbor, Bilinear, Bicubic Interpolation(‘SRCNN’)
② Late Upsampling: Transposed Conv를 사용해 Upsampling Operation 자체를 모델링
➢ Transposed Convolution Layer(‘FSRCNN’)
➢ Sub-pixel layer(‘ESPCN’)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (9/52)
↳
A Deep Journey into Super-resolution: A Survey
Residual Networks
Taxonomy: Overview
1-2
• Single-Stage와 Multi-Stage로 구분
• Bicubic Upsampled LR 이미지와 HR 이미지의 Residual 정보에 대해 학습
• Global/Local Residual Learning의 차이: Shortcut Connection으로 Input/output을
연결할지, 각각 다른 depth를 가진 레이어끼리 연결할지의 차이
• 만들어야 하는 정보량이 적어 깊으면서도 안정적인 학습 가능
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (10/52)
↳ Attention Based Networks
Taxonomy: Overview
1-2
https://wikidocs.net/22893
• “어떻게 중요한 정보를 부각시킬 것인가?”
• 고정된 크기의 벡터에 모든 정보를 압축하다 보니 Vanishing Gradient 문제 발생
• 단순히 Resblock을 쌓는 방식은 CNN의 RF size가 상대적으로 작아 Feature들이
담고 있는 Local 및 Global 정보가 동등하게 처리되는 문제 존재
• 𝑹𝒆𝒄𝒆𝒑𝒕𝒊𝒗𝒆 𝑭𝒊𝒆𝒍𝒅 : Size of the receptive field can be
reversed from the output network size.
• 𝑰𝒏𝒑𝒖𝒕_𝒔𝒊𝒛𝒆 : Size of the sense node of the output node
• 𝑲_𝒔𝒕𝒓𝒊𝒅𝒆 : Moving step size of the convolution kernel.
• 𝑲_𝒔𝒊𝒛𝒆 : Size of the convolution kernel between input and
output
𝑰𝒏𝒑𝒖𝒕_𝒔𝒊𝒛𝒆 = (𝒐𝒖𝒕𝒑𝒖𝒕_𝒔𝒊𝒛𝒆 − 𝟏) × 𝑲_𝒔𝒕𝒓𝒊𝒅𝒆 + 𝑲_𝒔𝒊𝒛𝒆.
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (11/52)
↳ Attention Based Networks
Taxonomy: Overview
1-2
https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/
• Receptive Field와 Filter의 차이
• Filter(=Kernel, Weights): Detect하고자 하는 feature를 정의하고 있는 Matrix
• RF: 이미지 위를 필터가 훑으며 Detect한 실제 feature
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (12/52)
↳ Attention Based Networks
Taxonomy: Overview
1-2
Convolutional Block Attention Module (CBAM)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (13/52)
↳ Attention Based Networks
Taxonomy: Overview
1-2
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (14/52)
↳
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
GAN Based Models
Taxonomy: Overview
1-2
• 질감의 디테일에 더욱 집중 (사람이 보기에 그럴싸한 이미지 생성)
• 다른 연구들은 MSE를 최소화하는 데 집중해 High Frequency Details정보 손실
• 따라서 VGG-based Content Loss와 Adversarial Loss로 구성된 Perception Loss 사용
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (15/52)
↳
• 정량평가
• PSNR(Peak Signal-to-Noise Ratio): 최대 신호 대 잡음 비
• SSIM(Structural SIMilarity): 구조적 유사도 지수
• VMAF(Video Multi-method Assessment Fusion, Netflix)
• 정성평가
• MOS(Mean Opinion Score)
Image Quality Assessment
Taxonomy: Evaluation
1-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (16/52)
↳
JPEG Compression(low)
MSE: 18.41, PSNR: 35.47
JPEG Compression(middle)
MSE: 9.87, PSNR: 38.18
Original
참고: “db’데시벨’과 ‘로그 스케일’ 이야기”
Image Quality Assessment - PSNR
Taxonomy: Evaluation
1-2
𝑴𝑺𝑬 =
𝟏
𝒎𝒏
෍
𝒊=𝟎
𝒎−𝟏
෍
𝒋=𝟎
𝒏−𝟏
[𝑰 𝒊, 𝒋 − ෠
𝑰(𝒊, 𝒋)]𝟐
𝑷𝑺𝑵𝑹 = 𝟏𝟎𝒍𝒐𝒈𝟏𝟎
𝑴𝑨𝑿𝑰
𝟐
𝑴𝑺𝑬
= 𝟏𝟎𝒍𝒐𝒈𝟏𝟎
𝑴𝑨𝑿𝑰
𝑴𝑺𝑬
= 𝟐𝟎𝒍𝒐𝒈𝟏𝟎 𝑴𝑨𝑿𝑰 − 𝟏𝟎𝒍𝒐𝒈𝟏𝟎 𝑴𝑺𝑬 ,
𝒘𝒉𝒆𝒓𝒆 𝑴𝑨𝑿𝑰 = 𝒎𝒂𝒙𝒊𝒎𝒖𝒎 𝒗𝒂𝒍𝒖𝒆 𝒐𝒇 𝒊𝒎𝒂𝒈𝒆
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (17/52)
↳ Image Quality Assessment - SSIM
Taxonomy: Evaluation
1-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (18/52)
↳ Image Quality Assessment - SSIM
Taxonomy: Evaluation
1-2
➢skimage.measure.compare_ssim
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (19/52)
↳
Taxonomy: Architectures
1-2
Deep Learning for Image Super-resolution: A Survey
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (20/52)
Deep Learning for Image Super-resolution: A Survey
Taxonomy: Performances
1-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (21/52)
Deep Learning for Image Super-resolution: A Survey
Taxonomy: Datasets
1-2
Single Image
Dataset
Video
Dataset
Dataset Usage Link 비고
Vimeo-90k Train / Validation Link 90k HQ videos
Vid4 Test Link -
Xiph HD - Link Old Videos
Ultra-Video Group HD - Link -
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (22/52)
Simple Review
Part 02
1. SISR(Single Image Super Resolution) Summary
2. VSR(Video Super Resolution) Summary: TBA
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (23/52)
SISR Summary
(Single Image Super Resolution)
Part 2-1
1. SRCNN (ECCV 2014)
2. FSRCNN (ECCV 2016)
3. VDSR (CVPR2016)
4. SRResNet (CVPR2017)
5. SRGAN (CVPR2017)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (24/52)
↳
Paper History
Featured
Featured
+SRResNet (CVPR’17)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (25/52)
↳
SRCNN
2-1-1
• DL을 SISR에 사용한 첫 논문
→ 3-layer CNN, MSE loss
• 전통적인 Interpolation 방식 대비 월등한 성능
SRCNN (ECCV 2014)
Image Super-Resolution Using Deep Convolutional Networks (ECCV 2014)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (26/52)
↳
• No Preprocessing, Transposed Conv Layer 적용해 Upsampling
FSRCNN (ECCV 2016)
Accelerating the Super-Resolution Convolutional Neural Network (ECCV2016)
F S R C N N
2-1-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (27/52)
↳
• Too Shallow했던 SRCNN의 단점 개선,
안정적이면서 빠른 학습
→ VGGNet based deep residual learning+MSE
“Adjustable gradient clipping for maximal boost in
speed while suppressing exploding gradients”
• 단순 LR→HR mapping 학습이 아닌,
bicubic upsampled LR 이미지와
HR 이미지의 잔차(residual)를 학습하는 방식
VDSR (CVPR 2016)
Image Super-Resolution Using Deep Convolutional Networks (ECCV 2014)
=>이후 DRCN(Deeply-recursive CNN), SRResNet,
DRRN의 제안으로 이어짐
V D S R
2-1-3
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (28/52)
↳
• ResNet 구조를 그대로 SR Task에 적용해 깊으면서도 안정적인
학습 수행 (후술할 SRGAN과 함께 제안)
• Batch Normalization 사용
• SRDenseNet(ICCV’17), Residual DenseNet(CVPR’18)도 비슷한 아이디어
• ResNet에 ESPCN의 Sub-Pixel Layer 적용
SRResNet (CVPR 2017)
Image Super-Resolution Using Deep Convolutional Networks (ECCV 2014)
SRResNet
2-1-4
=>이후 EDSR의 제안으로 이어짐
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (29/52)
↳
• GAN을 활용해 인간이 보기에 그럴싸한 이미지를 생성하려는 시도
• MSE 기반의 Content Loss는 blurry한 이미지를 생성하므로,
이미지의 텍스처 선명도를 개선하기 위해 Perceptual loss 사용을 제안
→ 𝑷𝒆𝒓𝒄𝒆𝒑𝒕𝒖𝒂𝒍 𝒍𝒐𝒔𝒔 = 𝑽𝑮𝑮 𝑪𝒐𝒏𝒕𝒆𝒏𝒕 𝒍𝒐𝒔𝒔 + 𝑮𝑨𝑵 𝒍𝒐𝒔𝒔(𝒂𝒅𝒗𝒆𝒓𝒔𝒂𝒓𝒊𝒂𝒍)
MSE loss 대신 style transfer에서 사용되는 VGG loss로 교체
SRGAN (CVPR 2017)
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (CVPR 2017)
Z-Vector
G Fake
Real
D
“Fake”
“Real”
SRGAN
2-1-5
𝑰𝑺𝑹
= 𝑰𝑽𝑮𝑮
𝑺𝑹
+ 𝑰𝑮𝒆𝒏
𝑺𝑹
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (30/52)
↳
SRGAN (CVPR 2017)
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (CVPR 2017)
SRGAN
2-1-5
• Difference Between MSE-based and VGG-based Content Loss
𝑴𝑺𝑬 𝑳𝒐𝒔𝒔 =
𝟏
𝒏
෍
𝒊=𝟏
𝒏
(𝒚𝒊 − 𝒕𝒊)𝟐 𝒑𝒊𝒙𝒆𝒍 − 𝒘𝒊𝒔𝒆 𝑴𝑺𝑬 𝑳𝒐𝒔𝒔 =
𝟏
𝒓𝟐𝑾𝑯
෍
𝒙=𝟏
𝒓𝑾
෍
𝒚=𝟏
𝒓𝑯
𝑰𝒙,𝒚
𝑯𝑹 − 𝑮𝜽𝑮
𝑰𝑳𝑹
𝒙,𝒚
𝟐
𝒘𝒉𝒆𝒓𝒆 𝜽𝑮 = 𝑾𝑳; 𝒃𝑳 , 𝒓 = 𝒅𝒐𝒘𝒏𝒔𝒂𝒎𝒑𝒍𝒊𝒏𝒈 𝒇𝒂𝒄𝒕𝒐𝒓, 𝑳 = 𝑳𝒂𝒚𝒆𝒓
𝑽𝑮𝑮 𝑳𝒐𝒔𝒔 =
𝟏
𝑾𝒊,𝒋𝑯𝒊,𝒋
෍
𝒙=𝟏
𝑾𝒊,𝒋
෍
𝒚=𝟏
𝑯𝒊,𝒋
𝝓𝒊,𝒋(𝑰𝑯𝑹)𝒙,𝒚−𝝓𝒊,𝒋(𝑮𝜽𝑮
(𝑰𝑳𝑹))𝒙,𝒚
𝟐
𝑾𝒉𝒆𝒓𝒆 𝝓𝒊,𝒋 = 𝒇𝒆𝒂𝒕𝒖𝒓𝒆 𝒎𝒂𝒑 𝒐𝒑𝒕𝒂𝒊𝒏𝒆𝒅 𝒃𝒚 𝒕𝒉𝒆 𝒋 − 𝒕𝒉 𝒄𝒐𝒏𝒗, (before max-pool)
𝑾𝒊,𝒋𝑯𝒊,𝒋 = 𝒅𝒊𝒎 𝒐𝒇 𝒕𝒉𝒆 𝒓𝒆𝒔𝒑𝒆𝒄𝒕𝒊𝒗𝒆 𝒇𝒆𝒂𝒕𝒖𝒓𝒆 𝒎𝒂𝒑𝒔 𝒘𝒊𝒕𝒉𝒊𝒏 𝒕𝒉𝒆 𝑵𝒆𝒕𝒘𝒐𝒓𝒌.
“Feature Level에서
Loss Calculation”
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (31/52)
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (CVPR 2017)
SRGAN
2-1-5
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (32/52)
Featured Papers
Part 03
1. Paper 1 (“EDSR”)
(Enhanced Deep Residual Networks for
Single Image Super-Resolution)
2. Paper 2 (“SAN”, SOTA)
(Second-order Attention Network for
Single Image Super-Resolution)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (33/52)
Paper 1:
Enhanced Deep Residual Networks for
Single Image Super-Resolution(“EDSR”)
Part 3-1
1. Introduction
2. Architecture Overview
3. Experiment & Conclusion
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (34/52)
↳
Introduction
3-1-1
Limitations of Previous Works
• ResNet을 그대로 사용 시(SRResnet) SISR에
좋은 성능을 보여 주지만 다음의 이슈가 존재
• ResNet과 같은 분류기들은 Classification과 같은 High-Level Task를 위한 구조
→ SR은 Low-Level Task
• ResNet에 적용된 BN으로 인해 네트워크의 flexibility 저하
→ 학습 시간이 오래 걸리는 문제
▲ Batch Normalization
Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (35/52)
↳
Contributions
• 학습 시 메모리 사용량 40% 절감
→ BN layer를 제거한 new residual block 제안
→ 더 깊은 네트워크를 학습할 수 있게 됨
• Single Scale Model(EDSR)과 Multi Scale Model(MDSR) 제안
→ 각각 single scale(x2, x3, x4)을 따로따로 학습(EDSR)하거나
동시에 여러 scale에 대해 학습(MDSR)
Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
Introduction
3-1-1
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (36/52)
↳
Architecture Overview
3-1-2
Model Overview
▲ EDSR ▲ MDSR
Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (37/52)
↳
Model Overview
• BN이 빠진 새 residual block 구조 제안
• Residual block 최종 feature matrix에
0.1의 constant를 곱해 줌
→ 안정적인 학습 도모
• x3, x4배 scaling 학습에서는 x2 모델을
기반으로 transfer learning
Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
Architecture Overview
3-1-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (38/52)
↳
Evaluation results
3-1-3 Experiment & Conclusion
Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (39/52)
↳
• Conclusion
• 종전 방법 대비 학습 시 메모리를 40%까지 줄이는 방법을 제안
• Single Scale(EDSR)과 Multi Scale(MDSR)에서 작동하는
SR 모델의 제안
• Limitations
• CNN의 receptive field size가 상대적으로 작음. (즉, local patch에만 집중)
→ 이미지에서 보다 넓은 영역을 고려하지 못함.
• Feature들이 담고 있는 local 혹은 global 정보가 동등하게 처리됨
→ 이후 Dilated conv, spatial or channel-wise attention 등의 제안으로 이어짐
Conclusion & Limitations
Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
3-1-3 Experiment & Conclusion
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (40/52)
Paper 2:
Second-order Attention Network for
Single Image Super-Resolution(“SAN”)
Part 3-2
1. Introduction
2. Architecture Overview
3. Experiment & Conclusion
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (41/52)
↳
Introduction
3-2-1
Limitations of Previous Works
• 기존 모델들은 깊거나 넓은 구조 디자인에만 초점을 맞춤
→ 따라서 레이어 간의 관계를 탐색하지 않아 네트워크 전체의 표현력 저하
• 대부분 LR 이미지의 모든 정보를 사용하지 않아 낮은 성능을 보여 옴
• 학습 속도가 본 논문 대비 상대적으로 느림
Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (42/52)
↳
Contributions
• Second order statistics를 활용해 레이어 간 feature 상호 의존성 학습
• LSRAG(local-source residual attention groups) 구조를 제안해 LR 이미지
정보를 적극 활용
→ 풍부한 low-frequency 정보
Introduction
3-2-1
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (43/52)
↳
Architecture Overview
3-2-2
Model Overview
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (44/52)
↳ Detailed view of Model:
RL-NL Module
• 목적 : 영상을 잘게 쪼개어 영상 전체를 보기 위함
→ 영상을 4등분 후 각 영역에 대해 non-local module 적용 (큰 해상도에서도 유리)
• SSRG모듈 전후로 수행되며, high-level에서 넓은 범위의 정보들을 모으는 역할
→ Global level non-local 연산은 인풋 사이즈가 클 경우 연산량 증가
(Region-Level Non-Local)
Architecture Overview
3-2-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (45/52)
↳ Detailed view of Model:
LSRAG Module
• 목적 : Feature inter-dependencies를 잘 유지하기 위함
→ 영상을 4등분 후 각 영역에 대해 non-local module 적용 (큰 해상도에서도 유리)
• Simplified residual block들로 구성
(Local Source
Residual Attention Group)
Architecture Overview
3-2-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (46/52)
↳ Detailed view of Model:
SOCA Module
• 목적 : 공분산 정규화를 통한 모델의 Discriminative representation 능력 향상
→ Attention을 통해 네트워크가 더욱 중요한 정보를 갖는 feature에 가중치를 더 두도록 함
• 기존 Attention 알고리즘들은 GAP을 이용한 1차 statistics 정보만 활용
→ 기존 방식들은 1st order statistics(=average) 이상의 정보를 활용하지 않아
모델의 Discriminative representation 능력이 저하됨
→ 따라서 Covariance Normalization을 거쳐 channel attention 수행
(Second-Order
Channel Attention)
Architecture Overview
3-2-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (47/52)
↳ Detailed view of Model:
SOCA Module
• GAP(Global Average Pooling) :
각 feature map의 노드들로부터
단순히 평균을 추출해 차원 축소
e.g.)
(𝟏+𝟗+𝟔+𝟒+𝟓+𝟒+𝟕+𝟖+𝟓+𝟏+𝟐+𝟗+𝟔+𝟕+𝟔+𝟎)
𝟏𝟔
= 𝟓
Replace to GCP(Global Covariance Pooling)
Architecture Overview
3-2-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (48/52)
↳ Detailed view of Model:
SOCA Module
• Covariance Normalization:
1. Reshape 𝑯 × 𝑾 × 𝑪 feature map 𝑭 = [𝐟𝟏, … , 𝐟𝒄] to feature matrix 𝑿.
(𝑿 has 𝒔 = 𝑾𝑯 features of 𝑪-dim)
2. Compute sample covariance matrix 𝚺.
𝚺 = 𝐗 ҧ
𝐈𝐗𝑻
, where ҧ
𝐈 =
𝟏
𝒔
(𝐈 −
𝟏
𝒔
𝟏). (𝑰=𝒔 × 𝒔 Identity matrix, 𝟏=matrix of all ones)
Architecture Overview
3-2-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (49/52)
↳ Detailed view of Model:
SOCA Module
3. 𝚺 is symmetric positive semi-definte, thus has EIG as follows:
𝚺 = 𝐔𝚲𝐔𝐓, 𝒘𝒉𝒆𝒓𝒆 𝑼 = 𝒐𝒓𝒕𝒉𝒐𝒈𝒐𝒏𝒂𝒍 𝒎𝒂𝒕𝒓𝒊𝒙, 𝜦 = 𝒅𝒊𝒂𝒈 𝒎𝒂𝒕𝒓𝒊𝒙 𝒘𝒊𝒕𝒉 𝒆𝒊𝒈𝒆𝒏𝒗𝒂𝒍𝒖𝒆𝒔.
4. So, covariance normalization can be coverted to power of 𝒆𝒊𝒈𝒆𝒏𝒗𝒂𝒍𝒖𝒆𝒔.
෡
𝐘 = 𝚺𝜶
= 𝐔𝚲𝜶
𝐔𝑻
- 𝜶는 양의 실수로, 1일 경우 정규화를 수행하지 않음. (𝛼 =½에서 잘 작동함을 확인)
- 𝜶<1인 경우 1.0보다 큰 eigenvalue는 non-linear하게 축소하고, 반대는 늘림.
Architecture Overview
3-2-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (50/52)
↳ Detailed view of Model:
SOCA Module
• Normalized covariance matrix ෠
𝐘는 channel-wise feature들 간의
상관 관계를 characterize하는 역할
• Covariance normalized ෡
𝐘 를 이용해 channel level에서 pooling을 수행
𝒍𝒆𝒕 ෠
𝐘 = 𝒚𝟏, … , 𝒚𝑪 , 𝒄𝒉𝒂𝒏𝒏𝒆𝒍𝒘𝒊𝒔𝒆 𝒔𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄𝒔 𝒛 = 𝑯𝑮𝑪𝑷 𝒚𝒄 =
𝟏
𝑪
෍
𝒊
𝑪
𝒚𝑪(𝒊)
Architecture Overview
3-2-2
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (51/52)
↳
Evaluation results
3-2-3 Experiment & Conclusion
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (52/52)
↳
• 다양한 모듈을 제안해 SISR의 PSNR 성능을 높임
• SSRG 모듈을 통해 low-frequency 정보들을 충분히 활용
• RL-NL 모듈을 통해 long-distance spatial contextual information을 활용
• SOCA 모듈을 통해 global covariance pooling을 수행하고, 레이어 간의
dependencies를 학습
• Second-order channel attention을 통해 discriminative representation학습에 집중
• 네트워크 규모 대비 낮은 수의 파라미터 개수
Conclusion & Limitations
3-2-3 Experiment & Conclusion
ESPRESOMEDIA 이명규
Landscape Summary of Super Resolution Task (53/52)
Thank you for Listening.
Email : brstar96@espresomedia.com
Mobile : +82-10-8234-3179

More Related Content

What's hot

Image colorization
Image colorizationImage colorization
Image colorizationYash Saraf
 
introduction to Digital Image Processing
introduction to Digital Image Processingintroduction to Digital Image Processing
introduction to Digital Image Processingnikesh gadare
 
Understanding neural radiance fields
Understanding neural radiance fieldsUnderstanding neural radiance fields
Understanding neural radiance fieldsVarun Bhaseen
 
Reinforcement Learning Q-Learning
Reinforcement Learning   Q-Learning Reinforcement Learning   Q-Learning
Reinforcement Learning Q-Learning Melaku Eneayehu
 
Occlusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsOcclusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsEditor IJCATR
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overviewjins0618
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksYunjey Choi
 
Facial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approachFacial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approachAshwinRachha
 
Stochastic gradient descent and its tuning
Stochastic gradient descent and its tuningStochastic gradient descent and its tuning
Stochastic gradient descent and its tuningArsalan Qadri
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANShyam Krishna Khadka
 
Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019
Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019
Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019Unity Technologies
 
Weapon detection using artificial intelligence and deep learning for security...
Weapon detection using artificial intelligence and deep learning for security...Weapon detection using artificial intelligence and deep learning for security...
Weapon detection using artificial intelligence and deep learning for security...Venkat Projects
 
Wavelet based image fusion
Wavelet based image fusionWavelet based image fusion
Wavelet based image fusionUmed Paliwal
 
Deep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defenseDeep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defenseWei Yang
 
An introduction to Realistic Ocean Rendering through FFT - Fabio Suriano - Co...
An introduction to Realistic Ocean Rendering through FFT - Fabio Suriano - Co...An introduction to Realistic Ocean Rendering through FFT - Fabio Suriano - Co...
An introduction to Realistic Ocean Rendering through FFT - Fabio Suriano - Co...Codemotion
 
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...Edureka!
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksUsman Qayyum
 

What's hot (20)

Image colorization
Image colorizationImage colorization
Image colorization
 
Dcgan
DcganDcgan
Dcgan
 
introduction to Digital Image Processing
introduction to Digital Image Processingintroduction to Digital Image Processing
introduction to Digital Image Processing
 
Understanding neural radiance fields
Understanding neural radiance fieldsUnderstanding neural radiance fields
Understanding neural radiance fields
 
Reinforcement Learning Q-Learning
Reinforcement Learning   Q-Learning Reinforcement Learning   Q-Learning
Reinforcement Learning Q-Learning
 
Occlusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsOcclusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance Applications
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Facial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approachFacial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approach
 
Stochastic gradient descent and its tuning
Stochastic gradient descent and its tuningStochastic gradient descent and its tuning
Stochastic gradient descent and its tuning
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGAN
 
Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019
Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019
Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019
 
Weapon detection using artificial intelligence and deep learning for security...
Weapon detection using artificial intelligence and deep learning for security...Weapon detection using artificial intelligence and deep learning for security...
Weapon detection using artificial intelligence and deep learning for security...
 
Wavelet based image fusion
Wavelet based image fusionWavelet based image fusion
Wavelet based image fusion
 
Super resolution
Super resolutionSuper resolution
Super resolution
 
Deep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defenseDeep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defense
 
An introduction to Realistic Ocean Rendering through FFT - Fabio Suriano - Co...
An introduction to Realistic Ocean Rendering through FFT - Fabio Suriano - Co...An introduction to Realistic Ocean Rendering through FFT - Fabio Suriano - Co...
An introduction to Realistic Ocean Rendering through FFT - Fabio Suriano - Co...
 
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 

Similar to Survey of Super Resolution Task (SISR Only)

Enhanced Deep Residual Networks for Single Image Super-Resolution
Enhanced Deep Residual Networks for Single Image Super-ResolutionEnhanced Deep Residual Networks for Single Image Super-Resolution
Enhanced Deep Residual Networks for Single Image Super-ResolutionNAVER Engineering
 
Final report
Final reportFinal report
Final reportUC Davis
 
Gpgpu presentation final
Gpgpu presentation finalGpgpu presentation final
Gpgpu presentation finalUC Davis
 
Generating super resolution images using transformers
Generating super resolution images using transformersGenerating super resolution images using transformers
Generating super resolution images using transformersNEERAJ BAGHEL
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
Super resolution-review
Super resolution-reviewSuper resolution-review
Super resolution-reviewWoojin Jeong
 
Single Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional NetworksSingle Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional NetworksGreeshma M.S.R
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodIRJET Journal
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Hansol Kang
 
Survey on Single image Super Resolution Techniques
Survey on Single image Super Resolution TechniquesSurvey on Single image Super Resolution Techniques
Survey on Single image Super Resolution TechniquesIOSR Journals
 
Survey on Single image Super Resolution Techniques
Survey on Single image Super Resolution TechniquesSurvey on Single image Super Resolution Techniques
Survey on Single image Super Resolution TechniquesIOSR Journals
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Jihong Kang
 
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...KAIST
 
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...KAIST
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...Jacky Liu
 

Similar to Survey of Super Resolution Task (SISR Only) (20)

Enhanced Deep Residual Networks for Single Image Super-Resolution
Enhanced Deep Residual Networks for Single Image Super-ResolutionEnhanced Deep Residual Networks for Single Image Super-Resolution
Enhanced Deep Residual Networks for Single Image Super-Resolution
 
2021 05-04-u2-net
2021 05-04-u2-net2021 05-04-u2-net
2021 05-04-u2-net
 
Final report
Final reportFinal report
Final report
 
Gpgpu presentation final
Gpgpu presentation finalGpgpu presentation final
Gpgpu presentation final
 
Generating super resolution images using transformers
Generating super resolution images using transformersGenerating super resolution images using transformers
Generating super resolution images using transformers
 
2007 asprs
2007 asprs2007 asprs
2007 asprs
 
WT in IP.ppt
WT in IP.pptWT in IP.ppt
WT in IP.ppt
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
Lightspeed SIGGRAPH talk
Lightspeed SIGGRAPH talkLightspeed SIGGRAPH talk
Lightspeed SIGGRAPH talk
 
Super resolution-review
Super resolution-reviewSuper resolution-review
Super resolution-review
 
Single Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional NetworksSingle Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional Networks
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain Method
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
 
Survey on Single image Super Resolution Techniques
Survey on Single image Super Resolution TechniquesSurvey on Single image Super Resolution Techniques
Survey on Single image Super Resolution Techniques
 
Survey on Single image Super Resolution Techniques
Survey on Single image Super Resolution TechniquesSurvey on Single image Super Resolution Techniques
Survey on Single image Super Resolution Techniques
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
 
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...
 

More from MYEONGGYU LEE

(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...MYEONGGYU LEE
 
Survey of HDR & Tone Mapping Task
Survey of HDR & Tone Mapping TaskSurvey of HDR & Tone Mapping Task
Survey of HDR & Tone Mapping TaskMYEONGGYU LEE
 
Simple Review of Single Image Super Resolution Task
Simple Review of Single Image Super Resolution TaskSimple Review of Single Image Super Resolution Task
Simple Review of Single Image Super Resolution TaskMYEONGGYU LEE
 
(Paper Review)Few-Shot Adversarial Learning of Realistic Neural Talking Head ...
(Paper Review)Few-Shot Adversarial Learning of Realistic Neural Talking Head ...(Paper Review)Few-Shot Adversarial Learning of Realistic Neural Talking Head ...
(Paper Review)Few-Shot Adversarial Learning of Realistic Neural Talking Head ...MYEONGGYU LEE
 
(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...
(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...
(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...MYEONGGYU LEE
 
(Book Summary) Classification and ensemble(book review)
(Book Summary) Classification and ensemble(book review)(Book Summary) Classification and ensemble(book review)
(Book Summary) Classification and ensemble(book review)MYEONGGYU LEE
 
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...MYEONGGYU LEE
 
(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...
(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...
(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...MYEONGGYU LEE
 
(Paper Review)Neural 3D mesh renderer
(Paper Review)Neural 3D mesh renderer(Paper Review)Neural 3D mesh renderer
(Paper Review)Neural 3D mesh rendererMYEONGGYU LEE
 
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...(Paper Review)3D shape reconstruction from sketches via multi view convolutio...
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...MYEONGGYU LEE
 
(Paper Review)Image to image translation with conditional adversarial network...
(Paper Review)Image to image translation with conditional adversarial network...(Paper Review)Image to image translation with conditional adversarial network...
(Paper Review)Image to image translation with conditional adversarial network...MYEONGGYU LEE
 
(Book summary) Ensemble method 2018summerml_study
(Book summary) Ensemble method 2018summerml_study(Book summary) Ensemble method 2018summerml_study
(Book summary) Ensemble method 2018summerml_studyMYEONGGYU LEE
 
(Paper Review)Towards foveated rendering for gaze tracked virtual reality
(Paper Review)Towards foveated rendering for gaze tracked virtual reality(Paper Review)Towards foveated rendering for gaze tracked virtual reality
(Paper Review)Towards foveated rendering for gaze tracked virtual realityMYEONGGYU LEE
 
(Paper Review)Geometrically correct projection-based texture mapping onto a d...
(Paper Review)Geometrically correct projection-based texture mapping onto a d...(Paper Review)Geometrically correct projection-based texture mapping onto a d...
(Paper Review)Geometrically correct projection-based texture mapping onto a d...MYEONGGYU LEE
 
(Papers Review)CNN for sentence classification
(Papers Review)CNN for sentence classification(Papers Review)CNN for sentence classification
(Papers Review)CNN for sentence classificationMYEONGGYU LEE
 
(Paper Review)Kernel predicting-convolutional-networks-for-denoising-monte-ca...
(Paper Review)Kernel predicting-convolutional-networks-for-denoising-monte-ca...(Paper Review)Kernel predicting-convolutional-networks-for-denoising-monte-ca...
(Paper Review)Kernel predicting-convolutional-networks-for-denoising-monte-ca...MYEONGGYU LEE
 

More from MYEONGGYU LEE (17)

(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
 
Survey of HDR & Tone Mapping Task
Survey of HDR & Tone Mapping TaskSurvey of HDR & Tone Mapping Task
Survey of HDR & Tone Mapping Task
 
Simple Review of Single Image Super Resolution Task
Simple Review of Single Image Super Resolution TaskSimple Review of Single Image Super Resolution Task
Simple Review of Single Image Super Resolution Task
 
ICCV 2019 Review
ICCV 2019 ReviewICCV 2019 Review
ICCV 2019 Review
 
(Paper Review)Few-Shot Adversarial Learning of Realistic Neural Talking Head ...
(Paper Review)Few-Shot Adversarial Learning of Realistic Neural Talking Head ...(Paper Review)Few-Shot Adversarial Learning of Realistic Neural Talking Head ...
(Paper Review)Few-Shot Adversarial Learning of Realistic Neural Talking Head ...
 
(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...
(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...
(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...
 
(Book Summary) Classification and ensemble(book review)
(Book Summary) Classification and ensemble(book review)(Book Summary) Classification and ensemble(book review)
(Book Summary) Classification and ensemble(book review)
 
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
 
(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...
(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...
(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...
 
(Paper Review)Neural 3D mesh renderer
(Paper Review)Neural 3D mesh renderer(Paper Review)Neural 3D mesh renderer
(Paper Review)Neural 3D mesh renderer
 
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...(Paper Review)3D shape reconstruction from sketches via multi view convolutio...
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...
 
(Paper Review)Image to image translation with conditional adversarial network...
(Paper Review)Image to image translation with conditional adversarial network...(Paper Review)Image to image translation with conditional adversarial network...
(Paper Review)Image to image translation with conditional adversarial network...
 
(Book summary) Ensemble method 2018summerml_study
(Book summary) Ensemble method 2018summerml_study(Book summary) Ensemble method 2018summerml_study
(Book summary) Ensemble method 2018summerml_study
 
(Paper Review)Towards foveated rendering for gaze tracked virtual reality
(Paper Review)Towards foveated rendering for gaze tracked virtual reality(Paper Review)Towards foveated rendering for gaze tracked virtual reality
(Paper Review)Towards foveated rendering for gaze tracked virtual reality
 
(Paper Review)Geometrically correct projection-based texture mapping onto a d...
(Paper Review)Geometrically correct projection-based texture mapping onto a d...(Paper Review)Geometrically correct projection-based texture mapping onto a d...
(Paper Review)Geometrically correct projection-based texture mapping onto a d...
 
(Papers Review)CNN for sentence classification
(Papers Review)CNN for sentence classification(Papers Review)CNN for sentence classification
(Papers Review)CNN for sentence classification
 
(Paper Review)Kernel predicting-convolutional-networks-for-denoising-monte-ca...
(Paper Review)Kernel predicting-convolutional-networks-for-denoising-monte-ca...(Paper Review)Kernel predicting-convolutional-networks-for-denoising-monte-ca...
(Paper Review)Kernel predicting-convolutional-networks-for-denoising-monte-ca...
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Survey of Super Resolution Task (SISR Only)

  • 1. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (1/52) Presented by 솔루션사업부 이명규 2020/07/06 Landscape Summary of Super Resolution Task
  • 2. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (2/52) I N D E X 01 02 03 Introduction Simple Review Featured Papers
  • 3. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (3/52) Introduction Part 01 1. SR Task 소개 2. Taxanomy
  • 4. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (4/52) ↳ SR Task 소개 1-1 • LR(Low-resolution) image 또는 video를 HR(High-resolution)로 복구하는 문제 • SISR(Single Image SR)과 MISR(Multiple Image SR)로 구분 Problem Definition 𝒚𝑳𝑹 = 𝒙⨂𝒌 𝒔 + 𝒏. GT* HR image Blur Kernel Down Sampling Noise *GT: Ground Truth
  • 5. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (5/52) ↳ • SR Algorithms • Interpolation based method (Bicubic, bilinear, Nearest neighbor etc….) => Just “upscaling” image • Reconstruction based method • Deep Learning based method Problem Definition https://bskyvision.com/531 SR Task 소개 1-1
  • 6. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (6/52) ↳ Applications Super Resolution Applications in Modern Digital Image Processing (IJCA 2016) ▲ SR for Satellite Image ▲ SR for Medical Imaging ▲ SR for Astrological Studies ▲ SR for Microscopy Image Processing SR Task 소개 1-1
  • 7. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (7/52) A Deep Journey into Super-resolution: A Survey Taxonomy: Overview 1-2 SISR GAN Models Linear Networks Early Upsampling Designs Late Upsampling Designs SRCNN VDSR DnCNN IrCNN FSRCNN ESPCN Residual Networks Single-Stage Networks Multi-Stage Networks EDSR CARN Form ResNet BTSRN REDNet Recursive Networks DRCN DRRN MemNet Progressive Reconstruction Designs SCN LapSRN Densely Connected Networks SR- DenseNet RDN D-DBPN Multi-Branch Designs CNF CMSC IDN Attention Based Networks SelNet RCAN SRRAM DRLN ZSSR SRMD Multiple Degradation Handling Networks SRGAN Enhance Net SRFeat ESRGAN
  • 8. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (8/52) ↳ A Deep Journey into Super-resolution: A Survey Linear Networks Taxonomy: Overview 1-2 • Early & Late Upsamling으로 구분: “어떻게 이미지를 키울 것인가?” ① Early Upsampling: Interpolation Based kernel로 이미지를 전처리해 모델에 피딩 ➢ Nearest-Neighbor, Bilinear, Bicubic Interpolation(‘SRCNN’) ② Late Upsampling: Transposed Conv를 사용해 Upsampling Operation 자체를 모델링 ➢ Transposed Convolution Layer(‘FSRCNN’) ➢ Sub-pixel layer(‘ESPCN’)
  • 9. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (9/52) ↳ A Deep Journey into Super-resolution: A Survey Residual Networks Taxonomy: Overview 1-2 • Single-Stage와 Multi-Stage로 구분 • Bicubic Upsampled LR 이미지와 HR 이미지의 Residual 정보에 대해 학습 • Global/Local Residual Learning의 차이: Shortcut Connection으로 Input/output을 연결할지, 각각 다른 depth를 가진 레이어끼리 연결할지의 차이 • 만들어야 하는 정보량이 적어 깊으면서도 안정적인 학습 가능
  • 10. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (10/52) ↳ Attention Based Networks Taxonomy: Overview 1-2 https://wikidocs.net/22893 • “어떻게 중요한 정보를 부각시킬 것인가?” • 고정된 크기의 벡터에 모든 정보를 압축하다 보니 Vanishing Gradient 문제 발생 • 단순히 Resblock을 쌓는 방식은 CNN의 RF size가 상대적으로 작아 Feature들이 담고 있는 Local 및 Global 정보가 동등하게 처리되는 문제 존재 • 𝑹𝒆𝒄𝒆𝒑𝒕𝒊𝒗𝒆 𝑭𝒊𝒆𝒍𝒅 : Size of the receptive field can be reversed from the output network size. • 𝑰𝒏𝒑𝒖𝒕_𝒔𝒊𝒛𝒆 : Size of the sense node of the output node • 𝑲_𝒔𝒕𝒓𝒊𝒅𝒆 : Moving step size of the convolution kernel. • 𝑲_𝒔𝒊𝒛𝒆 : Size of the convolution kernel between input and output 𝑰𝒏𝒑𝒖𝒕_𝒔𝒊𝒛𝒆 = (𝒐𝒖𝒕𝒑𝒖𝒕_𝒔𝒊𝒛𝒆 − 𝟏) × 𝑲_𝒔𝒕𝒓𝒊𝒅𝒆 + 𝑲_𝒔𝒊𝒛𝒆.
  • 11. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (11/52) ↳ Attention Based Networks Taxonomy: Overview 1-2 https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/ • Receptive Field와 Filter의 차이 • Filter(=Kernel, Weights): Detect하고자 하는 feature를 정의하고 있는 Matrix • RF: 이미지 위를 필터가 훑으며 Detect한 실제 feature
  • 12. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (12/52) ↳ Attention Based Networks Taxonomy: Overview 1-2 Convolutional Block Attention Module (CBAM)
  • 13. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (13/52) ↳ Attention Based Networks Taxonomy: Overview 1-2 U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
  • 14. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (14/52) ↳ Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network GAN Based Models Taxonomy: Overview 1-2 • 질감의 디테일에 더욱 집중 (사람이 보기에 그럴싸한 이미지 생성) • 다른 연구들은 MSE를 최소화하는 데 집중해 High Frequency Details정보 손실 • 따라서 VGG-based Content Loss와 Adversarial Loss로 구성된 Perception Loss 사용
  • 15. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (15/52) ↳ • 정량평가 • PSNR(Peak Signal-to-Noise Ratio): 최대 신호 대 잡음 비 • SSIM(Structural SIMilarity): 구조적 유사도 지수 • VMAF(Video Multi-method Assessment Fusion, Netflix) • 정성평가 • MOS(Mean Opinion Score) Image Quality Assessment Taxonomy: Evaluation 1-2
  • 16. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (16/52) ↳ JPEG Compression(low) MSE: 18.41, PSNR: 35.47 JPEG Compression(middle) MSE: 9.87, PSNR: 38.18 Original 참고: “db’데시벨’과 ‘로그 스케일’ 이야기” Image Quality Assessment - PSNR Taxonomy: Evaluation 1-2 𝑴𝑺𝑬 = 𝟏 𝒎𝒏 ෍ 𝒊=𝟎 𝒎−𝟏 ෍ 𝒋=𝟎 𝒏−𝟏 [𝑰 𝒊, 𝒋 − ෠ 𝑰(𝒊, 𝒋)]𝟐 𝑷𝑺𝑵𝑹 = 𝟏𝟎𝒍𝒐𝒈𝟏𝟎 𝑴𝑨𝑿𝑰 𝟐 𝑴𝑺𝑬 = 𝟏𝟎𝒍𝒐𝒈𝟏𝟎 𝑴𝑨𝑿𝑰 𝑴𝑺𝑬 = 𝟐𝟎𝒍𝒐𝒈𝟏𝟎 𝑴𝑨𝑿𝑰 − 𝟏𝟎𝒍𝒐𝒈𝟏𝟎 𝑴𝑺𝑬 , 𝒘𝒉𝒆𝒓𝒆 𝑴𝑨𝑿𝑰 = 𝒎𝒂𝒙𝒊𝒎𝒖𝒎 𝒗𝒂𝒍𝒖𝒆 𝒐𝒇 𝒊𝒎𝒂𝒈𝒆
  • 17. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (17/52) ↳ Image Quality Assessment - SSIM Taxonomy: Evaluation 1-2
  • 18. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (18/52) ↳ Image Quality Assessment - SSIM Taxonomy: Evaluation 1-2 ➢skimage.measure.compare_ssim
  • 19. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (19/52) ↳ Taxonomy: Architectures 1-2 Deep Learning for Image Super-resolution: A Survey
  • 20. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (20/52) Deep Learning for Image Super-resolution: A Survey Taxonomy: Performances 1-2
  • 21. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (21/52) Deep Learning for Image Super-resolution: A Survey Taxonomy: Datasets 1-2 Single Image Dataset Video Dataset Dataset Usage Link 비고 Vimeo-90k Train / Validation Link 90k HQ videos Vid4 Test Link - Xiph HD - Link Old Videos Ultra-Video Group HD - Link -
  • 22. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (22/52) Simple Review Part 02 1. SISR(Single Image Super Resolution) Summary 2. VSR(Video Super Resolution) Summary: TBA
  • 23. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (23/52) SISR Summary (Single Image Super Resolution) Part 2-1 1. SRCNN (ECCV 2014) 2. FSRCNN (ECCV 2016) 3. VDSR (CVPR2016) 4. SRResNet (CVPR2017) 5. SRGAN (CVPR2017)
  • 24. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (24/52) ↳ Paper History Featured Featured +SRResNet (CVPR’17)
  • 25. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (25/52) ↳ SRCNN 2-1-1 • DL을 SISR에 사용한 첫 논문 → 3-layer CNN, MSE loss • 전통적인 Interpolation 방식 대비 월등한 성능 SRCNN (ECCV 2014) Image Super-Resolution Using Deep Convolutional Networks (ECCV 2014)
  • 26. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (26/52) ↳ • No Preprocessing, Transposed Conv Layer 적용해 Upsampling FSRCNN (ECCV 2016) Accelerating the Super-Resolution Convolutional Neural Network (ECCV2016) F S R C N N 2-1-2
  • 27. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (27/52) ↳ • Too Shallow했던 SRCNN의 단점 개선, 안정적이면서 빠른 학습 → VGGNet based deep residual learning+MSE “Adjustable gradient clipping for maximal boost in speed while suppressing exploding gradients” • 단순 LR→HR mapping 학습이 아닌, bicubic upsampled LR 이미지와 HR 이미지의 잔차(residual)를 학습하는 방식 VDSR (CVPR 2016) Image Super-Resolution Using Deep Convolutional Networks (ECCV 2014) =>이후 DRCN(Deeply-recursive CNN), SRResNet, DRRN의 제안으로 이어짐 V D S R 2-1-3
  • 28. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (28/52) ↳ • ResNet 구조를 그대로 SR Task에 적용해 깊으면서도 안정적인 학습 수행 (후술할 SRGAN과 함께 제안) • Batch Normalization 사용 • SRDenseNet(ICCV’17), Residual DenseNet(CVPR’18)도 비슷한 아이디어 • ResNet에 ESPCN의 Sub-Pixel Layer 적용 SRResNet (CVPR 2017) Image Super-Resolution Using Deep Convolutional Networks (ECCV 2014) SRResNet 2-1-4 =>이후 EDSR의 제안으로 이어짐
  • 29. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (29/52) ↳ • GAN을 활용해 인간이 보기에 그럴싸한 이미지를 생성하려는 시도 • MSE 기반의 Content Loss는 blurry한 이미지를 생성하므로, 이미지의 텍스처 선명도를 개선하기 위해 Perceptual loss 사용을 제안 → 𝑷𝒆𝒓𝒄𝒆𝒑𝒕𝒖𝒂𝒍 𝒍𝒐𝒔𝒔 = 𝑽𝑮𝑮 𝑪𝒐𝒏𝒕𝒆𝒏𝒕 𝒍𝒐𝒔𝒔 + 𝑮𝑨𝑵 𝒍𝒐𝒔𝒔(𝒂𝒅𝒗𝒆𝒓𝒔𝒂𝒓𝒊𝒂𝒍) MSE loss 대신 style transfer에서 사용되는 VGG loss로 교체 SRGAN (CVPR 2017) Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (CVPR 2017) Z-Vector G Fake Real D “Fake” “Real” SRGAN 2-1-5 𝑰𝑺𝑹 = 𝑰𝑽𝑮𝑮 𝑺𝑹 + 𝑰𝑮𝒆𝒏 𝑺𝑹
  • 30. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (30/52) ↳ SRGAN (CVPR 2017) Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (CVPR 2017) SRGAN 2-1-5 • Difference Between MSE-based and VGG-based Content Loss 𝑴𝑺𝑬 𝑳𝒐𝒔𝒔 = 𝟏 𝒏 ෍ 𝒊=𝟏 𝒏 (𝒚𝒊 − 𝒕𝒊)𝟐 𝒑𝒊𝒙𝒆𝒍 − 𝒘𝒊𝒔𝒆 𝑴𝑺𝑬 𝑳𝒐𝒔𝒔 = 𝟏 𝒓𝟐𝑾𝑯 ෍ 𝒙=𝟏 𝒓𝑾 ෍ 𝒚=𝟏 𝒓𝑯 𝑰𝒙,𝒚 𝑯𝑹 − 𝑮𝜽𝑮 𝑰𝑳𝑹 𝒙,𝒚 𝟐 𝒘𝒉𝒆𝒓𝒆 𝜽𝑮 = 𝑾𝑳; 𝒃𝑳 , 𝒓 = 𝒅𝒐𝒘𝒏𝒔𝒂𝒎𝒑𝒍𝒊𝒏𝒈 𝒇𝒂𝒄𝒕𝒐𝒓, 𝑳 = 𝑳𝒂𝒚𝒆𝒓 𝑽𝑮𝑮 𝑳𝒐𝒔𝒔 = 𝟏 𝑾𝒊,𝒋𝑯𝒊,𝒋 ෍ 𝒙=𝟏 𝑾𝒊,𝒋 ෍ 𝒚=𝟏 𝑯𝒊,𝒋 𝝓𝒊,𝒋(𝑰𝑯𝑹)𝒙,𝒚−𝝓𝒊,𝒋(𝑮𝜽𝑮 (𝑰𝑳𝑹))𝒙,𝒚 𝟐 𝑾𝒉𝒆𝒓𝒆 𝝓𝒊,𝒋 = 𝒇𝒆𝒂𝒕𝒖𝒓𝒆 𝒎𝒂𝒑 𝒐𝒑𝒕𝒂𝒊𝒏𝒆𝒅 𝒃𝒚 𝒕𝒉𝒆 𝒋 − 𝒕𝒉 𝒄𝒐𝒏𝒗, (before max-pool) 𝑾𝒊,𝒋𝑯𝒊,𝒋 = 𝒅𝒊𝒎 𝒐𝒇 𝒕𝒉𝒆 𝒓𝒆𝒔𝒑𝒆𝒄𝒕𝒊𝒗𝒆 𝒇𝒆𝒂𝒕𝒖𝒓𝒆 𝒎𝒂𝒑𝒔 𝒘𝒊𝒕𝒉𝒊𝒏 𝒕𝒉𝒆 𝑵𝒆𝒕𝒘𝒐𝒓𝒌. “Feature Level에서 Loss Calculation”
  • 31. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (31/52) Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (CVPR 2017) SRGAN 2-1-5
  • 32. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (32/52) Featured Papers Part 03 1. Paper 1 (“EDSR”) (Enhanced Deep Residual Networks for Single Image Super-Resolution) 2. Paper 2 (“SAN”, SOTA) (Second-order Attention Network for Single Image Super-Resolution)
  • 33. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (33/52) Paper 1: Enhanced Deep Residual Networks for Single Image Super-Resolution(“EDSR”) Part 3-1 1. Introduction 2. Architecture Overview 3. Experiment & Conclusion
  • 34. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (34/52) ↳ Introduction 3-1-1 Limitations of Previous Works • ResNet을 그대로 사용 시(SRResnet) SISR에 좋은 성능을 보여 주지만 다음의 이슈가 존재 • ResNet과 같은 분류기들은 Classification과 같은 High-Level Task를 위한 구조 → SR은 Low-Level Task • ResNet에 적용된 BN으로 인해 네트워크의 flexibility 저하 → 학습 시간이 오래 걸리는 문제 ▲ Batch Normalization Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
  • 35. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (35/52) ↳ Contributions • 학습 시 메모리 사용량 40% 절감 → BN layer를 제거한 new residual block 제안 → 더 깊은 네트워크를 학습할 수 있게 됨 • Single Scale Model(EDSR)과 Multi Scale Model(MDSR) 제안 → 각각 single scale(x2, x3, x4)을 따로따로 학습(EDSR)하거나 동시에 여러 scale에 대해 학습(MDSR) Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017) Introduction 3-1-1
  • 36. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (36/52) ↳ Architecture Overview 3-1-2 Model Overview ▲ EDSR ▲ MDSR Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
  • 37. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (37/52) ↳ Model Overview • BN이 빠진 새 residual block 구조 제안 • Residual block 최종 feature matrix에 0.1의 constant를 곱해 줌 → 안정적인 학습 도모 • x3, x4배 scaling 학습에서는 x2 모델을 기반으로 transfer learning Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017) Architecture Overview 3-1-2
  • 38. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (38/52) ↳ Evaluation results 3-1-3 Experiment & Conclusion Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
  • 39. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (39/52) ↳ • Conclusion • 종전 방법 대비 학습 시 메모리를 40%까지 줄이는 방법을 제안 • Single Scale(EDSR)과 Multi Scale(MDSR)에서 작동하는 SR 모델의 제안 • Limitations • CNN의 receptive field size가 상대적으로 작음. (즉, local patch에만 집중) → 이미지에서 보다 넓은 영역을 고려하지 못함. • Feature들이 담고 있는 local 혹은 global 정보가 동등하게 처리됨 → 이후 Dilated conv, spatial or channel-wise attention 등의 제안으로 이어짐 Conclusion & Limitations Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017) 3-1-3 Experiment & Conclusion
  • 40. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (40/52) Paper 2: Second-order Attention Network for Single Image Super-Resolution(“SAN”) Part 3-2 1. Introduction 2. Architecture Overview 3. Experiment & Conclusion
  • 41. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (41/52) ↳ Introduction 3-2-1 Limitations of Previous Works • 기존 모델들은 깊거나 넓은 구조 디자인에만 초점을 맞춤 → 따라서 레이어 간의 관계를 탐색하지 않아 네트워크 전체의 표현력 저하 • 대부분 LR 이미지의 모든 정보를 사용하지 않아 낮은 성능을 보여 옴 • 학습 속도가 본 논문 대비 상대적으로 느림 Enhanced Deep Residual Networks for Single Image Super-Resolution (CVPR 2017)
  • 42. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (42/52) ↳ Contributions • Second order statistics를 활용해 레이어 간 feature 상호 의존성 학습 • LSRAG(local-source residual attention groups) 구조를 제안해 LR 이미지 정보를 적극 활용 → 풍부한 low-frequency 정보 Introduction 3-2-1
  • 43. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (43/52) ↳ Architecture Overview 3-2-2 Model Overview
  • 44. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (44/52) ↳ Detailed view of Model: RL-NL Module • 목적 : 영상을 잘게 쪼개어 영상 전체를 보기 위함 → 영상을 4등분 후 각 영역에 대해 non-local module 적용 (큰 해상도에서도 유리) • SSRG모듈 전후로 수행되며, high-level에서 넓은 범위의 정보들을 모으는 역할 → Global level non-local 연산은 인풋 사이즈가 클 경우 연산량 증가 (Region-Level Non-Local) Architecture Overview 3-2-2
  • 45. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (45/52) ↳ Detailed view of Model: LSRAG Module • 목적 : Feature inter-dependencies를 잘 유지하기 위함 → 영상을 4등분 후 각 영역에 대해 non-local module 적용 (큰 해상도에서도 유리) • Simplified residual block들로 구성 (Local Source Residual Attention Group) Architecture Overview 3-2-2
  • 46. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (46/52) ↳ Detailed view of Model: SOCA Module • 목적 : 공분산 정규화를 통한 모델의 Discriminative representation 능력 향상 → Attention을 통해 네트워크가 더욱 중요한 정보를 갖는 feature에 가중치를 더 두도록 함 • 기존 Attention 알고리즘들은 GAP을 이용한 1차 statistics 정보만 활용 → 기존 방식들은 1st order statistics(=average) 이상의 정보를 활용하지 않아 모델의 Discriminative representation 능력이 저하됨 → 따라서 Covariance Normalization을 거쳐 channel attention 수행 (Second-Order Channel Attention) Architecture Overview 3-2-2
  • 47. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (47/52) ↳ Detailed view of Model: SOCA Module • GAP(Global Average Pooling) : 각 feature map의 노드들로부터 단순히 평균을 추출해 차원 축소 e.g.) (𝟏+𝟗+𝟔+𝟒+𝟓+𝟒+𝟕+𝟖+𝟓+𝟏+𝟐+𝟗+𝟔+𝟕+𝟔+𝟎) 𝟏𝟔 = 𝟓 Replace to GCP(Global Covariance Pooling) Architecture Overview 3-2-2
  • 48. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (48/52) ↳ Detailed view of Model: SOCA Module • Covariance Normalization: 1. Reshape 𝑯 × 𝑾 × 𝑪 feature map 𝑭 = [𝐟𝟏, … , 𝐟𝒄] to feature matrix 𝑿. (𝑿 has 𝒔 = 𝑾𝑯 features of 𝑪-dim) 2. Compute sample covariance matrix 𝚺. 𝚺 = 𝐗 ҧ 𝐈𝐗𝑻 , where ҧ 𝐈 = 𝟏 𝒔 (𝐈 − 𝟏 𝒔 𝟏). (𝑰=𝒔 × 𝒔 Identity matrix, 𝟏=matrix of all ones) Architecture Overview 3-2-2
  • 49. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (49/52) ↳ Detailed view of Model: SOCA Module 3. 𝚺 is symmetric positive semi-definte, thus has EIG as follows: 𝚺 = 𝐔𝚲𝐔𝐓, 𝒘𝒉𝒆𝒓𝒆 𝑼 = 𝒐𝒓𝒕𝒉𝒐𝒈𝒐𝒏𝒂𝒍 𝒎𝒂𝒕𝒓𝒊𝒙, 𝜦 = 𝒅𝒊𝒂𝒈 𝒎𝒂𝒕𝒓𝒊𝒙 𝒘𝒊𝒕𝒉 𝒆𝒊𝒈𝒆𝒏𝒗𝒂𝒍𝒖𝒆𝒔. 4. So, covariance normalization can be coverted to power of 𝒆𝒊𝒈𝒆𝒏𝒗𝒂𝒍𝒖𝒆𝒔. ෡ 𝐘 = 𝚺𝜶 = 𝐔𝚲𝜶 𝐔𝑻 - 𝜶는 양의 실수로, 1일 경우 정규화를 수행하지 않음. (𝛼 =½에서 잘 작동함을 확인) - 𝜶<1인 경우 1.0보다 큰 eigenvalue는 non-linear하게 축소하고, 반대는 늘림. Architecture Overview 3-2-2
  • 50. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (50/52) ↳ Detailed view of Model: SOCA Module • Normalized covariance matrix ෠ 𝐘는 channel-wise feature들 간의 상관 관계를 characterize하는 역할 • Covariance normalized ෡ 𝐘 를 이용해 channel level에서 pooling을 수행 𝒍𝒆𝒕 ෠ 𝐘 = 𝒚𝟏, … , 𝒚𝑪 , 𝒄𝒉𝒂𝒏𝒏𝒆𝒍𝒘𝒊𝒔𝒆 𝒔𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄𝒔 𝒛 = 𝑯𝑮𝑪𝑷 𝒚𝒄 = 𝟏 𝑪 ෍ 𝒊 𝑪 𝒚𝑪(𝒊) Architecture Overview 3-2-2
  • 51. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (51/52) ↳ Evaluation results 3-2-3 Experiment & Conclusion
  • 52. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (52/52) ↳ • 다양한 모듈을 제안해 SISR의 PSNR 성능을 높임 • SSRG 모듈을 통해 low-frequency 정보들을 충분히 활용 • RL-NL 모듈을 통해 long-distance spatial contextual information을 활용 • SOCA 모듈을 통해 global covariance pooling을 수행하고, 레이어 간의 dependencies를 학습 • Second-order channel attention을 통해 discriminative representation학습에 집중 • 네트워크 규모 대비 낮은 수의 파라미터 개수 Conclusion & Limitations 3-2-3 Experiment & Conclusion
  • 53. ESPRESOMEDIA 이명규 Landscape Summary of Super Resolution Task (53/52) Thank you for Listening. Email : brstar96@espresomedia.com Mobile : +82-10-8234-3179