Data-driven AI
Security HCI (DASH) Lab
1
Data-driven AI
Security HCI (DASH) Lab
Real-World Super-Resolution via
Kernel Estimation and Noise Injection
Minha Kim
kimminha@g.skku.ed
u
Sungkyunkwan University
August 26, 2021
Xiaozhong Ji et al., Nanjing University, Tencent Youtu Lab
Data-driven AI
Security HCI (DASH) Lab
Super Resolution
- For denoising, super resolution (SR) uses fixed bicubic operation for Down sampling to construct training data
pairs.
In other words, the images downsempled using bicubic are robust only for bicubic noised images.
The results are very blurry with lots of noise.
It results in domain gap.
Propose
To solve these problem, they proposed a Realistic degradation framework for Super-Resolution
(RealSR), which contains kernel estimation and noise injection to preserve the original domain
attributes.
background
Data-driven AI
Security HCI (DASH) Lab
Introduction
•Almost methods adopt simple bicubic down-sampling from high-quality
images to construct Low-Resolution (LR) and High Resolution (HR) pairs
for training which may lose track of frequency-related details.
•These SR models trained on the data generated by bicubic kernel can only work
well on clean HR data, because the model has never seen blurry/noisy data during
training.
Problems
Contribution
•Various blur kernels.
•Real noise distributions.
Propose
To solve these problem, they proposed a Realistic degradation framework for
Super-Resolution (RealSR), which contains kernel estimation and noise injection to
preserve the original domain attributes.
Data-driven AI
Security HCI (DASH) Lab
RealSR Architecture
Data-driven AI
Security HCI (DASH) Lab
RealSR – First Stage - Degradation the images
LR image
K : blurry kernel
n : noise
• Note – all three are unknown.
• K and n are collected using their
method and noise injection method.
Data-driven AI
Security HCI (DASH) Lab
How to collect k and n
KernelGAN : https://arxiv.org/pdf/1909.06581.pdf
LR image
1. Kernel Estimation
K : blurry kernel
n : noise
• Note – all three are unknown.
• K and n are collected using their
method and noise injection method.
RealSR – First Stage – Estimation for kernel and
noise
Downsampled image with ideal
kernel
Downsampled image with kernel
Discriminator
Data-driven AI
Security HCI (DASH) Lab
How to collect k and n
is function to calculate variance
is the max value of variance
2. Noise Estimation
LR image
RealSR – First Stage – Estimation for kernel and
noise
K : blurry kernel
n : noise
• Note – all three are unknown.
• K and n are collected using their
method and noise injection method.
is a cropped noise patch from the noise pool
consisting of {k1, k2 · · · kl}.
Data-driven AI
Security HCI (DASH) Lab
LR image
K : blurry kernel
n : noise
• Note – all three are unknown.
• K and n are collected using their
method and noise injection method.
How to apply k and n that were collected
RealSR – First Stage - Degradation the images
Degradation the images
is a cropped noise patch from the noise pool
consisting of {k1, k2 · · · kl}.
Data-driven AI
Security HCI (DASH) Lab
Experiments
Evaluation Metrics
- PSNR
- SSIM
- LPIPS
Dataset
- DF2K dataset is artificially
added with Gaussian noise.
- DPED dataset is taken by the
iPhone3 camera. which are more
challenging containing noise, blur,
dark light and low-quality
problems.
DF2K DPED
Data-driven AI
Security HCI (DASH) Lab
Baselines
- EDSR. (Dai et al. CVPRW
2017)
- ESRGAN (Wang et al.
ECCV2018)
- ZSSR (Shocher et al. CVPR
2018)
- K-ZSSR is combination of
KernelGAN and ZSSR.
DF2K DPED
Experiments
Data-driven AI
Security HCI (DASH) Lab
Train set : 3,350 images
Validation set : 100 images
Results on DF2K
Data-driven AI
Security HCI (DASH) Lab
Train set : 5,614 images
Validation set : 100 images
Results on DPED
Data-driven AI
Security HCI (DASH) Lab
Results for the NTIRE 2020 Challenge on
Real-World Image Super-Resolution: Trak 1 & 2
Data-driven AI
Security HCI (DASH) Lab
Contributions
•They acquired LR images that share a common domain with real images.
•With those domain-consistent data using their method, they then train a real
image super-resolution GAN with a patch discriminator, which can produce HR
results with better perception.
•RealSR is also the winner of NTIRE 2020 Challenge on two tracks of Real-
World Super-Resolution as well as outperforms the SOTA methods.
Data-driven AI
Security HCI (DASH) Lab
Thank you ! 

[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injection

  • 1.
    Data-driven AI Security HCI(DASH) Lab 1 Data-driven AI Security HCI (DASH) Lab Real-World Super-Resolution via Kernel Estimation and Noise Injection Minha Kim kimminha@g.skku.ed u Sungkyunkwan University August 26, 2021 Xiaozhong Ji et al., Nanjing University, Tencent Youtu Lab
  • 2.
    Data-driven AI Security HCI(DASH) Lab Super Resolution - For denoising, super resolution (SR) uses fixed bicubic operation for Down sampling to construct training data pairs. In other words, the images downsempled using bicubic are robust only for bicubic noised images. The results are very blurry with lots of noise. It results in domain gap. Propose To solve these problem, they proposed a Realistic degradation framework for Super-Resolution (RealSR), which contains kernel estimation and noise injection to preserve the original domain attributes. background
  • 3.
    Data-driven AI Security HCI(DASH) Lab Introduction •Almost methods adopt simple bicubic down-sampling from high-quality images to construct Low-Resolution (LR) and High Resolution (HR) pairs for training which may lose track of frequency-related details. •These SR models trained on the data generated by bicubic kernel can only work well on clean HR data, because the model has never seen blurry/noisy data during training. Problems Contribution •Various blur kernels. •Real noise distributions. Propose To solve these problem, they proposed a Realistic degradation framework for Super-Resolution (RealSR), which contains kernel estimation and noise injection to preserve the original domain attributes.
  • 4.
    Data-driven AI Security HCI(DASH) Lab RealSR Architecture
  • 5.
    Data-driven AI Security HCI(DASH) Lab RealSR – First Stage - Degradation the images LR image K : blurry kernel n : noise • Note – all three are unknown. • K and n are collected using their method and noise injection method.
  • 6.
    Data-driven AI Security HCI(DASH) Lab How to collect k and n KernelGAN : https://arxiv.org/pdf/1909.06581.pdf LR image 1. Kernel Estimation K : blurry kernel n : noise • Note – all three are unknown. • K and n are collected using their method and noise injection method. RealSR – First Stage – Estimation for kernel and noise Downsampled image with ideal kernel Downsampled image with kernel Discriminator
  • 7.
    Data-driven AI Security HCI(DASH) Lab How to collect k and n is function to calculate variance is the max value of variance 2. Noise Estimation LR image RealSR – First Stage – Estimation for kernel and noise K : blurry kernel n : noise • Note – all three are unknown. • K and n are collected using their method and noise injection method. is a cropped noise patch from the noise pool consisting of {k1, k2 · · · kl}.
  • 8.
    Data-driven AI Security HCI(DASH) Lab LR image K : blurry kernel n : noise • Note – all three are unknown. • K and n are collected using their method and noise injection method. How to apply k and n that were collected RealSR – First Stage - Degradation the images Degradation the images is a cropped noise patch from the noise pool consisting of {k1, k2 · · · kl}.
  • 9.
    Data-driven AI Security HCI(DASH) Lab Experiments Evaluation Metrics - PSNR - SSIM - LPIPS Dataset - DF2K dataset is artificially added with Gaussian noise. - DPED dataset is taken by the iPhone3 camera. which are more challenging containing noise, blur, dark light and low-quality problems. DF2K DPED
  • 10.
    Data-driven AI Security HCI(DASH) Lab Baselines - EDSR. (Dai et al. CVPRW 2017) - ESRGAN (Wang et al. ECCV2018) - ZSSR (Shocher et al. CVPR 2018) - K-ZSSR is combination of KernelGAN and ZSSR. DF2K DPED Experiments
  • 11.
    Data-driven AI Security HCI(DASH) Lab Train set : 3,350 images Validation set : 100 images Results on DF2K
  • 12.
    Data-driven AI Security HCI(DASH) Lab Train set : 5,614 images Validation set : 100 images Results on DPED
  • 13.
    Data-driven AI Security HCI(DASH) Lab Results for the NTIRE 2020 Challenge on Real-World Image Super-Resolution: Trak 1 & 2
  • 14.
    Data-driven AI Security HCI(DASH) Lab Contributions •They acquired LR images that share a common domain with real images. •With those domain-consistent data using their method, they then train a real image super-resolution GAN with a patch discriminator, which can produce HR results with better perception. •RealSR is also the winner of NTIRE 2020 Challenge on two tracks of Real- World Super-Resolution as well as outperforms the SOTA methods.
  • 15.
    Data-driven AI Security HCI(DASH) Lab Thank you ! 

Editor's Notes

  • #2 Hi, I’m glad to be here with you today. What I’m going to talk about ~~~ This paper is presented at CVPRW2020. And in the NTIRE 2020challenge for real work super resolution in conjunction with CVPR 2020, their method is the winner on two tracks, which significantly outperforms other competitors by large margins.
  • #3 Before introduction their paper, I will briefly explain about what is Super Resolution Super-Resolution (SR) task is to increase the resolution of low-quality images. In general, the super resolution model is trained on the data generated by bicubic kernel can only work well on clean HR data, because the model has never seen blurry and noisy data during training. Moreover, in the test phase, the input image down-sampled by bicubic kernel is used to the designed network.
  • #4 Therefore, they said that these methods always fail in real-world image super-resolution, since most of them adopt simple bicubic downsampling from high quality images for training, which can lose track frequency-related details To solve these problem, they focused on designing a novel degradation framework for real world images by estimating various blur kernels as well as real noise distributions. Based on their novel degradation framework, they proposed how to acquire LR images sharing a common domain with real-world images. Then, they proposed a real world super-resolution model aiming at better perception. Extensive experiments on synthetic noise data and real-world images demonstrate that our method outperforms the state-of-the-art methods, resulting in lower noise and better visual quality. Their representative novel methods can be divided into Various blur kernels and Real noise distributions
  • #5 The Redbox are the key method proposed in this paper, which are degradation Pool and degradation. The degradation Pool is to generate kernel and noise for Degradation. And, the degradation method to create low resolution that is suitable for learning in Super Resolution task. The problems is that there are super resolution methods using very ideal kernels, such as bicubic downsampling. and test data with bicubic are also ideal data, which can occur practically more noise and blurry when applied. I will explain more specifically about method for degradation pool. 레드박스가 이 논문에서 제안하는 핵심 메소드임 바로 Degradation을 위한 kernel과 noise를 생성하는것. 이들의 최종 목표는, SR 테스크에서 학습에 적합한 LR을 만드는 방법을 제안하는 것임 (모델 X) 기존 문제는 bicubic 같이 아주 이상적인 커널을 사용하는 경우가 대부분이고, 테스트데이터 또한 비큐빅이 적용된 이상적인 데이터이는 실제 데이터를 적용했을 땐 오히려 블러링 + 노이징 현상이 심했기 때문이임
  • #6 They assumed that the LR image is obtained by the following degradation method The referred KernelGAN trains solely on the low resolution test image at test time, and learns its internal distribution of patches. Its Generator is trained to produce a downscaled version of the LR test image, such that its Discriminator cannot distinguish between the patch distribution of the downscaled image, and the patch distribution of the original LR image. The most important key of method is k and n, which are collected using their kernel estimation method and noise injection method. Next page, I will explain how to get the k and n.
  • #7 They adopt a similar kernel estimation method and set appropriate parameters based on real images. The generator of KernelGAN is a linear model without any activation layers. So it can be pluggable with all kernels. 1. The First term is L1 loss for between downsampled LR image with kernel k and downsampled image with ideal kernel, which is the bicubic downsampling method. 2. the second term is to constrain k to sum to 1. 3. And the third term is the penalizing non-zero values close to the boundaries. m is for a constant mask of weights exponentially growing with distance from the center of k. 4. Last term is the discriminator D(·) is to ensure the consistency of source domain .
  • #8 In order to make the degraded image have a similar noise distribution to the source image, they directly collected noise patches from the source real image set X. The sigmoid is function to calculate variance. So, they adopt an online noise injection method that the content and the noise are combined during training phases. This makes the noise more diverse and regularizes the SR model to distinguish content with noise.
  • #9 After the degradation with blur kernels and injecting noise, finally they obtained LR image using their RealSR method.
  • #10 They used two datasets DF2K and DPED The DF2K images are artificially added with Gaussian noise to simulate sensor noise. And DPED dataset is taken by the iPhone3 camera. which are more challenging containing noise, blur, dark light and other low-quality problems. And they adopt three metrics the PSNR, SSIM and LPIPS. PSNR and SSIM are commonly-used evaluation metrics for image restoration. These two metrics pay more attention to the fidelity of the image rather than visual quality. In contrast, LPIPS pays more attention to whether the visual features of images are similar or not. It uses pretrained Alexnet to extract image features, and then calculates the distance between the two features. Therefore, the smaller the LPIPS is, the closer the generated image is to the ground truth.
  • #11 They compared with four state the of art models, which are the ~~~~~
  • #12 First, compared with four baselines. And again, the smaller the LPIPS is, the closer the generated image is to the ground truth. So as you can see the table, they outperformed all models. This is figure for visualization comparison among EDSR, ZSSR, and our RealSR on a realVisualization comparison among EDSR, ZSSR, and our RealSR on a real-world low-resolution image.
  • #13 Since there is no corresponding ground truth, they provided only a visual comparison. The images suffer from degradation problems such as blur, noise something like that. In this DPED dataset, just like the problems encountered in SR training of real images, no ground truth. Therefore, they only showed the results of visual comparison
  • #14 they also won of NTIRE 2020 Challenge on two tracks of Real-World Super-Resolution as well as outperforms the SOTA methods. Note that the Mean Opinion Score (MOS) metric is measured according to human study.
  • #15 To summarize their contributions, 1. They a novel degradation framework RealSR based on kernel estimation and noise injection. By using different combinations of degradation (such as blur and noise), they acquire LR images that share a common domain with real images. 2.With those domain-consistent data using their method, they then train a real image super-resolution GAN with a patch discriminator, which can produce HR results with better perception. 3.RealSR is also the winner of NTIRE 2020 Challenge on two tracks of Real-World Super-Resolution as well as outperforms the SOTA methods.