A pipelined approach to deal with image distortion in computer vision - BRACIS 2020

•Download as PPTX, PDF•

0 likes•45 views

Image classification is a well-established problem in computer vision. Most state-of-the-art models rely on Convolutional Neural Networks to achieve near-human performance in that task. However, CNNs have shown to be susceptible to image manipulation, which undermines the trustability of perception systems. This property is critical, especially in unmanned systems, autonomous vehicles, and scenarios where light cannot be controlled. We investigate the robustness of several Deep-Learning based image recognition models and how the accuracy is affected by several distinct image distortions. The distortions include ill-exposure, low-range image sensors, and common noise types. Furthermore, we also propose and evaluate an image pipeline designed to minimize image distortion before the image classification is performed. Results show that most CNN models are marginally affected by mild miss-exposure...

Engineering

APIPELINEDAPPROACHTODEALWITH
IMAGEDISTORTIONINCOMPUTER
VISION
Garbage In ➡️ Garbage Out
STEFFENS, Cristiano R.; MESSIAS, Lucas R. V.;
DREWS-JR, Paulo J. L.;BOTELHO, Silvia S. d. C.
cristianosteffens@furg.br

ABOUT
• Badly exposed and noisy images
• Image Recognition / Classification
• Assessing the impacts of image distortions on the object rcognition task
• Image restoration and enhancement as a pre-processing strategy
• How much can it impact the results?
2

INTRODUCTION
How can the image be affected at capture time?
• Typical challenges
• High contrast (light source, contrasting scene)
• Low contrast
• Low light (weakly illuminated scene)
• Lens apperture
⬆ Larger aperture equals more light, shorter exposure time
⬇ Larger aperture equals lower depth of field
• Exposure time vs. Blur
⬆ Higher exposure time equals more light getting to the sensor
⬆ Higher time equals more blur artifacts (inadequate for scenes with moving subjects)
• Gain vs. Granularity
⬆ More gain equals more contrast
⬆ More gain equals in more noise (both signal and noise are amplified)
• Quantization, Sampling and Clipping
3

INTRODUCTION
Noise, sensor defects, transmission issues, excessive gains, dead pixels, compression artifacts
4

INTRODUCTION
Noise, sensor defects, transmission issues, excessive gains, dead pixels, compression artifacts
5

INTRODUCTION
Noise, sensor defects, transmission issues, excessive gains, dead pixels, compression artifacts
6scene / content

INTRODUCTION
Mis-exposure, too dark, too bright
7

OURAPPROACH
To use a Convolutional Neural Network for image enhancement and noise supression
• Modular approach to maximize reuse
• Avoid model adjustment
• Work in the sRGB color-space
• Flexible to accept diferente image resolutions
8
Scene Picture
Enhance
De-noise
Predict
Dog

IMAGE RECOGNITION MODELS
Notable models that achieved SOTA on the ILSVRC Challenge
• VGG / 2014
• ResNet / 2015
• Inception-V3 / 2016
• Inception Resnet v2 / 2017
• MobileNet v1 / 2017
• DenseNet / 2017
• NASNet / 2018
9

IMAGE RESTORATION MODELS
• ReExposeNet for mis-exposed images
• Based on U-Net and Context Aggergation Network
• One size fits all
• Small model in terms of parameters when compared to other with the same purpose
• Adjusted using supervised learning on sinthetic and real datasets
• DnCNN-3 for de-noising
• Very deep feed forward CNN
• Relies on Residual Rearning
• Can tackle several image denoising tasks, as well as JPEG deblocking and super-resolution
• Adequate for real-time applications
10

RESULTS
24
Low impact ≤10%
Moderate impact >10% & ≤30%
Critical impact > 30%

CONCLUSION
Comprehensive experiments on the robustness of state-of-the-art CNN-based image recognition
• Several common image distortions
• Ill exposure
• Signal dependente noise
• Signal independente noise
• Used a set of classifiers which had outstanding accuracy in the ILSVRC Competition
• We offer a succinct representation of the performance of the classifier
• Existing CNNs are little affected by slight miss-exposure or saturated pixel values
• Poisson noise and AWGN also have limited effect on the accuracy
• Models appear vulnerable to severe miss-exposure and signal-independent noise
• What next?
• Do segmentation, mapping, and localization systems follow the same robustness pattern?
25

Similar to A pipelined approach to deal with image distortion in computer vision - BRACIS 2020

Breast imaging tomosynthesis l rotenbergJFIM

[212]big models without big data using domain specific deep networks in data-...NAVER D2

PES ncetec conferenceAvinash P M

Scaling up Deep Learning Based Super Resolution AlgorithmsXiaoyong Zhu

image denoising technique using disctere wavelet transformalishapb

Digital Theory 1.pdfssuserccc5db

Rad 206 p11 Fundamentals of Imaging - Control of Scatter Radiationsehlawi

E011122530IOSR Journals

Super resolution-reviewWoojin Jeong

DSD-INT 2015 - Workshop processing with sentinel toolbox - Jos Maccabiani, Sk...Deltares

20211118 AI+ Remote SensingJui-Hsin (Larry) Lai

Multi-focus Application Presentation in ICSSE2017Trong-An Bui

Face Detection.pptxTorshaSett

Jillian ms defense-4-14-14-jaJillian Aurisano

IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...IEEEBEBTECHSTUDENTPROJECTS

IGARSS_2011_Tolt_presentation.pdfgrssieee

Evaluating the Perceptual Impact of Rendering Techniques on Thematic Color Ma...Matthias Trapp

seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbkRajeshKotian11

Perception and Quality of Immersive MediaAlpen-Adria-Universität

CHI2014 - Crowdsourcing Step-by-Step Information Extraction to Enhance Existi...Juho Kim

Similar to A pipelined approach to deal with image distortion in computer vision - BRACIS 2020 (20)

Breast imaging tomosynthesis l rotenberg

[212]big models without big data using domain specific deep networks in data-...

PES ncetec conference

Scaling up Deep Learning Based Super Resolution Algorithms

image denoising technique using disctere wavelet transform

Digital Theory 1.pdf

Rad 206 p11 Fundamentals of Imaging - Control of Scatter Radiation

E011122530

Super resolution-review

DSD-INT 2015 - Workshop processing with sentinel toolbox - Jos Maccabiani, Sk...

20211118 AI+ Remote Sensing

Multi-focus Application Presentation in ICSSE2017

Face Detection.pptx

Jillian ms defense-4-14-14-ja

IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...

IGARSS_2011_Tolt_presentation.pdf

Evaluating the Perceptual Impact of Rendering Techniques on Thematic Color Ma...

seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk

Perception and Quality of Immersive Media

CHI2014 - Crowdsourcing Step-by-Step Information Extraction to Enhance Existi...

Recently uploaded

VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ

Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxnull - The Open Security Community

Past, Present and Future of Generative AIabhishek36461

Effects of rheological properties on mixingviprabot1

IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani

What are the advantages and disadvantages of membrane structures.pptxwendy cai

young call girls in Green Park🔝 9953056974 🔝 escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR

Artificial-Intelligence-in-Electronics (K).pptxbritheesh05

Electronically Controlled suspensions system .pdfme23b1001

Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis

Churning of Butter, Factors affecting .Satyam Kumar

Application of Residue Theorem to evaluate real integrations.pptx959SahilShah

CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani

INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12

Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER

Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3

Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar

Recently uploaded (20)

VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...

Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx

Past, Present and Future of Generative AI

Effects of rheological properties on mixing

IVE Industry Focused Event - Defence Sector 2024

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf

What are the advantages and disadvantages of membrane structures.pptx

young call girls in Green Park🔝 9953056974 🔝 escort Service

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...

Artificial-Intelligence-in-Electronics (K).pptx

Electronically Controlled suspensions system .pdf

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction

Churning of Butter, Factors affecting .

Application of Residue Theorem to evaluate real integrations.pptx

CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf

INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE

Risk Assessment For Installation of Drainage Pipes.pdf

Concrete Mix Design - IS 10262-2019 - .pptx

Call Girls Delhi {Jodhpur} 9711199012 high profile service

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger

A pipelined approach to deal with image distortion in computer vision - BRACIS 2020

1. APIPELINEDAPPROACHTODEALWITH IMAGEDISTORTIONINCOMPUTER VISION Garbage In ➡️ Garbage Out STEFFENS, Cristiano R.; MESSIAS, Lucas R. V.; DREWS-JR, Paulo J. L.;BOTELHO, Silvia S. d. C. cristianosteffens@furg.br

2. ABOUT • Badly exposed and noisy images • Image Recognition / Classification • Assessing the impacts of image distortions on the object rcognition task • Image restoration and enhancement as a pre-processing strategy • How much can it impact the results? 2

3. INTRODUCTION How can the image be affected at capture time? • Typical challenges • High contrast (light source, contrasting scene) • Low contrast • Low light (weakly illuminated scene) • Lens apperture ⬆ Larger aperture equals more light, shorter exposure time ⬇ Larger aperture equals lower depth of field • Exposure time vs. Blur ⬆ Higher exposure time equals more light getting to the sensor ⬆ Higher time equals more blur artifacts (inadequate for scenes with moving subjects) • Gain vs. Granularity ⬆ More gain equals more contrast ⬆ More gain equals in more noise (both signal and noise are amplified) • Quantization, Sampling and Clipping 3

4. INTRODUCTION Noise, sensor defects, transmission issues, excessive gains, dead pixels, compression artifacts 4

5. INTRODUCTION Noise, sensor defects, transmission issues, excessive gains, dead pixels, compression artifacts 5

6. INTRODUCTION Noise, sensor defects, transmission issues, excessive gains, dead pixels, compression artifacts 6scene / content

7. INTRODUCTION Mis-exposure, too dark, too bright 7

8. OURAPPROACH To use a Convolutional Neural Network for image enhancement and noise supression • Modular approach to maximize reuse • Avoid model adjustment • Work in the sRGB color-space • Flexible to accept diferente image resolutions 8 Scene Picture Enhance De-noise Predict Dog

9. IMAGE RECOGNITION MODELS Notable models that achieved SOTA on the ILSVRC Challenge • VGG / 2014 • ResNet / 2015 • Inception-V3 / 2016 • Inception Resnet v2 / 2017 • MobileNet v1 / 2017 • DenseNet / 2017 • NASNet / 2018 9

10. IMAGE RESTORATION MODELS • ReExposeNet for mis-exposed images • Based on U-Net and Context Aggergation Network • One size fits all • Small model in terms of parameters when compared to other with the same purpose • Adjusted using supervised learning on sinthetic and real datasets • DnCNN-3 for de-noising • Very deep feed forward CNN • Relies on Residual Rearning • Can tackle several image denoising tasks, as well as JPEG deblocking and super-resolution • Adequate for real-time applications 10

11. RESULTS Overview 11

12. RESULTS Overview - Exposure 12

13. RESULTS Overview - Exposure 13

14. RESULTS Overview - Exposure 14

15. RESULTS Overview - Exposure 15

16. RESULTS Overview - Exposure 16

17. RESULTS Overview - Exposure 17

18. RESULTS Overview - Exposure 18

19. RESULTS Overview - Noise 19

20. RESULTS Overview - Noise 20

21. RESULTS Overview - Noise 21

22. RESULTS Overview - Noise 22

23. RESULTS Overview - Noise 23

24. RESULTS 24 Low impact ≤10% Moderate impact >10% & ≤30% Critical impact > 30%

25. CONCLUSION Comprehensive experiments on the robustness of state-of-the-art CNN-based image recognition • Several common image distortions • Ill exposure • Signal dependente noise • Signal independente noise • Used a set of classifiers which had outstanding accuracy in the ILSVRC Competition • We offer a succinct representation of the performance of the classifier • Existing CNNs are little affected by slight miss-exposure or saturated pixel values • Poisson noise and AWGN also have limited effect on the accuracy • Models appear vulnerable to severe miss-exposure and signal-independent noise • What next? • Do segmentation, mapping, and localization systems follow the same robustness pattern? 25

26. THANKS! Cristiano Steffens cristianosteffens@furg.br researchgate.net/profile/Cristiano_Steffens 26

Editor's Notes

Hi, My name is Criatiano Steffens. I’ll be presenting “A Pipelined Approach to Deal with Image Distortion in Computer Vision”. This work was done with my colleague Lucas Messias under the supervision of Professor Drews and Professor Botelho. Here, we extend a previous work, published las year, in which we showed the impacts of image distortion in several image-based applications.For those who are unfamiliar with the image recognition/ classification task, please consider this as a classic approach to deal with garbage-in > garbage-out issue. Our models are only as good as our input data. If you have faulty data, you are likely to get unreliable results. Although images are multidimensional data, in which the content can often be inferred by the inherent relationship among distinct image parts, we show that the same issues still apply.
Image classification is a well-established problem in computer vision. Most state-of-the-art models rely on Convolutional Neural Networks to achieve near-human performance in that task.However, CNNs have shown to be susceptible to image manipulation, which undermines the trustability of perception systems. This property is critical, especially in unmanned systems, autonomous vehicles, and scenarios where light cannot be controlled. We investigate the robustness of several Deep-Learning based image recognition models and how the accuracy is affected by several distinct image distortions. The distortions include ill-exposure, low-range image sensors, and common noise types.Furthermore, we also propose and evaluate an image pipeline designed to minimize image distortion before the image classification is performed.Results show that most CNN models are marginally affected by mild miss-exposure and Shot noise. On the one hand, the proposed pipeline can provide significant gain on miss-exposed images. On the other hand, harsh miss-exposure, signal-dependent noise, and impulse noise, incur in a high impact on all evaluated models.
(considering most image data is transmitted or stored in formats that can easily be converte to sRGB)
[worthy of attention or notice; remarkable] Image classification models are built to predict the classes of objects present in an image. The remaining of this paper explores CNN based classification models, which have been adjusted for the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Convolutional networks have recently enjoyed great success in this task. Among these, we highlight the following popular models: \textit{i.} VGG, by Simonyan \etal \cite{simonyan2014very}, which obtained both first and second place in the ILSVRC-2014; \textit{ii.} ResNet, by \cite{he2016deep}, which obtained first place in the ILSVRC-2015; \textit{iii.} Inception-v3, by \cite{szegedy2016rethinking}, which introduces factorized convolutions and aggressive regularization; \textit{iv.} Inception-ResNet-v2, by \cite{szegedy2017inception}, which combines residual connections; \textit{v.} MobileNetV1, by \cite{howard2017mobilenets}, which includes depthwise separable convolutions between the regular convolutions layers; \textit{vi.} DenseNet, by \cite{huang2017densely}, where each layer obtains additional inputs from all preceding layers and passes on its own feature-maps to all subsequent layers; \textit{vii.} NASNet, by \cite{zoph2018learning}, NASNet which automates network design using information acquired on a small dataset.
In order to restore miss-exposed images, we introduced the ReExposeNet \cite{steffens2019icip} image restoration model into the image recognition pipeline. This model is designed to estimate the radiance of an improperly exposed image, a task that requires restoration and enhancement of non-clipped pixels to maximize visibility and color accuracy, as well as reconstruction strategies for regions where the signal has been clipped. %ReExposeNet is a fast and small CNN exposure correction model, capable of synthesizing substantial clipped parts in high-resolution images. It combines aspects of both U-Nets and Context Aggregation Networks (CANs). ReExposeNet relies on supervised training considering a custom content-based objective function to maximize restoration and reconstruction in clipped areas. It has been adjusted considering both synthetic and real miss-exposed images in three different datasets. ReExposeNet is released as a one-size-fits-all solution, which can be consistently applied on a wide range of image miss-exposure levels. For the present work, we used the model as released by its authors, without further fine-tuning. %To restore images damaged by noise, we used the DnCNN-3 model \cite{zhang2017beyond}. DnCNN-3 is a very deep feed-forward denoising convolutional neural network. It relies on residual learning and batch normalization to speed up the training process as well as boost the denoising performance. Zhang \etal claims to provide a single DnCNN model to tackle several general image denoising tasks, such as blind Gaussian denoising, single image super-resolution, and JPEG image deblocking. The authors show that the DnCNN model can not only exhibit high effectiveness in several general image denoising tasks but also be efficiently implemented by benefiting from GPU computing, which makes it adequate for real-time applications.
Gamma Power Transformation is a nonlinear operation used to encode and decode luminance values in image systems. It is used to adjust and compensate the response of some luminance levels in the input image. We use Gamma Power Transformation to mimic the conditions observed in under-exposed and overexposed images as $\hat{I} = I^\gamma$. The power transformation is followed by min-max normalization in order to adjust pixel values to a valid representation range. This transformation results in lost data in dark regions, when $\gamma > 1$, or bright and washed-out regions, when $\gamma < 1$. For simulation purposes, we used $\gamma = [\frac{1}{4}; \frac{1}{6}; \frac{1}{8}; 4; 6; 8]$. R stands for restored, which means we are using our pipelined approach. For gamma 4, we notice an expressive improvement in the accuracy with the pipeline. We also notice that Nasnet Large (purple pentagon), Inception Resnet v2 (green x ), Inception v3 (orange triangle) seem to be more robust towards mild exposure than the other models considered. Small models such as mobile net v2 seem to be less resilient towards image distortion.
Gaussian Noise is randomly added to the input image. The random noise follows a normal distribution.
Shot Noise also know as Photon or Poisson Noise is a data-dependent noise model. A Poisson model of noise may be more appropriate than a Gaussian model for low light conditions where the noise is due to low photon counts. Talbot claims that image sensor noise is dominated by Poisson statistics, even at high illumination level, this being a typical effect in images captured by robots.
Salt and Pepper Noise (S\&P) is an impulse noise, added to an image by setting white (pixel value equals 255 in an 8-bit per color color-space) and black pixels (pixel value equals 0). In real applications, Salt \& Pepper noise is often associated with dead pixels the camera's sensor array. Details on the probability are provided in the paper.
Speckle noise is originated from coherent processing of back-scattered signals from multiple distributed points. It follows a uniform distribution. Speckle noise in real applications is often related to environmental conditions that affect the imaging sensor during image acquisition. It is also common in medical images, as well as active Radar images.
NASNet Large is best Left to right: VGG, ResNet, Inception v3, Inception Resnet v2, DenseNet, NasNet Large, NasNet Mobile, MobileNet v2
That’s it for today, If you have any questions regarding the experiments or if you’d like to share your toughts on this matter, please get in touch. Thank you for your atention.

A pipelined approach to deal with image distortion in computer vision - BRACIS 2020

Recommended

Recommended

More Related Content

Similar to A pipelined approach to deal with image distortion in computer vision - BRACIS 2020

Similar to A pipelined approach to deal with image distortion in computer vision - BRACIS 2020 (20)

More from Cristiano Rafael Steffens

More from Cristiano Rafael Steffens (20)

Recently uploaded

Recently uploaded (20)

A pipelined approach to deal with image distortion in computer vision - BRACIS 2020

Editor's Notes