Image classification is a well-established problem in computer vision. Most state-of-the-art models rely on Convolutional Neural Networks to achieve near-human performance in that task. However, CNNs have shown to be susceptible to image manipulation, which undermines the trustability of perception systems. This property is critical, especially in unmanned systems, autonomous vehicles, and scenarios where light cannot be controlled. We investigate the robustness of several Deep-Learning based image recognition models and how the accuracy is affected by several distinct image distortions. The distortions include ill-exposure, low-range image sensors, and common noise types. Furthermore, we also propose and evaluate an image pipeline designed to minimize image distortion before the image classification is performed. Results show that most CNN models are marginally affected by mild miss-exposure...
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...Cristiano Rafael Steffens
Convolutional Neural Networks stand the current state-of-the-art in image recognition, as well as many computer vision tasks.
Nevertheless, these architectures have been shown to be vulnerable to image manipulations, which may undermine the reliability and safety of CNN-based models in autonomous and robotic applications. We present a rigorous evaluation of the robustness of several high-level image recognition models and investigate their performance under distinct image distortions. We propose a testing framework which emulates ill exposure conditions, low-range image sensors, lossy compression, as well as commonly observed noise types. One one side results measured in terms of accuracy, precision, and F1-Score, indicate that most CNN models are marginally affected by mild miss-exposure, heavy compression, and Poisson noise. Severe miss-exposure, impulse noise, or signal-dependent noise, on the other side, show a substantial drop in accuracy and precision. A careful evaluation of some typical image distortions, commonly observed in computer vision and machine vision pipelines, provides insights and directions for further developments in the field. Please refer to our github repo for code and data.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-brailovskiy
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Ilya Brailovskiy, Principal Engineer at Amazon Lab126, presents the "How Image Sensor and Video Compression Parameters Impact Vision Algorithms" tutorial at the May 2017 Embedded Vision Summit.
Recent advances in deep learning algorithms have brought automated object detection and recognition to human accuracy levels on various test datasets. But algorithms that work well on an engineer’s PC often fail when deployed as part of a complete embedded system. In this talk, Brailovskiy examines some of the key embedded vision system elements that can degrade the performance of vision algorithms.
For example, in many systems video is compressed, transmitted, and then decompressed before being presented to vision algorithms. Not surprisingly, video encoding parameters, such as bit rate, can have a significant impact on vision algorithm accuracy. Similarly, image sensor parameters can have a profound effect on the nature of the images captured, and therefore on the performance of vision algorithms. He explores how image sensor and video compression parameters impact vision algorithm performance, and discusses methods for selecting the best parameters to aid vision algorithm accuracy.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/pathpartner/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Jayachandra Dakala, Technical Architect at PathPartner Technology, presents the "Approaches for Vision-based Driver Monitoring" tutorial at the May 2017 Embedded Vision Summit.
Since many road accidents are caused by driver inattention, assessing driver attention is important to preventing accidents. Distraction caused by other activities and sleepiness due to fatigue are the main causes of driver inattention. Vision-based assessment of driver distraction and fatigue must estimate face pose, sleepiness, expression, etc. Estimating these aspects under real driving conditions, including day-to-night transition, drivers wearing sunglasses etc., is a challenging task.
A solution using deep learning to handle tasks from searching for a driver’s face in a given image to estimating attention would potentially be difficult to realize in an embedded system. In this talk, Dakala looks at the pros and cons of various machine learning approaches like multi-task deep networks, boosted cascades, etc. for this application, and then describes a hybrid approach that provides the required insights while being realizable in an embedded system.
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...Cristiano Rafael Steffens
Convolutional Neural Networks stand the current state-of-the-art in image recognition, as well as many computer vision tasks.
Nevertheless, these architectures have been shown to be vulnerable to image manipulations, which may undermine the reliability and safety of CNN-based models in autonomous and robotic applications. We present a rigorous evaluation of the robustness of several high-level image recognition models and investigate their performance under distinct image distortions. We propose a testing framework which emulates ill exposure conditions, low-range image sensors, lossy compression, as well as commonly observed noise types. One one side results measured in terms of accuracy, precision, and F1-Score, indicate that most CNN models are marginally affected by mild miss-exposure, heavy compression, and Poisson noise. Severe miss-exposure, impulse noise, or signal-dependent noise, on the other side, show a substantial drop in accuracy and precision. A careful evaluation of some typical image distortions, commonly observed in computer vision and machine vision pipelines, provides insights and directions for further developments in the field. Please refer to our github repo for code and data.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-brailovskiy
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Ilya Brailovskiy, Principal Engineer at Amazon Lab126, presents the "How Image Sensor and Video Compression Parameters Impact Vision Algorithms" tutorial at the May 2017 Embedded Vision Summit.
Recent advances in deep learning algorithms have brought automated object detection and recognition to human accuracy levels on various test datasets. But algorithms that work well on an engineer’s PC often fail when deployed as part of a complete embedded system. In this talk, Brailovskiy examines some of the key embedded vision system elements that can degrade the performance of vision algorithms.
For example, in many systems video is compressed, transmitted, and then decompressed before being presented to vision algorithms. Not surprisingly, video encoding parameters, such as bit rate, can have a significant impact on vision algorithm accuracy. Similarly, image sensor parameters can have a profound effect on the nature of the images captured, and therefore on the performance of vision algorithms. He explores how image sensor and video compression parameters impact vision algorithm performance, and discusses methods for selecting the best parameters to aid vision algorithm accuracy.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/pathpartner/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Jayachandra Dakala, Technical Architect at PathPartner Technology, presents the "Approaches for Vision-based Driver Monitoring" tutorial at the May 2017 Embedded Vision Summit.
Since many road accidents are caused by driver inattention, assessing driver attention is important to preventing accidents. Distraction caused by other activities and sleepiness due to fatigue are the main causes of driver inattention. Vision-based assessment of driver distraction and fatigue must estimate face pose, sleepiness, expression, etc. Estimating these aspects under real driving conditions, including day-to-night transition, drivers wearing sunglasses etc., is a challenging task.
A solution using deep learning to handle tasks from searching for a driver’s face in a given image to estimating attention would potentially be difficult to realize in an embedded system. In this talk, Dakala looks at the pros and cons of various machine learning approaches like multi-task deep networks, boosted cascades, etc. for this application, and then describes a hybrid approach that provides the required insights while being realizable in an embedded system.
Scaling up Deep Learning Based Super Resolution AlgorithmsXiaoyong Zhu
Superresolution is a process for obtaining one or more high-resolution images from one or more low-resolution observations. It has been used for many applications, including satellite and aerial imaging, medical image processing, ultrasound imaging, line fitting, automated mosaicking, infrared imaging, facial image improvement, text image improvement, compressed image and video enhancement, and fingerprint image enhancement. While research on superresolution began in the 1970s, recently, with the power of deep learning, many notable new methods have been created, including SRCNN, SRResNet, and lately, SRGANs, which use generative adversarial networks. However, since these approaches require a lot of images to train the deep learning network, they are supercompute intensive. Fortunately, with the power of the cloud, you can easily scale up the compute resources as needed, making the algorithm converge faster.
Rad 206 p11 Fundamentals of Imaging - Control of Scatter Radiationsehlawi
Fundamentals of Imaging
This course will provide you with the principles involved in the formation and recording of the radiologic image in both conventional and digital imaging systems as well as the principles of image quality assessment.
Control of Scatter Radiation
In this talk, we introduce our proposed AI+ Remote Sensing techniques from the Research Lab of Ping An Technology. One of the techniques is our deep learning haze removal model which can effectively remove the interference of haze in the satellite images and observe the true ground reflectance. Next, we introduce our super-resolution model which can enhance 4x image details. The SR model has been deployed to the Sentinel-2 satellite imagery and greatly improve its image quality. Last, we introduce our crop recognition system. The system includes a user interface for a user to label a few of training samples, and the proposed crop recognition model can be trained on the fly to be deployed on a broad geo-area immediately. In addition to the techniques, our AI+ Remote Sensing technologies have been supporting the carbon(CO2) emission analysis for Environment, Society, and Government(ESG) Department, flooding and disaster analysis for Smart City Department, and crop field forecast for Investment Department in Ping An Group.
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...IEEEBEBTECHSTUDENTPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Interest in immersive media increased significantly over recent years. Besides applications in entertainment, culture, health, industry, etc., telepresence and remote collaboration gained importance due to the pandemic and climate crisis. Immersive media have the potential to increase social integration and to reduce greenhouse gas emissions. As a result, technologies along the whole pipeline from capture to display are maturing and applications are becoming available, creating business opportunities. One aspect of immersive technologies that is still relatively undeveloped is the understanding of perception and quality, including subjective and objective assessment. The interactive nature of immersive media poses new challenges to estimation of saliency or visual attention, and to the development of quality metrics. The V-SENSE lab of Trinity College Dublin addresses these questions in current research. This talk will highlight corresponding examples in 360 VR video, light fields, volumetric video and XR.
CHI2014 - Crowdsourcing Step-by-Step Information Extraction to Enhance Existi...Juho Kim
Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning experience of existing how-to videos with step-by-step annotations.
We first performed a formative study to verify that annotations are actually useful to learners. We created ToolScape, an interactive video player that displays step descriptions and intermediate result thumbnails in the video timeline. Learners in our study performed better and gained more self-efficacy using ToolScape versus a traditional video player.
To add the needed step annotations to existing how-to videos at scale, we introduce a novel crowdsourcing workflow. It extracts step-by-step structure from an existing video, including step times, descriptions, and before and after images. We introduce the Find-Verify-Expand design pattern for temporal and visual annotation, which applies clustering, text processing, and visual analysis algorithms to merge crowd output. The workflow does not rely on domain-specific customization, works on top of existing videos, and recruits untrained crowd workers. We evaluated the workflow with Mechanical Turk, using 75 cooking, makeup, and Photoshop videos on YouTube. Results show that our workflow can extract steps with a quality comparable to that of trained annotators across all three domains with 77% precision and 81% recall.
Signal with amplitude outside the range accepted by the sensor
Enter damaged image. get restored image
Post-processing of damaged images at the moment of acquisition
sRGB Color Space
Restoration with aesthetic purposes
What we expect: Color correction,Texture Edges / Lines / Image gradient; Structures;
Modeling based on convolutional neural networks
More Related Content
Similar to A pipelined approach to deal with image distortion in computer vision - BRACIS 2020
Scaling up Deep Learning Based Super Resolution AlgorithmsXiaoyong Zhu
Superresolution is a process for obtaining one or more high-resolution images from one or more low-resolution observations. It has been used for many applications, including satellite and aerial imaging, medical image processing, ultrasound imaging, line fitting, automated mosaicking, infrared imaging, facial image improvement, text image improvement, compressed image and video enhancement, and fingerprint image enhancement. While research on superresolution began in the 1970s, recently, with the power of deep learning, many notable new methods have been created, including SRCNN, SRResNet, and lately, SRGANs, which use generative adversarial networks. However, since these approaches require a lot of images to train the deep learning network, they are supercompute intensive. Fortunately, with the power of the cloud, you can easily scale up the compute resources as needed, making the algorithm converge faster.
Rad 206 p11 Fundamentals of Imaging - Control of Scatter Radiationsehlawi
Fundamentals of Imaging
This course will provide you with the principles involved in the formation and recording of the radiologic image in both conventional and digital imaging systems as well as the principles of image quality assessment.
Control of Scatter Radiation
In this talk, we introduce our proposed AI+ Remote Sensing techniques from the Research Lab of Ping An Technology. One of the techniques is our deep learning haze removal model which can effectively remove the interference of haze in the satellite images and observe the true ground reflectance. Next, we introduce our super-resolution model which can enhance 4x image details. The SR model has been deployed to the Sentinel-2 satellite imagery and greatly improve its image quality. Last, we introduce our crop recognition system. The system includes a user interface for a user to label a few of training samples, and the proposed crop recognition model can be trained on the fly to be deployed on a broad geo-area immediately. In addition to the techniques, our AI+ Remote Sensing technologies have been supporting the carbon(CO2) emission analysis for Environment, Society, and Government(ESG) Department, flooding and disaster analysis for Smart City Department, and crop field forecast for Investment Department in Ping An Group.
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...IEEEBEBTECHSTUDENTPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Interest in immersive media increased significantly over recent years. Besides applications in entertainment, culture, health, industry, etc., telepresence and remote collaboration gained importance due to the pandemic and climate crisis. Immersive media have the potential to increase social integration and to reduce greenhouse gas emissions. As a result, technologies along the whole pipeline from capture to display are maturing and applications are becoming available, creating business opportunities. One aspect of immersive technologies that is still relatively undeveloped is the understanding of perception and quality, including subjective and objective assessment. The interactive nature of immersive media poses new challenges to estimation of saliency or visual attention, and to the development of quality metrics. The V-SENSE lab of Trinity College Dublin addresses these questions in current research. This talk will highlight corresponding examples in 360 VR video, light fields, volumetric video and XR.
CHI2014 - Crowdsourcing Step-by-Step Information Extraction to Enhance Existi...Juho Kim
Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning experience of existing how-to videos with step-by-step annotations.
We first performed a formative study to verify that annotations are actually useful to learners. We created ToolScape, an interactive video player that displays step descriptions and intermediate result thumbnails in the video timeline. Learners in our study performed better and gained more self-efficacy using ToolScape versus a traditional video player.
To add the needed step annotations to existing how-to videos at scale, we introduce a novel crowdsourcing workflow. It extracts step-by-step structure from an existing video, including step times, descriptions, and before and after images. We introduce the Find-Verify-Expand design pattern for temporal and visual annotation, which applies clustering, text processing, and visual analysis algorithms to merge crowd output. The workflow does not rely on domain-specific customization, works on top of existing videos, and recruits untrained crowd workers. We evaluated the workflow with Mechanical Turk, using 75 cooking, makeup, and Photoshop videos on YouTube. Results show that our workflow can extract steps with a quality comparable to that of trained annotators across all three domains with 77% precision and 81% recall.
Similar to A pipelined approach to deal with image distortion in computer vision - BRACIS 2020 (20)
Signal with amplitude outside the range accepted by the sensor
Enter damaged image. get restored image
Post-processing of damaged images at the moment of acquisition
sRGB Color Space
Restoration with aesthetic purposes
What we expect: Color correction,Texture Edges / Lines / Image gradient; Structures;
Modeling based on convolutional neural networks
Vision-Based System for Welding Groove Measurements for Robotic Welding Appli...Cristiano Rafael Steffens
BQ Leonardo, CR Steffens, SC Silva Fil., JL Mór, V Hüttner, EA Leivas, VS Rosa and SSC Botelho
Center of Computer Science, Federal University of Rio Grande, Brazil
Welding Groove Mapping: Image Acquisition and Processing on Shiny Surfaces - ...Cristiano Rafael Steffens
We propose a Vision-Based Measurement (VBM) system and evaluate how different algorithms impact the results. The proposed system joins hardware and software to image the welding plates using a single CMOS camera, run computer vision algorithms and control the welding equipment. A complete prototype, using a commercial linear welding robot is presented.
Authors: Cristiano R. Steffens, Bruno Q. Leonardo, Sidnei Carlos S. Filho, Valquiria Hüttner, Vagner S. Rosa, Silvia Silva C. Botelho
11°International Conference on Computer Vision Theory and Applications - VISAPP 2016
Uma rápida introdução ao OpenCV apresenta somente o essencial. Esta apresentação vai direto ao ponto, trazendo exemplos para sair programando. Todos os algoritmos foram testados utilizando a versão 2.4.10 da biblioteca. Comentários no código em Pt-Br.
Um sistema de detecção de chamas utilizando apenas dados espaciais para detecção de fogo utilizando câmeras hand-held. Revisão dos trabalhos de Phillips (2002), Chen (2004), Celik (2007/2008/2009), Borges (2010) e Chenebert (2011). Utilização de Random Forests Breiman (2001) para extração e classificação das regiões.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
2. ABOUT
• Badly exposed and noisy images
• Image Recognition / Classification
• Assessing the impacts of image distortions on the object rcognition task
• Image restoration and enhancement as a pre-processing strategy
• How much can it impact the results?
2
3. INTRODUCTION
How can the image be affected at capture time?
• Typical challenges
• High contrast (light source, contrasting scene)
• Low contrast
• Low light (weakly illuminated scene)
• Lens apperture
⬆ Larger aperture equals more light, shorter exposure time
⬇ Larger aperture equals lower depth of field
• Exposure time vs. Blur
⬆ Higher exposure time equals more light getting to the sensor
⬆ Higher time equals more blur artifacts (inadequate for scenes with moving subjects)
• Gain vs. Granularity
⬆ More gain equals more contrast
⬆ More gain equals in more noise (both signal and noise are amplified)
• Quantization, Sampling and Clipping
3
8. OURAPPROACH
To use a Convolutional Neural Network for image enhancement and noise supression
• Modular approach to maximize reuse
• Avoid model adjustment
• Work in the sRGB color-space
• Flexible to accept diferente image resolutions
8
Scene Picture
Enhance
De-noise
Predict
Dog
10. IMAGE RESTORATION MODELS
• ReExposeNet for mis-exposed images
• Based on U-Net and Context Aggergation Network
• One size fits all
• Small model in terms of parameters when compared to other with the same purpose
• Adjusted using supervised learning on sinthetic and real datasets
• DnCNN-3 for de-noising
• Very deep feed forward CNN
• Relies on Residual Rearning
• Can tackle several image denoising tasks, as well as JPEG deblocking and super-resolution
• Adequate for real-time applications
10
25. CONCLUSION
Comprehensive experiments on the robustness of state-of-the-art CNN-based image recognition
• Several common image distortions
• Ill exposure
• Signal dependente noise
• Signal independente noise
• Used a set of classifiers which had outstanding accuracy in the ILSVRC Competition
• We offer a succinct representation of the performance of the classifier
• Existing CNNs are little affected by slight miss-exposure or saturated pixel values
• Poisson noise and AWGN also have limited effect on the accuracy
• Models appear vulnerable to severe miss-exposure and signal-independent noise
• What next?
• Do segmentation, mapping, and localization systems follow the same robustness pattern?
25
Hi,
My name is Criatiano Steffens. I’ll be presenting “A Pipelined Approach to Deal with Image Distortion in Computer Vision”. This work was done with my colleague Lucas Messias under the supervision of Professor Drews and Professor Botelho.
Here, we extend a previous work, published las year, in which we showed the impacts of image distortion in several image-based applications.For those who are unfamiliar with the image recognition/ classification task, please consider this as a classic approach to deal with garbage-in > garbage-out issue. Our models are only as good as our input data. If you have faulty data, you are likely to get unreliable results.
Although images are multidimensional data, in which the content can often be inferred by the inherent relationship among distinct image parts, we show that the same issues still apply.
Image classification is a well-established problem in computer vision. Most state-of-the-art models rely on Convolutional Neural Networks to achieve near-human performance in that task.However, CNNs have shown to be susceptible to image manipulation, which undermines the trustability of perception systems. This property is critical, especially in unmanned systems, autonomous vehicles, and scenarios where light cannot be controlled. We investigate the robustness of several Deep-Learning based image recognition models and how the accuracy is affected by several distinct image distortions. The distortions include ill-exposure, low-range image sensors, and common noise types.Furthermore, we also propose and evaluate an image pipeline designed to minimize image distortion before the image classification is performed.Results show that most CNN models are marginally affected by mild miss-exposure and Shot noise. On the one hand, the proposed pipeline can provide significant gain on miss-exposed images. On the other hand, harsh miss-exposure, signal-dependent noise, and impulse noise, incur in a high impact on all evaluated models.
(considering most image data is transmitted or stored in formats that can easily be converte to sRGB)
[worthy of attention or notice; remarkable]
Image classification models are built to predict the classes of objects present in an image. The remaining of this paper explores CNN based classification models, which have been adjusted for the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Convolutional networks have recently enjoyed great success in this task. Among these, we highlight the following popular models: \textit{i.} VGG, by Simonyan \etal \cite{simonyan2014very}, which obtained both first and second place in the ILSVRC-2014; \textit{ii.} ResNet, by \cite{he2016deep}, which obtained first place in the ILSVRC-2015; \textit{iii.} Inception-v3, by \cite{szegedy2016rethinking}, which introduces factorized convolutions and aggressive regularization; \textit{iv.} Inception-ResNet-v2, by \cite{szegedy2017inception}, which combines residual connections; \textit{v.} MobileNetV1, by \cite{howard2017mobilenets}, which includes depthwise separable convolutions between the regular convolutions layers; \textit{vi.} DenseNet, by \cite{huang2017densely}, where each layer obtains additional inputs from all preceding layers and passes on its own feature-maps to all subsequent layers; \textit{vii.} NASNet, by \cite{zoph2018learning}, NASNet which automates network design using information acquired on a small dataset.
In order to restore miss-exposed images, we introduced the ReExposeNet \cite{steffens2019icip} image restoration model into the image recognition pipeline. This model is designed to estimate the radiance of an improperly exposed image, a task that requires restoration and enhancement of non-clipped pixels to maximize visibility and color accuracy, as well as reconstruction strategies for regions where the signal has been clipped. %ReExposeNet is a fast and small CNN exposure correction model, capable of synthesizing substantial clipped parts in high-resolution images. It combines aspects of both U-Nets and Context Aggregation Networks (CANs). ReExposeNet relies on supervised training considering a custom content-based objective function to maximize restoration and reconstruction in clipped areas. It has been adjusted considering both synthetic and real miss-exposed images in three different datasets. ReExposeNet is released as a one-size-fits-all solution, which can be consistently applied on a wide range of image miss-exposure levels. For the present work, we used the model as released by its authors, without further fine-tuning.
%To restore images damaged by noise, we used the DnCNN-3 model \cite{zhang2017beyond}. DnCNN-3 is a very deep feed-forward denoising convolutional neural network. It relies on residual learning and batch normalization to speed up the training process as well as boost the denoising performance. Zhang \etal claims to provide a single DnCNN model to tackle several general image denoising tasks, such as blind Gaussian denoising, single image super-resolution, and JPEG image deblocking. The authors show that the DnCNN model can not only exhibit high effectiveness in several general image denoising tasks but also be efficiently implemented by benefiting from GPU computing, which makes it adequate for real-time applications.
Gamma Power Transformation is a nonlinear operation used to encode and decode luminance values in image systems. It is used to adjust and compensate the response of some luminance levels in the input image. We use Gamma Power Transformation to mimic the conditions observed in under-exposed and overexposed images as $\hat{I} = I^\gamma$. The power transformation is followed by min-max normalization in order to adjust pixel values to a valid representation range. This transformation results in lost data in dark regions, when $\gamma > 1$, or bright and washed-out regions, when $\gamma < 1$. For simulation purposes, we used $\gamma = [\frac{1}{4}; \frac{1}{6}; \frac{1}{8}; 4; 6; 8]$.
R stands for restored, which means we are using our pipelined approach.
For gamma 4, we notice an expressive improvement in the accuracy with the pipeline.
We also notice that Nasnet Large (purple pentagon), Inception Resnet v2 (green x ), Inception v3 (orange triangle) seem to be more robust towards mild exposure than the other models considered.
Small models such as mobile net v2 seem to be less resilient towards image distortion.
Gaussian Noise is randomly added to the input image. The random noise follows a normal distribution.
Shot Noise also know as Photon or Poisson Noise is a data-dependent noise model. A Poisson model of noise may be more appropriate than a Gaussian model for low light conditions where the noise is due to low photon counts.
Talbot claims that image sensor noise is dominated by Poisson statistics, even at high illumination level, this being a typical effect in images captured by robots.
Salt and Pepper Noise (S\&P) is an impulse noise, added to an image by setting white (pixel value equals 255 in an 8-bit per color color-space) and black pixels (pixel value equals 0). In real applications, Salt \& Pepper noise is often associated with dead pixels the camera's sensor array.
Details on the probability are provided in the paper.
Speckle noise is originated from coherent processing of back-scattered signals from multiple distributed points. It follows a uniform distribution.
Speckle noise in real applications is often related to environmental conditions that affect the imaging sensor during image acquisition. It is also common in medical images, as well as active Radar images.
NASNet Large is best
Left to right: VGG, ResNet, Inception v3, Inception Resnet v2, DenseNet, NasNet Large, NasNet Mobile, MobileNet v2
That’s it for today,
If you have any questions regarding the experiments or if you’d like to share your toughts on this matter, please get in touch.
Thank you for your atention.