1) The document evaluates how state-of-the-art convolutional neural networks (CNNs) perform on image recognition tasks when images are exposed to different types of noise, distortions and compression.
2) It finds that while CNN models are robust to mild exposure issues and noise, performance decreases significantly under moderate to severe exposure problems and salt and pepper noise.
3) Larger CNN models like NASNet Large perform best, while smaller mobile models are most affected by distortions. The study aims to improve CNN robustness and build image processing pipelines to handle faulty data.
A pipelined approach to deal with image distortion in computer vision - BRACI...Cristiano Rafael Steffens
Image classification is a well-established problem in computer vision. Most state-of-the-art models rely on Convolutional Neural Networks to achieve near-human performance in that task. However, CNNs have shown to be susceptible to image manipulation, which undermines the trustability of perception systems. This property is critical, especially in unmanned systems, autonomous vehicles, and scenarios where light cannot be controlled. We investigate the robustness of several Deep-Learning based image recognition models and how the accuracy is affected by several distinct image distortions. The distortions include ill-exposure, low-range image sensors, and common noise types. Furthermore, we also propose and evaluate an image pipeline designed to minimize image distortion before the image classification is performed. Results show that most CNN models are marginally affected by mild miss-exposure...
Improving Hardware Efficiency for DNN ApplicationsChester Chen
Speaker: Dr. Hai (Helen) Li is the Clare Boothe Luce Associate Professor of Electrical and Computer Engineering and Co-director of the Duke Center for Evolutionary Intelligence at Duke University
In this talk, I will introduce a few recent research spotlights by the Duke Center for Evolutionary Intelligence. The talk will start with the structured sparsity learning (SSL) method which attempts to learn a compact structure from a bigger DNN to reduce computation cost. It generates a regularized structure with high execution efficiency. Our experiments on CPU, GPU, and FPGA platforms show on average 3~5 times speedup of convolutional layer computation of AlexNet. Then, the implementation and acceleration of DNN applications on mobile computing systems will be introduced. MoDNN is a local distributed system which partitions DNN models onto several mobile devices to accelerate computations. ApesNet is an efficient pixel-wise segmentation network, which understands road scenes in real-time, and has achieved promising accuracy. Our prospects on the adoption of emerging technology will also be given at the end of this talk, offering the audiences an alternative thinking about the future evolution and revolution of modern computing systems.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-brailovskiy
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Ilya Brailovskiy, Principal Engineer at Amazon Lab126, presents the "How Image Sensor and Video Compression Parameters Impact Vision Algorithms" tutorial at the May 2017 Embedded Vision Summit.
Recent advances in deep learning algorithms have brought automated object detection and recognition to human accuracy levels on various test datasets. But algorithms that work well on an engineer’s PC often fail when deployed as part of a complete embedded system. In this talk, Brailovskiy examines some of the key embedded vision system elements that can degrade the performance of vision algorithms.
For example, in many systems video is compressed, transmitted, and then decompressed before being presented to vision algorithms. Not surprisingly, video encoding parameters, such as bit rate, can have a significant impact on vision algorithm accuracy. Similarly, image sensor parameters can have a profound effect on the nature of the images captured, and therefore on the performance of vision algorithms. He explores how image sensor and video compression parameters impact vision algorithm performance, and discusses methods for selecting the best parameters to aid vision algorithm accuracy.
A pipelined approach to deal with image distortion in computer vision - BRACI...Cristiano Rafael Steffens
Image classification is a well-established problem in computer vision. Most state-of-the-art models rely on Convolutional Neural Networks to achieve near-human performance in that task. However, CNNs have shown to be susceptible to image manipulation, which undermines the trustability of perception systems. This property is critical, especially in unmanned systems, autonomous vehicles, and scenarios where light cannot be controlled. We investigate the robustness of several Deep-Learning based image recognition models and how the accuracy is affected by several distinct image distortions. The distortions include ill-exposure, low-range image sensors, and common noise types. Furthermore, we also propose and evaluate an image pipeline designed to minimize image distortion before the image classification is performed. Results show that most CNN models are marginally affected by mild miss-exposure...
Improving Hardware Efficiency for DNN ApplicationsChester Chen
Speaker: Dr. Hai (Helen) Li is the Clare Boothe Luce Associate Professor of Electrical and Computer Engineering and Co-director of the Duke Center for Evolutionary Intelligence at Duke University
In this talk, I will introduce a few recent research spotlights by the Duke Center for Evolutionary Intelligence. The talk will start with the structured sparsity learning (SSL) method which attempts to learn a compact structure from a bigger DNN to reduce computation cost. It generates a regularized structure with high execution efficiency. Our experiments on CPU, GPU, and FPGA platforms show on average 3~5 times speedup of convolutional layer computation of AlexNet. Then, the implementation and acceleration of DNN applications on mobile computing systems will be introduced. MoDNN is a local distributed system which partitions DNN models onto several mobile devices to accelerate computations. ApesNet is an efficient pixel-wise segmentation network, which understands road scenes in real-time, and has achieved promising accuracy. Our prospects on the adoption of emerging technology will also be given at the end of this talk, offering the audiences an alternative thinking about the future evolution and revolution of modern computing systems.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-brailovskiy
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Ilya Brailovskiy, Principal Engineer at Amazon Lab126, presents the "How Image Sensor and Video Compression Parameters Impact Vision Algorithms" tutorial at the May 2017 Embedded Vision Summit.
Recent advances in deep learning algorithms have brought automated object detection and recognition to human accuracy levels on various test datasets. But algorithms that work well on an engineer’s PC often fail when deployed as part of a complete embedded system. In this talk, Brailovskiy examines some of the key embedded vision system elements that can degrade the performance of vision algorithms.
For example, in many systems video is compressed, transmitted, and then decompressed before being presented to vision algorithms. Not surprisingly, video encoding parameters, such as bit rate, can have a significant impact on vision algorithm accuracy. Similarly, image sensor parameters can have a profound effect on the nature of the images captured, and therefore on the performance of vision algorithms. He explores how image sensor and video compression parameters impact vision algorithm performance, and discusses methods for selecting the best parameters to aid vision algorithm accuracy.
Steer and/or sink the supertanker by Andrew RendellValtech UK
Andrew Rendell's (Principal Consultant from Valtech) presentation on: "Steer and or sink the supertanker"!
This presentation was held at the SPA conference on the 14th June 2011.
A case study into the pros and cons of over three years experience of continuous source code analysis followed by an interactive session using the real tool on real source code.
Andrew discusses what continuous inspection is and why directing software development can feel like trying to steer a super tanker.
Recent studies on robustness of Convolutional Neural Networks (CNN) shows that CNNs are highly vulnerable towards adversarial attacks. Meanwhile, smaller sized CNN models with no signicant accuracy loss are being introduced to mobile devices. However, only the accuracy on standard datasets is reported along with such research. The wide deployment of smaller models on millions of mobile devices stresses importance of their robustness. In this research, we study how robust such models are with respect to state-of-the-art compression techniques such as quantization.
06 13sept 8313 9997-2-ed an adaptive (edit lafi)IAESIJEECS
A robust Adaptive Reconstruction Error Minimization Convolution Neural Network (ARemCNN) architecture introduced to provide high reconstruction quality from low resolution using parallel configuration. Our proposed model can easily train the bulky datasets such as YUV21 and Videoset4.Our experimental results shows that our model outperforms many existing techniques in terms of PSNR, SSIM and reconstruction quality. The experimental results shows that our average PSNR result is 39.81 considering upscale-2, 35.56 for upscale-3 and 33.77 for upscale-4 for Videoset4 dataset which is very high in contrast to other existing techniques. Similarly, the experimental results shows that our average PSNR result is 38.71 considering upscale-2, 34.58 for upscale-3 and 33.047 for upscale-4 for YUV21 dataset.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/10/a-highly-data-efficient-deep-learning-approach-a-presentation-from-samsung/
Patrick Bangert, Vice President of AI at Samsung, presents the “Highly Data-Efficient Deep Learning Approach” tutorial at the May 2021 Embedded Vision Summit.
Many applications, such as medical imaging, lack the large amounts of data required for training popular CNNs to achieve sufficient accuracy. Often, these same applications suffer from an imbalanced class distribution problem that negatively impacts model accuracy. In this talk, Bangert proposes a highly data-efficient methodology that can achieve the same level of accuracy using significantly fewer labeled images and is insensitive to class imbalance.
The approach is based on a training pipeline with two components: a CNN trained in an unsupervised setting for image feature representation generation, and a multiclass Gaussian process classifier, trained in active learning cycles, using the image representations with labels. Bangert demonstrates his company’s approach with a COVID-19 chest X-ray classifier solution where data is scarce and highly imbalanced. He shows that the approach is insensitive to class imbalance and achieves comparable accuracy to prior approaches while using only a fraction of the training data.
SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...NAVER Engineering
발표자: 류봉균(EpiSys Science, Inc. 대표)
발표일: 2018.3.
In this talk, I will introduce Surprise Based Learning (SBL), a novel machine learning algorithm founded on the concepts of Complementary Discrimination Learning first introduced by Dr. Wei-Min Shen and Nobel laureate professor Herbert Simon. In contrast to most competitive learning algorithms which focus on structured learning (e.g., Bayesian Networks, Hidden Markov Models) or parameter learning (e.g., Neural Networks, Deep Learning) SBL offers the best of both worlds, meaning that it is capable of learning both the structure (i.e., number of states) and the parameters (i.e., input/output correlation) of a system.
EpiSys Science (EpiSci) is adapting SBL for several application domains. One of them is SafeguardAI, which identify when and what the DL model observes is unfamiliar, and communicate to a human supervisor, ‘I’m not sure what to do. The key insight is to embed a set of well-positioned intelligent agents inside the neural nets during the DL training process. These agents will continuously live inside a trained DL model during runtime and report out-of-distribution inputs as “surprises” or unusual behaviors. DL solutions can use these intelligent agents to safeguard its decision-making process.
I will present several experimental results that demonstrate the effectiveness of the SafeguardAI, and discuss other areas of applications.
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGESPNandaSai
Digital image processing is vast fields which can be using various applications. Which include Detection of criminal face, fingerprint authentication system, in medical field, object recognition etc. Brain tumor detection plays an important role in medical field. Brain tumor detection is detection of tumor affected part in the brain along with its shape size and boundary, so it useful in medical field.
Segmentation and the subsequent quantitative assessment of lesions in medical images provide valuable information for the analysis of neuropathologist and are important for planning of treatment strategies, monitoring of disease progression and prediction of patient outcome. For a better understanding of the pathophysiology of diseases, quantitative imaging can reveal clues about the disease characteristics and effects on particular anatomical structures
29 SETTEMBRE 2021 – Aula Magna – Corso Duca degli Abruzzi, 24 – Politecnico di Torino
Ricerca, trasferimento tecnologico e supporto alle aziende sui temi fondamentali dei Big Data, Intelligenza Artificiale, la robotica e la rivoluzione digitale
Virtual Retinal Display: their falling cost and rising performanceJeffrey Funk
These slides use concepts from my (Jeff Funk) course entitled analyzing hi-tech opportunities to analyze the increasing economic feasibility of virtual retinal displays. These displays focus light on a person’s retina using LEDs, digital micro-mirrors and lenses, which are all encased in a head-set about the size of glasses. They enable high resolution 3D video images with a large field of view that are far superior to existing displays. Rapid improvements in LEDs and digital micro-mirrors (one type of MEMS) are enabling these displays to experience rapid reductions in cost and improvements in performance.
Signal with amplitude outside the range accepted by the sensor
Enter damaged image. get restored image
Post-processing of damaged images at the moment of acquisition
sRGB Color Space
Restoration with aesthetic purposes
What we expect: Color correction,Texture Edges / Lines / Image gradient; Structures;
Modeling based on convolutional neural networks
More Related Content
Similar to Can Exposure, Noise and Compression affect Image Recognition? An Assessment of the Impacts on State-of-the-art ConvNets
Steer and/or sink the supertanker by Andrew RendellValtech UK
Andrew Rendell's (Principal Consultant from Valtech) presentation on: "Steer and or sink the supertanker"!
This presentation was held at the SPA conference on the 14th June 2011.
A case study into the pros and cons of over three years experience of continuous source code analysis followed by an interactive session using the real tool on real source code.
Andrew discusses what continuous inspection is and why directing software development can feel like trying to steer a super tanker.
Recent studies on robustness of Convolutional Neural Networks (CNN) shows that CNNs are highly vulnerable towards adversarial attacks. Meanwhile, smaller sized CNN models with no signicant accuracy loss are being introduced to mobile devices. However, only the accuracy on standard datasets is reported along with such research. The wide deployment of smaller models on millions of mobile devices stresses importance of their robustness. In this research, we study how robust such models are with respect to state-of-the-art compression techniques such as quantization.
06 13sept 8313 9997-2-ed an adaptive (edit lafi)IAESIJEECS
A robust Adaptive Reconstruction Error Minimization Convolution Neural Network (ARemCNN) architecture introduced to provide high reconstruction quality from low resolution using parallel configuration. Our proposed model can easily train the bulky datasets such as YUV21 and Videoset4.Our experimental results shows that our model outperforms many existing techniques in terms of PSNR, SSIM and reconstruction quality. The experimental results shows that our average PSNR result is 39.81 considering upscale-2, 35.56 for upscale-3 and 33.77 for upscale-4 for Videoset4 dataset which is very high in contrast to other existing techniques. Similarly, the experimental results shows that our average PSNR result is 38.71 considering upscale-2, 34.58 for upscale-3 and 33.047 for upscale-4 for YUV21 dataset.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/10/a-highly-data-efficient-deep-learning-approach-a-presentation-from-samsung/
Patrick Bangert, Vice President of AI at Samsung, presents the “Highly Data-Efficient Deep Learning Approach” tutorial at the May 2021 Embedded Vision Summit.
Many applications, such as medical imaging, lack the large amounts of data required for training popular CNNs to achieve sufficient accuracy. Often, these same applications suffer from an imbalanced class distribution problem that negatively impacts model accuracy. In this talk, Bangert proposes a highly data-efficient methodology that can achieve the same level of accuracy using significantly fewer labeled images and is insensitive to class imbalance.
The approach is based on a training pipeline with two components: a CNN trained in an unsupervised setting for image feature representation generation, and a multiclass Gaussian process classifier, trained in active learning cycles, using the image representations with labels. Bangert demonstrates his company’s approach with a COVID-19 chest X-ray classifier solution where data is scarce and highly imbalanced. He shows that the approach is insensitive to class imbalance and achieves comparable accuracy to prior approaches while using only a fraction of the training data.
SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...NAVER Engineering
발표자: 류봉균(EpiSys Science, Inc. 대표)
발표일: 2018.3.
In this talk, I will introduce Surprise Based Learning (SBL), a novel machine learning algorithm founded on the concepts of Complementary Discrimination Learning first introduced by Dr. Wei-Min Shen and Nobel laureate professor Herbert Simon. In contrast to most competitive learning algorithms which focus on structured learning (e.g., Bayesian Networks, Hidden Markov Models) or parameter learning (e.g., Neural Networks, Deep Learning) SBL offers the best of both worlds, meaning that it is capable of learning both the structure (i.e., number of states) and the parameters (i.e., input/output correlation) of a system.
EpiSys Science (EpiSci) is adapting SBL for several application domains. One of them is SafeguardAI, which identify when and what the DL model observes is unfamiliar, and communicate to a human supervisor, ‘I’m not sure what to do. The key insight is to embed a set of well-positioned intelligent agents inside the neural nets during the DL training process. These agents will continuously live inside a trained DL model during runtime and report out-of-distribution inputs as “surprises” or unusual behaviors. DL solutions can use these intelligent agents to safeguard its decision-making process.
I will present several experimental results that demonstrate the effectiveness of the SafeguardAI, and discuss other areas of applications.
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGESPNandaSai
Digital image processing is vast fields which can be using various applications. Which include Detection of criminal face, fingerprint authentication system, in medical field, object recognition etc. Brain tumor detection plays an important role in medical field. Brain tumor detection is detection of tumor affected part in the brain along with its shape size and boundary, so it useful in medical field.
Segmentation and the subsequent quantitative assessment of lesions in medical images provide valuable information for the analysis of neuropathologist and are important for planning of treatment strategies, monitoring of disease progression and prediction of patient outcome. For a better understanding of the pathophysiology of diseases, quantitative imaging can reveal clues about the disease characteristics and effects on particular anatomical structures
29 SETTEMBRE 2021 – Aula Magna – Corso Duca degli Abruzzi, 24 – Politecnico di Torino
Ricerca, trasferimento tecnologico e supporto alle aziende sui temi fondamentali dei Big Data, Intelligenza Artificiale, la robotica e la rivoluzione digitale
Virtual Retinal Display: their falling cost and rising performanceJeffrey Funk
These slides use concepts from my (Jeff Funk) course entitled analyzing hi-tech opportunities to analyze the increasing economic feasibility of virtual retinal displays. These displays focus light on a person’s retina using LEDs, digital micro-mirrors and lenses, which are all encased in a head-set about the size of glasses. They enable high resolution 3D video images with a large field of view that are far superior to existing displays. Rapid improvements in LEDs and digital micro-mirrors (one type of MEMS) are enabling these displays to experience rapid reductions in cost and improvements in performance.
Signal with amplitude outside the range accepted by the sensor
Enter damaged image. get restored image
Post-processing of damaged images at the moment of acquisition
sRGB Color Space
Restoration with aesthetic purposes
What we expect: Color correction,Texture Edges / Lines / Image gradient; Structures;
Modeling based on convolutional neural networks
Vision-Based System for Welding Groove Measurements for Robotic Welding Appli...Cristiano Rafael Steffens
BQ Leonardo, CR Steffens, SC Silva Fil., JL Mór, V Hüttner, EA Leivas, VS Rosa and SSC Botelho
Center of Computer Science, Federal University of Rio Grande, Brazil
Welding Groove Mapping: Image Acquisition and Processing on Shiny Surfaces - ...Cristiano Rafael Steffens
We propose a Vision-Based Measurement (VBM) system and evaluate how different algorithms impact the results. The proposed system joins hardware and software to image the welding plates using a single CMOS camera, run computer vision algorithms and control the welding equipment. A complete prototype, using a commercial linear welding robot is presented.
Authors: Cristiano R. Steffens, Bruno Q. Leonardo, Sidnei Carlos S. Filho, Valquiria Hüttner, Vagner S. Rosa, Silvia Silva C. Botelho
11°International Conference on Computer Vision Theory and Applications - VISAPP 2016
Uma rápida introdução ao OpenCV apresenta somente o essencial. Esta apresentação vai direto ao ponto, trazendo exemplos para sair programando. Todos os algoritmos foram testados utilizando a versão 2.4.10 da biblioteca. Comentários no código em Pt-Br.
Um sistema de detecção de chamas utilizando apenas dados espaciais para detecção de fogo utilizando câmeras hand-held. Revisão dos trabalhos de Phillips (2002), Chen (2004), Celik (2007/2008/2009), Borges (2010) e Chenebert (2011). Utilização de Random Forests Breiman (2001) para extração e classificação das regiões.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
2. OUTLINE
Evaluating image recognition models behiond validation sets
• Perception / Vision is an important component of modern autonomous systems
• CNNs hold the state-of-the-art in image recognition
• Growing interest in reliability / robustness
• Comprehensive assessment
• Clear methodology
• State-of-the-art models
• Several types of distortion
• Further directions
• How can we build better models?
• Can we prevent systems from operating on faulty data?
• Can we build better pipelines?
2
3. MOTIVATION
Under-exposure conditions
3
• Weakly illuminated scenes
• Time constraints (i.e. the robot depends on the image acquisition/processing to make a decision)
• Scenes with high dynamic ranges
• Small apperture (hardware construction)
• Low quality/cost sensors
Properly exposed Low Range Gamma 2 Gamma 4 Gamma 8
5. MOTIVATION
Over-Exposure
5
• Scene with high dynamic range
• Ill adjusted optics/gain
• Time constraints
• Reflective surfaces
• Low dynamic range sensors
Properly exposed Low Range Gamma 1/2 Gamma 1/4 Gamma 1/8
6. MOTIVATION
Lossy Compression, Poisson, Gaussian, Salt & Pepper and Speckle Noise
6
• Bandwidth limitation
• Storage limitation
• Sensor quality
• Dead pixels (always off or always on)
• Wear and tear
• Dust, damage on lens and sensors, noise
Over-compression Poisson Noise Gaussian Noise Salt & Pepper Speckle Noise
8. PROCEDURE
A procedure that can be reproduced and used for any vision task
• We use pre-trained image recognition models
• No fine-tuning
• Exact same preprocessing as in the original implementation
• Oficial Imagenet validation set
• 1000 classes
• 50 images per class
• Inference on:
• Original set (to avoid hardware related, interpolation and other bias)
• 8 levels of misexposure
• Over-compressed images
• 4 types of typical noise
8
11. RESULTS - INCEPTION-RESNET-V2
Overall good performance. Robust towards mild mis-exposure, compression, Gaussian and Poisson
11
FNs are limited to 50 due to the validation
dataset properties
No upper bound for FP
Statistics are per class:
A median of 10 means that 50% of the
classes in the dataset presented 10 or less
false negatives.
What is more important?
Would you rather overrun a person due to a
FN or stop in the middle of the road due to a
FP ?
12. RESULTS - MOBILENETV1
No robustness to S&P and Speckle Noise. Highly affected by moderate mis-exposure.
12
13. RESULTS - NASNET LARGE
Best accuracy, precision, and F1-Score among all models considered in this study
13
14. RESULTS - NASNET MOBILE
Significantly affected by severe miss-exposure conditions, S&P, and Speckle noise
14
16. RESULTS – XCEPTION
Robust towards moderate mis-exposure, over-compression, Gaussian and Poisson noise
16
17. CONCLUSION
New is Always better! Larger is better!
• Relevant
• Autonomous systems
• Robotics
• Applications that rely on visual perception
• Comprehensive experiment
• Broad set of classifiers
• Based on standard ILSVRC validation set
• Poor exposure
• Heavy compression
• Signal independent noise
• Signal dependent noise
• Reproducible procedure
• Objective evaluation
• No human bias
17
18. CONCLUSION
New is Always better! Larger is better!
• Most models are
• Little affected by mild miss-exposure.
• Robust towards Poisson and Gaussian noise
• Critically affected by moderate to severe miss exposure
• Critically affected by S&P and Speckle noise
• CNNs are evolving
• Modern architectures, such as NASNet, Inception Resnet v2 and Xception are more robust
• VGG is among the least robust
• Large models are better
• NOT you VGG!!
• NASNet Large performs significantly better than its Mobile version (while both share the same building
blocks)
• Mobile models are most affected
18
19. ONGOINGAND FUTURE WORK
We have a real issue! How can we solve it?
• Could the models’ accuracy be improved by adding these
common distortions in training time?
😞 Preliminary results show small improvement
• Can we build image processing pipelines which protect the
application from failing due to faulty data?
😃 Absolutelly! Preliminary results are promising 👉
• Can we prevent ill exposure in mobile/outdoor robotics?
⏳ Future Work
• Can we improve classification models by putting more
emphasis on image classes that are more prone to error?
⏳ Future Work
19
👈 Damaged
👈 Restored
☝️ Original