SlideShare a Scribd company logo
Copyright © 2017 1
How Image Sensor and Video
Compression Parameters Impact
Vision Algorithms
Ilya Brailovskiy, PhD, Principal CV Engineer @ Amazon Lab126
May 2017
Copyright © 2017 2
The views expressed in the following slides do not represent the view of
Amazon.com. They are solely the opinions of the presenter, Ilya Brailovskiy,
based on trends and projections provided by the sources cited. No
representations are made regarding the completeness, timeliness,
suitability or validity of any information presented.
All images and representations are sourced in open Public Domain, they
are used for the express purpose of research and education not promoting
any business, products, services, use or individuals.
Disclaimer
Copyright © 2017 3
• Algorithms reaching human level accuracy
• Quick check against “real-life” footage
• What can we learn from this?
• Image sizes: some theory behind object detection
• Adding video compression
• Bitrate impact
• Resolution impact
• Conclusions
What this is about?
Copyright © 2017 4
• Feature-based algorithms
• Histogram of gradients (HOG)
• Viola-Jones family
• ACF methods
• …
• Deep Learning algorithms
• Fast(er) RCNN
• Yolo
• SSD
• …
Human and Face Detection Task
Copyright © 2017 5
Human vs algorithm object detection accuracy
ImageNet Classification top-5 error (%)
Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”/
CVPR 2016
Russakovsky et al., IJCV, 2015
Copyright © 2017 6
Lets see!
Yolo Darknet against
awkward 1080 p @ 30 fps
footage: recall error > 35%
(output resolution reduced to 360 p)
“You Only Look Once: Unified, Real-Time Object Detection” Joseph Redmon,
Santosh Divvala, Ross Girshick, and Ali Farhadi, CVPR 2016
Copyright © 2017 7
• Improve algorithm(s)
• Design a better algorithm (for example by increasing modalities)
• Use better/more accurate training for your models
• Provide instant/frame level feedback from CV algorithm to the camera
algorithm
• Similar to face/human autofocus embed your “road sign” detection
with auto-focus
• (Auto) Focus is just one example
• Auto Exposure and Auto White Balance
• Low light behaviors (noise, colorization, etc.)
What could be fixed?
Copyright © 2017 8
Camera options: sensors and lens
Camera resolutions: 1 Mp, 2 Mp, 4 Mp, etc.
It translates to Horizontal (H) pixels: 1280, 1920, 3840, etc.
“SmartPhone”
FOV
Wide FOV Ultra-Wide FOV
Field of View: Horizontal 75 110 130
Field of View: Vertical
47 78 101
Field of View: Diagonal
83 117 136
Typical Field of View options (FOV)
in degrees
Copyright © 2017 9
Pixel Per meter for different FOV/lenses
2 ∗ 𝑂𝑏𝑗𝑒𝑐𝑡 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 ∗ tan ൗ𝐹𝑂𝑉
2 = 𝐶𝑎𝑚𝑒𝑟𝑎 𝑊𝑖𝑑𝑡ℎ 𝑃𝑖𝑥𝑒𝑙𝑠 ∗
𝑂𝑏𝑗𝑒𝑐𝑡 𝑊𝑖𝑑𝑡ℎ
𝑂𝑏𝑗𝑒𝑐𝑡 𝑊𝑖𝑑𝑡ℎ 𝑃𝑖𝑥𝑒𝑙𝑠
Objects should be above
certain width and height in
pixels to be detected
(~5-10X smaller for deep
learning compared to
feature based)
Copyright © 2017 10
• JPEG
• JPEG2000
• H.264
• SVC
• VP8/VP9
• HEVC
• What’s next? S-HEVC? AV1?
Let’s add compression
Copyright © 2017 11
Example of bitrate dependency
“Video quality for face detection, recognition, and tracking”, P Korshunov, WT Ooi ACM
Transactions on Multimedia Computing, Communications, and Applications
Performance drops when
bitrate below a threshold:
~500 Kbps in this Viola Jones
example
(similar results for other
algorithms – features based
on DNN)
500 Kbps
Bitrate on X
Copyright © 2017 12
Add scaling
Original video
Compressed
video
Reduced Res
video
Reduced Res
compressed
video
Compression at
bitrate B
Compression at
bitrate B
downsample upsample
Copyright © 2017 13
Downscaled video performance
There’s a threshold on
performance drop for lower
resolutions as well.
But the threshold is lower:
~250 Kbps in this example
So as long as object size is
above the distance curve
threshold it’s safe to downscale
(can make it part of the
encoding decisions as in
SVC/VP9/AV1)
500 Kbps
250 Kbps
Bitrate on X
Bitrate on X
Copyright © 2017 14
• ITU-T Recommendation P.912: “Subjective Video Quality Assessment Methods
for Recognition Tasks”
• Experiments to validate the analysis
• Test: 12 sequences, 3 sequences out 12 are obvious
• Some screening was required to remove subjects:
• Not paying attention
• Not understanding task
• Based on study recognition depends on light levels
• Sun light: recognition is 38 times better than dark
• Objects further apart recognition drops
• Still objects recognized 2.7 times better than moving
What is human understanding, really?
Copyright © 2017 15
Summary
• Improve your models
• Train your models for your camera
• Use DNN/CNN if it’s affordable
• Build feedback loop from your CV to your camera
• Decide on FOV and resolution based on your object sizes
• Examine reduced resolution if you need to handle lower bitrates: in
many cases you might be able to get your performance back
• And be very patient with your humans and your embedded CV
algorithms ☺
Copyright © 2017 16
Q & A
The end!
Copyright © 2017 17
Reference
• Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition.” CVPR 2016
• Russakovsky et al., ImageNet Large Scale Visual Recognition Challenge, IJCV, 2015
• Viola, P. & Jones, M.J. International Journal of Computer Vision 2004 57: 137
• Dollar, P., R. Appel, S. Belongie, and P. Perona. “Fast feature pyramids for object detection.” Pattern Analysis and Machine
Intelligence, IEEE Transactions. Vol. 36, Issue 8, 2014, pp. 1532–1545
• Dollar, C. Wojeck, B. Shiele, and P. Perona. “Pedestrian detection: An evaluation of the state of the art.” Pattern Analysis and
Machine Intelligence, IEEE Transactions. Vol. 34, Issue 4, 2012, pp. 743–761
• Shaoqing Ren, Kaiming He, Ross B. Girshick, Jian Sun: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal
Networks. NIPS 2015: 91-99
• Joseph Redmon, Ali Farhadi. YOLO9000: Better, Faster, Stronger. preprint arXiv:1612.08242, 2016
• “You Only Look Once: Unified, Real-Time Object Detection” Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi,
CVPR 2016
• Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg. SSD: Single Shot
MultiBox Detector ECCV 2016, preprint arXiv:1512.02325, 2016
• Image Sensors and Signal Processing for Digital Still Cameras (Optical Science and Engineering) Junichi Nakamura (Editor) by
Taylor & Francis Group ISBN-10: 0849335450
• P Korshunov, WT Ooi .Video quality for face detection, recognition, and tracking, ACM Transactions on Multimedia Computing,
Communications, and Applications
• https://www.itu.int/rec/T-REC-P.912 P.912 : Subjective video quality assessment methods for recognition tasks, 03/2016

More Related Content

Similar to "How Image Sensor and Video Compression Parameters Impact Vision Algorithms," a Presentation from Amazon Lab126

Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear Industry
Tanvir Moin
 
Long-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningLong-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep Learning
Elaheh Rashedi
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
Edge AI and Vision Alliance
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
Tulipp. Eu
 
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FIAT/IFTA
 
MCL314_Unlocking Media Workflows Using Amazon Rekognition
MCL314_Unlocking Media Workflows Using Amazon RekognitionMCL314_Unlocking Media Workflows Using Amazon Rekognition
MCL314_Unlocking Media Workflows Using Amazon Rekognition
Amazon Web Services
 
Biometric Recognition using Deep Learning
Biometric Recognition using Deep LearningBiometric Recognition using Deep Learning
Biometric Recognition using Deep Learning
SahithiKotha2
 
Image recognition technology (Medical Presentation)
Image recognition technology (Medical Presentation)Image recognition technology (Medical Presentation)
Image recognition technology (Medical Presentation)
saravanan guru
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
United States Air Force Academy
 
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde..."Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
Edge AI and Vision Alliance
 
Introduction
IntroductionIntroduction
Introduction
sagayaaurelia1
 
Elderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detectionElderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detection
Tanvi Mittal
 
Semantic Summarization of videos, Semantic Summarization of videos
Semantic Summarization of videos, Semantic Summarization of videosSemantic Summarization of videos, Semantic Summarization of videos
Semantic Summarization of videos, Semantic Summarization of videos
darsh228313
 
Cyphy
CyphyCyphy
A Smart Target Detection System using Fuzzy Logic and Background Subtraction
A Smart Target Detection System using Fuzzy Logic and Background SubtractionA Smart Target Detection System using Fuzzy Logic and Background Subtraction
A Smart Target Detection System using Fuzzy Logic and Background Subtraction
IRJET Journal
 
InfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
InfoSphere Streams toolkits :Real-Time Analytics on Data in MotionInfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
InfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
Avadhoot Patwardhan
 
[212]big models without big data using domain specific deep networks in data-...
[212]big models without big data using domain specific deep networks in data-...[212]big models without big data using domain specific deep networks in data-...
[212]big models without big data using domain specific deep networks in data-...
NAVER D2
 
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...
Cristiano Rafael Steffens
 
Face recognition for augmented reality and media management.Viewdle.2011.
Face recognition for augmented reality and media management.Viewdle.2011.Face recognition for augmented reality and media management.Viewdle.2011.
Face recognition for augmented reality and media management.Viewdle.2011.
Alexa Dovgopolaya
 

Similar to "How Image Sensor and Video Compression Parameters Impact Vision Algorithms," a Presentation from Amazon Lab126 (20)

Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear Industry
 
Long-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningLong-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep Learning
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
 
MCL314_Unlocking Media Workflows Using Amazon Rekognition
MCL314_Unlocking Media Workflows Using Amazon RekognitionMCL314_Unlocking Media Workflows Using Amazon Rekognition
MCL314_Unlocking Media Workflows Using Amazon Rekognition
 
Biometric Recognition using Deep Learning
Biometric Recognition using Deep LearningBiometric Recognition using Deep Learning
Biometric Recognition using Deep Learning
 
Image recognition technology (Medical Presentation)
Image recognition technology (Medical Presentation)Image recognition technology (Medical Presentation)
Image recognition technology (Medical Presentation)
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
 
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde..."Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
 
Introduction
IntroductionIntroduction
Introduction
 
Elderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detectionElderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detection
 
Semantic Summarization of videos, Semantic Summarization of videos
Semantic Summarization of videos, Semantic Summarization of videosSemantic Summarization of videos, Semantic Summarization of videos
Semantic Summarization of videos, Semantic Summarization of videos
 
Cyphy
CyphyCyphy
Cyphy
 
A Smart Target Detection System using Fuzzy Logic and Background Subtraction
A Smart Target Detection System using Fuzzy Logic and Background SubtractionA Smart Target Detection System using Fuzzy Logic and Background Subtraction
A Smart Target Detection System using Fuzzy Logic and Background Subtraction
 
InfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
InfoSphere Streams toolkits :Real-Time Analytics on Data in MotionInfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
InfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
 
[212]big models without big data using domain specific deep networks in data-...
[212]big models without big data using domain specific deep networks in data-...[212]big models without big data using domain specific deep networks in data-...
[212]big models without big data using domain specific deep networks in data-...
 
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...
Can Exposure, Noise and Compression affect Image Recognition? An Assessment o...
 
Face recognition for augmented reality and media management.Viewdle.2011.
Face recognition for augmented reality and media management.Viewdle.2011.Face recognition for augmented reality and media management.Viewdle.2011.
Face recognition for augmented reality and media management.Viewdle.2011.
 

More from Edge AI and Vision Alliance

“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
Edge AI and Vision Alliance
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
Edge AI and Vision Alliance
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
Edge AI and Vision Alliance
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
Edge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
Edge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 

Recently uploaded

Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 

Recently uploaded (20)

Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 

"How Image Sensor and Video Compression Parameters Impact Vision Algorithms," a Presentation from Amazon Lab126

  • 1. Copyright © 2017 1 How Image Sensor and Video Compression Parameters Impact Vision Algorithms Ilya Brailovskiy, PhD, Principal CV Engineer @ Amazon Lab126 May 2017
  • 2. Copyright © 2017 2 The views expressed in the following slides do not represent the view of Amazon.com. They are solely the opinions of the presenter, Ilya Brailovskiy, based on trends and projections provided by the sources cited. No representations are made regarding the completeness, timeliness, suitability or validity of any information presented. All images and representations are sourced in open Public Domain, they are used for the express purpose of research and education not promoting any business, products, services, use or individuals. Disclaimer
  • 3. Copyright © 2017 3 • Algorithms reaching human level accuracy • Quick check against “real-life” footage • What can we learn from this? • Image sizes: some theory behind object detection • Adding video compression • Bitrate impact • Resolution impact • Conclusions What this is about?
  • 4. Copyright © 2017 4 • Feature-based algorithms • Histogram of gradients (HOG) • Viola-Jones family • ACF methods • … • Deep Learning algorithms • Fast(er) RCNN • Yolo • SSD • … Human and Face Detection Task
  • 5. Copyright © 2017 5 Human vs algorithm object detection accuracy ImageNet Classification top-5 error (%) Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”/ CVPR 2016 Russakovsky et al., IJCV, 2015
  • 6. Copyright © 2017 6 Lets see! Yolo Darknet against awkward 1080 p @ 30 fps footage: recall error > 35% (output resolution reduced to 360 p) “You Only Look Once: Unified, Real-Time Object Detection” Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi, CVPR 2016
  • 7. Copyright © 2017 7 • Improve algorithm(s) • Design a better algorithm (for example by increasing modalities) • Use better/more accurate training for your models • Provide instant/frame level feedback from CV algorithm to the camera algorithm • Similar to face/human autofocus embed your “road sign” detection with auto-focus • (Auto) Focus is just one example • Auto Exposure and Auto White Balance • Low light behaviors (noise, colorization, etc.) What could be fixed?
  • 8. Copyright © 2017 8 Camera options: sensors and lens Camera resolutions: 1 Mp, 2 Mp, 4 Mp, etc. It translates to Horizontal (H) pixels: 1280, 1920, 3840, etc. “SmartPhone” FOV Wide FOV Ultra-Wide FOV Field of View: Horizontal 75 110 130 Field of View: Vertical 47 78 101 Field of View: Diagonal 83 117 136 Typical Field of View options (FOV) in degrees
  • 9. Copyright © 2017 9 Pixel Per meter for different FOV/lenses 2 ∗ 𝑂𝑏𝑗𝑒𝑐𝑡 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 ∗ tan ൗ𝐹𝑂𝑉 2 = 𝐶𝑎𝑚𝑒𝑟𝑎 𝑊𝑖𝑑𝑡ℎ 𝑃𝑖𝑥𝑒𝑙𝑠 ∗ 𝑂𝑏𝑗𝑒𝑐𝑡 𝑊𝑖𝑑𝑡ℎ 𝑂𝑏𝑗𝑒𝑐𝑡 𝑊𝑖𝑑𝑡ℎ 𝑃𝑖𝑥𝑒𝑙𝑠 Objects should be above certain width and height in pixels to be detected (~5-10X smaller for deep learning compared to feature based)
  • 10. Copyright © 2017 10 • JPEG • JPEG2000 • H.264 • SVC • VP8/VP9 • HEVC • What’s next? S-HEVC? AV1? Let’s add compression
  • 11. Copyright © 2017 11 Example of bitrate dependency “Video quality for face detection, recognition, and tracking”, P Korshunov, WT Ooi ACM Transactions on Multimedia Computing, Communications, and Applications Performance drops when bitrate below a threshold: ~500 Kbps in this Viola Jones example (similar results for other algorithms – features based on DNN) 500 Kbps Bitrate on X
  • 12. Copyright © 2017 12 Add scaling Original video Compressed video Reduced Res video Reduced Res compressed video Compression at bitrate B Compression at bitrate B downsample upsample
  • 13. Copyright © 2017 13 Downscaled video performance There’s a threshold on performance drop for lower resolutions as well. But the threshold is lower: ~250 Kbps in this example So as long as object size is above the distance curve threshold it’s safe to downscale (can make it part of the encoding decisions as in SVC/VP9/AV1) 500 Kbps 250 Kbps Bitrate on X Bitrate on X
  • 14. Copyright © 2017 14 • ITU-T Recommendation P.912: “Subjective Video Quality Assessment Methods for Recognition Tasks” • Experiments to validate the analysis • Test: 12 sequences, 3 sequences out 12 are obvious • Some screening was required to remove subjects: • Not paying attention • Not understanding task • Based on study recognition depends on light levels • Sun light: recognition is 38 times better than dark • Objects further apart recognition drops • Still objects recognized 2.7 times better than moving What is human understanding, really?
  • 15. Copyright © 2017 15 Summary • Improve your models • Train your models for your camera • Use DNN/CNN if it’s affordable • Build feedback loop from your CV to your camera • Decide on FOV and resolution based on your object sizes • Examine reduced resolution if you need to handle lower bitrates: in many cases you might be able to get your performance back • And be very patient with your humans and your embedded CV algorithms ☺
  • 16. Copyright © 2017 16 Q & A The end!
  • 17. Copyright © 2017 17 Reference • Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition.” CVPR 2016 • Russakovsky et al., ImageNet Large Scale Visual Recognition Challenge, IJCV, 2015 • Viola, P. & Jones, M.J. International Journal of Computer Vision 2004 57: 137 • Dollar, P., R. Appel, S. Belongie, and P. Perona. “Fast feature pyramids for object detection.” Pattern Analysis and Machine Intelligence, IEEE Transactions. Vol. 36, Issue 8, 2014, pp. 1532–1545 • Dollar, C. Wojeck, B. Shiele, and P. Perona. “Pedestrian detection: An evaluation of the state of the art.” Pattern Analysis and Machine Intelligence, IEEE Transactions. Vol. 34, Issue 4, 2012, pp. 743–761 • Shaoqing Ren, Kaiming He, Ross B. Girshick, Jian Sun: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NIPS 2015: 91-99 • Joseph Redmon, Ali Farhadi. YOLO9000: Better, Faster, Stronger. preprint arXiv:1612.08242, 2016 • “You Only Look Once: Unified, Real-Time Object Detection” Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi, CVPR 2016 • Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg. SSD: Single Shot MultiBox Detector ECCV 2016, preprint arXiv:1512.02325, 2016 • Image Sensors and Signal Processing for Digital Still Cameras (Optical Science and Engineering) Junichi Nakamura (Editor) by Taylor & Francis Group ISBN-10: 0849335450 • P Korshunov, WT Ooi .Video quality for face detection, recognition, and tracking, ACM Transactions on Multimedia Computing, Communications, and Applications • https://www.itu.int/rec/T-REC-P.912 P.912 : Subjective video quality assessment methods for recognition tasks, 03/2016