SlideShare a Scribd company logo
1 of 24
Download to read offline
© 2020 StreamLogic
Can You See What I See? The
Power of Deep Learning
Scott Thibault, Ph.D.
StreamLogic
© 2020 StreamLogic
Presentation Agenda
• Image Classification
• Answering questions about an image as a whole
• Object Detection
• Locating objects within an image
• Embeddings
• Measuring semantic similarity
• Conclusions
2
Three major vision tasks solved using deep learning:
© 2020 StreamLogic
Image Classification
3
© 2020 StreamLogic
Image Classification
Answers the question for the whole image:
1. Binary classification – Is this image X
(yes/no)?
2. Multi-class classification – Is this image X,
Y, or Z?
3. Multi-label classification – What labels X,
Y, and Z describe this image?
4
Mask (0.86)
No Mask (0.06)
© 2020 StreamLogic
Multi-Class Classification
5
Age classification output
Class Probability
0-2 0.000011
4-6 0.000008
8-13 0.000187
15-20 0.001444
25-32 0.313656
38-43 0.683580
48-53 0.001012
60+ 0.000101
© 2020 StreamLogic
Multi-Label Classification
6
Multi-label classification output
Label Probability
Smooth 0.00021
Rough 0.01332
Furry 0.87412
White 0.00340
Black 0.91021
Yellow 0.00013
Green 0.00002
… …
© 2020 StreamLogic
Applications of Image Classification
• Medical diagnosis
• Marketing profiles
• Smart camera event capture
• Quality control
• Policy verification
7
© 2020 StreamLogic
Pre-Trained Image Classification Models
• ImageNet (ILSVRC) – 1000 classes
• COCO – 80 classes
• Pascal VOC – 20 classes
• Stanford Cars
• Places365
• Demographics (gender/age/race)
• Emotions (facial expression)
• Activities, e.g. sports
• Plants
8
Places365
Public datasets with pre-trained models:
Type of environment: outdoor
Scene categories: beach (0.301), coast (0.214), beach_house (0.109), lagoon (0.108)
Scene attributes: natural light, open area, far-away horizon, sunny, natural, warm, boating,
dirt, clouds
© 2020 StreamLogic
Training Data for Image Classification
• Binary classification
• For each image: one binary value (yes=1, no=0)
• Multi-class classification (“One-hot” encoding)
• For each image: binary vector of length N with exactly one 1
• Multi-label classification
• For each image: binary vector of length N with any number of 1s
9
© 2020 StreamLogic
Image Classification Accuracy
10
• ImageNet Benchmark
• > 1M training images
• 1000 classes
• Top-N Accuracy – correct class is in N
highest class probabilities
• Model size affects accuracy
• Model size affects speed
How do we compare CNN architectures?
S. Bianco, R. Cadene, L. Celona and P. Napoletano, "Benchmark
Analysis of Representative Deep Neural Network Architectures," in
IEEE Access, vol. 6, pp. 64270-64277, 2018
© 2020 StreamLogic
Object Detection
© 2020 StreamLogic
Object Detection
• Object detection combines
classification with localization for
multiple instances.
• Outputs 0-N bounding boxes and
class scores for each box.
12
© 2020 StreamLogic
Applications of Object Detection
• Surveillance
• Counting (cars, people, …)
• + identification = recognition
(faces, license plates, …)
• + tracking = behavior
analysis (hot spots,
boundary crossing, package
exchange, …)
• Autonomous vehicles
13
© 2020 StreamLogic
Pre-Trained Object Detection Models
• COCO – 80 classes
• Pascal VOC – 20
• Pedestrians, faces, hands
• Cars, Bikes, Trains, …
• Street signs
• License plates
• Text
14
Public datasets with pre-trained models:
© 2020 StreamLogic
Training Data for Object Detection
• Labor intensive
• Labeling tool required
• Outline bounding box with mouse
• Label boxes with class
• Export and translate to suitable
format
15
© 2020 StreamLogic
Object Detector Design Choices
• Backbone (feature extractor)
• Meta-architecture (SSD, Yolo,
Faster R-CNN)
• Number of proposals
• Resolution
• Training datasets
16
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional
object detectors." Proceedings of the IEEE conference on computer vision and
pattern recognition. 2017.
© 2020 StreamLogic
Embeddings
© 2020 StreamLogic
Embeddings
An embedding is a translation of a high-
dimensional input to a low-dimensional feature
vector that:
• captures semantics of the input, and
• preserves semantic similarity.
18
© 2020 StreamLogic
MNIST Digits Example
• 70,000 hand-written digits
• 28x28 grayscale images
• Each image is a point in ℝ784
19
MNIST handwritten digit dataset
© 2020 StreamLogic
MNIST 1-D Embedding
20
Mean(image): ℝ784 → ℝ1
© 2020 StreamLogic
MNIST 2-D Embedding
21
Mean(image),Curvature(image): ℝ784 → ℝ2
© 2020 StreamLogic
Facial Image Recognition
• Deep Neural Network
• Face image → ℝ128
• Triplet loss function:
• Anchor image
• Positive image
• Negative image
FaceNet Embedding:
© 2020 StreamLogic
Conclusions
• Image Classification
• 3 types: binary, multi-class, and multi-label
• Answers questions about the whole image
• Objection Detection
• Identifies multiple objects in an image and their locations
• Major uses include counting, identification and object tracking
• Embeddings
• Represents semantics of the input as a vector of real numbers
• Used in facial image recognition technology
23
© 2020 StreamLogic
Resources
Papers
S. Bianco et al. 2018, Benchmark Analysis of
Representative Deep Neural Network Architectures
https://arxiv.org/abs/1810.00736
J. Huang et al. 2017, Speed/accuracy trade-offs for
modern convolutional object detectors
https://arxiv.org/abs/1611.10012
F. Schroff et al. 2015, FaceNet: A Unified Embedding for
Face Recognition and Clustering
https://arxiv.org/abs/1503.03832
Pre-trained models
Tensorflow
https://tfhub.dev/
PyTorch
https://pytorch.org/hub/
Caffe
https://github.com/BVLC/caffe/wiki/Model-Zoo
Contact Information
thibault@streamlogic.io
https://streamlogic.io
24

More Related Content

What's hot

Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Oswald Campesato
 
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...Edge AI and Vision Alliance
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Deep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical MethodologyDeep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical MethodologyJason Tsai
 
“From Inference to Action: AI Beyond Pattern Recognition,” a Keynote Presenta...
“From Inference to Action: AI Beyond Pattern Recognition,” a Keynote Presenta...“From Inference to Action: AI Beyond Pattern Recognition,” a Keynote Presenta...
“From Inference to Action: AI Beyond Pattern Recognition,” a Keynote Presenta...Edge AI and Vision Alliance
 
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Alexander Korbonits
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningCharles Deledalle
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Seonho Park
 
#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language ProcessingBerlin Language Technology
 
Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry: Statisti...
Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry:Statisti...Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry:Statisti...
Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry: Statisti...Takayuki Nishio
 
Lecture3 xing fei-fei
Lecture3 xing fei-feiLecture3 xing fei-fei
Lecture3 xing fei-feiTianlu Wang
 
Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep LearningDavid Khosid
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature LearningAmgad Muhammad
 
Deep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningDeep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningMarc Bolaños Solà
 

What's hot (20)

Backpropagation for Neural Networks
Backpropagation for Neural NetworksBackpropagation for Neural Networks
Backpropagation for Neural Networks
 
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
Deeplearning in finance
Deeplearning in financeDeeplearning in finance
Deeplearning in finance
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Deep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical MethodologyDeep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical Methodology
 
“From Inference to Action: AI Beyond Pattern Recognition,” a Keynote Presenta...
“From Inference to Action: AI Beyond Pattern Recognition,” a Keynote Presenta...“From Inference to Action: AI Beyond Pattern Recognition,” a Keynote Presenta...
“From Inference to Action: AI Beyond Pattern Recognition,” a Keynote Presenta...
 
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, Captioning
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing
 
Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry: Statisti...
Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry:Statisti...Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry:Statisti...
Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry: Statisti...
 
Lecture3 xing fei-fei
Lecture3 xing fei-feiLecture3 xing fei-fei
Lecture3 xing fei-fei
 
Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep Learning
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
 
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
 
Deep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningDeep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal Learning
 

Similar to Deep Learning Vision Tasks Classify Detect Embed Images

Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision Chen Sagiv
 
Week2- Deep Learning Intuition.pptx
Week2- Deep Learning Intuition.pptxWeek2- Deep Learning Intuition.pptx
Week2- Deep Learning Intuition.pptxfahmi324663
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionAmar Jindal
 
AI for PM.pptx
AI for PM.pptxAI for PM.pptx
AI for PM.pptxNatan Katz
 
Week3- Face Identification with K-Nears Neighbour .pptx
Week3- Face Identification with K-Nears Neighbour .pptxWeek3- Face Identification with K-Nears Neighbour .pptx
Week3- Face Identification with K-Nears Neighbour .pptxfahmi324663
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Universitat Politècnica de Catalunya
 
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...Shunta Saito
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer LearningDanielle Dean
 
IRJET - Face Recognition based Attendance System
IRJET -  	  Face Recognition based Attendance SystemIRJET -  	  Face Recognition based Attendance System
IRJET - Face Recognition based Attendance SystemIRJET Journal
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverviewMotaz El-Saban
 
How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...Wee Hyong Tok
 
ANISH_and_DR.DANIEL_VRINCEANU_Presentation
ANISH_and_DR.DANIEL_VRINCEANU_PresentationANISH_and_DR.DANIEL_VRINCEANU_Presentation
ANISH_and_DR.DANIEL_VRINCEANU_PresentationAnish Patel
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in VisionSangmin Woo
 
Weave-D - 2nd Progress Evaluation Presentation
Weave-D - 2nd Progress Evaluation PresentationWeave-D - 2nd Progress Evaluation Presentation
Weave-D - 2nd Progress Evaluation Presentationlasinducharith
 
How to test an AI application
How to test an AI applicationHow to test an AI application
How to test an AI applicationKari Kakkonen
 
Detection of a user-defined object in an image using feature extraction- Trai...
Detection of a user-defined object in an image using feature extraction- Trai...Detection of a user-defined object in an image using feature extraction- Trai...
Detection of a user-defined object in an image using feature extraction- Trai...IRJET Journal
 
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA
 
Image_Processing_LECTURE_c#_programming.ppt
Image_Processing_LECTURE_c#_programming.pptImage_Processing_LECTURE_c#_programming.ppt
Image_Processing_LECTURE_c#_programming.pptLOUISSEVERINOROMANO
 

Similar to Deep Learning Vision Tasks Classify Detect Embed Images (20)

Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
Week2- Deep Learning Intuition.pptx
Week2- Deep Learning Intuition.pptxWeek2- Deep Learning Intuition.pptx
Week2- Deep Learning Intuition.pptx
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
AI for PM.pptx
AI for PM.pptxAI for PM.pptx
AI for PM.pptx
 
Week3- Face Identification with K-Nears Neighbour .pptx
Week3- Face Identification with K-Nears Neighbour .pptxWeek3- Face Identification with K-Nears Neighbour .pptx
Week3- Face Identification with K-Nears Neighbour .pptx
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer Learning
 
IRJET - Face Recognition based Attendance System
IRJET -  	  Face Recognition based Attendance SystemIRJET -  	  Face Recognition based Attendance System
IRJET - Face Recognition based Attendance System
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...
 
pydataPointCloud.pptx
pydataPointCloud.pptxpydataPointCloud.pptx
pydataPointCloud.pptx
 
ANISH_and_DR.DANIEL_VRINCEANU_Presentation
ANISH_and_DR.DANIEL_VRINCEANU_PresentationANISH_and_DR.DANIEL_VRINCEANU_Presentation
ANISH_and_DR.DANIEL_VRINCEANU_Presentation
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
Weave-D - 2nd Progress Evaluation Presentation
Weave-D - 2nd Progress Evaluation PresentationWeave-D - 2nd Progress Evaluation Presentation
Weave-D - 2nd Progress Evaluation Presentation
 
How to test an AI application
How to test an AI applicationHow to test an AI application
How to test an AI application
 
Detection of a user-defined object in an image using feature extraction- Trai...
Detection of a user-defined object in an image using feature extraction- Trai...Detection of a user-defined object in an image using feature extraction- Trai...
Detection of a user-defined object in an image using feature extraction- Trai...
 
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
 
Image_Processing_LECTURE_c#_programming.ppt
Image_Processing_LECTURE_c#_programming.pptImage_Processing_LECTURE_c#_programming.ppt
Image_Processing_LECTURE_c#_programming.ppt
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Deep Learning Vision Tasks Classify Detect Embed Images

  • 1. © 2020 StreamLogic Can You See What I See? The Power of Deep Learning Scott Thibault, Ph.D. StreamLogic
  • 2. © 2020 StreamLogic Presentation Agenda • Image Classification • Answering questions about an image as a whole • Object Detection • Locating objects within an image • Embeddings • Measuring semantic similarity • Conclusions 2 Three major vision tasks solved using deep learning:
  • 3. © 2020 StreamLogic Image Classification 3
  • 4. © 2020 StreamLogic Image Classification Answers the question for the whole image: 1. Binary classification – Is this image X (yes/no)? 2. Multi-class classification – Is this image X, Y, or Z? 3. Multi-label classification – What labels X, Y, and Z describe this image? 4 Mask (0.86) No Mask (0.06)
  • 5. © 2020 StreamLogic Multi-Class Classification 5 Age classification output Class Probability 0-2 0.000011 4-6 0.000008 8-13 0.000187 15-20 0.001444 25-32 0.313656 38-43 0.683580 48-53 0.001012 60+ 0.000101
  • 6. © 2020 StreamLogic Multi-Label Classification 6 Multi-label classification output Label Probability Smooth 0.00021 Rough 0.01332 Furry 0.87412 White 0.00340 Black 0.91021 Yellow 0.00013 Green 0.00002 … …
  • 7. © 2020 StreamLogic Applications of Image Classification • Medical diagnosis • Marketing profiles • Smart camera event capture • Quality control • Policy verification 7
  • 8. © 2020 StreamLogic Pre-Trained Image Classification Models • ImageNet (ILSVRC) – 1000 classes • COCO – 80 classes • Pascal VOC – 20 classes • Stanford Cars • Places365 • Demographics (gender/age/race) • Emotions (facial expression) • Activities, e.g. sports • Plants 8 Places365 Public datasets with pre-trained models: Type of environment: outdoor Scene categories: beach (0.301), coast (0.214), beach_house (0.109), lagoon (0.108) Scene attributes: natural light, open area, far-away horizon, sunny, natural, warm, boating, dirt, clouds
  • 9. © 2020 StreamLogic Training Data for Image Classification • Binary classification • For each image: one binary value (yes=1, no=0) • Multi-class classification (“One-hot” encoding) • For each image: binary vector of length N with exactly one 1 • Multi-label classification • For each image: binary vector of length N with any number of 1s 9
  • 10. © 2020 StreamLogic Image Classification Accuracy 10 • ImageNet Benchmark • > 1M training images • 1000 classes • Top-N Accuracy – correct class is in N highest class probabilities • Model size affects accuracy • Model size affects speed How do we compare CNN architectures? S. Bianco, R. Cadene, L. Celona and P. Napoletano, "Benchmark Analysis of Representative Deep Neural Network Architectures," in IEEE Access, vol. 6, pp. 64270-64277, 2018
  • 12. © 2020 StreamLogic Object Detection • Object detection combines classification with localization for multiple instances. • Outputs 0-N bounding boxes and class scores for each box. 12
  • 13. © 2020 StreamLogic Applications of Object Detection • Surveillance • Counting (cars, people, …) • + identification = recognition (faces, license plates, …) • + tracking = behavior analysis (hot spots, boundary crossing, package exchange, …) • Autonomous vehicles 13
  • 14. © 2020 StreamLogic Pre-Trained Object Detection Models • COCO – 80 classes • Pascal VOC – 20 • Pedestrians, faces, hands • Cars, Bikes, Trains, … • Street signs • License plates • Text 14 Public datasets with pre-trained models:
  • 15. © 2020 StreamLogic Training Data for Object Detection • Labor intensive • Labeling tool required • Outline bounding box with mouse • Label boxes with class • Export and translate to suitable format 15
  • 16. © 2020 StreamLogic Object Detector Design Choices • Backbone (feature extractor) • Meta-architecture (SSD, Yolo, Faster R-CNN) • Number of proposals • Resolution • Training datasets 16 Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object detectors." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
  • 18. © 2020 StreamLogic Embeddings An embedding is a translation of a high- dimensional input to a low-dimensional feature vector that: • captures semantics of the input, and • preserves semantic similarity. 18
  • 19. © 2020 StreamLogic MNIST Digits Example • 70,000 hand-written digits • 28x28 grayscale images • Each image is a point in ℝ784 19 MNIST handwritten digit dataset
  • 20. © 2020 StreamLogic MNIST 1-D Embedding 20 Mean(image): ℝ784 → ℝ1
  • 21. © 2020 StreamLogic MNIST 2-D Embedding 21 Mean(image),Curvature(image): ℝ784 → ℝ2
  • 22. © 2020 StreamLogic Facial Image Recognition • Deep Neural Network • Face image → ℝ128 • Triplet loss function: • Anchor image • Positive image • Negative image FaceNet Embedding:
  • 23. © 2020 StreamLogic Conclusions • Image Classification • 3 types: binary, multi-class, and multi-label • Answers questions about the whole image • Objection Detection • Identifies multiple objects in an image and their locations • Major uses include counting, identification and object tracking • Embeddings • Represents semantics of the input as a vector of real numbers • Used in facial image recognition technology 23
  • 24. © 2020 StreamLogic Resources Papers S. Bianco et al. 2018, Benchmark Analysis of Representative Deep Neural Network Architectures https://arxiv.org/abs/1810.00736 J. Huang et al. 2017, Speed/accuracy trade-offs for modern convolutional object detectors https://arxiv.org/abs/1611.10012 F. Schroff et al. 2015, FaceNet: A Unified Embedding for Face Recognition and Clustering https://arxiv.org/abs/1503.03832 Pre-trained models Tensorflow https://tfhub.dev/ PyTorch https://pytorch.org/hub/ Caffe https://github.com/BVLC/caffe/wiki/Model-Zoo Contact Information thibault@streamlogic.io https://streamlogic.io 24