SlideShare a Scribd company logo
1 of 18
Download to read offline
Efficient Many-Function
Video ML at the Edge
Chris Rowen
Will Reed
Collaboration AI
Cisco Systems
• Many ML tasks
• Limited capacity:
• Compute
• Memory
• Development time
The Problem
• Many tasks, 1 model
• Share common paths
• Architecture
• Data
• Testing
• Deployment
Our Solution
© 2023 Cisco Systems 4
Before: 2 Tasks, 2 Models
Segmentation
Gestures
© 2023 Cisco Systems
After: 2 Tasks, 1 Model
Backbone
Gestures
Segmentation
5
Model Size Comparison: 2 functions
6
2.4M 2
300k
50k
5.15M
2.4M 1
300k
50k
2.75M
Before:
After:
# Encoder Parameters # Models
# Head
Parameters
TOTAL
© 2023 Cisco Systems
© 2023 Cisco Systems 7
Architecture
Encoder
Features
Decoder
Head 1
Head 2
Head 3
Head 4
Head N
• Configurable input size
• Swappable encoders
• 200 K, 420 K and 2.4 M param options
• Configurable training
• Train encoder, features and decoder jointly
• Freeze encoder and train heads
• Train encoder, lower learning rate and train heads
video frames
• Data efficient architecture
• Tasks can learn from each other’s data through generalization in the encoder
• Similar to how models benefit from pre-training
• Implicit regularization
• Multiple tasks discourage the model from overfitting on any one task
• Benefit from diversity of data in related tasks
• Only need to label data for the task you care about
• Quick and cheap to add a new task
Data Strategy
8
© 2023 Cisco Systems
Training
9
Function Labels
Inputs
Back prop
• Skip losses for unlabeled functions
Encoder
Features
Decoder
Head 1
Head 3
Head 4
Head N
Loss1
Loss2
Loss3
Loss4
Weighted
sum loss
© 2023 Cisco Systems
© 2023 Cisco Systems
Adding a New Task:
Face Landmarks Detection
Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
10
• Perform face and keypoints detection in
one pass
• Based on YOLO v2 (transitioning to v3
soon)
• Remove classification loss
• Add landmark localization loss
• Computation is bounded
SSLD: Single Shot Landmark Detection
face center
face
box
face classification (presence)
classification
confidence
landmark
locations
11
© 2023 Cisco Systems
12
2.4M 4
300k (segment)
50k (gesture)
50k (landmarks)
30k (face loc)
11.2M
2.4M 1 2.8M
300k (segment)
50K (gesture)
50K (landmark+face loc)
Model Size Comparison: 4 functions
segmentation + gesture + face location+ face landmarks
Separate:
Unified:
# Encoder Parameters # Models # Head Parameters TOTAL
© 2023 Cisco Systems
Multi-Function Model Performance
13
0.00000
0.00002
0.00004
0.00006
0.00008
0.00010
0.80
0.82
0.84
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
0 0.1 0.2 0.3 0.4 0.5 0.6
Error
(Keypoints)
Accuracy
(segmentation,
gesture,
face
loc)
Compute GMACs per frame - tiny, small, medium models
Ladon Model Performance (4 tasks)
Segmentation IoU Gesture Fscore Face box IoU Face keypoints mean-squared error
tiny
model
small
model
medium
model
© 2023 Cisco Systems
Examples
14
Optimized Portable Edge
Implementation
15
• Cross-platform
• C++: Supports Windows, Mac, Linux, iOS,
Android
• Javascript: Recent versions of major browsers
• Common ML framework
• ONNX, CoreML, OpenVINO
• CPU and GPU mode
• Consistent results
• Less conversion hell
• High and low level API’s
• Get high level predictions like a fully segmented
and blurred output, 2 and 3D filter effects
• Also low level access to segmentation masks,
landmark coordinates, etc…
Overlays &
Backgrounds
Avatars
Be Right
Back
Encoder
Features
Decoder
Foreground
Segmentation
Gestures
Face
Landmarks
Gesture
Actions
Face Location
& Size
Relighting
LUT
Studio
Lighting
© 2023 Cisco Systems
• Two powerful methods
• Teachers:
• Build big teacher model from limited hand-
labeled data
• use it to bootstrap big data set
• Train range of small models
• Synthetic Scenes:
• Diverse
• 3D environments
• faces
• poses
• lighting
• Programmatically animate 3D avatars
Training Label & Data Synthesis
16
Labeled data
Teacher model
1.Initial training
Unlabeled data
2.Rough labeling
Human selection/correction
Semi-automatic
labeled data
3.Training bootstrap
Production
models
4.Product training
© 2023 Cisco Systems
1. Richer applications  explosion of video ML needs  compute crisis​?
2. Multi-headed vision model is a form of ”foundation model” like GPT-4  robustness through task
diversity
3. Smart algorithmic labeling can replace much hand labeling
4. Estimating algorithm FLOPS is an imperfect predictor of implementation throughput – not every layer is
a convolution
5. Implementing N functions together complicates loss function design and training
6. Raising system functionality may span many platforms  portability
7. Emergence of edge CPU neural accelerators may open door to more aggressive video ML workloads,
but uneven time-lines​ for availability​
8. Conventional wisdom says diverse training tasks together often doesn’t work. Conventional wisdom is
often wrong.
Lessons Learned
17
© 2023 Cisco Systems
• Semantic Segmentation: https://www.v7labs.com/blog/semantic-segmentation-guide
• Adaptive Re-lighting: https://arxiv.org/pdf/2009.14468.pdf
• Original YOLO paper: https://arxiv.org/abs/1506.02640
• Knowledge distillation: https://www.v7labs.com/blog/knowledge-distillation-guide
• Multi-task learning: https://towardsdatascience.com/multi-task-learning-in-machine-learning-
20a37c796c9c
• Dataset Distillation: https://ai.googleblog.com/2021/12/training-machine-learning-models-
more.html
• Intro to self-supervised learning: https://ai.facebook.com/blog/self-supervised-learning-the-dark-
matter-of-intelligence/
Resources
18
© 2023 Cisco Systems

More Related Content

Similar to “Efficient Many-function Video ML at the Edge,” a Presentation from Cisco Systems

Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime PlatformAlexey Kharlamov
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on DatabricksDataScienceConferenc1
 
Automated product categorization
Automated product categorizationAutomated product categorization
Automated product categorizationAndreas Loupasakis
 
Automated product categorization
Automated product categorization   Automated product categorization
Automated product categorization Warply
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?Ivo Andreev
 
Training and deploying ML models with Google Cloud Platform
Training and deploying ML models with Google Cloud PlatformTraining and deploying ML models with Google Cloud Platform
Training and deploying ML models with Google Cloud PlatformSotrender
 
Andrii Sliusar "Module Architecture of React-Redux Applications"
Andrii Sliusar "Module Architecture of React-Redux Applications"Andrii Sliusar "Module Architecture of React-Redux Applications"
Andrii Sliusar "Module Architecture of React-Redux Applications"LogeekNightUkraine
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated MLMark Tabladillo
 
System Integration Demo- Revit and Excel
System Integration Demo- Revit and ExcelSystem Integration Demo- Revit and Excel
System Integration Demo- Revit and ExcelUfuoma Okeme
 
Forefront Identity Manager
Forefront Identity ManagerForefront Identity Manager
Forefront Identity ManagerMASIT MACEDONIA
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Sotrender
 
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg SchadData Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg SchadData Con LA
 
Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)
Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)
Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)Robb Boyd
 
Predicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksPredicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksSarah Dutkiewicz
 

Similar to “Efficient Many-function Video ML at the Edge,” a Presentation from Cisco Systems (20)

Developing Digital Twins
Developing Digital TwinsDeveloping Digital Twins
Developing Digital Twins
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime Platform
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
 
Automated product categorization
Automated product categorizationAutomated product categorization
Automated product categorization
 
Automated product categorization
Automated product categorization   Automated product categorization
Automated product categorization
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
 
Training and deploying ML models with Google Cloud Platform
Training and deploying ML models with Google Cloud PlatformTraining and deploying ML models with Google Cloud Platform
Training and deploying ML models with Google Cloud Platform
 
Andrii Sliusar "Module Architecture of React-Redux Applications"
Andrii Sliusar "Module Architecture of React-Redux Applications"Andrii Sliusar "Module Architecture of React-Redux Applications"
Andrii Sliusar "Module Architecture of React-Redux Applications"
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
System Integration Demo- Revit and Excel
System Integration Demo- Revit and ExcelSystem Integration Demo- Revit and Excel
System Integration Demo- Revit and Excel
 
CoreML
CoreMLCoreML
CoreML
 
Forefront Identity Manager
Forefront Identity ManagerForefront Identity Manager
Forefront Identity Manager
 
DE PPT.pptx
DE PPT.pptxDE PPT.pptx
DE PPT.pptx
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
 
Aditya java nov_2015
Aditya java nov_2015Aditya java nov_2015
Aditya java nov_2015
 
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg SchadData Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
 
Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)
Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)
Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)
 
Sangeetha_G
Sangeetha_GSangeetha_G
Sangeetha_G
 
Predicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksPredicting Flights with Azure Databricks
Predicting Flights with Azure Databricks
 
Resume
ResumeResume
Resume
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 

“Efficient Many-function Video ML at the Edge,” a Presentation from Cisco Systems

  • 1. Efficient Many-Function Video ML at the Edge Chris Rowen Will Reed Collaboration AI Cisco Systems
  • 2. • Many ML tasks • Limited capacity: • Compute • Memory • Development time The Problem
  • 3. • Many tasks, 1 model • Share common paths • Architecture • Data • Testing • Deployment Our Solution
  • 4. © 2023 Cisco Systems 4 Before: 2 Tasks, 2 Models Segmentation Gestures
  • 5. © 2023 Cisco Systems After: 2 Tasks, 1 Model Backbone Gestures Segmentation 5
  • 6. Model Size Comparison: 2 functions 6 2.4M 2 300k 50k 5.15M 2.4M 1 300k 50k 2.75M Before: After: # Encoder Parameters # Models # Head Parameters TOTAL © 2023 Cisco Systems
  • 7. © 2023 Cisco Systems 7 Architecture Encoder Features Decoder Head 1 Head 2 Head 3 Head 4 Head N • Configurable input size • Swappable encoders • 200 K, 420 K and 2.4 M param options • Configurable training • Train encoder, features and decoder jointly • Freeze encoder and train heads • Train encoder, lower learning rate and train heads video frames
  • 8. • Data efficient architecture • Tasks can learn from each other’s data through generalization in the encoder • Similar to how models benefit from pre-training • Implicit regularization • Multiple tasks discourage the model from overfitting on any one task • Benefit from diversity of data in related tasks • Only need to label data for the task you care about • Quick and cheap to add a new task Data Strategy 8 © 2023 Cisco Systems
  • 9. Training 9 Function Labels Inputs Back prop • Skip losses for unlabeled functions Encoder Features Decoder Head 1 Head 3 Head 4 Head N Loss1 Loss2 Loss3 Loss4 Weighted sum loss © 2023 Cisco Systems
  • 10. © 2023 Cisco Systems Adding a New Task: Face Landmarks Detection Click to edit Master text styles • Second level • Third level • Fourth level • Fifth level 10
  • 11. • Perform face and keypoints detection in one pass • Based on YOLO v2 (transitioning to v3 soon) • Remove classification loss • Add landmark localization loss • Computation is bounded SSLD: Single Shot Landmark Detection face center face box face classification (presence) classification confidence landmark locations 11 © 2023 Cisco Systems
  • 12. 12 2.4M 4 300k (segment) 50k (gesture) 50k (landmarks) 30k (face loc) 11.2M 2.4M 1 2.8M 300k (segment) 50K (gesture) 50K (landmark+face loc) Model Size Comparison: 4 functions segmentation + gesture + face location+ face landmarks Separate: Unified: # Encoder Parameters # Models # Head Parameters TOTAL © 2023 Cisco Systems
  • 13. Multi-Function Model Performance 13 0.00000 0.00002 0.00004 0.00006 0.00008 0.00010 0.80 0.82 0.84 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00 0 0.1 0.2 0.3 0.4 0.5 0.6 Error (Keypoints) Accuracy (segmentation, gesture, face loc) Compute GMACs per frame - tiny, small, medium models Ladon Model Performance (4 tasks) Segmentation IoU Gesture Fscore Face box IoU Face keypoints mean-squared error tiny model small model medium model © 2023 Cisco Systems
  • 15. Optimized Portable Edge Implementation 15 • Cross-platform • C++: Supports Windows, Mac, Linux, iOS, Android • Javascript: Recent versions of major browsers • Common ML framework • ONNX, CoreML, OpenVINO • CPU and GPU mode • Consistent results • Less conversion hell • High and low level API’s • Get high level predictions like a fully segmented and blurred output, 2 and 3D filter effects • Also low level access to segmentation masks, landmark coordinates, etc… Overlays & Backgrounds Avatars Be Right Back Encoder Features Decoder Foreground Segmentation Gestures Face Landmarks Gesture Actions Face Location & Size Relighting LUT Studio Lighting © 2023 Cisco Systems
  • 16. • Two powerful methods • Teachers: • Build big teacher model from limited hand- labeled data • use it to bootstrap big data set • Train range of small models • Synthetic Scenes: • Diverse • 3D environments • faces • poses • lighting • Programmatically animate 3D avatars Training Label & Data Synthesis 16 Labeled data Teacher model 1.Initial training Unlabeled data 2.Rough labeling Human selection/correction Semi-automatic labeled data 3.Training bootstrap Production models 4.Product training © 2023 Cisco Systems
  • 17. 1. Richer applications  explosion of video ML needs  compute crisis​? 2. Multi-headed vision model is a form of ”foundation model” like GPT-4  robustness through task diversity 3. Smart algorithmic labeling can replace much hand labeling 4. Estimating algorithm FLOPS is an imperfect predictor of implementation throughput – not every layer is a convolution 5. Implementing N functions together complicates loss function design and training 6. Raising system functionality may span many platforms  portability 7. Emergence of edge CPU neural accelerators may open door to more aggressive video ML workloads, but uneven time-lines​ for availability​ 8. Conventional wisdom says diverse training tasks together often doesn’t work. Conventional wisdom is often wrong. Lessons Learned 17 © 2023 Cisco Systems
  • 18. • Semantic Segmentation: https://www.v7labs.com/blog/semantic-segmentation-guide • Adaptive Re-lighting: https://arxiv.org/pdf/2009.14468.pdf • Original YOLO paper: https://arxiv.org/abs/1506.02640 • Knowledge distillation: https://www.v7labs.com/blog/knowledge-distillation-guide • Multi-task learning: https://towardsdatascience.com/multi-task-learning-in-machine-learning- 20a37c796c9c • Dataset Distillation: https://ai.googleblog.com/2021/12/training-machine-learning-models- more.html • Intro to self-supervised learning: https://ai.facebook.com/blog/self-supervised-learning-the-dark- matter-of-intelligence/ Resources 18 © 2023 Cisco Systems