SlideShare a Scribd company logo
1 of 21
Download to read offline
Hate it or love it, your SW stack
defines application performance
and reach
Felix Baum, Director, Product
Management at Qualcomm
Technologies Inc.
1
Personas and scenarios
2
Lisa Ben
Expert developer
Seasoned ML warrior
Need to squeeze all the
performance offered by
hardware
Novice developer
Not very sophisticated at ML
Scalability is more important
than performance
Building an Application - Misconceptions
3
ML in the cloud and edge device is similar Nothing could be further from reality
Runtime frameworks differ in cadence,
range and performance
All runtime frameworks offer the same
performance and flexibility
Quantization is hard and offers little benefit
Tools are available to offer users > 6 times
in performance and power improvements
FP32
App
ML
Algorithm
4
Building an Application
Adding ML to your Application
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
FP32
? ? ? ? ? ? ? ? ?
5
? ? ? ? ? ? ? ? ?
Adding ML to your Application
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
FP32
? ? ? ? ? ? ? ? ?
6
What if it is not accurate enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
FP32
? ? ? ? ? ? ? ? ?
Hardware Acceleration
FP32
? ? ? ? ?
Why not
INT8?
7
Hardware Acceleration
FP32
? ? ? ? ?
What if it is not accurate enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
FP32
? ? ? ? ? ? ? ? ?
8
32-bit floating point
3452.3194
01010101 01010101 01010101 01010101
0101
01010101
01010101 01010101
Models trained at
high precision
16-bit Integer
3452
Inference at
lower precision
8-bit Integer
255
4-bit Integer
15
64X
up to
16X
up to
4X
up to
Increase in
performance per watt
from savings in
memory and compute
Increase in
performance per
watt from savings in
memory and
compute
Increase in
performance per watt
from savings in
memory and compute
Adding more data types
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
? ? ? ? ? ? ? ? ?
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
9
Adding more data types – need tools to make it seamless
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
? ? ? ? ? ? ? ? ?
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
Quantization,
Pruning,
Optimization
INT8
INT16
FP16
FP32
10
Hardware Acceleration
? ? ? ? ?
What if it is not flexible enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
? ? ? ? ? ? ? ? ?
Where is the
choice?
11
INT8
INT16
FP16
FP32
Quantization,
Pruning,
Optimization
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
What if it is not flexible enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
? ? ? ? ? ? ? ? ?
12
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
• Cadence – some runtime frameworks have a yearly cadence
of release while others have monthly
• Range – some frameworks offer wider range of supported
operators
• Performance – even if runtime claims support for an
operator, that does not always mean that it is accelerated
Quantization,
Pruning,
Optimization
Selecting a runtime framework that fits
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
Qualcomm Neural Processing SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
13
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Hardware Acceleration
? ? ? ? ?
What if it is not flexible enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
14
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Qualcomm Neural Processing SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
Why so slow?
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Hardware Acceleration
What if it is not fast enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
15
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Qualcomm Neural Processing SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
Why so slow?
HTP
Audio
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Hardware Acceleration
What if it is not fast enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
16
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Qualcomm Neural Processing SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
• Not all accelerators support all data types
• Operator parity is not guaranteed
• Not all accelerators deliver the same
performance
HTP
Audio
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Selecting a runtime framework that fits
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
Hexagon Processor SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
Low Level Neural Network Library
Halide,
TVM
17
HTP
Audio
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Selecting a runtime framework that fits
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Hardware Acceleration
Hexagon Processor SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
Low Level Neural Network Library
HTP
Audio
Halide,
TVM
18
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
In order to achieve your application full capacity, you need a software stack
that is tailored to specifically to what you are looking to accomplish
Different models require specific tools that only customizable stacks will offer
Not all applications are built the same way, your software stack
will determine how well your application will perform
Key Takeaways
19
Qualcomm and Hexagon are trademarks or registered trademarks of Qualcomm Incorporated
Thank You
20
Resource Slide
• Qualcomm AI page:
https://www.qualcomm.com/invention/artificial-intelligence
• Qualcomm AI research:
https://www.qualcomm.com/invention/artificial-
intelligence/ai-
research?cmpid=fofyus193556&gclid=CjwKCAjw19z6BRAYEiw
Amo64LfQjU8vqH8TxqKTM2PZQp8JibXrjev85wLfKFknJnS_b4
94yZ7e_WhoCPQkQAvD_BwE
• Qualcomm Platform Solution Ecosystem:
https://www.qualcomm.com/support/qan/platform-
solutions-ecosystem
• GitHub AI Model Efficiency Toolkit (AIMET):
https://github.com/quic/aimet
• Qualcomm Mobile AI page:
https://www.qualcomm.com/products/smartphones/mobile-ai
• Qualcomm Mobile AI blog:
https://www.qualcomm.com/news/onq/2020/12/02/exploring-ai-
capabilities-qualcomm-snapdragon-888-mobile-platform
• Qualcomm Cloud AI 100 blog:
https://www.qualcomm.com/news/onq/2021/03/15/qualcomm-cloud-ai-
100-amd-epyc-7003-series-processor-and-gigabyte-server

More Related Content

What's hot

Software testing training in Chandigarh
Software testing training in ChandigarhSoftware testing training in Chandigarh
Software testing training in ChandigarhWebliquidinfotech
 
Resume_Archana_Rao
Resume_Archana_RaoResume_Archana_Rao
Resume_Archana_Raoarchana rao
 
Functional testing vs non functional testing | Difference Between Functional ...
Functional testing vs non functional testing | Difference Between Functional ...Functional testing vs non functional testing | Difference Between Functional ...
Functional testing vs non functional testing | Difference Between Functional ...Intellipaat
 
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPanditSoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPanditSreekrishna Pandit
 

What's hot (6)

Software testing training in Chandigarh
Software testing training in ChandigarhSoftware testing training in Chandigarh
Software testing training in Chandigarh
 
Resume_Archana_Rao
Resume_Archana_RaoResume_Archana_Rao
Resume_Archana_Rao
 
Functional testing vs non functional testing | Difference Between Functional ...
Functional testing vs non functional testing | Difference Between Functional ...Functional testing vs non functional testing | Difference Between Functional ...
Functional testing vs non functional testing | Difference Between Functional ...
 
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPanditSoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
 
JOB OF THE FUTURE: RPA Developer
JOB OF THE FUTURE:RPA DeveloperJOB OF THE FUTURE:RPA Developer
JOB OF THE FUTURE: RPA Developer
 
Introduction to ART (Android Runtime)
Introduction to ART (Android Runtime)Introduction to ART (Android Runtime)
Introduction to ART (Android Runtime)
 

Similar to “Hate It Or Love It, Your Neural Network Software Stack Defines Application Performance and Reach,” a Presentation from Qualcomm

Accelerating AI Adoption with Partners
Accelerating AI Adoption with PartnersAccelerating AI Adoption with Partners
Accelerating AI Adoption with PartnersSri Ambati
 
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...Edge AI and Vision Alliance
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...tdc-globalcode
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Intel® Software
 
Python* Scalability in Production Environments
Python* Scalability in Production EnvironmentsPython* Scalability in Production Environments
Python* Scalability in Production EnvironmentsIntel® Software
 
Getting Space Pirate Trainer* to Perform on Intel® Graphics
Getting Space Pirate Trainer* to Perform on Intel® GraphicsGetting Space Pirate Trainer* to Perform on Intel® Graphics
Getting Space Pirate Trainer* to Perform on Intel® GraphicsIntel® Software
 
Machine Learning Platform @Flipkart - Slash N Conference 2018
Machine Learning Platform @Flipkart - Slash N Conference 2018Machine Learning Platform @Flipkart - Slash N Conference 2018
Machine Learning Platform @Flipkart - Slash N Conference 2018Naresh Sankapelly
 
Monitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using PrometheusMonitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using PrometheusDatabricks
 
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYWhy Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYEnterprise Management Associates
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataDESMOND YUEN
 
Labview1_ Computer Applications in Control_ACRRL
Labview1_ Computer Applications in Control_ACRRLLabview1_ Computer Applications in Control_ACRRL
Labview1_ Computer Applications in Control_ACRRLMohammad Sabouri
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...Linaro
 
Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture Intel® Software
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Intel® Software
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel ArchitecturesJoel Falcou
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkDatabricks
 
Intro to open source telemetry linux con 2016
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016Matthew Broberg
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductTyrone Systems
 

Similar to “Hate It Or Love It, Your Neural Network Software Stack Defines Application Performance and Reach,” a Presentation from Qualcomm (20)

Accelerating AI Adoption with Partners
Accelerating AI Adoption with PartnersAccelerating AI Adoption with Partners
Accelerating AI Adoption with Partners
 
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
 
Python* Scalability in Production Environments
Python* Scalability in Production EnvironmentsPython* Scalability in Production Environments
Python* Scalability in Production Environments
 
Intel python 2017
Intel python 2017Intel python 2017
Intel python 2017
 
Getting Space Pirate Trainer* to Perform on Intel® Graphics
Getting Space Pirate Trainer* to Perform on Intel® GraphicsGetting Space Pirate Trainer* to Perform on Intel® Graphics
Getting Space Pirate Trainer* to Perform on Intel® Graphics
 
Machine Learning Platform @Flipkart - Slash N Conference 2018
Machine Learning Platform @Flipkart - Slash N Conference 2018Machine Learning Platform @Flipkart - Slash N Conference 2018
Machine Learning Platform @Flipkart - Slash N Conference 2018
 
Monitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using PrometheusMonitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using Prometheus
 
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYWhy Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big Data
 
Labview1_ Computer Applications in Control_ACRRL
Labview1_ Computer Applications in Control_ACRRLLabview1_ Computer Applications in Control_ACRRL
Labview1_ Computer Applications in Control_ACRRL
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
 
Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures
 
Build 2019 Recap
Build 2019 RecapBuild 2019 Recap
Build 2019 Recap
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache Spark
 
Intro to open source telemetry linux con 2016
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel Product
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Recently uploaded (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

“Hate It Or Love It, Your Neural Network Software Stack Defines Application Performance and Reach,” a Presentation from Qualcomm

  • 1. Hate it or love it, your SW stack defines application performance and reach Felix Baum, Director, Product Management at Qualcomm Technologies Inc. 1
  • 2. Personas and scenarios 2 Lisa Ben Expert developer Seasoned ML warrior Need to squeeze all the performance offered by hardware Novice developer Not very sophisticated at ML Scalability is more important than performance
  • 3. Building an Application - Misconceptions 3 ML in the cloud and edge device is similar Nothing could be further from reality Runtime frameworks differ in cadence, range and performance All runtime frameworks offer the same performance and flexibility Quantization is hard and offers little benefit Tools are available to offer users > 6 times in performance and power improvements
  • 5. Adding ML to your Application FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 FP32 ? ? ? ? ? ? ? ? ? 5 ? ? ? ? ? ? ? ? ?
  • 6. Adding ML to your Application FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 FP32 ? ? ? ? ? ? ? ? ? 6
  • 7. What if it is not accurate enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 FP32 ? ? ? ? ? ? ? ? ? Hardware Acceleration FP32 ? ? ? ? ? Why not INT8? 7
  • 8. Hardware Acceleration FP32 ? ? ? ? ? What if it is not accurate enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 FP32 ? ? ? ? ? ? ? ? ? 8 32-bit floating point 3452.3194 01010101 01010101 01010101 01010101 0101 01010101 01010101 01010101 Models trained at high precision 16-bit Integer 3452 Inference at lower precision 8-bit Integer 255 4-bit Integer 15 64X up to 16X up to 4X up to Increase in performance per watt from savings in memory and compute Increase in performance per watt from savings in memory and compute Increase in performance per watt from savings in memory and compute
  • 9. Adding more data types FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 ? ? ? ? ? ? ? ? ? INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Hardware Acceleration ? ? ? ? ? 9
  • 10. Adding more data types – need tools to make it seamless FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 ? ? ? ? ? ? ? ? ? INT8 INT16 FP16 FP32 Hardware Acceleration ? ? ? ? ? Quantization, Pruning, Optimization INT8 INT16 FP16 FP32 10
  • 11. Hardware Acceleration ? ? ? ? ? What if it is not flexible enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 ? ? ? ? ? ? ? ? ? Where is the choice? 11 INT8 INT16 FP16 FP32 Quantization, Pruning, Optimization INT8 INT16 FP16 FP32
  • 12. Hardware Acceleration ? ? ? ? ? What if it is not flexible enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 ? ? ? ? ? ? ? ? ? 12 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 • Cadence – some runtime frameworks have a yearly cadence of release while others have monthly • Range – some frameworks offer wider range of supported operators • Performance – even if runtime claims support for an operator, that does not always mean that it is accelerated Quantization, Pruning, Optimization
  • 13. Selecting a runtime framework that fits FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Hardware Acceleration ? ? ? ? ? Qualcomm Neural Processing SDK PyTorch Mobile NNAPI TF-Lite ONNX RT 13 Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 14. Hardware Acceleration ? ? ? ? ? What if it is not flexible enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 14 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Qualcomm Neural Processing SDK PyTorch Mobile NNAPI TF-Lite ONNX RT Why so slow? Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 15. Hardware Acceleration What if it is not fast enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 15 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Qualcomm Neural Processing SDK PyTorch Mobile NNAPI TF-Lite ONNX RT Why so slow? HTP Audio Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 16. Hardware Acceleration What if it is not fast enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 16 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Qualcomm Neural Processing SDK PyTorch Mobile NNAPI TF-Lite ONNX RT • Not all accelerators support all data types • Operator parity is not guaranteed • Not all accelerators deliver the same performance HTP Audio Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 17. Selecting a runtime framework that fits FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Hardware Acceleration ? ? ? ? ? Hexagon Processor SDK PyTorch Mobile NNAPI TF-Lite ONNX RT Low Level Neural Network Library Halide, TVM 17 HTP Audio Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 18. Selecting a runtime framework that fits FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Hardware Acceleration Hexagon Processor SDK PyTorch Mobile NNAPI TF-Lite ONNX RT Low Level Neural Network Library HTP Audio Halide, TVM 18 Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 19. In order to achieve your application full capacity, you need a software stack that is tailored to specifically to what you are looking to accomplish Different models require specific tools that only customizable stacks will offer Not all applications are built the same way, your software stack will determine how well your application will perform Key Takeaways 19 Qualcomm and Hexagon are trademarks or registered trademarks of Qualcomm Incorporated
  • 21. Resource Slide • Qualcomm AI page: https://www.qualcomm.com/invention/artificial-intelligence • Qualcomm AI research: https://www.qualcomm.com/invention/artificial- intelligence/ai- research?cmpid=fofyus193556&gclid=CjwKCAjw19z6BRAYEiw Amo64LfQjU8vqH8TxqKTM2PZQp8JibXrjev85wLfKFknJnS_b4 94yZ7e_WhoCPQkQAvD_BwE • Qualcomm Platform Solution Ecosystem: https://www.qualcomm.com/support/qan/platform- solutions-ecosystem • GitHub AI Model Efficiency Toolkit (AIMET): https://github.com/quic/aimet • Qualcomm Mobile AI page: https://www.qualcomm.com/products/smartphones/mobile-ai • Qualcomm Mobile AI blog: https://www.qualcomm.com/news/onq/2020/12/02/exploring-ai- capabilities-qualcomm-snapdragon-888-mobile-platform • Qualcomm Cloud AI 100 blog: https://www.qualcomm.com/news/onq/2021/03/15/qualcomm-cloud-ai- 100-amd-epyc-7003-series-processor-and-gigabyte-server