SlideShare a Scribd company logo
Hate it or love it, your SW stack
defines application performance
and reach
Felix Baum, Director, Product
Management at Qualcomm
Technologies Inc.
1
Personas and scenarios
2
Lisa Ben
Expert developer
Seasoned ML warrior
Need to squeeze all the
performance offered by
hardware
Novice developer
Not very sophisticated at ML
Scalability is more important
than performance
Building an Application - Misconceptions
3
ML in the cloud and edge device is similar Nothing could be further from reality
Runtime frameworks differ in cadence,
range and performance
All runtime frameworks offer the same
performance and flexibility
Quantization is hard and offers little benefit
Tools are available to offer users > 6 times
in performance and power improvements
FP32
App
ML
Algorithm
4
Building an Application
Adding ML to your Application
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
FP32
? ? ? ? ? ? ? ? ?
5
? ? ? ? ? ? ? ? ?
Adding ML to your Application
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
FP32
? ? ? ? ? ? ? ? ?
6
What if it is not accurate enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
FP32
? ? ? ? ? ? ? ? ?
Hardware Acceleration
FP32
? ? ? ? ?
Why not
INT8?
7
Hardware Acceleration
FP32
? ? ? ? ?
What if it is not accurate enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
FP32
? ? ? ? ? ? ? ? ?
8
32-bit floating point
3452.3194
01010101 01010101 01010101 01010101
0101
01010101
01010101 01010101
Models trained at
high precision
16-bit Integer
3452
Inference at
lower precision
8-bit Integer
255
4-bit Integer
15
64X
up to
16X
up to
4X
up to
Increase in
performance per watt
from savings in
memory and compute
Increase in
performance per
watt from savings in
memory and
compute
Increase in
performance per watt
from savings in
memory and compute
Adding more data types
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
? ? ? ? ? ? ? ? ?
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
9
Adding more data types – need tools to make it seamless
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
? ? ? ? ? ? ? ? ?
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
Quantization,
Pruning,
Optimization
INT8
INT16
FP16
FP32
10
Hardware Acceleration
? ? ? ? ?
What if it is not flexible enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
? ? ? ? ? ? ? ? ?
Where is the
choice?
11
INT8
INT16
FP16
FP32
Quantization,
Pruning,
Optimization
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
What if it is not flexible enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
? ? ? ? ? ? ? ? ?
12
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
• Cadence – some runtime frameworks have a yearly cadence
of release while others have monthly
• Range – some frameworks offer wider range of supported
operators
• Performance – even if runtime claims support for an
operator, that does not always mean that it is accelerated
Quantization,
Pruning,
Optimization
Selecting a runtime framework that fits
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
Qualcomm Neural Processing SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
13
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Hardware Acceleration
? ? ? ? ?
What if it is not flexible enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
14
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Qualcomm Neural Processing SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
Why so slow?
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Hardware Acceleration
What if it is not fast enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
15
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Qualcomm Neural Processing SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
Why so slow?
HTP
Audio
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Hardware Acceleration
What if it is not fast enough?
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
16
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Qualcomm Neural Processing SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
• Not all accelerators support all data types
• Operator parity is not guaranteed
• Not all accelerators deliver the same
performance
HTP
Audio
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Selecting a runtime framework that fits
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Hardware Acceleration
? ? ? ? ?
Hexagon Processor SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
Low Level Neural Network Library
Halide,
TVM
17
HTP
Audio
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
Selecting a runtime framework that fits
FP32
App
ML
Algorithm
Cognitive Toolkit
Training Frameworks
Runtime Frameworks
FP32
INT8
INT16
FP16
FP32
INT8
INT16
FP16
FP32
Hardware Acceleration
Hexagon Processor SDK
PyTorch Mobile
NNAPI
TF-Lite
ONNX RT
Low Level Neural Network Library
HTP
Audio
Halide,
TVM
18
Quantization,
Pruning,
Optimization
Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries.
.
In order to achieve your application full capacity, you need a software stack
that is tailored to specifically to what you are looking to accomplish
Different models require specific tools that only customizable stacks will offer
Not all applications are built the same way, your software stack
will determine how well your application will perform
Key Takeaways
19
Qualcomm and Hexagon are trademarks or registered trademarks of Qualcomm Incorporated
Thank You
20
Resource Slide
• Qualcomm AI page:
https://www.qualcomm.com/invention/artificial-intelligence
• Qualcomm AI research:
https://www.qualcomm.com/invention/artificial-
intelligence/ai-
research?cmpid=fofyus193556&gclid=CjwKCAjw19z6BRAYEiw
Amo64LfQjU8vqH8TxqKTM2PZQp8JibXrjev85wLfKFknJnS_b4
94yZ7e_WhoCPQkQAvD_BwE
• Qualcomm Platform Solution Ecosystem:
https://www.qualcomm.com/support/qan/platform-
solutions-ecosystem
• GitHub AI Model Efficiency Toolkit (AIMET):
https://github.com/quic/aimet
• Qualcomm Mobile AI page:
https://www.qualcomm.com/products/smartphones/mobile-ai
• Qualcomm Mobile AI blog:
https://www.qualcomm.com/news/onq/2020/12/02/exploring-ai-
capabilities-qualcomm-snapdragon-888-mobile-platform
• Qualcomm Cloud AI 100 blog:
https://www.qualcomm.com/news/onq/2021/03/15/qualcomm-cloud-ai-
100-amd-epyc-7003-series-processor-and-gigabyte-server

More Related Content

What's hot

Software testing training in Chandigarh
Software testing training in ChandigarhSoftware testing training in Chandigarh
Software testing training in Chandigarh
Webliquidinfotech
 
Resume_Archana_Rao
Resume_Archana_RaoResume_Archana_Rao
Resume_Archana_Rao
archana rao
 
Functional testing vs non functional testing | Difference Between Functional ...
Functional testing vs non functional testing | Difference Between Functional ...Functional testing vs non functional testing | Difference Between Functional ...
Functional testing vs non functional testing | Difference Between Functional ...
Intellipaat
 
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPanditSoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
Sreekrishna Pandit
 
JOB OF THE FUTURE: RPA Developer
JOB OF THE FUTURE:RPA DeveloperJOB OF THE FUTURE:RPA Developer
JOB OF THE FUTURE: RPA Developer
Abhinav Sabharwal- Business Analyst Mumbai
 
Introduction to ART (Android Runtime)
Introduction to ART (Android Runtime)Introduction to ART (Android Runtime)
Introduction to ART (Android Runtime)
Iordanis (Jordan) Giannakakis
 

What's hot (6)

Software testing training in Chandigarh
Software testing training in ChandigarhSoftware testing training in Chandigarh
Software testing training in Chandigarh
 
Resume_Archana_Rao
Resume_Archana_RaoResume_Archana_Rao
Resume_Archana_Rao
 
Functional testing vs non functional testing | Difference Between Functional ...
Functional testing vs non functional testing | Difference Between Functional ...Functional testing vs non functional testing | Difference Between Functional ...
Functional testing vs non functional testing | Difference Between Functional ...
 
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPanditSoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
SoftwareQualityControlProfessional_12Yrs_SreekrishnaHPandit
 
JOB OF THE FUTURE: RPA Developer
JOB OF THE FUTURE:RPA DeveloperJOB OF THE FUTURE:RPA Developer
JOB OF THE FUTURE: RPA Developer
 
Introduction to ART (Android Runtime)
Introduction to ART (Android Runtime)Introduction to ART (Android Runtime)
Introduction to ART (Android Runtime)
 

Similar to “Hate It Or Love It, Your Neural Network Software Stack Defines Application Performance and Reach,” a Presentation from Qualcomm

Accelerating AI Adoption with Partners
Accelerating AI Adoption with PartnersAccelerating AI Adoption with Partners
Accelerating AI Adoption with Partners
Sri Ambati
 
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
Edge AI and Vision Alliance
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
tdc-globalcode
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Intel® Software
 
Intel python 2017
Intel python 2017Intel python 2017
Intel python 2017
DESMOND YUEN
 
Python* Scalability in Production Environments
Python* Scalability in Production EnvironmentsPython* Scalability in Production Environments
Python* Scalability in Production Environments
Intel® Software
 
Getting Space Pirate Trainer* to Perform on Intel® Graphics
Getting Space Pirate Trainer* to Perform on Intel® GraphicsGetting Space Pirate Trainer* to Perform on Intel® Graphics
Getting Space Pirate Trainer* to Perform on Intel® Graphics
Intel® Software
 
Machine Learning Platform @Flipkart - Slash N Conference 2018
Machine Learning Platform @Flipkart - Slash N Conference 2018Machine Learning Platform @Flipkart - Slash N Conference 2018
Machine Learning Platform @Flipkart - Slash N Conference 2018
Naresh Sankapelly
 
Monitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using PrometheusMonitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using Prometheus
Databricks
 
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYWhy Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Enterprise Management Associates
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big Data
DESMOND YUEN
 
Labview1_ Computer Applications in Control_ACRRL
Labview1_ Computer Applications in Control_ACRRLLabview1_ Computer Applications in Control_ACRRL
Labview1_ Computer Applications in Control_ACRRL
Mohammad Sabouri
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
Linaro
 
Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture
Intel® Software
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Intel® Software
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures
Joel Falcou
 
Build 2019 Recap
Build 2019 RecapBuild 2019 Recap
Build 2019 Recap
Eran Stiller
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache Spark
Databricks
 
Intro to open source telemetry linux con 2016
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016
Matthew Broberg
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel Product
Tyrone Systems
 

Similar to “Hate It Or Love It, Your Neural Network Software Stack Defines Application Performance and Reach,” a Presentation from Qualcomm (20)

Accelerating AI Adoption with Partners
Accelerating AI Adoption with PartnersAccelerating AI Adoption with Partners
Accelerating AI Adoption with Partners
 
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
 
Intel python 2017
Intel python 2017Intel python 2017
Intel python 2017
 
Python* Scalability in Production Environments
Python* Scalability in Production EnvironmentsPython* Scalability in Production Environments
Python* Scalability in Production Environments
 
Getting Space Pirate Trainer* to Perform on Intel® Graphics
Getting Space Pirate Trainer* to Perform on Intel® GraphicsGetting Space Pirate Trainer* to Perform on Intel® Graphics
Getting Space Pirate Trainer* to Perform on Intel® Graphics
 
Machine Learning Platform @Flipkart - Slash N Conference 2018
Machine Learning Platform @Flipkart - Slash N Conference 2018Machine Learning Platform @Flipkart - Slash N Conference 2018
Machine Learning Platform @Flipkart - Slash N Conference 2018
 
Monitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using PrometheusMonitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using Prometheus
 
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYWhy Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big Data
 
Labview1_ Computer Applications in Control_ACRRL
Labview1_ Computer Applications in Control_ACRRLLabview1_ Computer Applications in Control_ACRRL
Labview1_ Computer Applications in Control_ACRRL
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
 
Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures
 
Build 2019 Recap
Build 2019 RecapBuild 2019 Recap
Build 2019 Recap
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache Spark
 
Intro to open source telemetry linux con 2016
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel Product
 

More from Edge AI and Vision Alliance

“Squeezing the Last Milliwatt and Cubic Millimeter from Smart Cameras Using t...
“Squeezing the Last Milliwatt and Cubic Millimeter from Smart Cameras Using t...“Squeezing the Last Milliwatt and Cubic Millimeter from Smart Cameras Using t...
“Squeezing the Last Milliwatt and Cubic Millimeter from Smart Cameras Using t...
Edge AI and Vision Alliance
 
"Maximize Your AI Compatibility with Flexible Pre- and Post-processing," a Pr...
"Maximize Your AI Compatibility with Flexible Pre- and Post-processing," a Pr..."Maximize Your AI Compatibility with Flexible Pre- and Post-processing," a Pr...
"Maximize Your AI Compatibility with Flexible Pre- and Post-processing," a Pr...
Edge AI and Vision Alliance
 
“Addressing Tomorrow’s Sensor Fusion and Processing Needs with Cadence’s Newe...
“Addressing Tomorrow’s Sensor Fusion and Processing Needs with Cadence’s Newe...“Addressing Tomorrow’s Sensor Fusion and Processing Needs with Cadence’s Newe...
“Addressing Tomorrow’s Sensor Fusion and Processing Needs with Cadence’s Newe...
Edge AI and Vision Alliance
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
“Silicon Slip-ups: The Ten Most Common Errors Processor Suppliers Make (Numbe...
“Silicon Slip-ups: The Ten Most Common Errors Processor Suppliers Make (Numbe...“Silicon Slip-ups: The Ten Most Common Errors Processor Suppliers Make (Numbe...
“Silicon Slip-ups: The Ten Most Common Errors Processor Suppliers Make (Numbe...
Edge AI and Vision Alliance
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
“How Arm’s Machine Learning Solution Enables Vision Transformers at the Edge,...
“How Arm’s Machine Learning Solution Enables Vision Transformers at the Edge,...“How Arm’s Machine Learning Solution Enables Vision Transformers at the Edge,...
“How Arm’s Machine Learning Solution Enables Vision Transformers at the Edge,...
Edge AI and Vision Alliance
 
“Nx EVOS: A New Enterprise Operating System for Video and Visual AI,” a Prese...
“Nx EVOS: A New Enterprise Operating System for Video and Visual AI,” a Prese...“Nx EVOS: A New Enterprise Operating System for Video and Visual AI,” a Prese...
“Nx EVOS: A New Enterprise Operating System for Video and Visual AI,” a Prese...
Edge AI and Vision Alliance
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a..."OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
Edge AI and Vision Alliance
 
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
Edge AI and Vision Alliance
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
Edge AI and Vision Alliance
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
Edge AI and Vision Alliance
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Squeezing the Last Milliwatt and Cubic Millimeter from Smart Cameras Using t...
“Squeezing the Last Milliwatt and Cubic Millimeter from Smart Cameras Using t...“Squeezing the Last Milliwatt and Cubic Millimeter from Smart Cameras Using t...
“Squeezing the Last Milliwatt and Cubic Millimeter from Smart Cameras Using t...
 
"Maximize Your AI Compatibility with Flexible Pre- and Post-processing," a Pr...
"Maximize Your AI Compatibility with Flexible Pre- and Post-processing," a Pr..."Maximize Your AI Compatibility with Flexible Pre- and Post-processing," a Pr...
"Maximize Your AI Compatibility with Flexible Pre- and Post-processing," a Pr...
 
“Addressing Tomorrow’s Sensor Fusion and Processing Needs with Cadence’s Newe...
“Addressing Tomorrow’s Sensor Fusion and Processing Needs with Cadence’s Newe...“Addressing Tomorrow’s Sensor Fusion and Processing Needs with Cadence’s Newe...
“Addressing Tomorrow’s Sensor Fusion and Processing Needs with Cadence’s Newe...
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
“Silicon Slip-ups: The Ten Most Common Errors Processor Suppliers Make (Numbe...
“Silicon Slip-ups: The Ten Most Common Errors Processor Suppliers Make (Numbe...“Silicon Slip-ups: The Ten Most Common Errors Processor Suppliers Make (Numbe...
“Silicon Slip-ups: The Ten Most Common Errors Processor Suppliers Make (Numbe...
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
“How Arm’s Machine Learning Solution Enables Vision Transformers at the Edge,...
“How Arm’s Machine Learning Solution Enables Vision Transformers at the Edge,...“How Arm’s Machine Learning Solution Enables Vision Transformers at the Edge,...
“How Arm’s Machine Learning Solution Enables Vision Transformers at the Edge,...
 
“Nx EVOS: A New Enterprise Operating System for Video and Visual AI,” a Prese...
“Nx EVOS: A New Enterprise Operating System for Video and Visual AI,” a Prese...“Nx EVOS: A New Enterprise Operating System for Video and Visual AI,” a Prese...
“Nx EVOS: A New Enterprise Operating System for Video and Visual AI,” a Prese...
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a..."OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
 
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 

Recently uploaded

Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 

Recently uploaded (20)

Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Artificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic WarfareArtificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic Warfare
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 

“Hate It Or Love It, Your Neural Network Software Stack Defines Application Performance and Reach,” a Presentation from Qualcomm

  • 1. Hate it or love it, your SW stack defines application performance and reach Felix Baum, Director, Product Management at Qualcomm Technologies Inc. 1
  • 2. Personas and scenarios 2 Lisa Ben Expert developer Seasoned ML warrior Need to squeeze all the performance offered by hardware Novice developer Not very sophisticated at ML Scalability is more important than performance
  • 3. Building an Application - Misconceptions 3 ML in the cloud and edge device is similar Nothing could be further from reality Runtime frameworks differ in cadence, range and performance All runtime frameworks offer the same performance and flexibility Quantization is hard and offers little benefit Tools are available to offer users > 6 times in performance and power improvements
  • 5. Adding ML to your Application FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 FP32 ? ? ? ? ? ? ? ? ? 5 ? ? ? ? ? ? ? ? ?
  • 6. Adding ML to your Application FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 FP32 ? ? ? ? ? ? ? ? ? 6
  • 7. What if it is not accurate enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 FP32 ? ? ? ? ? ? ? ? ? Hardware Acceleration FP32 ? ? ? ? ? Why not INT8? 7
  • 8. Hardware Acceleration FP32 ? ? ? ? ? What if it is not accurate enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 FP32 ? ? ? ? ? ? ? ? ? 8 32-bit floating point 3452.3194 01010101 01010101 01010101 01010101 0101 01010101 01010101 01010101 Models trained at high precision 16-bit Integer 3452 Inference at lower precision 8-bit Integer 255 4-bit Integer 15 64X up to 16X up to 4X up to Increase in performance per watt from savings in memory and compute Increase in performance per watt from savings in memory and compute Increase in performance per watt from savings in memory and compute
  • 9. Adding more data types FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 ? ? ? ? ? ? ? ? ? INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Hardware Acceleration ? ? ? ? ? 9
  • 10. Adding more data types – need tools to make it seamless FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 ? ? ? ? ? ? ? ? ? INT8 INT16 FP16 FP32 Hardware Acceleration ? ? ? ? ? Quantization, Pruning, Optimization INT8 INT16 FP16 FP32 10
  • 11. Hardware Acceleration ? ? ? ? ? What if it is not flexible enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 ? ? ? ? ? ? ? ? ? Where is the choice? 11 INT8 INT16 FP16 FP32 Quantization, Pruning, Optimization INT8 INT16 FP16 FP32
  • 12. Hardware Acceleration ? ? ? ? ? What if it is not flexible enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 ? ? ? ? ? ? ? ? ? 12 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 • Cadence – some runtime frameworks have a yearly cadence of release while others have monthly • Range – some frameworks offer wider range of supported operators • Performance – even if runtime claims support for an operator, that does not always mean that it is accelerated Quantization, Pruning, Optimization
  • 13. Selecting a runtime framework that fits FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Hardware Acceleration ? ? ? ? ? Qualcomm Neural Processing SDK PyTorch Mobile NNAPI TF-Lite ONNX RT 13 Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 14. Hardware Acceleration ? ? ? ? ? What if it is not flexible enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 14 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Qualcomm Neural Processing SDK PyTorch Mobile NNAPI TF-Lite ONNX RT Why so slow? Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 15. Hardware Acceleration What if it is not fast enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 15 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Qualcomm Neural Processing SDK PyTorch Mobile NNAPI TF-Lite ONNX RT Why so slow? HTP Audio Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 16. Hardware Acceleration What if it is not fast enough? FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 16 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Qualcomm Neural Processing SDK PyTorch Mobile NNAPI TF-Lite ONNX RT • Not all accelerators support all data types • Operator parity is not guaranteed • Not all accelerators deliver the same performance HTP Audio Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 17. Selecting a runtime framework that fits FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Hardware Acceleration ? ? ? ? ? Hexagon Processor SDK PyTorch Mobile NNAPI TF-Lite ONNX RT Low Level Neural Network Library Halide, TVM 17 HTP Audio Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 18. Selecting a runtime framework that fits FP32 App ML Algorithm Cognitive Toolkit Training Frameworks Runtime Frameworks FP32 INT8 INT16 FP16 FP32 INT8 INT16 FP16 FP32 Hardware Acceleration Hexagon Processor SDK PyTorch Mobile NNAPI TF-Lite ONNX RT Low Level Neural Network Library HTP Audio Halide, TVM 18 Quantization, Pruning, Optimization Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc and/or its subsidiaries. .
  • 19. In order to achieve your application full capacity, you need a software stack that is tailored to specifically to what you are looking to accomplish Different models require specific tools that only customizable stacks will offer Not all applications are built the same way, your software stack will determine how well your application will perform Key Takeaways 19 Qualcomm and Hexagon are trademarks or registered trademarks of Qualcomm Incorporated
  • 21. Resource Slide • Qualcomm AI page: https://www.qualcomm.com/invention/artificial-intelligence • Qualcomm AI research: https://www.qualcomm.com/invention/artificial- intelligence/ai- research?cmpid=fofyus193556&gclid=CjwKCAjw19z6BRAYEiw Amo64LfQjU8vqH8TxqKTM2PZQp8JibXrjev85wLfKFknJnS_b4 94yZ7e_WhoCPQkQAvD_BwE • Qualcomm Platform Solution Ecosystem: https://www.qualcomm.com/support/qan/platform- solutions-ecosystem • GitHub AI Model Efficiency Toolkit (AIMET): https://github.com/quic/aimet • Qualcomm Mobile AI page: https://www.qualcomm.com/products/smartphones/mobile-ai • Qualcomm Mobile AI blog: https://www.qualcomm.com/news/onq/2020/12/02/exploring-ai- capabilities-qualcomm-snapdragon-888-mobile-platform • Qualcomm Cloud AI 100 blog: https://www.qualcomm.com/news/onq/2021/03/15/qualcomm-cloud-ai- 100-amd-epyc-7003-series-processor-and-gigabyte-server