For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2022/06/a-practical-guide-to-getting-the-dnn-accuracy-you-need-and-the-performance-you-deserve-a-presentation-from-qualcomm/
Felix Baum, Director of Product Management at Qualcomm, presents the “Practical Guide to Getting the DNN Accuracy You Need and the Performance You Deserve” tutorial at the May 2022 Embedded Vision Summit.
Every day, developers struggle to take DNN workloads that were originally developed on workstations and migrate them to run on edge devices. Whether the application is in mobile, compute, IoT, XR or automotive, most AI developers start their algorithm development in the cloud or on a workstation and later migrate to on-device as an afterthought. Qualcomm is helping these developers on multiple fronts—democratizing AI at the edge by supporting frameworks and data types that developers are most familiar with, and at the same time building a set of tools to assist sophisticated developers who are taking extra steps to extract the best performance and power efficiency.
In this session, Baum presents the workflow and steps for effectively migrating DNN workloads to the edge. He discusses quantization issues, explore how the accuracy of models affects performance and power and outline the Qualcomm tools that help developers successfully launch new use cases on mobile and other edge devices.
“A Practical Guide to Getting the DNN Accuracy You Need and the Performance You Deserve,” a Presentation from Qualcomm
1. A Practical Guide to
Getting the DNN
Accuracy You Need
and the Performance
You Deserve
Felix Baum
Director, Product Management
Qualcomm Technologies, Inc.
Snapdragon is a product of Qualcomm
Technologies, Inc. and/or its subsidiaries.
2. Qualcomm Technologies AI software stack
Supporting every AI software layer from applications to the metal
2
2022 Qualcomm Technologies, Inc.
Runtime
Qualcomm®
Neural Processing SDK
Android Neural
Networks API
SDKs
ResNet DeepLab
MobileNet
SSD
Mobile
BERT
VDSR
Models
Qualcomm® AI
Engine direct
NNAPI
Frameworks
Applications
AIMET
TVM
Tools +
Compilers
Qualcomm Neural Processing SDK and Qualcomm AI Engine Direct
are products of Qualcomm Technologies, Inc. and/or its subsidiaries
3. Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
AI software workflow today
2022 Qualcomm Technologies, Inc. 3
Train,
Finetune
model
Data
Scientist
ML Training
Engineer
Environment Training
Legend Customer chosen sw.
Customer ML Workflow Target Usage
SNPE & Qualcomm AI
Engine Direct Workflow
4. Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Does the
model
compile?
AI software workflow today
2022 Qualcomm Technologies, Inc. 4
Train,
Finetune
model
Add custom
layers and/or
fix errors
Qualcomm
AI Engine Direct
& Qualcomm
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
Formats
supported
Data
Scientist
ML Training
Engineer
Environment Training Compilation
Legend Customer chosen sw.
Customer ML Workflow Target Usage
SNPE & Qualcomm AI
Engine Direct Workflow
5. Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
AI software workflow today
2022 Qualcomm Technologies, Inc. 5
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm
AI Engine Direct
& Qualcomm
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Data
Scientist
ML Training
Engineer
Environment Training Compilation Accuracy analysis
Legend Customer chosen sw.
Customer ML Workflow Target Usage
SNPE & Qualcomm AI
Engine Direct Workflow
6. Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
AI software workflow today
2022 Qualcomm Technologies, Inc. 6
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
Environment Training Compilation Accuracy analysis Optimizations
Qualcomm
AI Engine Direct
& Qualcomm
Neural Processing
SDK Profilers
Legend Customer chosen sw.
Customer ML Workflow Target Usage
SNPE & Qualcomm AI
Engine Direct Workflow
Hexagon
Profiler
& Trace
Analyzer
7. AI software workflow today
2022 Qualcomm Technologies, Inc. 7
Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Environment Training Compilation Accuracy analysis Optimizations Integration Deployment
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
Legend Customer chosen sw.
Customer ML Workflow Target Usage
SNPE & Qualcomm AI
Engine Direct Workflow
8. Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
AI software workflow today
2022 Qualcomm Technologies, Inc. 8
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Environment Training Compilation Accuracy analysis Optimizations Integration Deployment
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
Legend Customer chosen sw.
Customer ML Workflow
SNPE & Qualcomm AI
Engine Direct Workflow
Target Usage
Quantization Tuner
Automated
quantization using all
quantization options
and determine best
options for best
accuracy for model
Ranks accuracy using
different verifiers for all
quantization options
matrix
9. Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
AI software workflow today
2022 Qualcomm Technologies, Inc. 9
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Environment Training Compilation Accuracy analysis Optimizations Integration Deployment
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
Legend Customer chosen sw.
Customer ML Workflow
SNPE & Qualcomm AI
Engine Direct Workflow
Target Usage
Quantization Tuner
Automated
quantization using all
quantization options
and determine best
options for best
accuracy for model
Ranks accuracy using
different verifiers for all
quantization options
matrix
Performance
Analyzer
A new QNN HTP perf
profile exposes
bottlenecks in
network execution by
showing expanded
analysis of
contribution of ops to
executive timelines
10. Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
AI software workflow today
2022 Qualcomm Technologies, Inc. 10
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Environment Training Compilation Accuracy analysis Optimizations Integration Deployment
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
Legend Customer chosen sw.
Customer ML Workflow
SNPE & Qualcomm AI
Engine Direct Workflow
Target Usage
Quantization Tuner
Automated
quantization using all
quantization options
and determine best
options for best
accuracy for model
Ranks accuracy using
different verifiers for all
quantization options
matrix
Performance
Analyzer
A new QNN HTP perf
profile exposes
bottlenecks in
network execution by
showing expanded
analysis of
contribution of ops to
executive timelines
Hexagon Instrumentation Profiler
Provide insights into the system by
collecting cycle counts, PMU counters
& other metrics
11. Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
AI software workflow today
2022 Qualcomm Technologies, Inc. 11
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Environment Training Compilation Accuracy analysis Optimizations Integration Deployment
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
Legend Customer chosen sw.
Customer ML Workflow
SNPE & Qualcomm AI
Engine Direct Workflow
Target Usage
Quantization Tuner
Automated
quantization using all
quantization options
and determine best
options for best
accuracy for model
Ranks accuracy using
different verifiers for all
quantization options
matrix
Performance
Analyzer
A new QNN HTP perf
profile exposes
bottlenecks in
network execution by
showing expanded
analysis of
contribution of ops to
executive timelines
Hexagon Instrumentation Profiler
Provide insights into the system by
collecting cycle counts, PMU counters
& other metrics
Hexagon VS Code
Improved IDE for
debugging, profiling,
and trace analysis
12. Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
AI software workflow today
2022 Qualcomm Technologies, Inc. 12
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Environment Training Compilation Accuracy analysis Optimizations Integration Deployment
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
Legend Customer chosen sw.
Customer ML Workflow
SNPE & Qualcomm AI
Engine Direct Workflow
Target Usage
Quantization Tuner
Automated
quantization using all
quantization options
and determine best
options for best
accuracy for model
Ranks accuracy using
different verifiers for all
quantization options
matrix
Performance
Analyzer
A new QNN HTP perf
profile exposes
bottlenecks in
network execution by
showing expanded
analysis of
contribution of ops to
executive timelines
Hexagon Instrumentation Profiler
Provide insights into the system by
collecting cycle counts, PMU counters
& other metrics
Hexagon VS Code
Improved IDE for
debugging, profiling,
and trace analysis
QNN HTP
Simulator
QeMU based
simulation
environment for bit
accurate validation
of accuracy of the
execution
13. Not all applications are built the same way, your software
stack will determine how well your application will perform
In order to achieve your application full capacity, you need a
software stack that is tailored to specifically to what you are
looking to accomplish
Different models require specific tools that only customizable
stacks will offer
Take away
13
2022 Qualcomm Technologies, Inc.
14. Resources
14
2022 Qualcomm Technologies, Inc.
2022 Embedded Vision Summit
“Powering the Intelligent Connected Edge and the Future of
On-Device AI”
Ziad Asghar May 18 9:30 - 10:00 AM PT
“Seamless Deployment of Multimedia and Machine Learning
Applications at the Edge”
Megha Daga May 17 2:40 - 3:10 PM PT
"Autonomous Driving AI Workloads: Technology Trends and
Optimization Strategies“
Ahmed Sadek May 17 2:05 – 2:35 PM PT
“Tools for Creating Next-Gen Computer Vision Apps on
Snapdragon”
Judd Heape May 18 10:50 - 11:20 AM PT
“The Future of AI is Here Today: Deep Dive into Qualcomm’s
On-Device AI Offerings”
Vinesh Sukumar May 18 12:00 - 12:30 PM PT
Qualcomm AI page:
https://www.qualcomm.com/invention/artificial-intelligence
Qualcomm AI Research:
https://www.qualcomm.com/invention/artificial-intelligence/ai-
research?cmpid=fofyus193556&gclid=CjwKCAjw19z6BRAYEiwAmo64LfQ
jU8vqH8TxqKTM2PZQp8JibXrjev85wLfKFknJnS_b494yZ7e_WhoCPQkQAv
D_BwE
Qualcomm Platform Solution Ecosystem:
https://www.qualcomm.com/support/qan/platform-solutions-ecosystem
GitHub AI Model Efficiency Toolkit (AIMET):
https://github.com/quic/aimet
Qualcomm Mobile AI page:
https://www.qualcomm.com/products/smartphones/mobile-ai
Qualcomm Mobile AI blog:
https://www.qualcomm.com/news/onq/2020/12/02/exploring-ai-
capabilities-qualcomm-snapdragon-888-mobile-platform
Felix Baum, Director, Product Management
fbaum@qti.qualcomm.com