“A Practical Guide to Getting the DNN Accuracy You Need and the Performance You Deserve,” a Presentation from Qualcomm

A Practical Guide to
Getting the DNN
Accuracy You Need
and the Performance
You Deserve
Felix Baum
Director, Product Management
Qualcomm Technologies, Inc.
Snapdragon is a product of Qualcomm
Technologies, Inc. and/or its subsidiaries.

Qualcomm Technologies AI software stack
Supporting every AI software layer from applications to the metal
2
2022 Qualcomm Technologies, Inc.
Runtime
Qualcomm®
Neural Processing SDK
Android Neural
Networks API
SDKs
ResNet DeepLab
MobileNet
SSD
Mobile
BERT
VDSR
Models
Qualcomm® AI
Engine direct
NNAPI
Frameworks
Applications
AIMET
TVM
Tools +
Compilers
Qualcomm Neural Processing SDK and Qualcomm AI Engine Direct
are products of Qualcomm Technologies, Inc. and/or its subsidiaries

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
AI software workflow today
2022 Qualcomm Technologies, Inc. 3
Train,
Finetune
model
Data
Scientist
ML Training
Engineer
Environment Training
Legend Customer chosen sw.
Customer ML Workflow Target Usage
SNPE & Qualcomm AI
Engine Direct Workflow

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Does the
model
compile?
Train,
Finetune
model
Add custom
layers and/or
fix errors
Qualcomm
AI Engine Direct
& Qualcomm
Neural
Processing
SDK Converter
and Quantizer Custom Ops
LLVM (C/C++)
TVM (Python)
Formats
supported
Data
Scientist
ML Training
Engineer
Environment Training Compilation
SNPE & Qualcomm AI

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm
AI Engine Direct
& Qualcomm
Neural
Processing
SDK Converter
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Data
Scientist
ML Training
Engineer
Environment Training Compilation Accuracy analysis
SNPE & Qualcomm AI

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
Environment Training Compilation Accuracy analysis Optimizations
Qualcomm
AI Engine Direct
& Qualcomm
Neural Processing
SDK Profilers
SNPE & Qualcomm AI
Hexagon
Profiler
& Trace
Analyzer

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Environment Training Compilation Accuracy analysis Optimizations Integration Deployment
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
SNPE & Qualcomm AI

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
Customer ML Workflow
SNPE & Qualcomm AI
Target Usage
Quantization Tuner
Automated
quantization using all
quantization options
and determine best
options for best
accuracy for model
Ranks accuracy using
different verifiers for all
matrix

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
SNPE & Qualcomm AI
Target Usage
Quantization Tuner
Automated
and determine best
options for best
accuracy for model
matrix
Performance
Analyzer
A new QNN HTP perf
profile exposes
bottlenecks in
network execution by
showing expanded
analysis of
contribution of ops to
executive timelines

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
SNPE & Qualcomm AI
Target Usage
Quantization Tuner
Automated
and determine best
options for best
accuracy for model
matrix
Performance
Analyzer
A new QNN HTP perf
profile exposes
bottlenecks in
showing expanded
analysis of
executive timelines
Hexagon Instrumentation Profiler
Provide insights into the system by
collecting cycle counts, PMU counters
& other metrics

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
SNPE & Qualcomm AI
Target Usage
Quantization Tuner
Automated
and determine best
options for best
accuracy for model
matrix
Performance
Analyzer
A new QNN HTP perf
profile exposes
bottlenecks in
showing expanded
analysis of
executive timelines
& other metrics
Hexagon VS Code
Improved IDE for
debugging, profiling,
and trace analysis

Choose env.,
config.,
model and
framework
Does the model
meet
performance
metrics
Model
Compilation/
Runner
Accuracy
Evaluation
Does the
model
compile?
Is the
model’s
accuracy
acceptable?
Is the model’s
output &
latency
acceptable?
Integrate
model into App
or pipeline
Deploy
App
Train,
Finetune
model
Add custom
layers and/or
fix errors
Debug and
fix errors
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural
Processing
SDK Converter
LLVM (C/C++)
TVM (Python)
AI model
Efficiency
Toolkit
Formats
supported
Profile
model Perf
Did that
work?
Use advanced
optimization
techniques
Hexagon
Profiler
& Trace
Analyzer
Data
Scientist
ML Training
Engineer
ML Inference
Engineer
App
Developer
DevOps
Engineer
Qualcomm®
AI Engine Direct
& Qualcomm®
Neural Processing
SDK Profilers
SNPE & Qualcomm AI
Target Usage
Quantization Tuner
Automated
and determine best
options for best
accuracy for model
matrix
Performance
Analyzer
A new QNN HTP perf
profile exposes
bottlenecks in
showing expanded
analysis of
executive timelines
& other metrics
Hexagon VS Code
Improved IDE for
debugging, profiling,
and trace analysis
QNN HTP
Simulator
QeMU based
simulation
environment for bit
accurate validation
of accuracy of the
execution

 Not all applications are built the same way, your software
stack will determine how well your application will perform
 In order to achieve your application full capacity, you need a
software stack that is tailored to specifically to what you are
looking to accomplish
 Different models require specific tools that only customizable
stacks will offer
Take away
13

Resources
14
2022 Embedded Vision Summit
“Powering the Intelligent Connected Edge and the Future of
On-Device AI”
Ziad Asghar May 18 9:30 - 10:00 AM PT
“Seamless Deployment of Multimedia and Machine Learning
Applications at the Edge”
Megha Daga May 17 2:40 - 3:10 PM PT
"Autonomous Driving AI Workloads: Technology Trends and
Optimization Strategies“
Ahmed Sadek May 17 2:05 – 2:35 PM PT
“Tools for Creating Next-Gen Computer Vision Apps on
Snapdragon”
Judd Heape May 18 10:50 - 11:20 AM PT
“The Future of AI is Here Today: Deep Dive into Qualcomm’s
On-Device AI Offerings”
Vinesh Sukumar May 18 12:00 - 12:30 PM PT
Qualcomm AI page:
https://www.qualcomm.com/invention/artificial-intelligence
Qualcomm AI Research:
https://www.qualcomm.com/invention/artificial-intelligence/ai-
research?cmpid=fofyus193556&gclid=CjwKCAjw19z6BRAYEiwAmo64LfQ
jU8vqH8TxqKTM2PZQp8JibXrjev85wLfKFknJnS_b494yZ7e_WhoCPQkQAv
D_BwE
Qualcomm Platform Solution Ecosystem:
https://www.qualcomm.com/support/qan/platform-solutions-ecosystem
GitHub AI Model Efficiency Toolkit (AIMET):
https://github.com/quic/aimet
Qualcomm Mobile AI page:
https://www.qualcomm.com/products/smartphones/mobile-ai
Qualcomm Mobile AI blog:
https://www.qualcomm.com/news/onq/2020/12/02/exploring-ai-
capabilities-qualcomm-snapdragon-888-mobile-platform
Felix Baum, Director, Product Management
fbaum@qti.qualcomm.com

Thank you
15

“A Practical Guide to Getting the DNN Accuracy You Need and the Performance You Deserve,” a Presentation from Qualcomm

Recommended

Recommended

More Related Content

Similar to “A Practical Guide to Getting the DNN Accuracy You Need and the Performance You Deserve,” a Presentation from Qualcomm

Similar to “A Practical Guide to Getting the DNN Accuracy You Need and the Performance You Deserve,” a Presentation from Qualcomm (20)

More from Edge AI and Vision Alliance

More from Edge AI and Vision Alliance (20)

Recently uploaded

Recently uploaded (20)

“A Practical Guide to Getting the DNN Accuracy You Need and the Performance You Deserve,” a Presentation from Qualcomm