SlideShare a Scribd company logo
1 of 23
Download to read offline
© 2020 Texas Instruments
Making Edge AI Inference
Programming Easier and Flexible
Manisha Agrawal
Product Marketing Engineer
September 2020
© 2020 Texas Instruments
Agenda
2
Challenges with
edge inference
deployment
Open source
inference
framework
advancement
Easier and
flexible
programming on
TI Jacinto™ 7
processors
Programming
experience with
Jacinto 7
processors
compared to
desktop
© 2020 Texas Instruments
Deploying Deep Learning Model for Edge Inference (1/2)
3
Edge Inference
Training framework
Trained model Deploy
© 2020 Texas Instruments
Deploying Deep Learning Model for Edge Inference(2/2)
4
Training
framework
Trained model
Proprietary
inference
framework
Optimizer
Embedded
hardware
Learn
Custom tools
Runtime engine
Vendor A
Vendor B
Vendor C
Vendor D
Vendor E
API | Interpreter | Scheduler Kernel library
© 2020 Texas Instruments
Edge Inference Programming Challenges
5
Trained model
Inference
framework
Embedded
hardware
Challenges
• Tools not easy
to use
• Unsupported
operators
• Accuracy goals not
met
• Performance
not met
• Power goals
not met
Software
Hardware
Optimizer / Compiler
Runtime engine
API | Interpreter | Scheduler Kernel library
Training
framework
© 2020 Texas Instruments
Open Source Inference Framework Advancement
6
Trained model
Inference
framework
Embedded
hardware
Open source inference
framework
Compiler / Optimizer
Runtime engine
API | Interpreter | Scheduler Kernel library
General purpose Custom
ArmNN
Training
framework
© 2020 Texas Instruments
Promising Open Source Framework Meeting Majority
Customers Need
7
RunTime engineTraining framework
Optimizer / Compiler /
Graph format
TensorFLow Lite format
ONNX format
TVM / Amazon SageMaker Neo
compiler
TFLite RunTime
ONNX-RunTime
Neo-AI-DLR
© 2020 Texas Instruments
Open Source TFLite RunTime
8
OpenSource RunTime API
TFLite Kernel library
General purpose hardware
Model artifacts
TFLite RunTime TFLIte RunTime
• Supporting all inference operators on CPU /
GPU
© 2020 Texas Instruments
Open Source ONNX RunTime
9
OpenSource RunTime API
ONNX Kernel library
General purpose hardware
Model artifacts
ONNX RunTime ONNX RunTime
• Supporting all inference operators on CPU /
GPU
© 2020 Texas Instruments
Open Source Inference Framework with Hooks for
Specialized Hardware
10
OpenSource RunTime API
Backend
Kernel library / Generated code
General purpose hardware
Model artifacts
RunTime
Kernel library
Specialized
hardware
Compiler & RunTime
• Supporting all inference operators on CPU /
GPU
• Backend for specialized hardware
Embedded SoC
© 2020 Texas Instruments
TI’s First JacintoTM 7 SoC for Edge Inference
Accelerating key functions lowers
power
• DSP for computer vision
• Vision processing
• Video, graphics
• Deep learning
Industry’s most efficient DL
architecture
• Enables passively-cooling
designs
• 90% utilization of deep
learning accelerator due to
smart memory system
Automotive Quality-ready
process technology
• Power reduction is achieved
through smart architecture,
not process
https://www.ti.com/product/TDA4VM
High speed interconnect 16nm FF
ASIL D / SIL 3
C7x DSP
32k/48K L1
512KB L2
ASIL B / SIL2
Powerisolation
Security acceleration
Crypto: AES, 3DES, SHA, PKA, RNG
Encode
Decode
Video acceleration Encode
Decode
Ethernet switch
Up to 8 Ports
ETHERNE
T
Audio acceleration
HD ATLASRC
Vision acceleration
Vision ISP
w/ LDC
Dense
Optical
Flow
Stereo
Disparity
Estimation
Arm
Cortex A7x
48k/32K each
Arm
Cortex A7x
48k/32K each
1M shared L2
Rogue 8XE GPU
Display sub-system
1x DP/eDP, 1x DSI
Safety MCU
DMSC
Device Management &
Security Controller
Hardware Diagnostics
CRC
RTI
ESM
DCC
BIST
Arm
Cortex R5F
32K/32K L1
64KB RAM
Arm
Cortex R5F
32K/32K L1
64KB RAM
LockStep
1MB SRAM
2x ADC
3x I2C* 2x UART*
3x SPI* 2x I3C*
UDMA GPIO
2x
RGMII/RMII
Capture sub-system
2x CSI2 x 4 DL
GPIO
IPC
IOMMU UDMA
SMMU Debug Timers
WDT
System services
Hardware
Diagnostics
Connectivity Network
4x PCIe 14x
6 Pin 4096FS
Serial
1x I3C1x SDIO
8x McSPI10x
UART
12x
McASP
7x I2C
MMA32k/32K
L1
288KB L2
C66xDSP
MMA
-+ * =
32k/32K L1
288KB L2
C66xDSP
MMA-+ * =
8 MB L3 RAM w/ECC and Coherency
011100
100010
001111
32b LPDDR4/4x-3732 +ECC
Arm
Cortex R5F
16K/16K L1
64KB RAM
Arm
Cortex R5F
16K/16K L1
64KB RAM
LockStep
Arm
Cortex R5F
16K/16K L1
64KB RAM
Arm
Cortex R5F
16K/16K L1
64KB RAM
LockStep
0.5 MB SRAM
Storage
1x UFS
2.x 1x SD 3.0
1x eMMC5.x 2x USB
© 2020 Texas Instruments
TI Deep Learning (TIDL) Inference Framework
12
NRE-free, royalty-free tools enable high-performance, fixed-point inference on TI processors
Optimizer
TIDL Optimizer
• Post training quantization for 8-bit and
16-bit
• Range calibration
• Hardware independent optimizations
• Hardware specific optimizations
TI SoC
RunTime
TIDL RunTime
• Runtime API
• Deep learning library
TI SoC
© 2020 Texas Instruments
TIDL Accelerate All Operators You Rely On
13
Refer to latest Processor SDK user guide document for complete list of accelerated operators and tested models:
https://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-jacinto7/latest/exports/docs/tidl_j7_01_02_00_09/ti_dl/docs/user_guide_html/md_tidl_layers_info.html
TIDL features
• Accelerates all operators commonly used by CNN vision
models on our deep learning accelerator cores
• Out-of-box support for 35+ pre-trained CNN models
• List continues to grow
Popular operators supported include
• Convolution
• Pooling
• Element wise
• Inner-product
• Soft-max
• Bias add
• Concatenate
• Scale
• Batch
normalization
• Re-size
• Arg-max
• Slice
• Crop
• Flatten
• Shuffle channel
• Detection
output
• Deconvolution/
Transpose
convolution
© 2020 Texas Instruments
Texas Instruments adopting open source framework with
TI Deep Learning (TIDL) Integration
14
OpenSource RunTime API
TIDL RunTime
RunTime Open source adoption
• Integrating TIDL RunTime in open source
RunTime engine
Kernel library / Generated code
Cortex-A72
TIDL library
Deep learning
accelerator
Model artifacts
Neo-AI-DLR
Jacinto 7 processor
© 2020 Texas Instruments
Texas Instruments adopting open source framework with
TI Deep Learning (TIDL) Integration
15
OpenSource Compiler tools
High level optimizations | Graph partition
TIDL Backend
TVM Compiler Open source adoption: TVM Compiler
• Integrating TI Optimizer in open source TVM
Compiler tool
Model artifacts
Code Generation
General purpose
hardware
TIDL Optimizer
tools
TI DL hardware
Low level
optimizations
© 2020 Texas Instruments
TIDL RunTime
Now Accelerate All Your Models With Open Source API
16
• Inference latency in <5 ms for all popular
classification models
• Minimal accuracy loss with Post Training
Quantization and Quantization Aware Training
tools from TI
TensorFLow Lite
models
ONNX models
TVM/Amazon SageMaker
Neo compiler
TFLite RunTime
ONNX-RunTime
Neo-AI-DLR
User Application
Python / C / C++
API | interpreter | scheduler
TFLite/ONNX-RT/Neo-AI-DLR
Jacinto 7 processor
Deep learning
accelerator
-+ * =
C7x DSP with MMA*
ARM
Cortex A72
ARM
Cortex A72
IPC
*MMA: Matrix Multiplication Accelerator (Tensor Processing Unit)
Linux OS
CPU
Run any model from open source inference
frameworks on TI processors!
© 2020 Texas Instruments
Seamless Migration From PC Validation to Inference
Deployment
17
Your DL programming experience on Jacinto 7 processors
is the same as desktop computer programming.
This includes -
• Open source LINUX callable APIs in Python / C / C++
between PC and target board
• Jupyter Notebook examples
Download directly from ti.com
http://www.ti.com/tool/PROCESSOR-SDK-DRA8X-TDA4X
User Application
Python / C / C++
API | interpreter | scheduler
TFLite/ONNX-RT/Neo-AI-DLR
Jacinto 7 processor
ARM
Cortex A72
ARM
Cortex A72
Linux OS
CPU
© 2020 Texas Instruments
Programming Example: Amazon SageMaker Neo &
Neo-AI-DLR
18
Model compilation RunTime
Output
Run inference
Create RunTime
SageMaker Neo console
Provide model
information
Select TDA4VM as target
device
Compile the model
TDA4VM
Provide output location
© 2020 Texas Instruments
Programming Example: TensorFlow Lite
Create RunTime
Run inference
Get output
Init time: Hook TIDL backend
RunTime
© 2020 Texas Instruments
Deep Learning System Performance on Jacinto-7
TDA4VM: Example demo
20
5 simultaneous deep learning algorithms on 3x 1MP camera each @ ~16 fps
• Parking spot detection
• Vehicle detection
• Semantic segmentation
• Motion segmentation
• Depth estimation
Camera 1 Camera 2 Camera 3 Camera 1 Camera 2 Camera 3
Resource loading
A72: 6% C7x+MMA: 94%
DDR BW: 26%
Vehicle detection
Parking spot
detection
Semantic
segmentation
Dense Optical Flow
Depth Estimation
Motion
Segmentation
Inference resolution: 3x768x384
© 2020 Texas Instruments
Key Takeaways: Deep Learning Edge Inference at TI
21
• Easier to use
• Opensource Linux callable RunTime APIs supported to program SoC
• Embedded development environment same as a desktop computer environment
• More flexible
• Supports TFLite, ONNX-RT or TVM/Neo-AI-DLR
• Supports compilation at the edge or in the cloud with Amazon SageMaker Neo
• Provide wide model coverage
• All TFLite, ONNX, TVM and SageMaker Neo models supported on TI SoCs
• High compute performance, high throughput, low latency and low power consumption
• Accelerates the model on TI’s deep learning accelerator C7x+MMA to provide the best combination of system
power, deep learning performance and latency at the edge
Jacinto 7 processor silicon is available today!
© 2020 Texas Instruments
Explore Deep Learning With TI Today
22
Full development
Turn-key designs
Software
development kits
Support
TDA4 EVM
http://www.ti.com/tool/TDA4VMXEVM
Automotive version of TDA4V Mid
http://www.ti.com/tool/D3-3P-TDAX-DK
TI Processor SDK – Seamlessly reuse and migrate Linux,
Linux-RT and TI-RTOS software across TI processors
http://www.ti.com/tool/PROCESSOR-SDK-DRA8X-TDA4X
https://e2e.ti.com
Open source RunTime engines with TIDL acceleration is packaged as part of TI’s NRE-free, royalty-free Processor
SDK!
© 2020 Texas Instruments
Thank You!
23

More Related Content

What's hot

Embedded Operating System - Linux
Embedded Operating System - LinuxEmbedded Operating System - Linux
Embedded Operating System - Linux
Emertxe Information Technologies Pvt Ltd
 
U boot porting guide for SoC
U boot porting guide for SoCU boot porting guide for SoC
U boot porting guide for SoC
Macpaul Lin
 

What's hot (20)

Embedded Operating System - Linux
Embedded Operating System - LinuxEmbedded Operating System - Linux
Embedded Operating System - Linux
 
U-Boot - An universal bootloader
U-Boot - An universal bootloader U-Boot - An universal bootloader
U-Boot - An universal bootloader
 
U boot porting guide for SoC
U boot porting guide for SoCU boot porting guide for SoC
U boot porting guide for SoC
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
Linux device drivers
Linux device drivers Linux device drivers
Linux device drivers
 
Linux Internals - Part II
Linux Internals - Part IILinux Internals - Part II
Linux Internals - Part II
 
PCI express
PCI expressPCI express
PCI express
 
Embedded Android : System Development - Part II (Linux device drivers)
Embedded Android : System Development - Part II (Linux device drivers)Embedded Android : System Development - Part II (Linux device drivers)
Embedded Android : System Development - Part II (Linux device drivers)
 
Las16 200 - firmware summit - ras what is it- why do we need it
Las16 200 - firmware summit - ras what is it- why do we need itLas16 200 - firmware summit - ras what is it- why do we need it
Las16 200 - firmware summit - ras what is it- why do we need it
 
OpenWrt From Top to Bottom
OpenWrt From Top to BottomOpenWrt From Top to Bottom
OpenWrt From Top to Bottom
 
Embedded Linux Kernel - Build your custom kernel
Embedded Linux Kernel - Build your custom kernelEmbedded Linux Kernel - Build your custom kernel
Embedded Linux Kernel - Build your custom kernel
 
RISC-V software state of the union
RISC-V software state of the unionRISC-V software state of the union
RISC-V software state of the union
 
Secure container: Kata container and gVisor
Secure container: Kata container and gVisorSecure container: Kata container and gVisor
Secure container: Kata container and gVisor
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
 
Docker Architecture (v1.3)
Docker Architecture (v1.3)Docker Architecture (v1.3)
Docker Architecture (v1.3)
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
 
Study on Android Emulator
Study on Android EmulatorStudy on Android Emulator
Study on Android Emulator
 
Linux Internals - Part I
Linux Internals - Part ILinux Internals - Part I
Linux Internals - Part I
 
Containers technologies
Containers technologiesContainers technologies
Containers technologies
 
Diagnostic in Adaptive AUTOSAR
Diagnostic in Adaptive AUTOSARDiagnostic in Adaptive AUTOSAR
Diagnostic in Adaptive AUTOSAR
 

Similar to “Making Edge AI Inference Programming Easier and Flexible,” a Presentation from Texas Instruments

“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
Edge AI and Vision Alliance
 
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
Edge AI and Vision Alliance
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
Linaro
 

Similar to “Making Edge AI Inference Programming Easier and Flexible,” a Presentation from Texas Instruments (20)

Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
Fixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP PlatformFixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP Platform
 
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
 
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
 
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based MultiprocessingArm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe Conference
 
HiPEAC Computing Systems Week 2022_Mario Porrmann presentation
HiPEAC Computing Systems Week 2022_Mario Porrmann presentationHiPEAC Computing Systems Week 2022_Mario Porrmann presentation
HiPEAC Computing Systems Week 2022_Mario Porrmann presentation
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Achieving AI @scale on Mobile Devices
Achieving AI @scale on Mobile DevicesAchieving AI @scale on Mobile Devices
Achieving AI @scale on Mobile Devices
 
Introduction to the new MediaTek LinkIt™ Development Platform for RTOS
Introduction to the new MediaTek LinkIt™ Development Platform for RTOSIntroduction to the new MediaTek LinkIt™ Development Platform for RTOS
Introduction to the new MediaTek LinkIt™ Development Platform for RTOS
 
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemHai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
 
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
 
Linxu conj2016 96boards
Linxu conj2016 96boardsLinxu conj2016 96boards
Linxu conj2016 96boards
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
 
High Performance DSP with Xilinx All Programmable Devices (Design Conference ...
High Performance DSP with Xilinx All Programmable Devices (Design Conference ...High Performance DSP with Xilinx All Programmable Devices (Design Conference ...
High Performance DSP with Xilinx All Programmable Devices (Design Conference ...
 
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
Edge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 

“Making Edge AI Inference Programming Easier and Flexible,” a Presentation from Texas Instruments

  • 1. © 2020 Texas Instruments Making Edge AI Inference Programming Easier and Flexible Manisha Agrawal Product Marketing Engineer September 2020
  • 2. © 2020 Texas Instruments Agenda 2 Challenges with edge inference deployment Open source inference framework advancement Easier and flexible programming on TI Jacinto™ 7 processors Programming experience with Jacinto 7 processors compared to desktop
  • 3. © 2020 Texas Instruments Deploying Deep Learning Model for Edge Inference (1/2) 3 Edge Inference Training framework Trained model Deploy
  • 4. © 2020 Texas Instruments Deploying Deep Learning Model for Edge Inference(2/2) 4 Training framework Trained model Proprietary inference framework Optimizer Embedded hardware Learn Custom tools Runtime engine Vendor A Vendor B Vendor C Vendor D Vendor E API | Interpreter | Scheduler Kernel library
  • 5. © 2020 Texas Instruments Edge Inference Programming Challenges 5 Trained model Inference framework Embedded hardware Challenges • Tools not easy to use • Unsupported operators • Accuracy goals not met • Performance not met • Power goals not met Software Hardware Optimizer / Compiler Runtime engine API | Interpreter | Scheduler Kernel library Training framework
  • 6. © 2020 Texas Instruments Open Source Inference Framework Advancement 6 Trained model Inference framework Embedded hardware Open source inference framework Compiler / Optimizer Runtime engine API | Interpreter | Scheduler Kernel library General purpose Custom ArmNN Training framework
  • 7. © 2020 Texas Instruments Promising Open Source Framework Meeting Majority Customers Need 7 RunTime engineTraining framework Optimizer / Compiler / Graph format TensorFLow Lite format ONNX format TVM / Amazon SageMaker Neo compiler TFLite RunTime ONNX-RunTime Neo-AI-DLR
  • 8. © 2020 Texas Instruments Open Source TFLite RunTime 8 OpenSource RunTime API TFLite Kernel library General purpose hardware Model artifacts TFLite RunTime TFLIte RunTime • Supporting all inference operators on CPU / GPU
  • 9. © 2020 Texas Instruments Open Source ONNX RunTime 9 OpenSource RunTime API ONNX Kernel library General purpose hardware Model artifacts ONNX RunTime ONNX RunTime • Supporting all inference operators on CPU / GPU
  • 10. © 2020 Texas Instruments Open Source Inference Framework with Hooks for Specialized Hardware 10 OpenSource RunTime API Backend Kernel library / Generated code General purpose hardware Model artifacts RunTime Kernel library Specialized hardware Compiler & RunTime • Supporting all inference operators on CPU / GPU • Backend for specialized hardware Embedded SoC
  • 11. © 2020 Texas Instruments TI’s First JacintoTM 7 SoC for Edge Inference Accelerating key functions lowers power • DSP for computer vision • Vision processing • Video, graphics • Deep learning Industry’s most efficient DL architecture • Enables passively-cooling designs • 90% utilization of deep learning accelerator due to smart memory system Automotive Quality-ready process technology • Power reduction is achieved through smart architecture, not process https://www.ti.com/product/TDA4VM High speed interconnect 16nm FF ASIL D / SIL 3 C7x DSP 32k/48K L1 512KB L2 ASIL B / SIL2 Powerisolation Security acceleration Crypto: AES, 3DES, SHA, PKA, RNG Encode Decode Video acceleration Encode Decode Ethernet switch Up to 8 Ports ETHERNE T Audio acceleration HD ATLASRC Vision acceleration Vision ISP w/ LDC Dense Optical Flow Stereo Disparity Estimation Arm Cortex A7x 48k/32K each Arm Cortex A7x 48k/32K each 1M shared L2 Rogue 8XE GPU Display sub-system 1x DP/eDP, 1x DSI Safety MCU DMSC Device Management & Security Controller Hardware Diagnostics CRC RTI ESM DCC BIST Arm Cortex R5F 32K/32K L1 64KB RAM Arm Cortex R5F 32K/32K L1 64KB RAM LockStep 1MB SRAM 2x ADC 3x I2C* 2x UART* 3x SPI* 2x I3C* UDMA GPIO 2x RGMII/RMII Capture sub-system 2x CSI2 x 4 DL GPIO IPC IOMMU UDMA SMMU Debug Timers WDT System services Hardware Diagnostics Connectivity Network 4x PCIe 14x 6 Pin 4096FS Serial 1x I3C1x SDIO 8x McSPI10x UART 12x McASP 7x I2C MMA32k/32K L1 288KB L2 C66xDSP MMA -+ * = 32k/32K L1 288KB L2 C66xDSP MMA-+ * = 8 MB L3 RAM w/ECC and Coherency 011100 100010 001111 32b LPDDR4/4x-3732 +ECC Arm Cortex R5F 16K/16K L1 64KB RAM Arm Cortex R5F 16K/16K L1 64KB RAM LockStep Arm Cortex R5F 16K/16K L1 64KB RAM Arm Cortex R5F 16K/16K L1 64KB RAM LockStep 0.5 MB SRAM Storage 1x UFS 2.x 1x SD 3.0 1x eMMC5.x 2x USB
  • 12. © 2020 Texas Instruments TI Deep Learning (TIDL) Inference Framework 12 NRE-free, royalty-free tools enable high-performance, fixed-point inference on TI processors Optimizer TIDL Optimizer • Post training quantization for 8-bit and 16-bit • Range calibration • Hardware independent optimizations • Hardware specific optimizations TI SoC RunTime TIDL RunTime • Runtime API • Deep learning library TI SoC
  • 13. © 2020 Texas Instruments TIDL Accelerate All Operators You Rely On 13 Refer to latest Processor SDK user guide document for complete list of accelerated operators and tested models: https://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-jacinto7/latest/exports/docs/tidl_j7_01_02_00_09/ti_dl/docs/user_guide_html/md_tidl_layers_info.html TIDL features • Accelerates all operators commonly used by CNN vision models on our deep learning accelerator cores • Out-of-box support for 35+ pre-trained CNN models • List continues to grow Popular operators supported include • Convolution • Pooling • Element wise • Inner-product • Soft-max • Bias add • Concatenate • Scale • Batch normalization • Re-size • Arg-max • Slice • Crop • Flatten • Shuffle channel • Detection output • Deconvolution/ Transpose convolution
  • 14. © 2020 Texas Instruments Texas Instruments adopting open source framework with TI Deep Learning (TIDL) Integration 14 OpenSource RunTime API TIDL RunTime RunTime Open source adoption • Integrating TIDL RunTime in open source RunTime engine Kernel library / Generated code Cortex-A72 TIDL library Deep learning accelerator Model artifacts Neo-AI-DLR Jacinto 7 processor
  • 15. © 2020 Texas Instruments Texas Instruments adopting open source framework with TI Deep Learning (TIDL) Integration 15 OpenSource Compiler tools High level optimizations | Graph partition TIDL Backend TVM Compiler Open source adoption: TVM Compiler • Integrating TI Optimizer in open source TVM Compiler tool Model artifacts Code Generation General purpose hardware TIDL Optimizer tools TI DL hardware Low level optimizations
  • 16. © 2020 Texas Instruments TIDL RunTime Now Accelerate All Your Models With Open Source API 16 • Inference latency in <5 ms for all popular classification models • Minimal accuracy loss with Post Training Quantization and Quantization Aware Training tools from TI TensorFLow Lite models ONNX models TVM/Amazon SageMaker Neo compiler TFLite RunTime ONNX-RunTime Neo-AI-DLR User Application Python / C / C++ API | interpreter | scheduler TFLite/ONNX-RT/Neo-AI-DLR Jacinto 7 processor Deep learning accelerator -+ * = C7x DSP with MMA* ARM Cortex A72 ARM Cortex A72 IPC *MMA: Matrix Multiplication Accelerator (Tensor Processing Unit) Linux OS CPU Run any model from open source inference frameworks on TI processors!
  • 17. © 2020 Texas Instruments Seamless Migration From PC Validation to Inference Deployment 17 Your DL programming experience on Jacinto 7 processors is the same as desktop computer programming. This includes - • Open source LINUX callable APIs in Python / C / C++ between PC and target board • Jupyter Notebook examples Download directly from ti.com http://www.ti.com/tool/PROCESSOR-SDK-DRA8X-TDA4X User Application Python / C / C++ API | interpreter | scheduler TFLite/ONNX-RT/Neo-AI-DLR Jacinto 7 processor ARM Cortex A72 ARM Cortex A72 Linux OS CPU
  • 18. © 2020 Texas Instruments Programming Example: Amazon SageMaker Neo & Neo-AI-DLR 18 Model compilation RunTime Output Run inference Create RunTime SageMaker Neo console Provide model information Select TDA4VM as target device Compile the model TDA4VM Provide output location
  • 19. © 2020 Texas Instruments Programming Example: TensorFlow Lite Create RunTime Run inference Get output Init time: Hook TIDL backend RunTime
  • 20. © 2020 Texas Instruments Deep Learning System Performance on Jacinto-7 TDA4VM: Example demo 20 5 simultaneous deep learning algorithms on 3x 1MP camera each @ ~16 fps • Parking spot detection • Vehicle detection • Semantic segmentation • Motion segmentation • Depth estimation Camera 1 Camera 2 Camera 3 Camera 1 Camera 2 Camera 3 Resource loading A72: 6% C7x+MMA: 94% DDR BW: 26% Vehicle detection Parking spot detection Semantic segmentation Dense Optical Flow Depth Estimation Motion Segmentation Inference resolution: 3x768x384
  • 21. © 2020 Texas Instruments Key Takeaways: Deep Learning Edge Inference at TI 21 • Easier to use • Opensource Linux callable RunTime APIs supported to program SoC • Embedded development environment same as a desktop computer environment • More flexible • Supports TFLite, ONNX-RT or TVM/Neo-AI-DLR • Supports compilation at the edge or in the cloud with Amazon SageMaker Neo • Provide wide model coverage • All TFLite, ONNX, TVM and SageMaker Neo models supported on TI SoCs • High compute performance, high throughput, low latency and low power consumption • Accelerates the model on TI’s deep learning accelerator C7x+MMA to provide the best combination of system power, deep learning performance and latency at the edge Jacinto 7 processor silicon is available today!
  • 22. © 2020 Texas Instruments Explore Deep Learning With TI Today 22 Full development Turn-key designs Software development kits Support TDA4 EVM http://www.ti.com/tool/TDA4VMXEVM Automotive version of TDA4V Mid http://www.ti.com/tool/D3-3P-TDAX-DK TI Processor SDK – Seamlessly reuse and migrate Linux, Linux-RT and TI-RTOS software across TI processors http://www.ti.com/tool/PROCESSOR-SDK-DRA8X-TDA4X https://e2e.ti.com Open source RunTime engines with TIDL acceleration is packaged as part of TI’s NRE-free, royalty-free Processor SDK!
  • 23. © 2020 Texas Instruments Thank You! 23