SlideShare a Scribd company logo
Deep Learning on
Everyday Devices
Amir Alush, CTO & co founder, IMVC 2018
Deep Learning “Cycle”
untrained model
new data
(test)
training
data
trained model
car
car pedestrian
Yes No
car
TRAINING INFERENCE
Inference on Everyday devices
8-Core ARM Qualcomm 6-core
Snapdragon 820 TBD
ARM
x86
MIPS
Power
other
Embedded processors market share (Source: AMD 2016)
Inference on the Edge
Motivation
● Low Latency
● Keep Privacy
● Small/no Bandwidth
● Utilizing Existing HW
Challenges
● NN High complexity
● Low HW resources
● Complex porting
The Deep Learning Stack
HARDWARE
GPU, CPU, TPU, FPGA, DSP, ASIC
Primitives Libraries
BLAS*, NNPACK,CUDNN
Frameworks
TF, CAFFE, PyTorch, MXNet
Algorithms
NN Architectures, Meta-Architectures
Engines
TensorRT, Core ML, SNPE
It’s the algorithms!
Efficient algorithms:
1. More AI on any processor
2. Critical for everyday devices with low power processors
Speed-accuracy tradeoff
Running off-the-shelf DL algorithms on the edge
requires sacrificing accuracy. The tradeoffs:
1. Reduce input resolution → reduce accuracy
2. Reduce model size (backbone) → reduce accuracy
3. Reduce model bit precision → reduce accuracy?
4. Less accurate algorithm
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Application”, Howard et al. 2017
nverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation”, Sandler et al. 2018
mAP GPU msecs
faster-rcnn + resnet 101 + high resolution 35.6 140
ssd + mobilenetV1 + low resolution 18.8 50
Reduce network size (pruning)
● Should be structured, non structured is usually not HW supported
● Requires re-training
● How effective on already small models?
● Can hurt accuracy!
“Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks”, Mittal 2018
Reduce model bit precision (quantization)
● Reduce from 32 bits to 8/4/2/1 bits ?
● Activations & Weights are within a narrow range
● Networks are robust to small changes
● 8 bits:
○ Need to be supported in processors instructions
○ Reduces DRAM bandwidth (more in the SRAM)
○ Supported natively by various engines
● Below 8 bits:
○ Accuracy drops
○ Not supported in existing HW
How to deploy your models?
Training Frameworks Inference Engines
● Research flexibility:
○ Fast POC from idea to results
○ Easy to extended: new data loaders, layers
(loss, operations)
○ Easy debugging
● Good training speed (+ parallelization)
○ Fast research iterations
● Large active community:
○ Explore new research ideas fast
● Personal flavour
● Portability should not a factor
● Optimized for latency (forward pass)
● Should be efficient on a specific hardware
○ Takes advantage of specific hw
instructions
● Little / no overhead
● Efficient working memory
● Small code size
● Support all of your NN layers
Training frameworks vs Inference Engines
Build your own Inference Environment?
1. Use the engines (TensorRT, SNPE...):
○ Faster to production (if the conversion works out of the box)
○ Current generation has some limitations and minimal tech support
○ Not as mature as the training frameworks or DL primitives libraries
○ Not all NN capabilities are supported
2. Write your own inference engine:
○ Slower to production, but, with low risk
○ Use the DL primitives libraries (CuDNN, BLAS, NNPACK..) they are more matures than the
engines and they are optimized!
○ You’ll need to write your own logic on top!
○ You’re in control and can adjust to your needs
Brodmann17 R&D workflow
DeployData Research
Brodmann17 Use Case - IoT
● MWC 2018 recent cooperation with &
● Embedded World 2018 with
● CES 2018 with
Brodmann17 Use Case - Smart City
* 1 CPU core
Brodmann17 Use Case - Adas
Benchmarking on the KITTI dataset car detection
* 1 CPU core
Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite, Geiger et al, 2012
1. motivation and challenges in NN edge processing
2. Speed accuracy tradeoffs made today
3. Brodmann17 key design principles for keeping efficiency and accuracy
on the edge
4. key considerations when choosing your inference environment
5. Brodmann17 example use cases
Summary

More Related Content

What's hot

Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...AI Frontiers
 
Presentation
PresentationPresentation
Presentationbutest
 
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori..."Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...Edge AI and Vision Alliance
 
IoT Slam Keynote: Harnessing the Flood of Data with Heterogeneous Computing a...
IoT Slam Keynote: Harnessing the Flood of Data with Heterogeneous Computing a...IoT Slam Keynote: Harnessing the Flood of Data with Heterogeneous Computing a...
IoT Slam Keynote: Harnessing the Flood of Data with Heterogeneous Computing a...Ryft
 
How Do I Understand Deep Learning Performance?
How Do I Understand Deep Learning Performance?How Do I Understand Deep Learning Performance?
How Do I Understand Deep Learning Performance?NVIDIA
 
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from SamsungEdge AI and Vision Alliance
 
Machine Learning with Scala
Machine Learning with ScalaMachine Learning with Scala
Machine Learning with ScalaSusan Eraly
 
Vertex Perspectives | AI Optimized Chipsets | Part II
Vertex Perspectives | AI Optimized Chipsets | Part IIVertex Perspectives | AI Optimized Chipsets | Part II
Vertex Perspectives | AI Optimized Chipsets | Part IIVertex Holdings
 
Affordable AI Connects To A Better Life
Affordable AI Connects To A Better LifeAffordable AI Connects To A Better Life
Affordable AI Connects To A Better LifeNVIDIA Taiwan
 
"Emerging Processor Architectures for Deep Learning: Options and Trade-offs,"...
"Emerging Processor Architectures for Deep Learning: Options and Trade-offs,"..."Emerging Processor Architectures for Deep Learning: Options and Trade-offs,"...
"Emerging Processor Architectures for Deep Learning: Options and Trade-offs,"...Edge AI and Vision Alliance
 
HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...
HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...
HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...LEGATO project
 
"How to Test and Validate an Automated Driving System," a Presentation from M...
"How to Test and Validate an Automated Driving System," a Presentation from M..."How to Test and Validate an Automated Driving System," a Presentation from M...
"How to Test and Validate an Automated Driving System," a Presentation from M...Edge AI and Vision Alliance
 
Google Devfest kolkata'19
Google Devfest kolkata'19Google Devfest kolkata'19
Google Devfest kolkata'19Akshay Bahadur
 
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co..."New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...Edge AI and Vision Alliance
 
Deep learning for smart manufacturing
Deep learning for smart manufacturingDeep learning for smart manufacturing
Deep learning for smart manufacturingSunil Kumar Pradhan
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...Edge AI and Vision Alliance
 
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart..."Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...Edge AI and Vision Alliance
 
Tiny intelligent computers and sensors - Open Hardware Event 2020
Tiny intelligent computers and sensors - Open Hardware Event 2020Tiny intelligent computers and sensors - Open Hardware Event 2020
Tiny intelligent computers and sensors - Open Hardware Event 2020Jan Jongboom
 

What's hot (20)

Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
 
Presentation
PresentationPresentation
Presentation
 
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori..."Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
 
IoT Slam Keynote: Harnessing the Flood of Data with Heterogeneous Computing a...
IoT Slam Keynote: Harnessing the Flood of Data with Heterogeneous Computing a...IoT Slam Keynote: Harnessing the Flood of Data with Heterogeneous Computing a...
IoT Slam Keynote: Harnessing the Flood of Data with Heterogeneous Computing a...
 
How Do I Understand Deep Learning Performance?
How Do I Understand Deep Learning Performance?How Do I Understand Deep Learning Performance?
How Do I Understand Deep Learning Performance?
 
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
 
Machine Learning with Scala
Machine Learning with ScalaMachine Learning with Scala
Machine Learning with Scala
 
Original
OriginalOriginal
Original
 
Vertex Perspectives | AI Optimized Chipsets | Part II
Vertex Perspectives | AI Optimized Chipsets | Part IIVertex Perspectives | AI Optimized Chipsets | Part II
Vertex Perspectives | AI Optimized Chipsets | Part II
 
Affordable AI Connects To A Better Life
Affordable AI Connects To A Better LifeAffordable AI Connects To A Better Life
Affordable AI Connects To A Better Life
 
"Emerging Processor Architectures for Deep Learning: Options and Trade-offs,"...
"Emerging Processor Architectures for Deep Learning: Options and Trade-offs,"..."Emerging Processor Architectures for Deep Learning: Options and Trade-offs,"...
"Emerging Processor Architectures for Deep Learning: Options and Trade-offs,"...
 
HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...
HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...
HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...
 
"How to Test and Validate an Automated Driving System," a Presentation from M...
"How to Test and Validate an Automated Driving System," a Presentation from M..."How to Test and Validate an Automated Driving System," a Presentation from M...
"How to Test and Validate an Automated Driving System," a Presentation from M...
 
On-Device AI
On-Device AIOn-Device AI
On-Device AI
 
Google Devfest kolkata'19
Google Devfest kolkata'19Google Devfest kolkata'19
Google Devfest kolkata'19
 
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co..."New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
 
Deep learning for smart manufacturing
Deep learning for smart manufacturingDeep learning for smart manufacturing
Deep learning for smart manufacturing
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
 
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart..."Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
 
Tiny intelligent computers and sensors - Open Hardware Event 2020
Tiny intelligent computers and sensors - Open Hardware Event 2020Tiny intelligent computers and sensors - Open Hardware Event 2020
Tiny intelligent computers and sensors - Open Hardware Event 2020
 

Similar to Deep Learning on Everyday Devices

“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...Edge AI and Vision Alliance
 
Deploying Pretrained Model In Edge IoT Devices.pdf
Deploying Pretrained Model In Edge IoT Devices.pdfDeploying Pretrained Model In Edge IoT Devices.pdf
Deploying Pretrained Model In Edge IoT Devices.pdfObject Automation
 
Deep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devicesDeep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devicesNumenta
 
Technologies comparison: Genuino 101 vs uTensor
Technologies comparison: Genuino 101 vs uTensor Technologies comparison: Genuino 101 vs uTensor
Technologies comparison: Genuino 101 vs uTensor AndreaNapoletani
 
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Codemotion
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationVEDLIoT Project
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformGetInData
 
How AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesHow AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesDanny Sabour
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Community
 
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...Edge AI and Vision Alliance
 
How I Sped up Complex Matrix-Vector Multiplication: Finding Intel MKL's "S
How I Sped up Complex Matrix-Vector Multiplication: Finding Intel MKL's "SHow I Sped up Complex Matrix-Vector Multiplication: Finding Intel MKL's "S
How I Sped up Complex Matrix-Vector Multiplication: Finding Intel MKL's "SBrandon Liu
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-isctembreternitz
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200xIBM Sverige
 
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...Edge AI and Vision Alliance
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...StampedeCon
 
Chiplets in Data Centers
Chiplets in Data CentersChiplets in Data Centers
Chiplets in Data CentersODSA Workgroup
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataDESMOND YUEN
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningSergey Karayev
 

Similar to Deep Learning on Everyday Devices (20)

“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
 
Deploying Pretrained Model In Edge IoT Devices.pdf
Deploying Pretrained Model In Edge IoT Devices.pdfDeploying Pretrained Model In Edge IoT Devices.pdf
Deploying Pretrained Model In Edge IoT Devices.pdf
 
Deep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devicesDeep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devices
 
Technologies comparison: Genuino 101 vs uTensor
Technologies comparison: Genuino 101 vs uTensor Technologies comparison: Genuino 101 vs uTensor
Technologies comparison: Genuino 101 vs uTensor
 
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentation
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
 
How AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesHow AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changes
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
 
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
 
How I Sped up Complex Matrix-Vector Multiplication: Finding Intel MKL's "S
How I Sped up Complex Matrix-Vector Multiplication: Finding Intel MKL's "SHow I Sped up Complex Matrix-Vector Multiplication: Finding Intel MKL's "S
How I Sped up Complex Matrix-Vector Multiplication: Finding Intel MKL's "S
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscte
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
 
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
 
Chiplets in Data Centers
Chiplets in Data CentersChiplets in Data Centers
Chiplets in Data Centers
 
Summit workshop thompto
Summit workshop thomptoSummit workshop thompto
Summit workshop thompto
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big Data
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep Learning
 

More from Brodmann17

5 Practical Steps to a Successful Deep Learning Research
5 Practical Steps to a Successful  Deep Learning Research5 Practical Steps to a Successful  Deep Learning Research
5 Practical Steps to a Successful Deep Learning ResearchBrodmann17
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methodsBrodmann17
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detectionBrodmann17
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basicsBrodmann17
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionBrodmann17
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningBrodmann17
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17
 

More from Brodmann17 (8)

5 Practical Steps to a Successful Deep Learning Research
5 Practical Steps to a Successful  Deep Learning Research5 Practical Steps to a Successful  Deep Learning Research
5 Practical Steps to a Successful Deep Learning Research
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methods
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detection
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides
 
Geektime 2017
Geektime 2017Geektime 2017
Geektime 2017
 

Recently uploaded

Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratessachin783648
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingJocelyn Atis
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionpablovgd
 
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...Alba Morales
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationanitaento25
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxmuralinath2
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...NathanBaughman3
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
 
Transport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSETransport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSEjordanparish425
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPirithiRaju
 
Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalMd Hasan Tareq
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfDiyaBiswas10
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Sérgio Sacani
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...Health Advances
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPirithiRaju
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyNoelManyise1
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSELF-EXPLANATORY
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfbinhminhvu04
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsAreesha Ahmad
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxSultanMuhammadGhauri
 

Recently uploaded (20)

Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
Transport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSETransport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSE
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of Bengal
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
 

Deep Learning on Everyday Devices

  • 1. Deep Learning on Everyday Devices Amir Alush, CTO & co founder, IMVC 2018
  • 2. Deep Learning “Cycle” untrained model new data (test) training data trained model car car pedestrian Yes No car TRAINING INFERENCE
  • 3. Inference on Everyday devices 8-Core ARM Qualcomm 6-core Snapdragon 820 TBD ARM x86 MIPS Power other Embedded processors market share (Source: AMD 2016)
  • 4. Inference on the Edge Motivation ● Low Latency ● Keep Privacy ● Small/no Bandwidth ● Utilizing Existing HW Challenges ● NN High complexity ● Low HW resources ● Complex porting
  • 5. The Deep Learning Stack HARDWARE GPU, CPU, TPU, FPGA, DSP, ASIC Primitives Libraries BLAS*, NNPACK,CUDNN Frameworks TF, CAFFE, PyTorch, MXNet Algorithms NN Architectures, Meta-Architectures Engines TensorRT, Core ML, SNPE
  • 6. It’s the algorithms! Efficient algorithms: 1. More AI on any processor 2. Critical for everyday devices with low power processors
  • 7. Speed-accuracy tradeoff Running off-the-shelf DL algorithms on the edge requires sacrificing accuracy. The tradeoffs: 1. Reduce input resolution → reduce accuracy 2. Reduce model size (backbone) → reduce accuracy 3. Reduce model bit precision → reduce accuracy? 4. Less accurate algorithm MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Application”, Howard et al. 2017 nverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation”, Sandler et al. 2018 mAP GPU msecs faster-rcnn + resnet 101 + high resolution 35.6 140 ssd + mobilenetV1 + low resolution 18.8 50
  • 8. Reduce network size (pruning) ● Should be structured, non structured is usually not HW supported ● Requires re-training ● How effective on already small models? ● Can hurt accuracy! “Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks”, Mittal 2018
  • 9. Reduce model bit precision (quantization) ● Reduce from 32 bits to 8/4/2/1 bits ? ● Activations & Weights are within a narrow range ● Networks are robust to small changes ● 8 bits: ○ Need to be supported in processors instructions ○ Reduces DRAM bandwidth (more in the SRAM) ○ Supported natively by various engines ● Below 8 bits: ○ Accuracy drops ○ Not supported in existing HW
  • 10. How to deploy your models? Training Frameworks Inference Engines ● Research flexibility: ○ Fast POC from idea to results ○ Easy to extended: new data loaders, layers (loss, operations) ○ Easy debugging ● Good training speed (+ parallelization) ○ Fast research iterations ● Large active community: ○ Explore new research ideas fast ● Personal flavour ● Portability should not a factor ● Optimized for latency (forward pass) ● Should be efficient on a specific hardware ○ Takes advantage of specific hw instructions ● Little / no overhead ● Efficient working memory ● Small code size ● Support all of your NN layers
  • 11. Training frameworks vs Inference Engines
  • 12. Build your own Inference Environment? 1. Use the engines (TensorRT, SNPE...): ○ Faster to production (if the conversion works out of the box) ○ Current generation has some limitations and minimal tech support ○ Not as mature as the training frameworks or DL primitives libraries ○ Not all NN capabilities are supported 2. Write your own inference engine: ○ Slower to production, but, with low risk ○ Use the DL primitives libraries (CuDNN, BLAS, NNPACK..) they are more matures than the engines and they are optimized! ○ You’ll need to write your own logic on top! ○ You’re in control and can adjust to your needs
  • 14. Brodmann17 Use Case - IoT ● MWC 2018 recent cooperation with & ● Embedded World 2018 with ● CES 2018 with
  • 15. Brodmann17 Use Case - Smart City * 1 CPU core
  • 16. Brodmann17 Use Case - Adas Benchmarking on the KITTI dataset car detection * 1 CPU core Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite, Geiger et al, 2012
  • 17. 1. motivation and challenges in NN edge processing 2. Speed accuracy tradeoffs made today 3. Brodmann17 key design principles for keeping efficiency and accuracy on the edge 4. key considerations when choosing your inference environment 5. Brodmann17 example use cases Summary