SlideShare a Scribd company logo
1 of 32
Download to read offline
Copyright © 2014 Synopsys Inc. 1
Pierre Paulin, Director R&D
Santa Clara, 29 May 2014
Combining Flexibility and Low-power in
Embedded Vision Subsystems:
An Application to Pedestrian Detection
Bruno Lavigueur, Senior R&D Engineer
Copyright © 2014 Synopsys Inc. 2
• Pedestrian Detection algorithm overview
• Computation and bandwidth requirements
• Embedded Vision Reference Platform
• Programming Tools and Architecture
• Application Mapping to a Heterogeneous Multi-Core
Platform
• From Functional implementation in OpenCV
to a fully optimized mapping to GPP and ASIP cores
• Final optimized mapping
• Power — Performance — Area analysis
• FPGA-based prototype
• Lessons learned, outlook
Outline
Copyright © 2014 Synopsys Inc. 3
• EDA tool and IP provider
• $1.96B in revenue (FY 2013)
• ~8700 employees ( > 5600 R&D engineers)
• ~81 offices worldwide
• Products for Designing Embedded Vision Systems
• Embedded Cores (ARC HS, EM, 600, 700)
• Application Specific Processor (ASIP) design tools
• Semiconductor IP (DDR, DMA, AXI, HDMI, USB, A/D, …)
• Synthesis and verification for SoCs and FPGAs
• FPGA-based rapid prototyping system
Synopsys — EDA Industry Leadership
Copyright © 2014 Synopsys Inc. 4
• Pedestrian detection
• One of the most popular EV applications
• Standard feature in luxury vehicles
• Moving to mid-size and compact vehicles
in the next 5-10 years, also due to
legislation efforts
• Implementation requirements
• Low cost
• Low power (small form factor, and/or battery powered)
• Programmable (to allow for in-field SW upgrades)
• Most popular algorithm for pedestrian detection is
Histogram of Oriented Gradients (HOG)
Pedestrian Detection and HOG
Copyright © 2014 Synopsys Inc. 5
Histogram Of Oriented Gradients
Gradient Computation
Apply Sobel operators:
+1 +2 +1
0 0 0
−1 −2 −1
and
+1 0 −1
+2 0 −2
+1 0 −1
Grey scale
conversion
Scale to multiple
resolutions
Gradient computation
Histogram
computation per block
Normalization of the
histograms
SVM per window
position
Non-max suppression
Scale to Multiple Resolutions
Use a fixed 64x128-pixel detection window.
Apply this detection window to scaled frames.
Copyright © 2014 Synopsys Inc. 6
Histogram Of Oriented Gradients
The image is divided in 8x8-pixel cells. For very block of 2x2 cells, apply
Gaussian weights and compute 4 histograms of orientation of gradients.
Histogram Computation
Normalization of the Histograms
(1) L2 Normalization (2) clipping (saturation) (3) L2 Normalization
Support Vector Machine
Linear classification of histograms
for every 64x128 windows position.
Non-Max Suppression
Cluster multi-scale dense scan of
detection windows and select unique
Grey scale
conversion
Scale to multiple
resolutions
Gradient computation
Histogram
computation per block
Normalization of the
histograms
SVM per window
position
Non-max suppression
Copyright © 2014 Synopsys Inc. 7
Embedded Vision
Reference Platform Overview
Copyright © 2014 Synopsys Inc. 8
Embedded Vision Reference Platform
Embedded Vision
Reference Platform
Ported OpenCV library
Pedestrian Detection, etc.
C API to ASIP-based
vision accelerators
Configurable ARC HS
RISC processor
ASIP-based accelerators
HAPS® FPGA-based
prototyping system
Pre-verifiedflow
andexamples
Copyright © 2014 Synopsys Inc. 9
Time-to-market and Flexibility vs.
Power-Performance-Area Trade-offs
Subsystem
Controller
HS
Emb. Vision
Accelerators
ASIP
ASIP
ASIP
1X 100X
P
A
R
A
L
L
E
L
I
S
M
Pre-processing:
- Filtering
- Color conversion
- Image scaling
- Feature extraction
and matching
- Segmentation
Power-Performance-Area Efficiency
Time-to-market, Flexibility
10X
MQX
Lightweight O/S
High-level processing:
- Control
- Multi-object tracking
- Post-processing
- High-level command
interpretation
Data
Level
Parallelism
Task
Level
Parallelism
Sequential
Tasks
Copyright © 2014 Synopsys Inc. 10
• ARC HS family of high-performance cores
• ARC HS 36 Performance, power at 28 nm HPM process (worst case):
• Scalable to 1.6 GHz
• 1.9 DMIPS/MHz
• 37 uW/MHZ
• Application-Specific Instruction-set Processors (ASIP)
• User-driven design of processors tailored to a specific application
• Ability to guide performance-power-area and flexibility trade-offs
• Automatic generation of implementation, C compiler and
programming tools from instruction-set specification
• Connectivity components
• DMA, AXI, DDR, etc.
Main architectural components
53 Dhrystone GIPS/W
Copyright © 2014 Synopsys Inc. 11
Embedded Vision Flow and Architecture
HOG Embedded App.
Base driversMQX runtime
AXI-4 local interconnect
DMA,
Sync
& I/OHS DCCM
Dedicated Streaming Interconnect (FIFOs)
D D D
ASIP1 ASIPn
C/C++ C API to Accelerators
HAPS-70 S12
12M ASIC
Gate equiv.
L2
SRAM
ASIP2
Copyright © 2014 Synopsys Inc. 13
HOG Mapping and Refinement Flow
Camera
HOG
Detection
DVI
Output
Copyright © 2014 Synopsys Inc. 14
• Refinement from an OpenCV high-level functional
description, to a fully optimized multi-processor SoC
combining a GP RISC with multiple ASIPs
• Main steps
• OpenCV functional reference
• Optimization and Porting onto MQX RTOS
• Profiling of all major functions
• Identification of high compute kernels
• Development of ASIPs using Synopsys ASIP design and
exploration tools
• Stepwise refinement
• From GPP only  to GPP + multiple ASIPs
HOG Mapping and Refinement Flow
Copyright © 2014 Synopsys Inc. 15
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
40.00
45.00
50.00
Rescale Grad Hist Norm SVM Other
% of processing
% of processing
ARC and ASIP Exploration Tool Flow
Optimizing
Compiler
Assembler,
Linker
Instrn.-Set
Simulator
Debugger,
Profiler
C code
ARC HS S/W
Optimization
Processor
Description
Language
Optimizing
Compiler
Assembler,
Linker
Instrn.-Set
Simulator
Debugger,
Profiler
RTL
Gen.
Sim, FPGA,
RTL Synthesis
C code
ASIP HW/SW
Optimization
ARC-ASIP Trade-off Exploration
MQX RTOS
Copyright © 2014 Synopsys Inc. 16
Grey scale
conversion
HOG Functional Validation on ARC HS
(640 × 480 pixels)
AXI local interconnect
DMA,
Sync
& I/O
Dedicated Streaming Interconnect (FIFOs)
D D D
ASIP1 ASIP2
Rescaling Gradient Histogram SVM
Normali-
zation
Non-max
suppression
ASIP4
L3 Ext. DRAM
DCCMHS
Subs. ctrl
1
• C fixed point profiling results: 2.25 G cycles per frame
Copyright © 2014 Synopsys Inc. 18
ARC HS
G cycles
0.1
1.4
17.3
31.9
1.2
15.7
0.004
Histogram Of Oriented Gradients Profiling
(640 × 480 pixels, at 25 FPS)
Grey scale
conversion
Scale to multiple
resolutions
Gradient computation
Histogram
computation per block
Normalization of the
histograms
SVM per window
position
Non-max suppression
Copyright © 2014 Synopsys Inc. 19
Histogram Of Oriented Gradients Profiling
(640 × 480 pixels, at 25 FPS)
Grey scale
conversion
Scale to multiple
resolutions
Gradient computation
Histogram
computation per block
Normalization of the
histograms
SVM per window
position
Non-max suppression
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
40.00
45.00
50.00
Rescale Grad Hist Norm SVM Other
% of processing
% of processing
Copyright © 2014 Synopsys Inc. 20
0.00
2.00
4.00
6.00
8.00
10.00
12.00
14.00
16.00
18.00
Rescale Grad Hist Norm SVM Other
# ARC HS
# ARC HS
Histogram Of Oriented Gradients Profiling
(640 × 480 pixels, at 25 FPS)
Grey scale
conversion
Scale to multiple
resolutions
Gradient computation
Histogram
computation per block
Normalization of the
histograms
SVM per window
position
Non-max suppression
Single Core
Multicore?
Accelerate!
Copyright © 2014 Synopsys Inc. 21
Grey scale
conversion
Task Assignment #2
AXI local interconnect
DMA,
Sync
& I/O
Dedicated Streaming Interconnect (FIFOs)
D D D
ASIP1 ASIP2
Rescaling Gradient Histogram SVM
Normali-
zation
Non-max
suppression
ASIP4
2
L3 Ext. DRAM
DCCMHS
Subs. ctrl1.6 GHz
1.6 GHz
400 MHz
Copyright © 2014 Synopsys Inc. 22
Task Assignment #3
AXI local interconnect
DMA,
Sync
& I/OHS DCCM
Dedicated Streaming Interconnect (FIFOs)
Subs. ctrl
D D DD
ASIP1 ASIP2 ASIP3 ASIP4
3
L3 Ext. DRAM
Grey scale
conversion
Rescaling Gradient Histogram SVM
Normali-
zation
Non-max
suppression
1.6 GHz
400 MHz
Copyright © 2014 Synopsys Inc. 23
Task Assignment #4
AXI local interconnect
DMA,
Sync
& I/OHS DCCM
Dedicated Streaming Interconnect (FIFOs)
Subs. ctrl
D D DD
ASIP1’ ASIP2 ASIP3 ASIP4
4
L3 Ext. DRAM
Grey scale
conversion
Rescaling Gradient Histogram SVM
Normali-
zation
Non-max
suppression
400 MHz
400 MHz
Copyright © 2014 Synopsys Inc. 24
Task Assignment #4 With On-Chip L2
AXI local interconnect
DMA,
Sync
& I/O
Dedicated Streaming Interconnect (FIFOs)
D D DD
ASIP1’ ASIP2 ASIP3 ASIP4
4
HS DCCM
L2
SRAM
L3 Ext. DRAM
Grey scale
conversion
Rescaling Gradient Histogram SVM
Normali-
zation
Non-max
suppression
Storage of
scaled images
200 MB/s 80 MB/s
400 MHz
400 MHz
Copyright © 2014 Synopsys Inc. 25
Power, Gate Count Comparisons (28 nm)
640 × 480 pixels, at 25 FPS
0
200
400
600
800
1000
1200
1400
Config #2 Config #3 Config #4
ASIP gates (K)
ARC gates (K)
Gates (K)
2 3 4
0.0
20.0
40.0
60.0
80.0
100.0
120.0
140.0
Config #2 Config #3 Config #4
ASIP power (mW)
ARC power (mW)
Power (mW)
2 3 4
0
1
2
3
4
5
6
Config #2 Config #3 Config #4
ASIP design and S/W
ARC S/W
2 3 4
Effort (person-months)
HAPS FPGA-based
demo platform
Note: Gates and power for processors
and local memory
Copyright © 2014 Synopsys Inc. 26
• 1 ARC HS, 4 ASIPs, AXI interconnect, private SRAM, L2 SRAM
• Fixed point version of HOG derived from OpenCV
• 25 frames/second at 400 MHz (ARC and ASIPs)
• TSMC HPM process, 28nm
• Gate count (at 400 MHz): 471K gates
• 303K gates for ASIPs, 168K gates for ARC HS 36
• Power consumption: 60 mW
• Prototype running on HAPS board (ASIPs)
• 4 frames/second at 70 MHz
26
Final Results for Demonstrator Platform
Demonstration available at our booth
4
Copyright © 2014 Synopsys Inc. 27
Lessons Learned
Subsystem
Controller
HS
Emb. Vision
Accelerators
ASIP2
ASIP1’
ASIP3
1X 100X
P
A
R
A
L
L
E
L
I
S
M
1’) Rescaling + Gradient
2) Histogram
3) Normalization
4) SVM
Power-Performance-Area Efficiency
Time-to-market, Flexibility
10X
Data
Level
Parallelism
Talk
Level
Parallelism
1) Greyscale
2) Non-max suppr.
3) Display
4) Control, O/S ASIP4
Sequential
Tasks
4
Copyright © 2014 Synopsys Inc. 28
Lessons Learned
Subsystem
Controller
HS
Emb. Vision
Accelerators
ASIP2
ASIP1’
ASIP3
1X 60X~80X
P
A
R
A
L
L
E
L
I
S
M
Area Efficiency
Time-to-market, Flexibility
Data
Level
Parallelism
Talk
Level
Parallelism
Combined:
471K gates,
60 mW @ 28nm
ASIP4
Sequential
Tasks
400 MHz
303K gates, 58 mW
(Logic = 12 mW
SRAM = 46 mW)
4
20% utilization
1.6 GHz: 473K gates, 37 uW/MHZ
400 MHz: 168K gates, 18 uW/MHz
Copyright © 2014 Synopsys Inc. 29
Accelerator
C API
Data
Level
Parallelism
Talk
Level
Parallelism
Sequential
Tasks
Embedded Vision Platform Directions
& Wish List
Subsystem
Controller
HS
1X 100X
Pre-processing:
- Filtering
- Color conversion
- Image scaling
- Feature extraction
and matching
- Segmentation
High-level processing:
- Body part detection
- Multi-object tracking
- Post-processing
- Command
interpretation
Power-Performance-Area Efficiency
Time-to-market, Flexibility
Close
coupling
Vision
Extn.
SIMD
(64 bit)
OpenCV
MQX O/S
10X
Emb. Vision
Accelerators
ASIP
ASIP
ASIP
P
A
R
A
L
L
E
L
I
S
M
Copyright © 2014 Synopsys Inc. 30
• Embedded vision applications combine complex algorithms
and high data rates with a need for low power
• Need to trade-off Flexibility vs. Power-Perf-Area
• Flexibility via High-performance ARC HS core
• Ability to trade-off power vs. performance
• Scaling to multi-core, specialization and SIMD usage
• Highest PPA via ASIPs
• Performance gains and power efficiency due to tailored
instruction sets and dedicated memory architecture
• While fully programmable, gains are application specific
Conclusions
Copyright © 2014 Synopsys Inc. 31
Backup slides
Copyright © 2014 Synopsys Inc. 32
Design flow for the Vision Sub System
ARC
HS
DW
AXI interco
DesignWare
DMA
DesignWare
DDR
ARChitect
ASIP
Processor
Designer
Core
Assembler
ASIP
descriptionASIP ISA
description
Ref Sub
System
ASIP
Synthesis +
P&R tools
Core
Consultant
SubSys
settings
ARC
settings
coreKitTool
Core
Builder
Core
Builder
User
config
VCS
DVE MDB PDBG
Legend :
HAPS
Copyright © 2014 Synopsys Inc. 33
Synopsys’ ASIP Design Tool Flow
Processor
Description
Language
Optimizing
Compiler
Assembler Linker
Instruction-Set
Simulator
Debugger Profiler
RTL Generator
RTL Sim &
FPGA
RTL
Synthesis
Full-featured SDK with graphical debugger
Compiler supports processor specific data-
types and operators
Advanced optimizations allow C
programmers to easily tap into architectural
efficiencies
Fast retargeting to evaluate
incremental processor
architecture changes quickly.
High level language to quickly capture ISA
Tight control of architecture (RTL-level)
Fast simulation technology
Easy integration into System C virtual platforms
Multicore and on-chip debugging
Smooth integration with
RTL implementation and
verification flows
Copyright © 2014 Synopsys Inc. 34
Architectural Optimization Space
ASIP architectural optimization space
Parallelism Specialization
Instruction-
level
parallelism
Data-
level
parallelism
Task-
level
parallelism
Orthogonal
instruction
set (VLIW)
Encoded
instruction
set
Vector
processing
(SIMD)
Multi-
core
App.-specific
data types
App.-specific
instructions
Connectivity & storage
matching application’s
data-flow
App.-spec.
data
processing
App.-spec.
memory
addressing
App.-spec.
control
processing
Distributed regs,
sub-ranges
Multiple mem’s,
sub-ranges
Jumps, subroutines,
interrupts, HW do-loops,
residual control, predication…
Direct, indirect, post-
modification, indexed,
stack indirect…
Any exotic
operator
Integer, fractional,
floating-point, bits,
complex, vector…
Single or
multi-cycle
Relative or absolute, address
range, delay slots…
Pipeline
Synopsys ASIP tools …
• Support a wide range of ASIP
architectures
• Support RTL accelerator tricks for
highest PPA efficiency
• Enable ASIP optimization through
architectural exploration
Multi-
threading

More Related Content

What's hot

"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from Qualcomm"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from QualcommEdge AI and Vision Alliance
 
“Efficient Video Perception Through AI,” a Presentation from Qualcomm
“Efficient Video Perception Through AI,” a Presentation from Qualcomm“Efficient Video Perception Through AI,” a Presentation from Qualcomm
“Efficient Video Perception Through AI,” a Presentation from QualcommEdge AI and Vision Alliance
 
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...Eduardo Pelegri-Llopart
 
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles NgSE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles NgAMD Developer Central
 
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ..."2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...Edge AI and Vision Alliance
 
GTC China 2016
GTC China 2016GTC China 2016
GTC China 2016NVIDIA
 
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr..."Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...Edge AI and Vision Alliance
 
“How 5G is Pushing Processing to the Edge,” a Presentation from Inseego
“How 5G is Pushing Processing to the Edge,” a Presentation from Inseego“How 5G is Pushing Processing to the Edge,” a Presentation from Inseego
“How 5G is Pushing Processing to the Edge,” a Presentation from InseegoEdge AI and Vision Alliance
 
GTC 2016 Opening Keynote
GTC 2016 Opening KeynoteGTC 2016 Opening Keynote
GTC 2016 Opening KeynoteNVIDIA
 
2016 06 nvidia-isc_supercomputing_car_v02
2016 06 nvidia-isc_supercomputing_car_v022016 06 nvidia-isc_supercomputing_car_v02
2016 06 nvidia-isc_supercomputing_car_v02Carlo Nardone
 
"Pioneering Analog Compute for Edge AI to Overcome the End of Digital Scaling...
"Pioneering Analog Compute for Edge AI to Overcome the End of Digital Scaling..."Pioneering Analog Compute for Edge AI to Overcome the End of Digital Scaling...
"Pioneering Analog Compute for Edge AI to Overcome the End of Digital Scaling...Edge AI and Vision Alliance
 
Opening Keynote at GTC 2015: Leaps in Visual Computing
Opening Keynote at GTC 2015: Leaps in Visual ComputingOpening Keynote at GTC 2015: Leaps in Visual Computing
Opening Keynote at GTC 2015: Leaps in Visual ComputingNVIDIA
 
Beyond the Hype Cycle: Barriers and Breakthroughs Toward XR Growth
Beyond the Hype Cycle: Barriers and Breakthroughs Toward XR GrowthBeyond the Hype Cycle: Barriers and Breakthroughs Toward XR Growth
Beyond the Hype Cycle: Barriers and Breakthroughs Toward XR GrowthIntel® Software
 
“Productizing Edge AI Across Applications and Verticals: Case Study and Insig...
“Productizing Edge AI Across Applications and Verticals: Case Study and Insig...“Productizing Edge AI Across Applications and Verticals: Case Study and Insig...
“Productizing Edge AI Across Applications and Verticals: Case Study and Insig...Edge AI and Vision Alliance
 
Open Source 5G/Edge Automation via ONAP
Open Source 5G/Edge Automation via ONAPOpen Source 5G/Edge Automation via ONAP
Open Source 5G/Edge Automation via ONAPLiz Warner
 
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LGEdge AI and Vision Alliance
 
May 2021 Embedded Vision Summit Opening Remarks (May 28)
May 2021 Embedded Vision Summit Opening Remarks (May 28)May 2021 Embedded Vision Summit Opening Remarks (May 28)
May 2021 Embedded Vision Summit Opening Remarks (May 28)Edge AI and Vision Alliance
 
"Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati...
"Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati..."Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati...
"Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati...Edge AI and Vision Alliance
 

What's hot (20)

"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from Qualcomm"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from Qualcomm
 
“Efficient Video Perception Through AI,” a Presentation from Qualcomm
“Efficient Video Perception Through AI,” a Presentation from Qualcomm“Efficient Video Perception Through AI,” a Presentation from Qualcomm
“Efficient Video Perception Through AI,” a Presentation from Qualcomm
 
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
 
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles NgSE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
 
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ..."2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
 
GTC China 2016
GTC China 2016GTC China 2016
GTC China 2016
 
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr..."Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
 
“How 5G is Pushing Processing to the Edge,” a Presentation from Inseego
“How 5G is Pushing Processing to the Edge,” a Presentation from Inseego“How 5G is Pushing Processing to the Edge,” a Presentation from Inseego
“How 5G is Pushing Processing to the Edge,” a Presentation from Inseego
 
GTC 2016 Opening Keynote
GTC 2016 Opening KeynoteGTC 2016 Opening Keynote
GTC 2016 Opening Keynote
 
2016 06 nvidia-isc_supercomputing_car_v02
2016 06 nvidia-isc_supercomputing_car_v022016 06 nvidia-isc_supercomputing_car_v02
2016 06 nvidia-isc_supercomputing_car_v02
 
"Pioneering Analog Compute for Edge AI to Overcome the End of Digital Scaling...
"Pioneering Analog Compute for Edge AI to Overcome the End of Digital Scaling..."Pioneering Analog Compute for Edge AI to Overcome the End of Digital Scaling...
"Pioneering Analog Compute for Edge AI to Overcome the End of Digital Scaling...
 
Opening Keynote at GTC 2015: Leaps in Visual Computing
Opening Keynote at GTC 2015: Leaps in Visual ComputingOpening Keynote at GTC 2015: Leaps in Visual Computing
Opening Keynote at GTC 2015: Leaps in Visual Computing
 
Beyond the Hype Cycle: Barriers and Breakthroughs Toward XR Growth
Beyond the Hype Cycle: Barriers and Breakthroughs Toward XR GrowthBeyond the Hype Cycle: Barriers and Breakthroughs Toward XR Growth
Beyond the Hype Cycle: Barriers and Breakthroughs Toward XR Growth
 
“Productizing Edge AI Across Applications and Verticals: Case Study and Insig...
“Productizing Edge AI Across Applications and Verticals: Case Study and Insig...“Productizing Edge AI Across Applications and Verticals: Case Study and Insig...
“Productizing Edge AI Across Applications and Verticals: Case Study and Insig...
 
Open Source 5G/Edge Automation via ONAP
Open Source 5G/Edge Automation via ONAPOpen Source 5G/Edge Automation via ONAP
Open Source 5G/Edge Automation via ONAP
 
SV Cloud Meetup
SV Cloud MeetupSV Cloud Meetup
SV Cloud Meetup
 
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
 
May 2021 Embedded Vision Summit Opening Remarks (May 28)
May 2021 Embedded Vision Summit Opening Remarks (May 28)May 2021 Embedded Vision Summit Opening Remarks (May 28)
May 2021 Embedded Vision Summit Opening Remarks (May 28)
 
IOT - Presentation to PEP @ Progress
IOT - Presentation to PEP @ ProgressIOT - Presentation to PEP @ Progress
IOT - Presentation to PEP @ Progress
 
"Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati...
"Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati..."Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati...
"Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati...
 

Viewers also liked

Mentor Graphics Customer Presentation
Mentor Graphics Customer PresentationMentor Graphics Customer Presentation
Mentor Graphics Customer PresentationSplunk
 
Cadence, Create Scalable and Sustainable Sales Development Process
Cadence, Create Scalable and Sustainable Sales Development ProcessCadence, Create Scalable and Sustainable Sales Development Process
Cadence, Create Scalable and Sustainable Sales Development ProcessB2BCamp
 
How Mentor Graphics Uses Google Cloud for the Internet of Things - Mentor Gra...
How Mentor Graphics Uses Google Cloud for the Internet of Things - Mentor Gra...How Mentor Graphics Uses Google Cloud for the Internet of Things - Mentor Gra...
How Mentor Graphics Uses Google Cloud for the Internet of Things - Mentor Gra...RightScale
 
Sales Stack Workshop: Hacking Outbound Messaging and Cadence
Sales Stack Workshop: Hacking Outbound Messaging and CadenceSales Stack Workshop: Hacking Outbound Messaging and Cadence
Sales Stack Workshop: Hacking Outbound Messaging and CadenceSkaled
 
MIPI DevCon 2016: Verification of Mobile SOC Design (UFS)
MIPI DevCon 2016: Verification of Mobile SOC Design (UFS)MIPI DevCon 2016: Verification of Mobile SOC Design (UFS)
MIPI DevCon 2016: Verification of Mobile SOC Design (UFS)MIPI Alliance
 
MIPI DevCon 2016: Comprehensive Verification of MIPI SoundWire Master-Slave S...
MIPI DevCon 2016: Comprehensive Verification of MIPI SoundWire Master-Slave S...MIPI DevCon 2016: Comprehensive Verification of MIPI SoundWire Master-Slave S...
MIPI DevCon 2016: Comprehensive Verification of MIPI SoundWire Master-Slave S...MIPI Alliance
 
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement ChallengesMIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement ChallengesMIPI Alliance
 
XPages is Workflow's new best friend
XPages is Workflow's new best friendXPages is Workflow's new best friend
XPages is Workflow's new best friendStephan H. Wissel
 
IBM Connect 2013: Messaging and Collaboration Roadmap
IBM Connect 2013: Messaging and Collaboration RoadmapIBM Connect 2013: Messaging and Collaboration Roadmap
IBM Connect 2013: Messaging and Collaboration RoadmapEd Brill
 
Small Business Marketing Toolkit
Small Business Marketing ToolkitSmall Business Marketing Toolkit
Small Business Marketing ToolkitCadence Marketing
 
Evolution Of Microprocessors
Evolution Of MicroprocessorsEvolution Of Microprocessors
Evolution Of Microprocessorsharinder
 
Intel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMGIntel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMGIntel IT Center
 

Viewers also liked (13)

Mentor Graphics Customer Presentation
Mentor Graphics Customer PresentationMentor Graphics Customer Presentation
Mentor Graphics Customer Presentation
 
Cadence, Create Scalable and Sustainable Sales Development Process
Cadence, Create Scalable and Sustainable Sales Development ProcessCadence, Create Scalable and Sustainable Sales Development Process
Cadence, Create Scalable and Sustainable Sales Development Process
 
How Mentor Graphics Uses Google Cloud for the Internet of Things - Mentor Gra...
How Mentor Graphics Uses Google Cloud for the Internet of Things - Mentor Gra...How Mentor Graphics Uses Google Cloud for the Internet of Things - Mentor Gra...
How Mentor Graphics Uses Google Cloud for the Internet of Things - Mentor Gra...
 
Sales Stack Workshop: Hacking Outbound Messaging and Cadence
Sales Stack Workshop: Hacking Outbound Messaging and CadenceSales Stack Workshop: Hacking Outbound Messaging and Cadence
Sales Stack Workshop: Hacking Outbound Messaging and Cadence
 
MIPI DevCon 2016: Verification of Mobile SOC Design (UFS)
MIPI DevCon 2016: Verification of Mobile SOC Design (UFS)MIPI DevCon 2016: Verification of Mobile SOC Design (UFS)
MIPI DevCon 2016: Verification of Mobile SOC Design (UFS)
 
MIPI DevCon 2016: Comprehensive Verification of MIPI SoundWire Master-Slave S...
MIPI DevCon 2016: Comprehensive Verification of MIPI SoundWire Master-Slave S...MIPI DevCon 2016: Comprehensive Verification of MIPI SoundWire Master-Slave S...
MIPI DevCon 2016: Comprehensive Verification of MIPI SoundWire Master-Slave S...
 
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement ChallengesMIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
 
XPages is Workflow's new best friend
XPages is Workflow's new best friendXPages is Workflow's new best friend
XPages is Workflow's new best friend
 
IBM Connect 2013: Messaging and Collaboration Roadmap
IBM Connect 2013: Messaging and Collaboration RoadmapIBM Connect 2013: Messaging and Collaboration Roadmap
IBM Connect 2013: Messaging and Collaboration Roadmap
 
Small Business Marketing Toolkit
Small Business Marketing ToolkitSmall Business Marketing Toolkit
Small Business Marketing Toolkit
 
DITA Quick Start
DITA Quick StartDITA Quick Start
DITA Quick Start
 
Evolution Of Microprocessors
Evolution Of MicroprocessorsEvolution Of Microprocessors
Evolution Of Microprocessors
 
Intel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMGIntel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMG
 

Similar to Combining Flexibility and Low-power in Embedded Vision Systems

"Multiple Uses of Pipelined Video Pre-Processor Hardware in Vision Applicatio...
"Multiple Uses of Pipelined Video Pre-Processor Hardware in Vision Applicatio..."Multiple Uses of Pipelined Video Pre-Processor Hardware in Vision Applicatio...
"Multiple Uses of Pipelined Video Pre-Processor Hardware in Vision Applicatio...Edge AI and Vision Alliance
 
Codasip application class RISC-V processor solutions
Codasip application class RISC-V processor solutionsCodasip application class RISC-V processor solutions
Codasip application class RISC-V processor solutionsRISC-V International
 
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ..."New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...Edge AI and Vision Alliance
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsClaudiu Barbura
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Intel® Software
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...Edge AI and Vision Alliance
 
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen..."Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...Edge AI and Vision Alliance
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryDeepak Shankar
 
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemHai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemAI Frontiers
 
OFI Overview 2019 Webinar
OFI Overview 2019 WebinarOFI Overview 2019 Webinar
OFI Overview 2019 Webinarseanhefty
 
HPE Hybrid HPC strategy including UberCloud Containers
HPE Hybrid HPC strategy including UberCloud ContainersHPE Hybrid HPC strategy including UberCloud Containers
HPE Hybrid HPC strategy including UberCloud ContainersThomas Francis
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...Edge AI and Vision Alliance
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...Rogue Wave Software
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)Amazon Web Services
 

Similar to Combining Flexibility and Low-power in Embedded Vision Systems (20)

"Multiple Uses of Pipelined Video Pre-Processor Hardware in Vision Applicatio...
"Multiple Uses of Pipelined Video Pre-Processor Hardware in Vision Applicatio..."Multiple Uses of Pipelined Video Pre-Processor Hardware in Vision Applicatio...
"Multiple Uses of Pipelined Video Pre-Processor Hardware in Vision Applicatio...
 
Codasip application class RISC-V processor solutions
Codasip application class RISC-V processor solutionsCodasip application class RISC-V processor solutions
Codasip application class RISC-V processor solutions
 
Resume_suresh_final
Resume_suresh_finalResume_suresh_final
Resume_suresh_final
 
Cassandra in xPatterns
Cassandra in xPatternsCassandra in xPatterns
Cassandra in xPatterns
 
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ..."New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
 
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen..."Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP Library
 
Demystify OpenPOWER
Demystify OpenPOWERDemystify OpenPOWER
Demystify OpenPOWER
 
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemHai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
 
OFI Overview 2019 Webinar
OFI Overview 2019 WebinarOFI Overview 2019 Webinar
OFI Overview 2019 Webinar
 
HPE Hybrid HPC strategy including UberCloud Containers
HPE Hybrid HPC strategy including UberCloud ContainersHPE Hybrid HPC strategy including UberCloud Containers
HPE Hybrid HPC strategy including UberCloud Containers
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
 
RISC V in Spacer
RISC V in SpacerRISC V in Spacer
RISC V in Spacer
 
OSPRay 1.0 and Beyond
OSPRay 1.0 and BeyondOSPRay 1.0 and Beyond
OSPRay 1.0 and Beyond
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
 
Decision trees in hadoop
Decision trees in hadoopDecision trees in hadoop
Decision trees in hadoop
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 

Combining Flexibility and Low-power in Embedded Vision Systems

  • 1. Copyright © 2014 Synopsys Inc. 1 Pierre Paulin, Director R&D Santa Clara, 29 May 2014 Combining Flexibility and Low-power in Embedded Vision Subsystems: An Application to Pedestrian Detection Bruno Lavigueur, Senior R&D Engineer
  • 2. Copyright © 2014 Synopsys Inc. 2 • Pedestrian Detection algorithm overview • Computation and bandwidth requirements • Embedded Vision Reference Platform • Programming Tools and Architecture • Application Mapping to a Heterogeneous Multi-Core Platform • From Functional implementation in OpenCV to a fully optimized mapping to GPP and ASIP cores • Final optimized mapping • Power — Performance — Area analysis • FPGA-based prototype • Lessons learned, outlook Outline
  • 3. Copyright © 2014 Synopsys Inc. 3 • EDA tool and IP provider • $1.96B in revenue (FY 2013) • ~8700 employees ( > 5600 R&D engineers) • ~81 offices worldwide • Products for Designing Embedded Vision Systems • Embedded Cores (ARC HS, EM, 600, 700) • Application Specific Processor (ASIP) design tools • Semiconductor IP (DDR, DMA, AXI, HDMI, USB, A/D, …) • Synthesis and verification for SoCs and FPGAs • FPGA-based rapid prototyping system Synopsys — EDA Industry Leadership
  • 4. Copyright © 2014 Synopsys Inc. 4 • Pedestrian detection • One of the most popular EV applications • Standard feature in luxury vehicles • Moving to mid-size and compact vehicles in the next 5-10 years, also due to legislation efforts • Implementation requirements • Low cost • Low power (small form factor, and/or battery powered) • Programmable (to allow for in-field SW upgrades) • Most popular algorithm for pedestrian detection is Histogram of Oriented Gradients (HOG) Pedestrian Detection and HOG
  • 5. Copyright © 2014 Synopsys Inc. 5 Histogram Of Oriented Gradients Gradient Computation Apply Sobel operators: +1 +2 +1 0 0 0 −1 −2 −1 and +1 0 −1 +2 0 −2 +1 0 −1 Grey scale conversion Scale to multiple resolutions Gradient computation Histogram computation per block Normalization of the histograms SVM per window position Non-max suppression Scale to Multiple Resolutions Use a fixed 64x128-pixel detection window. Apply this detection window to scaled frames.
  • 6. Copyright © 2014 Synopsys Inc. 6 Histogram Of Oriented Gradients The image is divided in 8x8-pixel cells. For very block of 2x2 cells, apply Gaussian weights and compute 4 histograms of orientation of gradients. Histogram Computation Normalization of the Histograms (1) L2 Normalization (2) clipping (saturation) (3) L2 Normalization Support Vector Machine Linear classification of histograms for every 64x128 windows position. Non-Max Suppression Cluster multi-scale dense scan of detection windows and select unique Grey scale conversion Scale to multiple resolutions Gradient computation Histogram computation per block Normalization of the histograms SVM per window position Non-max suppression
  • 7. Copyright © 2014 Synopsys Inc. 7 Embedded Vision Reference Platform Overview
  • 8. Copyright © 2014 Synopsys Inc. 8 Embedded Vision Reference Platform Embedded Vision Reference Platform Ported OpenCV library Pedestrian Detection, etc. C API to ASIP-based vision accelerators Configurable ARC HS RISC processor ASIP-based accelerators HAPS® FPGA-based prototyping system Pre-verifiedflow andexamples
  • 9. Copyright © 2014 Synopsys Inc. 9 Time-to-market and Flexibility vs. Power-Performance-Area Trade-offs Subsystem Controller HS Emb. Vision Accelerators ASIP ASIP ASIP 1X 100X P A R A L L E L I S M Pre-processing: - Filtering - Color conversion - Image scaling - Feature extraction and matching - Segmentation Power-Performance-Area Efficiency Time-to-market, Flexibility 10X MQX Lightweight O/S High-level processing: - Control - Multi-object tracking - Post-processing - High-level command interpretation Data Level Parallelism Task Level Parallelism Sequential Tasks
  • 10. Copyright © 2014 Synopsys Inc. 10 • ARC HS family of high-performance cores • ARC HS 36 Performance, power at 28 nm HPM process (worst case): • Scalable to 1.6 GHz • 1.9 DMIPS/MHz • 37 uW/MHZ • Application-Specific Instruction-set Processors (ASIP) • User-driven design of processors tailored to a specific application • Ability to guide performance-power-area and flexibility trade-offs • Automatic generation of implementation, C compiler and programming tools from instruction-set specification • Connectivity components • DMA, AXI, DDR, etc. Main architectural components 53 Dhrystone GIPS/W
  • 11. Copyright © 2014 Synopsys Inc. 11 Embedded Vision Flow and Architecture HOG Embedded App. Base driversMQX runtime AXI-4 local interconnect DMA, Sync & I/OHS DCCM Dedicated Streaming Interconnect (FIFOs) D D D ASIP1 ASIPn C/C++ C API to Accelerators HAPS-70 S12 12M ASIC Gate equiv. L2 SRAM ASIP2
  • 12. Copyright © 2014 Synopsys Inc. 13 HOG Mapping and Refinement Flow Camera HOG Detection DVI Output
  • 13. Copyright © 2014 Synopsys Inc. 14 • Refinement from an OpenCV high-level functional description, to a fully optimized multi-processor SoC combining a GP RISC with multiple ASIPs • Main steps • OpenCV functional reference • Optimization and Porting onto MQX RTOS • Profiling of all major functions • Identification of high compute kernels • Development of ASIPs using Synopsys ASIP design and exploration tools • Stepwise refinement • From GPP only  to GPP + multiple ASIPs HOG Mapping and Refinement Flow
  • 14. Copyright © 2014 Synopsys Inc. 15 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00 Rescale Grad Hist Norm SVM Other % of processing % of processing ARC and ASIP Exploration Tool Flow Optimizing Compiler Assembler, Linker Instrn.-Set Simulator Debugger, Profiler C code ARC HS S/W Optimization Processor Description Language Optimizing Compiler Assembler, Linker Instrn.-Set Simulator Debugger, Profiler RTL Gen. Sim, FPGA, RTL Synthesis C code ASIP HW/SW Optimization ARC-ASIP Trade-off Exploration MQX RTOS
  • 15. Copyright © 2014 Synopsys Inc. 16 Grey scale conversion HOG Functional Validation on ARC HS (640 × 480 pixels) AXI local interconnect DMA, Sync & I/O Dedicated Streaming Interconnect (FIFOs) D D D ASIP1 ASIP2 Rescaling Gradient Histogram SVM Normali- zation Non-max suppression ASIP4 L3 Ext. DRAM DCCMHS Subs. ctrl 1 • C fixed point profiling results: 2.25 G cycles per frame
  • 16. Copyright © 2014 Synopsys Inc. 18 ARC HS G cycles 0.1 1.4 17.3 31.9 1.2 15.7 0.004 Histogram Of Oriented Gradients Profiling (640 × 480 pixels, at 25 FPS) Grey scale conversion Scale to multiple resolutions Gradient computation Histogram computation per block Normalization of the histograms SVM per window position Non-max suppression
  • 17. Copyright © 2014 Synopsys Inc. 19 Histogram Of Oriented Gradients Profiling (640 × 480 pixels, at 25 FPS) Grey scale conversion Scale to multiple resolutions Gradient computation Histogram computation per block Normalization of the histograms SVM per window position Non-max suppression 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00 Rescale Grad Hist Norm SVM Other % of processing % of processing
  • 18. Copyright © 2014 Synopsys Inc. 20 0.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 18.00 Rescale Grad Hist Norm SVM Other # ARC HS # ARC HS Histogram Of Oriented Gradients Profiling (640 × 480 pixels, at 25 FPS) Grey scale conversion Scale to multiple resolutions Gradient computation Histogram computation per block Normalization of the histograms SVM per window position Non-max suppression Single Core Multicore? Accelerate!
  • 19. Copyright © 2014 Synopsys Inc. 21 Grey scale conversion Task Assignment #2 AXI local interconnect DMA, Sync & I/O Dedicated Streaming Interconnect (FIFOs) D D D ASIP1 ASIP2 Rescaling Gradient Histogram SVM Normali- zation Non-max suppression ASIP4 2 L3 Ext. DRAM DCCMHS Subs. ctrl1.6 GHz 1.6 GHz 400 MHz
  • 20. Copyright © 2014 Synopsys Inc. 22 Task Assignment #3 AXI local interconnect DMA, Sync & I/OHS DCCM Dedicated Streaming Interconnect (FIFOs) Subs. ctrl D D DD ASIP1 ASIP2 ASIP3 ASIP4 3 L3 Ext. DRAM Grey scale conversion Rescaling Gradient Histogram SVM Normali- zation Non-max suppression 1.6 GHz 400 MHz
  • 21. Copyright © 2014 Synopsys Inc. 23 Task Assignment #4 AXI local interconnect DMA, Sync & I/OHS DCCM Dedicated Streaming Interconnect (FIFOs) Subs. ctrl D D DD ASIP1’ ASIP2 ASIP3 ASIP4 4 L3 Ext. DRAM Grey scale conversion Rescaling Gradient Histogram SVM Normali- zation Non-max suppression 400 MHz 400 MHz
  • 22. Copyright © 2014 Synopsys Inc. 24 Task Assignment #4 With On-Chip L2 AXI local interconnect DMA, Sync & I/O Dedicated Streaming Interconnect (FIFOs) D D DD ASIP1’ ASIP2 ASIP3 ASIP4 4 HS DCCM L2 SRAM L3 Ext. DRAM Grey scale conversion Rescaling Gradient Histogram SVM Normali- zation Non-max suppression Storage of scaled images 200 MB/s 80 MB/s 400 MHz 400 MHz
  • 23. Copyright © 2014 Synopsys Inc. 25 Power, Gate Count Comparisons (28 nm) 640 × 480 pixels, at 25 FPS 0 200 400 600 800 1000 1200 1400 Config #2 Config #3 Config #4 ASIP gates (K) ARC gates (K) Gates (K) 2 3 4 0.0 20.0 40.0 60.0 80.0 100.0 120.0 140.0 Config #2 Config #3 Config #4 ASIP power (mW) ARC power (mW) Power (mW) 2 3 4 0 1 2 3 4 5 6 Config #2 Config #3 Config #4 ASIP design and S/W ARC S/W 2 3 4 Effort (person-months) HAPS FPGA-based demo platform Note: Gates and power for processors and local memory
  • 24. Copyright © 2014 Synopsys Inc. 26 • 1 ARC HS, 4 ASIPs, AXI interconnect, private SRAM, L2 SRAM • Fixed point version of HOG derived from OpenCV • 25 frames/second at 400 MHz (ARC and ASIPs) • TSMC HPM process, 28nm • Gate count (at 400 MHz): 471K gates • 303K gates for ASIPs, 168K gates for ARC HS 36 • Power consumption: 60 mW • Prototype running on HAPS board (ASIPs) • 4 frames/second at 70 MHz 26 Final Results for Demonstrator Platform Demonstration available at our booth 4
  • 25. Copyright © 2014 Synopsys Inc. 27 Lessons Learned Subsystem Controller HS Emb. Vision Accelerators ASIP2 ASIP1’ ASIP3 1X 100X P A R A L L E L I S M 1’) Rescaling + Gradient 2) Histogram 3) Normalization 4) SVM Power-Performance-Area Efficiency Time-to-market, Flexibility 10X Data Level Parallelism Talk Level Parallelism 1) Greyscale 2) Non-max suppr. 3) Display 4) Control, O/S ASIP4 Sequential Tasks 4
  • 26. Copyright © 2014 Synopsys Inc. 28 Lessons Learned Subsystem Controller HS Emb. Vision Accelerators ASIP2 ASIP1’ ASIP3 1X 60X~80X P A R A L L E L I S M Area Efficiency Time-to-market, Flexibility Data Level Parallelism Talk Level Parallelism Combined: 471K gates, 60 mW @ 28nm ASIP4 Sequential Tasks 400 MHz 303K gates, 58 mW (Logic = 12 mW SRAM = 46 mW) 4 20% utilization 1.6 GHz: 473K gates, 37 uW/MHZ 400 MHz: 168K gates, 18 uW/MHz
  • 27. Copyright © 2014 Synopsys Inc. 29 Accelerator C API Data Level Parallelism Talk Level Parallelism Sequential Tasks Embedded Vision Platform Directions & Wish List Subsystem Controller HS 1X 100X Pre-processing: - Filtering - Color conversion - Image scaling - Feature extraction and matching - Segmentation High-level processing: - Body part detection - Multi-object tracking - Post-processing - Command interpretation Power-Performance-Area Efficiency Time-to-market, Flexibility Close coupling Vision Extn. SIMD (64 bit) OpenCV MQX O/S 10X Emb. Vision Accelerators ASIP ASIP ASIP P A R A L L E L I S M
  • 28. Copyright © 2014 Synopsys Inc. 30 • Embedded vision applications combine complex algorithms and high data rates with a need for low power • Need to trade-off Flexibility vs. Power-Perf-Area • Flexibility via High-performance ARC HS core • Ability to trade-off power vs. performance • Scaling to multi-core, specialization and SIMD usage • Highest PPA via ASIPs • Performance gains and power efficiency due to tailored instruction sets and dedicated memory architecture • While fully programmable, gains are application specific Conclusions
  • 29. Copyright © 2014 Synopsys Inc. 31 Backup slides
  • 30. Copyright © 2014 Synopsys Inc. 32 Design flow for the Vision Sub System ARC HS DW AXI interco DesignWare DMA DesignWare DDR ARChitect ASIP Processor Designer Core Assembler ASIP descriptionASIP ISA description Ref Sub System ASIP Synthesis + P&R tools Core Consultant SubSys settings ARC settings coreKitTool Core Builder Core Builder User config VCS DVE MDB PDBG Legend : HAPS
  • 31. Copyright © 2014 Synopsys Inc. 33 Synopsys’ ASIP Design Tool Flow Processor Description Language Optimizing Compiler Assembler Linker Instruction-Set Simulator Debugger Profiler RTL Generator RTL Sim & FPGA RTL Synthesis Full-featured SDK with graphical debugger Compiler supports processor specific data- types and operators Advanced optimizations allow C programmers to easily tap into architectural efficiencies Fast retargeting to evaluate incremental processor architecture changes quickly. High level language to quickly capture ISA Tight control of architecture (RTL-level) Fast simulation technology Easy integration into System C virtual platforms Multicore and on-chip debugging Smooth integration with RTL implementation and verification flows
  • 32. Copyright © 2014 Synopsys Inc. 34 Architectural Optimization Space ASIP architectural optimization space Parallelism Specialization Instruction- level parallelism Data- level parallelism Task- level parallelism Orthogonal instruction set (VLIW) Encoded instruction set Vector processing (SIMD) Multi- core App.-specific data types App.-specific instructions Connectivity & storage matching application’s data-flow App.-spec. data processing App.-spec. memory addressing App.-spec. control processing Distributed regs, sub-ranges Multiple mem’s, sub-ranges Jumps, subroutines, interrupts, HW do-loops, residual control, predication… Direct, indirect, post- modification, indexed, stack indirect… Any exotic operator Integer, fractional, floating-point, bits, complex, vector… Single or multi-cycle Relative or absolute, address range, delay slots… Pipeline Synopsys ASIP tools … • Support a wide range of ASIP architectures • Support RTL accelerator tricks for highest PPA efficiency • Enable ASIP optimization through architectural exploration Multi- threading