SlideShare a Scribd company logo
Programming Vision
Pipelines on AMD’s AI
Engines
Kristof Denolf (Principal Engineer)
Bader Md Alam (Director SW Engineering)
AMD
Versal is a Heterogenous Chip Well Suited for Vision
2
© 2022 AMD
Let’s focus on the AI Engines Native
MIPIPHY
PCIe&CCIX
(w/DMA
DDR/
LPDDR4
Sensor
I/O
Sensor
I/O
Scalar Engines Adaptable Engines
Intelligent
Engines
LPDDR
CPU
Host
Processor
(Optional)
AI Engines
DSP Engines
Programmable NoC
Platform Management
Controller
Arm Cortex-R5
Real-Time Processor
Arm Cortex-A72
Application Processor
Radar Lidar 4/8-Mpix
Multi-Camera
DATA
CONDITIONING
(e.g., Tiling)
ENVIRONMENT
CHARACTERIZATION
ToF SENSOR
DETECTION /
POINT CLOUD
(Radar / LiDAR)
OBJECT
CLASSIFICATION
ISP / IMAGE
CONDITIONING
(Vision)
CAN-FD
Vehicle
Control
HMI
Ethernet/
CAN-FD
• Assessment
• Decision Making
• Perception &
Behavioral SW
Processing
Final Decision Making
(Functional Safety)
ToF SENSOR
DETECTION /
POINT CLOUD
Pre-Processing
(Radar / LiDAR)
ISP / IMAGE
CONDITIONING
Pre-Processing
(Vision)
Stream
SENSOR FUSION
Next Gen
Accelerator RAM
• AI Engine Technology Introduction
• Compute Capabilities of the AI Engine
• Data Movement
• Vitis Vision:
• Library Overview
• Programming Vision Pipelines with Vitis
• Sneak Preview AIE-ML
• Conclusion
Agenda
3
© 2022 AMD
AI Engine Technology Introduction
Versal and AI Engine Terminology
5
© 2022 AMD
Adaptable
Hardware
AI Engines
Arm
Cortex-R5
Arm
Cortex-A72
PMC
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
AI Engine Array
Interconnect
ISA-based
Vector Processor
32 kB
Memory
AI Vector
Extensions
5G Vector
Extensions
Data
Mover
AI Engine Tile
Versal ACAP
128-400 1GHz AI Engines (Versal Core)
Data type GMACs per
AI Engine
x 128
(MACs/s)
X400
(MACs/s)
int8 128 16 T 51 T
int16 32 4 T 13 T
{foat,int}32 8 1 T 3 T
• AI Engine is a VLIW vector processor
• 32 kB memory, locks and data movers
• Directly connected to its neighbors
• Fully connected through AXI Stream
interconnect
• MAC = 2 Ops
Multi-Precision Support Enables Different Pixel Depths
6
© 2022 AMD
8 8
16
32
64
128
32x32
SPFP
32x32
int
32x16
int
16x16
int
16x8
int
8x8
int
MACs / Cycle (per core)
AI Data Types Local Memory Access
8 8
16
32
SPFP 32b 16b 8b
# data access / cycle (per
LD/ST unit)
Each AI Engine has:
• 2 x 256b LD units
• 1 x 256b ST
Data reuse needed to
match memory
bandwidth with 100%
MAC utilization
Config Data
reuse
Coeff
Reuse
32x32 1x 1x
32x16 2x 1x
16x16 2x 2x
16x8 4x 2x
8x8 4x 4x
Measured results
Vectorization Example
• More compute with smaller
datatypes
• Data reuse to enable
maximum vector compute
AI Engine: SW Programmable Signal Processor
7
© 2022 AMD
MEM
I/F
Data
Memory
(32KB)
AXIM
Switch
MEM
I/F
MEM I/F
MM2S
DMA
MEM
I/F
Program
Memory
(16KB)
Instruction
Fetch &
Decode
Unit
Load & Store
Address
Generation
Units
32b Scalar
RISC Unit
Fixed Point
512b SIMD
Vector Unit
Floating Point
512b SIMD
Vector Unit
Stall
Handler
Control,
Debug
& Trace
Accumulator
Stream FIFO
Scalar
Register Files
Vector Register Files
AI
Core
32 bit scalar RISC
512 bit vector core
1+ GHz
S2MM
DMA
AXIS
North
AXIS
South
AIE Compiler
videoKernel.cpp
AIE
simulator
Results & Cycles
Optimize program
to leverage
HW resources
DMA
Data
Memory
(32KB)
DMA
AI Engine Tile
AI Engine: SW Programmable Signal Processor
with Zero Loop Overhead on Counters
and Buffer Auto Increment
8
© 2022 AMD
int32 *inDataMemory;
int32 *outDataMemory;
aie::vector<int32,16> vectorOfData;
loop(expression) {
loop(expression) {
vectorOfData = *inDataMemory++;
processing on vectorOfData;
*outDataMemory++ = vectorOfResults;
}
}
Processing
MEM
I/F
Data
Memory
(32KB)
AXIM
Switch
MEM
I/F
MEM I/F
MM2S
DMA
MEM
I/F
Program
Memory
(16KB)
Instruction
Fetch &
Decode
Unit
Load & Store
Address
Generation
Units
32b Scalar
RISC Unit
Fixed Point
512b SIMD
Vector Unit
Floating Point
512b SIMD
Vector Unit
Stall
Handler
Control,
Debug
& Trace
Accumulator
Stream FIFO
Scalar
Register Files
Vector Register Files
AI
Core
32 bit scalar RISC
512 bit vector core
1+ GHz
S2MM
DMA
AXIS
North
AXIS
South
DMA
Data
Memory
(32KB)
DMA
AI Engine Tile
Filter2D – Basic Algorithm
32b data x 32b coefficients
Complexity
• O(N,k^2)
• N = Image Size, k = Kernel Size
int32 *img_in;
for(int i=0; i<imageH; i++) {
for(int j=0; j<imageW; j++) {
int32_t accum = 0;
for(int m = 0; m < kernelH; m++){
for(int n = 0; n < kernelW; n++) {
accum += kernel_coeff[m*kernelW+n]*
img_in[(m+i)*imageW + (j+n)];
}
}
img_out[i*image_width + j ] = accum; } }
kernelH
(m)
imageH
(i)
© 2022 AMD 9
Filter2D – Unroll Inner Loops (Prepare for
Vectorization)
32b data x 32b coefficients
int32 *img_in;
for(int i=0; i<imageH; i++) {
for(int j=0; j<imageW; j++) {
int32_t accum = 0;
accum = kernel_coeff[0]*img_in[(0+i)*imageW+(j+0)];
accum += kernel_coeff[1]*img_in[(0+i)*imageW+(j+1)];
accum += kernel_coeff[2]*img_in[(0+i)*imageW+(j+2)];
accum += kernel_coeff[3]*img_in[(1+i)*imageW+(j+0)];
accum += kernel_coeff[4]*img_in[(1+i)*imageW+(j+1)];
accum += kernel_coeff[5]*img_in[(1+i)*imageW+(j+2)];
accum += kernel_coeff[6]*img_in[(2+i)*imageW+(j+0)];
accum += kernel_coeff[7]*img_in[(2+i)*imageW+(j+1)];
accum += kernel_coeff[8]*img_in[(2+i)*imageW+(j+2)];
img_out[i*image_width + j ] = accum; }
}
kernelH
(m)
imageH
(i)
Unrolled
(for
3x3
kernel)
© 2022 AMD 10
Filter2D – Vectorize by 8 in Horizontal Dimension
32b data x 32b coefficients
• Scalar Reference Solution (32b data and 32b coefficients)
int32 *img_in;
for(int i=0; i<imageH; i++) {
for(int j=0; j<imageW; j+=8) {
vector<int32_t,8> accum8 = 0;
accum8 = kernel_coeff[0]*img_in[r1:0..7];
accum8 += kernel_coeff[1]*img_in[r1:1..8];
accum8 += kernel_coeff[2]*img_in[r1:2..9];
accum8 += kernel_coeff[3]*img_in[r2:0..7];
accum8 += kernel_coeff[4]*img_in[r2:1..8];
accum8 += kernel_coeff[5]*img_in[r2:2..9];
accum8 += kernel_coeff[6]*img_in[r2:0..7];
accum8 += kernel_coeff[7]*img_in[r2:1..8];
accum8 += kernel_coeff[8]*img_in[r2:2..9];
img_out[i*image_width + j ] = accum8; }
}
kernelH
(m)
imageH
(i)
© 2022 AMD 11
acc += mul(coeff, c_sel9, data_buf, d_sel9);
acc += mul(coeff, c_sel8, data_buf, d_sel8);
acc += mul(coeff, c_sel7, data_buf, d_sel7);
acc += mul(coeff, c_sel6, data_buf, d_sel6);
acc += mul(coeff, c_sel5, data_buf, d_sel5);
acc += mul(coeff, c_sel4, data_buf, d_sel4);
acc += mul(coeff, c_sel3, data_buf, d_sel3);
acc += mul(coeff, c_sel2, data_buf, d_sel2);
acc = mul(coeff, c_sel1, data_buf, d_sel1);
New Inner loop pseudo code
Vectoring with Factor 8 while Exploiting Vector
Register Data Reuse through Select
© 2022 AMD 12
AI Engine (Array) is Built for Parallel Data Movement
and Compute
13
© 2022 AMD
DMA DMA
Data
Memory
(32KB)
DMA DMA
Data
Memory
(32KB)
Interconnect
AI Core
32 bit scalar RISC
512 bit vector core
1+ GHz
AXIM
Switch
AXIS
North
AXIS
South
AXIM
Switch
AXIS
North
AXIS
South
• Data push system
• Control flow support  data
flow style implementations
© 2022 AMD
Zoom out to System Level
14
Interconnect
AI Core
32 bit scalar RISC
512 bit vector core
1+ GHz
AXIM
Switch
AXIS
North
AXIS
South
AXIM
Switch
AXIS
North
AXIS
South
© 2022 AMD
Zoom out to System Level
15
Interconnect
AXIM
Switch
AXIS
North
AXIS
South
AXIM
Switch
AXIS
North
AXIS
South
© 2022 AMD
NoC
DDR
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Vision Processing Pipeline
4K ~ 8 MPixels
Vision Processing Graph Exploits Specialized Data
Movement
16
Composing DMA
(Stitcher)
Decomposing DMA
(Tiler)
Interconnect
AXIM
Switch
AXIS
North
AXIS
South
AXIM
Switch
AXIS
North
AXIS
South
© 2022 AMD
NoC
DDR
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Vision Processing Pipeline
Local
Buffer
Local
Buffer
Vitis Vision: Library Overview,
Programming a Vision Pipeline and Tools
What is in the AI Engine Vision Library?
DRAM
PS
cv2.filter2D(img,-1,kernel,dst)
© 2022 AMD 18
What is in the AI Engine Vision Library?
DRAM
PS
cv2.filter2D(img,-1,kernel,dst)
© 2022 AMD
Host code
What is in the AI Engine Vision Library?
DRAM
PS
cv2.filter2D(img,-1,kernel,dst)
Ease-of-Use – High level abstraction for data movement
Data Mover
(Tiler)
Data Mover
(Tiler)
AXI-MM AXI-S
Data Mover (Stitcher)
AXI-MM AXI-S
Ease-of-Use – High level abstraction for data movement
Code to define DataMover
2 DataMover Options:
1) PL via PLIO
2) SW/ NoC via GMIO
data-movement
© 2022 AMD
Host code
What is in the AI Engine Vision Library?
DRAM
PS
cv2.filter2D(img,-1,kernel,dst)
Host code
Ease-of-Use – High level abstraction for data movement
Data Mover
(Tiler)
Data Mover
(Tiler)
AXI-MM AXI-S
Data Mover (Stitcher)
AXI-MM AXI-S
Ease-of-Use – High level abstraction for data movement
Code to define DataMover
Graph code for kernel
Vision
kernel
#1
2 DataMover Options:
1) PL via PLIO
2) SW/ NoC via GMIO
AI Engine vision
kernels
data-movement
© 2022 AMD
What is in the AI Engine Vision Library?
DRAM
PS
Vision
kernel
#1
Data Mover
(Tiler)
AXI-MM AXI-S
Data Mover (Stitcher)
AXI-MM AXI-S
2 DataMover Options:
1) PL via PLIO
2) SW/ NoC via GMIO
Ease-of-Use – High level abstraction for data movement
Code to define DataMover
Host code to call datamover & run graph
Graph code for kernel
cv2.filter2D(img,-1,kernel,dst)
AI Engine vision
kernels
data-movement
© 2022 AMD
Vitis Tool Overview
PL and AIE Integration ( v++ --link)
Generate Binary (v++ --package)
AIE Kernels, Graph
AIE Simulation
PL Kernels (HLS)
HLS Cosimulation
SIM
AIESim QEMU
Vitis HW Platform
Vitis SW Platform
Linux + rootfs
Run on Device
Profile
PL (HLS/RTL)
AI Engine Platform
Debug
PS APP
PL Kernels (HLS)
HW Emulation
AIE Kernels, Graph
Host
AI Engine vision
kernels
data-movement
Host code
© 2022 AMD 23
PL and AIE Integration ( v++ --link)
Generate Binary (v++ --package)
AIE Kernels, Graph
AIE Simulation
PL Kernels (HLS)
HLS Cosimulation
SIM
AIESim QEMU
Vitis HW Platform
Vitis SW Platform
Linux + rootfs
Run on Device
Profile
PL (HLS/RTL)
AI Engine Platform
Debug
PS APP
PL Kernels (HLS)
HW Emulation
AIE Kernels, Graph
Host
© 2022 AMD
Vitis Tool Overview Slide
cv2.filter2D(img,-1,kernel,dst)
host.cpp
adf: graph.{h,cpp}
includes xf_filter2d.cc
AI Engine vision
kernels
data-movement
Host code
24
Library of Optimized Vision Kernels – 1x AI Engine
Core Performance
219 219
123 123
194
220
123
219
154
220
87
195 195 192
FPS achieved - processing 4K resolution images
60 fps
© 2022 AMD 25
Vitis Vision Library: AI Engine Portfolio
26
© 2022 AMD
2D/3D Noise Reduction
Mono, RGB-IR Debayering
Bicubic Resize Tone Mappers
Background Matting HDRFusion Feature Extractors
Mask Generation Histogram Equalization Remap
IntersectionOfUnion Quantization and Dithering Warp
Box-Sort AWB Stereo GBM
NMS AEC Stereo LBM
Crop/Patch Gamma Correction OTSU Thresholding
Absolute Difference Channel lnterleaving BlackLevelCorrection SeparableFilters
Accumulate Weighted LenseShadingCorrection
Accumulate Normalization filter2D
ConvertScaleAbs Resize (Bilinear) Gain Control Gaussian Blur
PixelWiseMul Thresholding Defective Pixel Correction Erode
ZeroFunction ColorConversion Debayering Laplacian
Basic
Functionality
DNN
(X of X+ML)
Image Sensor Processing (ISP)
Filters/
Others
Vitis Vision Lib:
AI Engine Portfolio
2021.1 / planned
Sneak Preview: AI Engine-ML
Intelligent Engines Optimized for Any Whole Vision AI
Application
28
© 2022 AMD
Signal
Processing
AI Inference
optimized optimized
AIE AIE-ML
AI Engine Architecture
1X
1X
1X
1X
1X
1X
1X
1X
1X
1X
Compute
Tiles
UltraRAM
LUTs
LUTs
 Optimized for signal processing AND ML
 Flexibility for high performance DSP applications
 Native support for INT8, INT16, FP32
INT4
INT8
INT16
BFLOAT16
INT32
FP32
AIE AIE-ML
OPS / Tile
1024
512
128
256
256
256
64
16
16
KB / Tile
64
Data
Memory
Program
Memory
16
16
32
16
42*
*Via software emulation
AIE-ML Architecture
2X 2X 2X 2X 2X
Compute
Tiles
LUTs
Mem
Tiles
 Optimized for ML Inference Applications
 Maximum AI/ML compute with reduced footprint
 Native support for INT4, INT8, INT16, bfloat16
 Fine grained sparsity HW optimization
512KB 512KB
512KB
512KB
512KB
Intelligent Engines Optimized for Any Whole Vision AI
Application
29
© 2022 AMD
Signal
Processing
AI Inference
optimized optimized
AIE AIE-ML
AI Engine Architecture
1X
1X
1X
1X
1X
1X
1X
1X
1X
1X
Compute
Tiles
UltraRAM
LUTs
LUTs
 Optimized for signal processing AND ML
 Flexibility for high performance DSP applications
 Native support for INT8, INT16, FP32
AIE-ML Architecture
2X 2X 2X 2X 2X
Compute
Tiles
LUTs
Mem
Tiles
 Optimized for ML Inference Applications
 Maximum AI/ML compute with reduced footprint
 Native support for INT4, INT8, INT16, bfloat16
 Fine grained sparsity HW optimization
INT4
INT8
INT16
BFLOAT16
INT32
FP32
AIE AIE-ML
OPS / Tile
1024
512
128
256
256
256
64
16
16
KB / Tile
64
Data
Memory
Program
Memory
16
16
32
16
42*
*Via softwareemulation
2X INT8/16 OPs/Tile
4X INT4 OPs/Tile
Reduced data
movement
Reduced AI
PL Footprint
2X AI Perf/W
Versal AIE-ML offers 2X AI Performance per Watt
• The AI Engines of the Versal device support vison workloads by design
• VLIW-vector processor with zero loop overhead and auto buffer increment
• Vector compute in different bit depths covering essential vision operators
• Concurrent data movement and compute in AI Engine array through DMAs
• Composing/Decomposing datamovers tiling into sub-images that fit local memory
• Supports streaming dataflow pipelines in the AI Engine array
• Growing kernel library covering both typical vision kernels and ISP kernels
available in open source
• Vitis tools support easy programming of vision pipelines
Conclusions
30
© 2022 AMD
Vitis (Vision) Resources
• Vitis Vision
https://www.xilinx.com/products/design-
tools/vitis/vitis-libraries/vitis-vision.html
• Github docs
https://xilinx.github.io/Vitis_Libraries/vision
/2021.2/index.html
• Github code
https://github.com/Xilinx/Vitis_Libraries/tre
e/master/vision
• Vitis
https://www.xilinx.com/products/design-
tools/vitis/vitis-platform.html
AI Engine Resources
• AI Engines
https://www.xilinx.com/products/technology/ai
-engine.html
• Versal Core Product Family
https://www.xilinx.com/products/silicon-
devices/acap/versal-ai-core.html
• Versal AI Edge Product Family
https://www.xilinx.com/products/silicon-
devices/acap/versal-ai-edge.html
2022 Embedded Vision Summit
Please visit our AMD boot 31
© 2022 AMD
Resources
Thank You

More Related Content

Similar to “Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD

SIMD Processing Using Compiler Intrinsics
SIMD Processing Using Compiler IntrinsicsSIMD Processing Using Compiler Intrinsics
SIMD Processing Using Compiler Intrinsics
Richard Thomson
 
Industry’s performance leading ultra low-power dsp solution
Industry’s performance leading ultra low-power dsp solutionIndustry’s performance leading ultra low-power dsp solution
Industry’s performance leading ultra low-power dsp solution
Analog Devices, Inc.
 
Jetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous MachinesJetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous Machines
Dustin Franklin
 
GTC 2018 で発表された自動運転最新情報
GTC 2018 で発表された自動運転最新情報GTC 2018 で発表された自動運転最新情報
GTC 2018 で発表された自動運転最新情報
NVIDIA Japan
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Akihiro Hayashi
 
soc ip core based for spacecraft application
soc ip core based for spacecraft applicationsoc ip core based for spacecraft application
soc ip core based for spacecraft application
navyashree pari
 
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
AMD Developer Central
 
Dr.s.shiyamala fpga ppt
Dr.s.shiyamala  fpga pptDr.s.shiyamala  fpga ppt
Dr.s.shiyamala fpga ppt
SHIYAMALASUBRAMANI1
 
BUD17 Socionext SC2A11 ARM Server SoC
BUD17 Socionext SC2A11 ARM Server SoCBUD17 Socionext SC2A11 ARM Server SoC
BUD17 Socionext SC2A11 ARM Server SoC
Linaro
 
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
Laurent Leturgez
 
AXONIM 2018 industrial automation technical support
AXONIM 2018 industrial automation technical supportAXONIM 2018 industrial automation technical support
AXONIM 2018 industrial automation technical support
Vitaliy Bozhkov ✔
 
Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)
Alexander Dolbilov
 
39245196 intro-es-iii
39245196 intro-es-iii39245196 intro-es-iii
39245196 intro-es-iiiEmbeddedbvp
 
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
Edge AI and Vision Alliance
 
Crypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M ProcessorsCrypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M Processors
Hannes Tschofenig
 
Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...
Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...
Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...
EmbeddedFest
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
Dhaval Kaneria
 
NGIoT Sustainability Workshop 2023_Rene Griessl presentation
NGIoT Sustainability Workshop 2023_Rene Griessl presentationNGIoT Sustainability Workshop 2023_Rene Griessl presentation
NGIoT Sustainability Workshop 2023_Rene Griessl presentation
VEDLIoT Project
 
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
Edge AI and Vision Alliance
 

Similar to “Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD (20)

SIMD Processing Using Compiler Intrinsics
SIMD Processing Using Compiler IntrinsicsSIMD Processing Using Compiler Intrinsics
SIMD Processing Using Compiler Intrinsics
 
Industry’s performance leading ultra low-power dsp solution
Industry’s performance leading ultra low-power dsp solutionIndustry’s performance leading ultra low-power dsp solution
Industry’s performance leading ultra low-power dsp solution
 
Jetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous MachinesJetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous Machines
 
GTC 2018 で発表された自動運転最新情報
GTC 2018 で発表された自動運転最新情報GTC 2018 で発表された自動運転最新情報
GTC 2018 で発表された自動運転最新情報
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
 
soc ip core based for spacecraft application
soc ip core based for spacecraft applicationsoc ip core based for spacecraft application
soc ip core based for spacecraft application
 
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
 
Dr.s.shiyamala fpga ppt
Dr.s.shiyamala  fpga pptDr.s.shiyamala  fpga ppt
Dr.s.shiyamala fpga ppt
 
BUD17 Socionext SC2A11 ARM Server SoC
BUD17 Socionext SC2A11 ARM Server SoCBUD17 Socionext SC2A11 ARM Server SoC
BUD17 Socionext SC2A11 ARM Server SoC
 
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
 
AXONIM 2018 industrial automation technical support
AXONIM 2018 industrial automation technical supportAXONIM 2018 industrial automation technical support
AXONIM 2018 industrial automation technical support
 
Introduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSPIntroduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSP
 
Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)
 
39245196 intro-es-iii
39245196 intro-es-iii39245196 intro-es-iii
39245196 intro-es-iii
 
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
 
Crypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M ProcessorsCrypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M Processors
 
Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...
Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...
Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
NGIoT Sustainability Workshop 2023_Rene Griessl presentation
NGIoT Sustainability Workshop 2023_Rene Griessl presentationNGIoT Sustainability Workshop 2023_Rene Griessl presentation
NGIoT Sustainability Workshop 2023_Rene Griessl presentation
 
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
 

More from Edge AI and Vision Alliance

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a..."OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
Edge AI and Vision Alliance
 
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
Edge AI and Vision Alliance
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
Edge AI and Vision Alliance
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
Edge AI and Vision Alliance
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
Edge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a..."OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
"OpenCV for High-performance, Low-power Vision Applications on Snapdragon," a...
 
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 

Recently uploaded

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 

Recently uploaded (20)

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 

“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD

  • 1. Programming Vision Pipelines on AMD’s AI Engines Kristof Denolf (Principal Engineer) Bader Md Alam (Director SW Engineering) AMD
  • 2. Versal is a Heterogenous Chip Well Suited for Vision 2 © 2022 AMD Let’s focus on the AI Engines Native MIPIPHY PCIe&CCIX (w/DMA DDR/ LPDDR4 Sensor I/O Sensor I/O Scalar Engines Adaptable Engines Intelligent Engines LPDDR CPU Host Processor (Optional) AI Engines DSP Engines Programmable NoC Platform Management Controller Arm Cortex-R5 Real-Time Processor Arm Cortex-A72 Application Processor Radar Lidar 4/8-Mpix Multi-Camera DATA CONDITIONING (e.g., Tiling) ENVIRONMENT CHARACTERIZATION ToF SENSOR DETECTION / POINT CLOUD (Radar / LiDAR) OBJECT CLASSIFICATION ISP / IMAGE CONDITIONING (Vision) CAN-FD Vehicle Control HMI Ethernet/ CAN-FD • Assessment • Decision Making • Perception & Behavioral SW Processing Final Decision Making (Functional Safety) ToF SENSOR DETECTION / POINT CLOUD Pre-Processing (Radar / LiDAR) ISP / IMAGE CONDITIONING Pre-Processing (Vision) Stream SENSOR FUSION Next Gen Accelerator RAM
  • 3. • AI Engine Technology Introduction • Compute Capabilities of the AI Engine • Data Movement • Vitis Vision: • Library Overview • Programming Vision Pipelines with Vitis • Sneak Preview AIE-ML • Conclusion Agenda 3 © 2022 AMD
  • 4. AI Engine Technology Introduction
  • 5. Versal and AI Engine Terminology 5 © 2022 AMD Adaptable Hardware AI Engines Arm Cortex-R5 Arm Cortex-A72 PMC Memory AI Core Memory AI Core Memory AI Core Memory AI Core AI Engine Array Interconnect ISA-based Vector Processor 32 kB Memory AI Vector Extensions 5G Vector Extensions Data Mover AI Engine Tile Versal ACAP 128-400 1GHz AI Engines (Versal Core) Data type GMACs per AI Engine x 128 (MACs/s) X400 (MACs/s) int8 128 16 T 51 T int16 32 4 T 13 T {foat,int}32 8 1 T 3 T • AI Engine is a VLIW vector processor • 32 kB memory, locks and data movers • Directly connected to its neighbors • Fully connected through AXI Stream interconnect • MAC = 2 Ops
  • 6. Multi-Precision Support Enables Different Pixel Depths 6 © 2022 AMD 8 8 16 32 64 128 32x32 SPFP 32x32 int 32x16 int 16x16 int 16x8 int 8x8 int MACs / Cycle (per core) AI Data Types Local Memory Access 8 8 16 32 SPFP 32b 16b 8b # data access / cycle (per LD/ST unit) Each AI Engine has: • 2 x 256b LD units • 1 x 256b ST Data reuse needed to match memory bandwidth with 100% MAC utilization Config Data reuse Coeff Reuse 32x32 1x 1x 32x16 2x 1x 16x16 2x 2x 16x8 4x 2x 8x8 4x 4x Measured results Vectorization Example • More compute with smaller datatypes • Data reuse to enable maximum vector compute
  • 7. AI Engine: SW Programmable Signal Processor 7 © 2022 AMD MEM I/F Data Memory (32KB) AXIM Switch MEM I/F MEM I/F MM2S DMA MEM I/F Program Memory (16KB) Instruction Fetch & Decode Unit Load & Store Address Generation Units 32b Scalar RISC Unit Fixed Point 512b SIMD Vector Unit Floating Point 512b SIMD Vector Unit Stall Handler Control, Debug & Trace Accumulator Stream FIFO Scalar Register Files Vector Register Files AI Core 32 bit scalar RISC 512 bit vector core 1+ GHz S2MM DMA AXIS North AXIS South AIE Compiler videoKernel.cpp AIE simulator Results & Cycles Optimize program to leverage HW resources DMA Data Memory (32KB) DMA AI Engine Tile
  • 8. AI Engine: SW Programmable Signal Processor with Zero Loop Overhead on Counters and Buffer Auto Increment 8 © 2022 AMD int32 *inDataMemory; int32 *outDataMemory; aie::vector<int32,16> vectorOfData; loop(expression) { loop(expression) { vectorOfData = *inDataMemory++; processing on vectorOfData; *outDataMemory++ = vectorOfResults; } } Processing MEM I/F Data Memory (32KB) AXIM Switch MEM I/F MEM I/F MM2S DMA MEM I/F Program Memory (16KB) Instruction Fetch & Decode Unit Load & Store Address Generation Units 32b Scalar RISC Unit Fixed Point 512b SIMD Vector Unit Floating Point 512b SIMD Vector Unit Stall Handler Control, Debug & Trace Accumulator Stream FIFO Scalar Register Files Vector Register Files AI Core 32 bit scalar RISC 512 bit vector core 1+ GHz S2MM DMA AXIS North AXIS South DMA Data Memory (32KB) DMA AI Engine Tile
  • 9. Filter2D – Basic Algorithm 32b data x 32b coefficients Complexity • O(N,k^2) • N = Image Size, k = Kernel Size int32 *img_in; for(int i=0; i<imageH; i++) { for(int j=0; j<imageW; j++) { int32_t accum = 0; for(int m = 0; m < kernelH; m++){ for(int n = 0; n < kernelW; n++) { accum += kernel_coeff[m*kernelW+n]* img_in[(m+i)*imageW + (j+n)]; } } img_out[i*image_width + j ] = accum; } } kernelH (m) imageH (i) © 2022 AMD 9
  • 10. Filter2D – Unroll Inner Loops (Prepare for Vectorization) 32b data x 32b coefficients int32 *img_in; for(int i=0; i<imageH; i++) { for(int j=0; j<imageW; j++) { int32_t accum = 0; accum = kernel_coeff[0]*img_in[(0+i)*imageW+(j+0)]; accum += kernel_coeff[1]*img_in[(0+i)*imageW+(j+1)]; accum += kernel_coeff[2]*img_in[(0+i)*imageW+(j+2)]; accum += kernel_coeff[3]*img_in[(1+i)*imageW+(j+0)]; accum += kernel_coeff[4]*img_in[(1+i)*imageW+(j+1)]; accum += kernel_coeff[5]*img_in[(1+i)*imageW+(j+2)]; accum += kernel_coeff[6]*img_in[(2+i)*imageW+(j+0)]; accum += kernel_coeff[7]*img_in[(2+i)*imageW+(j+1)]; accum += kernel_coeff[8]*img_in[(2+i)*imageW+(j+2)]; img_out[i*image_width + j ] = accum; } } kernelH (m) imageH (i) Unrolled (for 3x3 kernel) © 2022 AMD 10
  • 11. Filter2D – Vectorize by 8 in Horizontal Dimension 32b data x 32b coefficients • Scalar Reference Solution (32b data and 32b coefficients) int32 *img_in; for(int i=0; i<imageH; i++) { for(int j=0; j<imageW; j+=8) { vector<int32_t,8> accum8 = 0; accum8 = kernel_coeff[0]*img_in[r1:0..7]; accum8 += kernel_coeff[1]*img_in[r1:1..8]; accum8 += kernel_coeff[2]*img_in[r1:2..9]; accum8 += kernel_coeff[3]*img_in[r2:0..7]; accum8 += kernel_coeff[4]*img_in[r2:1..8]; accum8 += kernel_coeff[5]*img_in[r2:2..9]; accum8 += kernel_coeff[6]*img_in[r2:0..7]; accum8 += kernel_coeff[7]*img_in[r2:1..8]; accum8 += kernel_coeff[8]*img_in[r2:2..9]; img_out[i*image_width + j ] = accum8; } } kernelH (m) imageH (i) © 2022 AMD 11
  • 12. acc += mul(coeff, c_sel9, data_buf, d_sel9); acc += mul(coeff, c_sel8, data_buf, d_sel8); acc += mul(coeff, c_sel7, data_buf, d_sel7); acc += mul(coeff, c_sel6, data_buf, d_sel6); acc += mul(coeff, c_sel5, data_buf, d_sel5); acc += mul(coeff, c_sel4, data_buf, d_sel4); acc += mul(coeff, c_sel3, data_buf, d_sel3); acc += mul(coeff, c_sel2, data_buf, d_sel2); acc = mul(coeff, c_sel1, data_buf, d_sel1); New Inner loop pseudo code Vectoring with Factor 8 while Exploiting Vector Register Data Reuse through Select © 2022 AMD 12
  • 13. AI Engine (Array) is Built for Parallel Data Movement and Compute 13 © 2022 AMD DMA DMA Data Memory (32KB) DMA DMA Data Memory (32KB) Interconnect AI Core 32 bit scalar RISC 512 bit vector core 1+ GHz AXIM Switch AXIS North AXIS South AXIM Switch AXIS North AXIS South • Data push system • Control flow support  data flow style implementations © 2022 AMD
  • 14. Zoom out to System Level 14 Interconnect AI Core 32 bit scalar RISC 512 bit vector core 1+ GHz AXIM Switch AXIS North AXIS South AXIM Switch AXIS North AXIS South © 2022 AMD
  • 15. Zoom out to System Level 15 Interconnect AXIM Switch AXIS North AXIS South AXIM Switch AXIS North AXIS South © 2022 AMD NoC DDR Memory AI Core Memory AI Core Memory AI Core Memory AI Core Vision Processing Pipeline 4K ~ 8 MPixels
  • 16. Vision Processing Graph Exploits Specialized Data Movement 16 Composing DMA (Stitcher) Decomposing DMA (Tiler) Interconnect AXIM Switch AXIS North AXIS South AXIM Switch AXIS North AXIS South © 2022 AMD NoC DDR Memory AI Core Memory AI Core Memory AI Core Memory AI Core Vision Processing Pipeline Local Buffer Local Buffer
  • 17. Vitis Vision: Library Overview, Programming a Vision Pipeline and Tools
  • 18. What is in the AI Engine Vision Library? DRAM PS cv2.filter2D(img,-1,kernel,dst) © 2022 AMD 18
  • 19. What is in the AI Engine Vision Library? DRAM PS cv2.filter2D(img,-1,kernel,dst) © 2022 AMD Host code
  • 20. What is in the AI Engine Vision Library? DRAM PS cv2.filter2D(img,-1,kernel,dst) Ease-of-Use – High level abstraction for data movement Data Mover (Tiler) Data Mover (Tiler) AXI-MM AXI-S Data Mover (Stitcher) AXI-MM AXI-S Ease-of-Use – High level abstraction for data movement Code to define DataMover 2 DataMover Options: 1) PL via PLIO 2) SW/ NoC via GMIO data-movement © 2022 AMD Host code
  • 21. What is in the AI Engine Vision Library? DRAM PS cv2.filter2D(img,-1,kernel,dst) Host code Ease-of-Use – High level abstraction for data movement Data Mover (Tiler) Data Mover (Tiler) AXI-MM AXI-S Data Mover (Stitcher) AXI-MM AXI-S Ease-of-Use – High level abstraction for data movement Code to define DataMover Graph code for kernel Vision kernel #1 2 DataMover Options: 1) PL via PLIO 2) SW/ NoC via GMIO AI Engine vision kernels data-movement © 2022 AMD
  • 22. What is in the AI Engine Vision Library? DRAM PS Vision kernel #1 Data Mover (Tiler) AXI-MM AXI-S Data Mover (Stitcher) AXI-MM AXI-S 2 DataMover Options: 1) PL via PLIO 2) SW/ NoC via GMIO Ease-of-Use – High level abstraction for data movement Code to define DataMover Host code to call datamover & run graph Graph code for kernel cv2.filter2D(img,-1,kernel,dst) AI Engine vision kernels data-movement © 2022 AMD
  • 23. Vitis Tool Overview PL and AIE Integration ( v++ --link) Generate Binary (v++ --package) AIE Kernels, Graph AIE Simulation PL Kernels (HLS) HLS Cosimulation SIM AIESim QEMU Vitis HW Platform Vitis SW Platform Linux + rootfs Run on Device Profile PL (HLS/RTL) AI Engine Platform Debug PS APP PL Kernels (HLS) HW Emulation AIE Kernels, Graph Host AI Engine vision kernels data-movement Host code © 2022 AMD 23
  • 24. PL and AIE Integration ( v++ --link) Generate Binary (v++ --package) AIE Kernels, Graph AIE Simulation PL Kernels (HLS) HLS Cosimulation SIM AIESim QEMU Vitis HW Platform Vitis SW Platform Linux + rootfs Run on Device Profile PL (HLS/RTL) AI Engine Platform Debug PS APP PL Kernels (HLS) HW Emulation AIE Kernels, Graph Host © 2022 AMD Vitis Tool Overview Slide cv2.filter2D(img,-1,kernel,dst) host.cpp adf: graph.{h,cpp} includes xf_filter2d.cc AI Engine vision kernels data-movement Host code 24
  • 25. Library of Optimized Vision Kernels – 1x AI Engine Core Performance 219 219 123 123 194 220 123 219 154 220 87 195 195 192 FPS achieved - processing 4K resolution images 60 fps © 2022 AMD 25
  • 26. Vitis Vision Library: AI Engine Portfolio 26 © 2022 AMD 2D/3D Noise Reduction Mono, RGB-IR Debayering Bicubic Resize Tone Mappers Background Matting HDRFusion Feature Extractors Mask Generation Histogram Equalization Remap IntersectionOfUnion Quantization and Dithering Warp Box-Sort AWB Stereo GBM NMS AEC Stereo LBM Crop/Patch Gamma Correction OTSU Thresholding Absolute Difference Channel lnterleaving BlackLevelCorrection SeparableFilters Accumulate Weighted LenseShadingCorrection Accumulate Normalization filter2D ConvertScaleAbs Resize (Bilinear) Gain Control Gaussian Blur PixelWiseMul Thresholding Defective Pixel Correction Erode ZeroFunction ColorConversion Debayering Laplacian Basic Functionality DNN (X of X+ML) Image Sensor Processing (ISP) Filters/ Others Vitis Vision Lib: AI Engine Portfolio 2021.1 / planned
  • 27. Sneak Preview: AI Engine-ML
  • 28. Intelligent Engines Optimized for Any Whole Vision AI Application 28 © 2022 AMD Signal Processing AI Inference optimized optimized AIE AIE-ML AI Engine Architecture 1X 1X 1X 1X 1X 1X 1X 1X 1X 1X Compute Tiles UltraRAM LUTs LUTs  Optimized for signal processing AND ML  Flexibility for high performance DSP applications  Native support for INT8, INT16, FP32 INT4 INT8 INT16 BFLOAT16 INT32 FP32 AIE AIE-ML OPS / Tile 1024 512 128 256 256 256 64 16 16 KB / Tile 64 Data Memory Program Memory 16 16 32 16 42* *Via software emulation AIE-ML Architecture 2X 2X 2X 2X 2X Compute Tiles LUTs Mem Tiles  Optimized for ML Inference Applications  Maximum AI/ML compute with reduced footprint  Native support for INT4, INT8, INT16, bfloat16  Fine grained sparsity HW optimization 512KB 512KB 512KB 512KB 512KB
  • 29. Intelligent Engines Optimized for Any Whole Vision AI Application 29 © 2022 AMD Signal Processing AI Inference optimized optimized AIE AIE-ML AI Engine Architecture 1X 1X 1X 1X 1X 1X 1X 1X 1X 1X Compute Tiles UltraRAM LUTs LUTs  Optimized for signal processing AND ML  Flexibility for high performance DSP applications  Native support for INT8, INT16, FP32 AIE-ML Architecture 2X 2X 2X 2X 2X Compute Tiles LUTs Mem Tiles  Optimized for ML Inference Applications  Maximum AI/ML compute with reduced footprint  Native support for INT4, INT8, INT16, bfloat16  Fine grained sparsity HW optimization INT4 INT8 INT16 BFLOAT16 INT32 FP32 AIE AIE-ML OPS / Tile 1024 512 128 256 256 256 64 16 16 KB / Tile 64 Data Memory Program Memory 16 16 32 16 42* *Via softwareemulation 2X INT8/16 OPs/Tile 4X INT4 OPs/Tile Reduced data movement Reduced AI PL Footprint 2X AI Perf/W Versal AIE-ML offers 2X AI Performance per Watt
  • 30. • The AI Engines of the Versal device support vison workloads by design • VLIW-vector processor with zero loop overhead and auto buffer increment • Vector compute in different bit depths covering essential vision operators • Concurrent data movement and compute in AI Engine array through DMAs • Composing/Decomposing datamovers tiling into sub-images that fit local memory • Supports streaming dataflow pipelines in the AI Engine array • Growing kernel library covering both typical vision kernels and ISP kernels available in open source • Vitis tools support easy programming of vision pipelines Conclusions 30 © 2022 AMD
  • 31. Vitis (Vision) Resources • Vitis Vision https://www.xilinx.com/products/design- tools/vitis/vitis-libraries/vitis-vision.html • Github docs https://xilinx.github.io/Vitis_Libraries/vision /2021.2/index.html • Github code https://github.com/Xilinx/Vitis_Libraries/tre e/master/vision • Vitis https://www.xilinx.com/products/design- tools/vitis/vitis-platform.html AI Engine Resources • AI Engines https://www.xilinx.com/products/technology/ai -engine.html • Versal Core Product Family https://www.xilinx.com/products/silicon- devices/acap/versal-ai-core.html • Versal AI Edge Product Family https://www.xilinx.com/products/silicon- devices/acap/versal-ai-edge.html 2022 Embedded Vision Summit Please visit our AMD boot 31 © 2022 AMD Resources