SlideShare a Scribd company logo
1 of 19
Rene Griessl
Bielefeld University
VEDLIoT – Accelerators for
Heterogenous Computing in AIoT
2
Big Picture
3
VEDLIoT Hardware Platform
 Heterogeneous, modular, scalable microserver system
 Supporting the full spectrum of IoT from embedded over the edge towards the cloud
 Different technology concepts for improving
x86
GPU
ML-ASIC
ARM v8
GPU
SoC
FPGA
SoC
RISC-V
FPGA
VEDLIOT Cognitive
IoT Platform
 Performance
 Cost-effectiveness
 Maintainability
 Reliability
 Energy-Efficiency
 Safety
4
RECS Architecture – RECS|BOX
RECS Server Backplane (up to 15 Carriers)
Carrier (PCIe Expansion)
Carrier (High Performance)
e.g. GPU-Accelerator
Carrier (Low Power)
#3
#2
Microserver
(High Performance)
#1
Microserver
(Low Power)
#16
#3
#2
Microserver
(Low Power)
#1
High-Speed Low-Latency Network (PCIe, High-Speed Serial)
Compute Network (up to 40 GbE)
Management Network (KVM, Monitoring, …)
HDMI/USB
iPass+ HD
QSFP+
RJ45
Ext. Connectors
GPU
SoC
FPGA
SoC
ARM
Soc
Low-Power Microserver
(Apalis/Jetson)
x86 ARM v8
High-Performance Microserver (COM
Express)
FPGA SoC
High-Performance
Carrier
(up to 3 microservers)
Low-Power Carrier
(up to 16 microservers)
5
t.RECS
t.RECS Edge Server
 Optimized platform for
local / edge applications
 Provide interfaces for
 Video
 Camera
 Peripheral input (USB)
 Combine FPGA and
GPU acceleration
 Compact dimensions
1 RU, E-ATX form factor
(2 RU/ 3 RU for special cases)
RECS Architecture – t.RECS
Microserver #3
(COM-HPC Client)
Microserver #1
(COM-HPC Client)
Microserver #2
(COM-HPC Server)
Switched PCIe (Host to Host)
External
interfaces
PCIe
expansion
Ethernet (up to 10 GbE)
Management Network (KVM, Monitoring, …)
I/O (Camera, Display, Radar/Lidar, Audio)
6
u.RECS
u.RECS AIoT Server
 Supports ML acceleration
 FPGA
 ASIC
 Communication interfaces
 Wired (CAN, Ethernet, CSI)
 Wireless (WLAN, LoRa, 5G)
 Sensors
 Camera
 Environment (Temp./Hum.)
 Housekeeping
 Embedded Device
(~ 20x20x6 cm)
RECS Architecture – u.RECS
PCIe
Ethernet (1 GbE & SPE)
Management & Monitoring
I/O (Camera, WiFi, LoRa, 4G/5G)
Microserver #1
(SMARC 2.1)
Microserver #2
(Jetson NX)
ML
Acc.
(M.2)
Front
Panel
2x
HDMI
RJ45/
SPE
4x
USB 3.1
7
Microserver Overview
u.RECS
t.RECS
RECS|Box
Xilinx Kria
K26
NVIDIA Jetson
Orin NX
Hailo-8
SMARC 2.1
x86/ARM
CPUs, FPGAs
Raspberry Pi
Compute
Module 4
COM-HPC
Client
X86
COM-HPC
NVIDIA
AGX Adapter
COM-HPC
Server
FPGA
COM Express
ARM v8 Server
SoC Hi1616
COM Express
Xilinx Zynq 7045
COM Express
AMD Ryzen
V1807B
Jetson TX2
NVIDIA
Tegra X2
COM Express
Intel Stratix 10
COM Express
Intel Core i7
8th Gen
NVIDIA Jetson
Orin NX
COM Express
AMD EPYC
3451
8
▪ VEDLIoT accelerators support a large variety
of reconfigurable architectures
▪ From small embedded FPGAs to large ACAPs
▪ Large design space for FPGA-based accelerators
▪ Dynamic hardware reconfiguration
▪ Adapt to changing requirements at run-time
▪ Change characteristics of DL-accelerator
▪ Trade-off between
power and performance, power and accuracy, etc.
▪ Inference and training on FPGA
▪ Supports quantization from int8 to float32
▪ DL and Deep Reinforcement Learning
Reconfigurable DL accelerators
9
 Peak performance values of specialized accelerators, provided by the vendors
(precisions varying from INT8 to FP32)
Peak Performance of DL Accelerators
Average efficiency at 1000 GOPS /W
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
1
10
100
1,000
10,000
100,000
1,000,000
10,000,000
0.01 0.1 1 10 100 1000
Performance
[GOPS]
Power [Watt]
ASIC
GPU
FPGA
Ultra Low Power
High Performance
Low Power
10
Yolo v4 accelerator performance (1)
YoloV4
• 15 devices
• 59 measurements
11
Yolo v4 accelerator performance (2)
 Performance of Yolo v4 for different hardware platform has been evaluated
 Performance measurement for other networks (Resnet, EfficientNet) available as well
• ASICs (Hailo-8, Versal AI cores) achieve highest energy efficiency (only INT8)
• Embedded GPUs (Orin, Xavier) show good efficiency in all precisions
• GPGPU (GTX1660, V100, A100) are optimized for performance
12
Summary
 Efficient Heterogeneous Computing: The VEDLIoT hardware platform
RECS combines diverse compute architectures, boosting communication
and energy efficiency.
 Accelerator Integration: The VEDLIoT project harnesses RECS for
accelerator benchmarking and hardware/software integration.
 Seamless Edge to Cloud Integration: RECS offers a unified approach
across the computing spectrum, enhancing interoperability.
13
14
Big Picture
15
DL accelerator co-design
"FiBHA: Fixed Budget Hybrid CNN Accelerator", Fareed Qararyah, Muhammad Waqar Azhar, Pedro Trancoso, IEEE 34th International Symposium on Computer Architecture and High-
Performance Computing (SBAC-PAD 2022), Bordeaux, France, November 2–5 2022
Monolithic design
● One engine computes
all the core layers
● E.g. TPU
SEML
● One engine computes all
layers of the same type
● PW engine, DW engine
SESL
● One engine per layer
● E.g. FINN
FiBHA
● SESL + SEML
16
VEDLIoT‘s Deep Learning Toolchain
Enabling the rapid convergence of the fast pace
innovation on the hardware and software
Frameworks &
Exchange Formats
Optimization
Engine
Compilers &
Runtime APIs
Heterogeneous
Hardware
Platforms
17
Simulation platform for ML
accelerators
▪ RISC-V SoCs and Custom
Function Units
▪ Improve test and
verification
▪ Co-simulate Verilog blocks
▪ Used in Google’s CFU
Playground
▪ Continuous integration
based in Gitlab and Google
Cloud Platform
Safety and Robustness
Robustness verification on DL models
▪ Tuning hyperparameters
More in the
hands on
session
18
▪ Common environment for running distributed applications
▪ WebAssembly runtime + Trusted Execution Environment
▪ Security for edge (and cloud) devices
▪ Advances on attestation
▪ Better support for edge devices
▪ Distributed (Byzantine fault-tolerant) attestation and configuration service
▪ Secure IoT Gateway
Security
19
A compositional architecture framework for AIoT
Knowledge creation (e.g.
definition of safety goals).
Concept design (e.g.
introduction of redundancy
to fulfil safety goals).
Final design (e.g. assigning
functions to independent
processors to guarantee
redundancy).
Monitoring concept definition
(e.g. monitoring fulfilment of
safety goals at run-time).
Solution
Space
Problem
Space

More Related Content

Similar to VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT

Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステムShinnosuke Furuya
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoEmbarcados
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...Linaro
 
Summit 16: Deploying Virtualized Mobile Infrastructures on Openstack
Summit 16: Deploying Virtualized Mobile Infrastructures on OpenstackSummit 16: Deploying Virtualized Mobile Infrastructures on Openstack
Summit 16: Deploying Virtualized Mobile Infrastructures on OpenstackOPNFV
 
HiPEAC 2022_Marco Tassemeier presentation
HiPEAC 2022_Marco Tassemeier presentationHiPEAC 2022_Marco Tassemeier presentation
HiPEAC 2022_Marco Tassemeier presentationVEDLIoT Project
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors Rebekah Rodriguez
 
Fujitsu World Tour 2017 - Compute Platform For The Digital World
Fujitsu World Tour 2017 - Compute Platform For The Digital WorldFujitsu World Tour 2017 - Compute Platform For The Digital World
Fujitsu World Tour 2017 - Compute Platform For The Digital WorldFujitsu India
 
AI Crash Course- Supercomputing
AI Crash Course- SupercomputingAI Crash Course- Supercomputing
AI Crash Course- SupercomputingIntel IT Center
 
NGIoT standardisation workshops_Jens Hagemeyer presentation
NGIoT standardisation workshops_Jens Hagemeyer presentationNGIoT standardisation workshops_Jens Hagemeyer presentation
NGIoT standardisation workshops_Jens Hagemeyer presentationVEDLIoT Project
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureRebekah Rodriguez
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudRebekah Rodriguez
 
Arm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfArm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfPaul Yang
 
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...Shuquan Huang
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceLEGATO project
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power EdgeSashikris
 
G rpc talk with intel (3)
G rpc talk with intel (3)G rpc talk with intel (3)
G rpc talk with intel (3)Intel
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Intel® Software
 

Similar to VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT (20)

Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
 
Summit 16: Deploying Virtualized Mobile Infrastructures on Openstack
Summit 16: Deploying Virtualized Mobile Infrastructures on OpenstackSummit 16: Deploying Virtualized Mobile Infrastructures on Openstack
Summit 16: Deploying Virtualized Mobile Infrastructures on Openstack
 
HiPEAC 2022_Marco Tassemeier presentation
HiPEAC 2022_Marco Tassemeier presentationHiPEAC 2022_Marco Tassemeier presentation
HiPEAC 2022_Marco Tassemeier presentation
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
 
Fujitsu World Tour 2017 - Compute Platform For The Digital World
Fujitsu World Tour 2017 - Compute Platform For The Digital WorldFujitsu World Tour 2017 - Compute Platform For The Digital World
Fujitsu World Tour 2017 - Compute Platform For The Digital World
 
AI Crash Course- Supercomputing
AI Crash Course- SupercomputingAI Crash Course- Supercomputing
AI Crash Course- Supercomputing
 
NGIoT standardisation workshops_Jens Hagemeyer presentation
NGIoT standardisation workshops_Jens Hagemeyer presentationNGIoT standardisation workshops_Jens Hagemeyer presentation
NGIoT standardisation workshops_Jens Hagemeyer presentation
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
 
Arm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfArm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdf
 
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
 
No[1][1]
No[1][1]No[1][1]
No[1][1]
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe Conference
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
 
G rpc talk with intel (3)
G rpc talk with intel (3)G rpc talk with intel (3)
G rpc talk with intel (3)
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 

More from VEDLIoT Project

IoT Tech Expo 2023_Micha vor dem Berge presentation
IoT Tech Expo 2023_Micha vor dem Berge presentationIoT Tech Expo 2023_Micha vor dem Berge presentation
IoT Tech Expo 2023_Micha vor dem Berge presentationVEDLIoT Project
 
Computing Frontiers 2023_Pedro Trancoso presentation
Computing Frontiers 2023_Pedro Trancoso presentationComputing Frontiers 2023_Pedro Trancoso presentation
Computing Frontiers 2023_Pedro Trancoso presentationVEDLIoT Project
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationVEDLIoT Project
 
IoT Week 2022-NGIoT session_Micha vor dem Berge presentation
IoT Week 2022-NGIoT session_Micha vor dem Berge presentationIoT Week 2022-NGIoT session_Micha vor dem Berge presentation
IoT Week 2022-NGIoT session_Micha vor dem Berge presentationVEDLIoT Project
 
Next Generation IoT Architectures_Hans Salomonsson
Next Generation IoT Architectures_Hans SalomonssonNext Generation IoT Architectures_Hans Salomonsson
Next Generation IoT Architectures_Hans SalomonssonVEDLIoT Project
 
CONASENSE 2022_Jens Hagemeyer presentation
CONASENSE 2022_Jens Hagemeyer presentationCONASENSE 2022_Jens Hagemeyer presentation
CONASENSE 2022_Jens Hagemeyer presentationVEDLIoT Project
 
IoT Tech Expo 2023_Pedro Trancoso presentation
IoT Tech Expo 2023_Pedro Trancoso presentationIoT Tech Expo 2023_Pedro Trancoso presentation
IoT Tech Expo 2023_Pedro Trancoso presentationVEDLIoT Project
 
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentationHiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentationVEDLIoT Project
 
IoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentationIoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentationVEDLIoT Project
 
HiPEAC 2022_Marcelo Pasin presentation
HiPEAC 2022_Marcelo Pasin presentationHiPEAC 2022_Marcelo Pasin presentation
HiPEAC 2022_Marcelo Pasin presentationVEDLIoT Project
 
IoT Tech Expo 2023_Marcelo Pasin presentation
IoT Tech Expo 2023_Marcelo Pasin presentationIoT Tech Expo 2023_Marcelo Pasin presentation
IoT Tech Expo 2023_Marcelo Pasin presentationVEDLIoT Project
 
IoT Tech Expo 2023_Hans-Martin Heyn presentation
IoT Tech Expo 2023_Hans-Martin Heyn presentationIoT Tech Expo 2023_Hans-Martin Heyn presentation
IoT Tech Expo 2023_Hans-Martin Heyn presentationVEDLIoT Project
 
HiPEAC2022_António Casimiro presentation
HiPEAC2022_António Casimiro presentationHiPEAC2022_António Casimiro presentation
HiPEAC2022_António Casimiro presentationVEDLIoT Project
 
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentationNGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentationVEDLIoT Project
 
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...VEDLIoT Project
 
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar AzharHiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar AzharVEDLIoT Project
 
VEDLIoT at Stockholm Tech Live 2022
VEDLIoT at Stockholm Tech Live 2022VEDLIoT at Stockholm Tech Live 2022
VEDLIoT at Stockholm Tech Live 2022VEDLIoT Project
 
Industrial Pioneers Days - Machine Learning
Industrial Pioneers Days - Machine LearningIndustrial Pioneers Days - Machine Learning
Industrial Pioneers Days - Machine LearningVEDLIoT Project
 
AccML, co-located with HiPEAC 2021_Pedro Trancoso presentation
AccML, co-located with HiPEAC 2021_Pedro Trancoso presentationAccML, co-located with HiPEAC 2021_Pedro Trancoso presentation
AccML, co-located with HiPEAC 2021_Pedro Trancoso presentationVEDLIoT Project
 

More from VEDLIoT Project (19)

IoT Tech Expo 2023_Micha vor dem Berge presentation
IoT Tech Expo 2023_Micha vor dem Berge presentationIoT Tech Expo 2023_Micha vor dem Berge presentation
IoT Tech Expo 2023_Micha vor dem Berge presentation
 
Computing Frontiers 2023_Pedro Trancoso presentation
Computing Frontiers 2023_Pedro Trancoso presentationComputing Frontiers 2023_Pedro Trancoso presentation
Computing Frontiers 2023_Pedro Trancoso presentation
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentation
 
IoT Week 2022-NGIoT session_Micha vor dem Berge presentation
IoT Week 2022-NGIoT session_Micha vor dem Berge presentationIoT Week 2022-NGIoT session_Micha vor dem Berge presentation
IoT Week 2022-NGIoT session_Micha vor dem Berge presentation
 
Next Generation IoT Architectures_Hans Salomonsson
Next Generation IoT Architectures_Hans SalomonssonNext Generation IoT Architectures_Hans Salomonsson
Next Generation IoT Architectures_Hans Salomonsson
 
CONASENSE 2022_Jens Hagemeyer presentation
CONASENSE 2022_Jens Hagemeyer presentationCONASENSE 2022_Jens Hagemeyer presentation
CONASENSE 2022_Jens Hagemeyer presentation
 
IoT Tech Expo 2023_Pedro Trancoso presentation
IoT Tech Expo 2023_Pedro Trancoso presentationIoT Tech Expo 2023_Pedro Trancoso presentation
IoT Tech Expo 2023_Pedro Trancoso presentation
 
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentationHiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
 
IoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentationIoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentation
 
HiPEAC 2022_Marcelo Pasin presentation
HiPEAC 2022_Marcelo Pasin presentationHiPEAC 2022_Marcelo Pasin presentation
HiPEAC 2022_Marcelo Pasin presentation
 
IoT Tech Expo 2023_Marcelo Pasin presentation
IoT Tech Expo 2023_Marcelo Pasin presentationIoT Tech Expo 2023_Marcelo Pasin presentation
IoT Tech Expo 2023_Marcelo Pasin presentation
 
IoT Tech Expo 2023_Hans-Martin Heyn presentation
IoT Tech Expo 2023_Hans-Martin Heyn presentationIoT Tech Expo 2023_Hans-Martin Heyn presentation
IoT Tech Expo 2023_Hans-Martin Heyn presentation
 
HiPEAC2022_António Casimiro presentation
HiPEAC2022_António Casimiro presentationHiPEAC2022_António Casimiro presentation
HiPEAC2022_António Casimiro presentation
 
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentationNGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
 
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
 
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar AzharHiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
 
VEDLIoT at Stockholm Tech Live 2022
VEDLIoT at Stockholm Tech Live 2022VEDLIoT at Stockholm Tech Live 2022
VEDLIoT at Stockholm Tech Live 2022
 
Industrial Pioneers Days - Machine Learning
Industrial Pioneers Days - Machine LearningIndustrial Pioneers Days - Machine Learning
Industrial Pioneers Days - Machine Learning
 
AccML, co-located with HiPEAC 2021_Pedro Trancoso presentation
AccML, co-located with HiPEAC 2021_Pedro Trancoso presentationAccML, co-located with HiPEAC 2021_Pedro Trancoso presentation
AccML, co-located with HiPEAC 2021_Pedro Trancoso presentation
 

Recently uploaded

DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 

Recently uploaded (20)

DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 

VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT

  • 1. Rene Griessl Bielefeld University VEDLIoT – Accelerators for Heterogenous Computing in AIoT
  • 3. 3 VEDLIoT Hardware Platform  Heterogeneous, modular, scalable microserver system  Supporting the full spectrum of IoT from embedded over the edge towards the cloud  Different technology concepts for improving x86 GPU ML-ASIC ARM v8 GPU SoC FPGA SoC RISC-V FPGA VEDLIOT Cognitive IoT Platform  Performance  Cost-effectiveness  Maintainability  Reliability  Energy-Efficiency  Safety
  • 4. 4 RECS Architecture – RECS|BOX RECS Server Backplane (up to 15 Carriers) Carrier (PCIe Expansion) Carrier (High Performance) e.g. GPU-Accelerator Carrier (Low Power) #3 #2 Microserver (High Performance) #1 Microserver (Low Power) #16 #3 #2 Microserver (Low Power) #1 High-Speed Low-Latency Network (PCIe, High-Speed Serial) Compute Network (up to 40 GbE) Management Network (KVM, Monitoring, …) HDMI/USB iPass+ HD QSFP+ RJ45 Ext. Connectors GPU SoC FPGA SoC ARM Soc Low-Power Microserver (Apalis/Jetson) x86 ARM v8 High-Performance Microserver (COM Express) FPGA SoC High-Performance Carrier (up to 3 microservers) Low-Power Carrier (up to 16 microservers)
  • 5. 5 t.RECS t.RECS Edge Server  Optimized platform for local / edge applications  Provide interfaces for  Video  Camera  Peripheral input (USB)  Combine FPGA and GPU acceleration  Compact dimensions 1 RU, E-ATX form factor (2 RU/ 3 RU for special cases) RECS Architecture – t.RECS Microserver #3 (COM-HPC Client) Microserver #1 (COM-HPC Client) Microserver #2 (COM-HPC Server) Switched PCIe (Host to Host) External interfaces PCIe expansion Ethernet (up to 10 GbE) Management Network (KVM, Monitoring, …) I/O (Camera, Display, Radar/Lidar, Audio)
  • 6. 6 u.RECS u.RECS AIoT Server  Supports ML acceleration  FPGA  ASIC  Communication interfaces  Wired (CAN, Ethernet, CSI)  Wireless (WLAN, LoRa, 5G)  Sensors  Camera  Environment (Temp./Hum.)  Housekeeping  Embedded Device (~ 20x20x6 cm) RECS Architecture – u.RECS PCIe Ethernet (1 GbE & SPE) Management & Monitoring I/O (Camera, WiFi, LoRa, 4G/5G) Microserver #1 (SMARC 2.1) Microserver #2 (Jetson NX) ML Acc. (M.2) Front Panel 2x HDMI RJ45/ SPE 4x USB 3.1
  • 7. 7 Microserver Overview u.RECS t.RECS RECS|Box Xilinx Kria K26 NVIDIA Jetson Orin NX Hailo-8 SMARC 2.1 x86/ARM CPUs, FPGAs Raspberry Pi Compute Module 4 COM-HPC Client X86 COM-HPC NVIDIA AGX Adapter COM-HPC Server FPGA COM Express ARM v8 Server SoC Hi1616 COM Express Xilinx Zynq 7045 COM Express AMD Ryzen V1807B Jetson TX2 NVIDIA Tegra X2 COM Express Intel Stratix 10 COM Express Intel Core i7 8th Gen NVIDIA Jetson Orin NX COM Express AMD EPYC 3451
  • 8. 8 ▪ VEDLIoT accelerators support a large variety of reconfigurable architectures ▪ From small embedded FPGAs to large ACAPs ▪ Large design space for FPGA-based accelerators ▪ Dynamic hardware reconfiguration ▪ Adapt to changing requirements at run-time ▪ Change characteristics of DL-accelerator ▪ Trade-off between power and performance, power and accuracy, etc. ▪ Inference and training on FPGA ▪ Supports quantization from int8 to float32 ▪ DL and Deep Reinforcement Learning Reconfigurable DL accelerators
  • 9. 9  Peak performance values of specialized accelerators, provided by the vendors (precisions varying from INT8 to FP32) Peak Performance of DL Accelerators Average efficiency at 1000 GOPS /W [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 0.01 0.1 1 10 100 1000 Performance [GOPS] Power [Watt] ASIC GPU FPGA Ultra Low Power High Performance Low Power
  • 10. 10 Yolo v4 accelerator performance (1) YoloV4 • 15 devices • 59 measurements
  • 11. 11 Yolo v4 accelerator performance (2)  Performance of Yolo v4 for different hardware platform has been evaluated  Performance measurement for other networks (Resnet, EfficientNet) available as well • ASICs (Hailo-8, Versal AI cores) achieve highest energy efficiency (only INT8) • Embedded GPUs (Orin, Xavier) show good efficiency in all precisions • GPGPU (GTX1660, V100, A100) are optimized for performance
  • 12. 12 Summary  Efficient Heterogeneous Computing: The VEDLIoT hardware platform RECS combines diverse compute architectures, boosting communication and energy efficiency.  Accelerator Integration: The VEDLIoT project harnesses RECS for accelerator benchmarking and hardware/software integration.  Seamless Edge to Cloud Integration: RECS offers a unified approach across the computing spectrum, enhancing interoperability.
  • 13. 13
  • 15. 15 DL accelerator co-design "FiBHA: Fixed Budget Hybrid CNN Accelerator", Fareed Qararyah, Muhammad Waqar Azhar, Pedro Trancoso, IEEE 34th International Symposium on Computer Architecture and High- Performance Computing (SBAC-PAD 2022), Bordeaux, France, November 2–5 2022 Monolithic design ● One engine computes all the core layers ● E.g. TPU SEML ● One engine computes all layers of the same type ● PW engine, DW engine SESL ● One engine per layer ● E.g. FINN FiBHA ● SESL + SEML
  • 16. 16 VEDLIoT‘s Deep Learning Toolchain Enabling the rapid convergence of the fast pace innovation on the hardware and software Frameworks & Exchange Formats Optimization Engine Compilers & Runtime APIs Heterogeneous Hardware Platforms
  • 17. 17 Simulation platform for ML accelerators ▪ RISC-V SoCs and Custom Function Units ▪ Improve test and verification ▪ Co-simulate Verilog blocks ▪ Used in Google’s CFU Playground ▪ Continuous integration based in Gitlab and Google Cloud Platform Safety and Robustness Robustness verification on DL models ▪ Tuning hyperparameters More in the hands on session
  • 18. 18 ▪ Common environment for running distributed applications ▪ WebAssembly runtime + Trusted Execution Environment ▪ Security for edge (and cloud) devices ▪ Advances on attestation ▪ Better support for edge devices ▪ Distributed (Byzantine fault-tolerant) attestation and configuration service ▪ Secure IoT Gateway Security
  • 19. 19 A compositional architecture framework for AIoT Knowledge creation (e.g. definition of safety goals). Concept design (e.g. introduction of redundancy to fulfil safety goals). Final design (e.g. assigning functions to independent processors to guarantee redundancy). Monitoring concept definition (e.g. monitoring fulfilment of safety goals at run-time). Solution Space Problem Space

Editor's Notes

  1. Focus on Xilinx DPU and then Different DPU (Deeplearning processing Unit) Configurations Different DPU Configurations DPU for UltraScale (Zynq Virtex Kintex) in Fabric DPU for HBM Alveo Cards DPU in Hardware for Versal
  2. Different clusters are forming
  3. Extensive Benchmarks on 15 devices generating 59 measurements
  4. Self measured effiency data using the RECS system Values lower due tue real world performance, not peak performance
  5. SEML: Seingle Engine Multi Layer SESL: Single Engine Single Layer Matching the hardware to the accelerator Net distributes compute single layer in single engine Consider memory, what does it need to access, does it need cache?
  6. Model Optimazation due tue pruning or precision changes to tailor it to the used accelerator and reduce content switching for example
  7. Emulation of the Rebustness models in Renode
  8. Using the WebAssembly runtime to have a capsuled deployment on different system Integration of remote attastion in WebAssembly Available on ARM, Jetson, everything that supports OpTee
  9. Requirements engineering Frameworf (RAF) for ML/AI systems and solutions giving guidance how to design a system to make it secure efficient and working good Challenging as ML cannot be proven formally Important to make sure ML/AI solutions are save to use in for example autonomous driving AIoT Artificial intelligence of things