VEDLIoT – A heterogeneous hardware platform for next-gen AIoT applications, Jens Hagemeyer, EU-IoT Training Session on “Machine Learning at the Edge and the FarEdge”, IoT Week (online event), August 2021
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
IoT Week 2021_Jens Hagemeyer presentation
1. Jens Hagemeyer, Carola Haumann
Training Session on Machine Learning at the
Edge and the FarEdge
30. August 2021
VEDLIoT – A heterogeneous hardware
platform for next-gen AIoT applications
Teaching the IoT to learn
2. 2
Platform
Hardware: Scalable, heterogeneous, distributed
Accelerators: Efficiency boost by FPGA and ASIC technology
Toolchain: Optimizing Deep Learning for IoT
Use cases
Industrial IoT
Automotive
Smart Home
Open call
At project mid-term
Early use and evaluation of VEDLIoT technology
Very Efficient Deep Learning for IoT –
VEDLIoT
Call: H2020-ICT2020-1
Topic: ICT-56-2020 Next Generation Internet of Things
Duration: 1. November 2020 – 31. Oktober 2023
Coordinator: Bielefeld University (Germany)
Overall budget: 7 996 646.25 €
Consortium: 12 partners from 4 EU countries (Germany,
Poland, Portugal and Sweden) and one associated
country (Switzerland).
More info:
https://www.vedliot.eu/
https://twitter.com/VEDLIoT
https://www.linkedin.com/company/vedliot/
3. 3
Bielefeld University (UNIBI) - Coordinator
Christmann (CHR)
University of Osnabrück (UOS)
Siemens (SIEMENS)
University of Neuchâtel (UNINE)
University of Lisbon (FC.ID)
Chalmers (CHALMERS)
University of Gothenburg (UGOT)
RISE (RISE)
EmbeDL (EMBEDL)
Veoneer (VEONEER)
Antmicro (ANT)
VEDLIOT partners
5. 5
VEDLIoT Hardware Platform
Heterogeneous, modular, scalable microserver system
Supporting the full spectrum of IoT from embedded over the edge towards the cloud
Different technology concepts for improving
x86
GPU
ML-ASIC
ARM v8
GPU
SoC
FPGA
SoC
RISC-V
FPGA
VEDLIOT Cognitive
IoT Platform
Performance
Cost-effectiveness
Maintainability
Reliability
Energy-Efficiency
Safety
6. 6
RECS Architecture (RECS|BOX)
RECS Server Backplane (up to 15 Carriers)
Carrier (PCIe Expansion)
Carrier (High Performance)
e.g. GPU-Accelerator
Carrier (Low Power)
#3
#2
Microserver
(High Performance)
#1
Microserver
(Low Power)
#16
#3
#2
Microserver
(Low Power)
#1
High-Speed Low-Latency Network (PCIe, High-Speed Serial)
Compute Network (up to 40 GbE)
Management Network (KVM, Monitoring, …)
HDMI/USB
iPass+ HD
QSFP+
RJ45
Ext. Connectors
GPU SoC
FPGA SoC ARM Soc
Low-Power Microserver (Apalis/Jetson)
x86 ARM v8
High-Performance Microserver (COM Express)
FPGA SoC
High-Performance
Carrier
(up to 3 microservers)
Low-Power Carrier
(up to 16 microservers)
7. 7
RECS Architecture (t.RECS)
• Optimized platform for
local / edge applications
• Provide interfaces for
– Video
– Camera
– Peripheral input (USB)
• Combine FPGA and GPU acceleration
• Compact dimensions
1RU, E-ATX form factor
(2 RU/ 3 RU for special cases)
t.RECS Edge Server
Microserver
(COM-HPC Client)
Microserver
(COM-HPC Client)
Microserver
(COM-HPC Server)
Switched PCIe (Host to Host)
External
interfaces
PCIe
expansion
Ethernet (up to 10 GbE)
Management Network (KVM, Monitoring, …)
I/O (Camera, Display, Radar/Lidar, Audio)
8. 8
Embedded Device
Supports ML acceleration
FPGA
ASIC
Communication interfaces
Wired (CAN, Ethernet, CSI)
Wireless (WLAN, LoRa, 5G)
Sensors
Camera
Environment (Temp./Hum.)
Housekeeping
RECS Architecture (µ.RECS)
u.RECS Edge Server
ML
Accelerator
M.2
SMARC 2.1 JETSON NX
PCIe
Mux
Management
System
GbE Switch
Peripherals and I/O
Sensors
(CSI, ADC, I2S,…)
RJ45 SPE
LoRa
WiFi
/BLE
USB3.x 4G/5G
10. 10
Flexible and adaptable Accelerators for Deep
Learning
DL
Model
DL Model
CPU, GPU-
SoC,
ML-SoC
FPGA-SoC
End of Moore’s law & dark silicon
=> Domain Specific Architectures (DSA)
Efficient, flexible, scalable accelerators for
the compute continuum
Algotecture
Optimized DL algorithms
Optimized toolchain
Optimized computer architecture
Heterogeneous DL
Accelerator
Algotecture/
Co-Designed DL
Accelerator
Compiler
Co-Design
11. 11
VEDLIoT‘s Deep Learning Toolchain
Enabling the rapid convergence of the fast pace
innovation on the hardware and software
Frameworks &
Exchange Formats
Optimization
Engine
Compilers &
Runtime APIs
Heterogeneous
Hardware
Platforms
12. 12
Increase safety, health and well being of residents – acceleration of AI
methods for demand-oriented user-home interaction
Smart Mirror as central user interface
Own mirror image can be seen normally
Intuitive control over gesture and voice
Shows personalized information
Data Privacy as the highest priority
Edge computation of many neural networks
Use case: Smart Home / Assisted Living
13. 13
Face recognition
mobilenetSSD trained on WIDERFACE dataset
Object detection
YoloV3, Efficient-Net, yoloV4-tiny
Gesture detection
YoloV4-tiny with 3 Yolo layers (usually: 2 layers)
Speech recognition
Mozilla DeepSpeech
AI Art: Style-Gan trained on works of arts
Collect usage data in situation memory
Use case: Smart Mirror – Neural Networks
14. 14
Use case: Industrial IoT – drive condition
classification
Control applications need DL-based condition classification
On the edge device for low power consumption
Suggestions for control and maintenance
DL methods on all communication layers
DL in a distributed architecture
Dynamically configured systems
Sensored testbench with 2 motors
Acceleration, Magnetic field, Temperature,
IR-Cam (temperature), Current-Sensors, Torque
On / Off detection without
motor current or voltage
Cooling fault detection
Bearing fault detection
15. 15
Use case: Industrial IoT – Arc detectioc
AI based pattern recognition for different local sensor data
current, magnetic field, vibration, temperature, low resolution infrared picture
Safety critical nature
response time should be <10ms
AI based or AI supported decision made by the sensor node itself or by a local part of the sensor
network
16. 16
Focus on collision detection/avoidance scenario
Improve performance/cost ratio – AI processing hardware
distributed over the entire chain
Use case: Automotive
18. 18
Thank you for your
attention.
Contact
Jens Hagemeyer, Carola Haumann
Bielefeld University, Germany
chaumann@cor-lab.uni-bielefeld.de
jhagemey@cit-ec.uni-bielefeld.de
19. 19
End-to-end attestation, support for execution and
communication
Support for
End-to-end trust (attestation)
Secrets distribution (keys)
Machine learning (federated, streaming, …)
Requirements
Support for edge applications
Decentralised, dynamic, loosely organised
Rely on
Trusted execution environments
BFT and peer-to-peer algorithms
Blockchain concepts
VEDLIoT
app
Attestation
Pool
20. 20
Simulation platform for IoT
Open source framework for software/hardware co-development with CI-driven testing
capabilities, as well as metrics for measuring efficiency of ML workloads
Enables development and continuous testing of VEDLIoT’s security features and its robustness
Renode is available to all project members and future users of VEDLIoT and will include a
simulated model of the RISC-V-based FPGA SoC platform developed as part of the VEDLIoT
project
21. 21
Safety and robustness
Define monitors for timeliness and data quality assessment
Develop safety argumentation for adaptive solutions,
exploiting architectural hybridization
Hybrid System architecture:
• Some system components implemented
with assured reliability (as needed)
• Remaining components subject to
uncertainties and prone to failures
Safety requirements:
• Defined at design time for each operational
mode
Monitoring data:
• Collected in run-time, to allow verifying if
safety requirements are being satisfied
and trigger reconfiguration if not
Timeliness (e.g., deadline
miss detection)
Sensor data quality (e.g.,
outlier, noise detection)
Accuracy of DL systems
(e.g., anomaly detection)