Jens Hagemeyer, Carola Haumann
Industrial Pioneers Days OWL –
Machine Learning
20. April 2021
VEDLIoT - Very Efficient Deep
Learning in IoT
Teaching the IoT to learn
2
 Platform
 Hardware: Scalable, heterogeneous, distributed
 Accelerators: Efficiency boost by FPGA and ASIC technology
 Toolchain: Optimizing Deep Learning for IoT
 Use cases
 Industrial IoT
 Automotive
 Smart Home
 Open call
 At project mid-term
 Early use and evaluation of VEDLIoT technology
Very Efficient Deep Learning IoT – VEDLIoT
 Call: H2020-ICT2020-1
 Topic: ICT-56-2020 Next Generation Internet of Things
 Duration: 1. November 2020 – 31. Oktober 2023
 Coordinator: Bielefeld University (Germany)
 Overall budget: 7 996 646.25 €
 Consortium: 12 partners from 4 EU countries (Germany,
Poland, Portugal and Sweden) and one associated
country (Switzerland).
More info:
 https://www.vedliot.eu/
 https://twitter.com/VEDLIoT
 https://www.linkedin.com/company/vedliot/
3
 Bielefeld University (UNIBI) - Coordinator
 Christmann (CHR)
 University of Osnabrück (UOS)
 Siemens (SIEMENS)
 University of Neuchâtel (UNINE)
 University of Lisbon (FC.ID)
 Chalmers (CHALMERS)
 University of Gothenburg (UGOT)
 RISE (RISE)
 EmbeDL (EMBEDL)
 Veoneer (VEONEER)
 Antmicro (ANT)
VEDLIOT partners
4
Cognitive IoT Platforms
Security,
Privacy
and
Trust,
Robustness
and
Safety
Hardware Accelerators
Requirements
Use Cases
Industrial IoT Smart Home
Automotive Open Call
Toolchain
VEDLIoT structure
Reconfigurable,
heterogeneous,
scalable
10x performance
improvement
10x efficiency
improvement
5
VEDLIoT Hardware Platform
 Heterogeneous, modular, scalable microserver system
 Different technology concepts for improving: computing power density, cost-effectiveness,
maintainability, and reliability for the full spectrum of IoT, from embedded devices over the edge
towards the cloud
x86
GPU
ML-ASIC
ARM v8
GPU
SoC
FPGA
SoC
RISC-V
FPGA
VEDLIOT Cognitive
IoT Platform
6
RECS Architecture (RECS|BOX, t.RECS)
RECS Server Backplane (up to 15 Carriers)
Carrier (PCIe Expansion)
Carrier (High Performance)
e.g. GPU-Accelerator
Carrier (Low Power)
#3
#2
Microserver
(High Performance)
#1
Microserver
(Low Power)
#16
#3
#2
Microserver
(Low Power)
#1
High-Speed Low-Latency Network (PCIe, High-Speed Serial)
Compute Network (up to 40 GbE)
Management Network (KVM, Monitoring, …)
HDMI/USB
iPass+ HD
QSFP+
RJ45
Ext. Connectors
GPU SoC
FPGA SoC ARM Soc
Low-Power Microserver (Apalis/Jetson)
x86 ARM v8
High-Performance Microserver (COM Express)
FPGA SoC
High-Performance
Carrier
(up to 3 microservers)
Low-Power Carrier
(up to 16 microservers)
7
 Embedded Device
 Supports ML acceleration
 FPGA
 ASIC
 Communication interfaces
 Wired (CAN, Ethernet, CSI)
 Wireless (WLAN, LoRa, 5G)
 Sensors
 Camera
 Environment (Temp./Hum.)
 Housekeeping
RECS Architecture (µ.RECS)
Processing Unit ML Accelerator
Wireless
Communication
Sensors
Wired Communication
Module (COM HPC; NVIDIA Jetson; …)
ML Accelerator
Carrier
Low-Power
Processing Unit
8
 End of Moore’s law & dark silicon – Domain Specific Architectures (DSA)
 Efficient, flexible, scalable accelerators for the compute continuum
 Algotecture – DL algorithm + computer architecture co-design
DL Accelerators
9
 Enabling the rapid convergence of the fast pace innovation on the hardware and
software
Deep Learning Toolchain
10
Simulation platform for IoT
 Open source framework for software/hardware co-development with CI-driven testing
capabilities, as well as metrics for measuring efficiency of ML workloads
 Enables development and continuous testing of VEDLIoT’s security features and its robustness
 Renode is available to all project members and future users of VEDLIoT and will include a
simulated model of the RISC-V-based FPGA SoC platform developed as part of the VEDLIoT
project
11
End-to-end attestation, support for execution and communication
 Support for
 End-to-end trust (attestation)
 Secrets distribution (keys)
 Machine learning (federated, streaming, …)
 Requirements
 Support for edge applications
 Decentralised, dynamic, loosely organised
 Rely on
 Trusted execution environments
 BFT and peer-to-peer algorithms
 Blockchain concepts
VEDLIoT
app
Attestation
Pool
12
Safety and robustness
 Define monitors for timeliness and data quality assessment
 Develop safety argumentation for adaptive solutions,
exploiting architectural hybridization
Hybrid System architecture:
• Some system components implemented
with assured reliability (as needed)
• Remaining components subject to
uncertainties and prone to failures
Safety requirements:
• Defined at design time for each operational
mode
Monitoring data:
• Collected in run-time, to allow verifying if
safety requirements are being satisfied
and trigger reconfiguration if not
Timeliness (e.g., deadline
miss detection)
Sensor data quality (e.g.,
outlier, noise detection)
Accuracy of DL systems
(e.g., anomaly detection)
13
 Increase safety, health and well being of residents – acceleration of AI methods for
demand-oriented user-home interaction
Use case: Smart Home / Assisted Living
14
 Smart Mirror as central user interface
 Own mirror image can be seen normally
 Intuitive control over gesture and voice
 Shows personalized information
 Public transportation schedule
 Cafeteria offering
 Football scorings/leader board
 Weather
 News
 …
 Data Privacy as the highest priority
-> Edge computation of many neural networks
Smart Mirror – Central Demonstrator for Smart Home
15
 Face recognition
 mobilenetSSD trained on WIDERFACE dataset
 Object detection
 YoloV3, Efficient-Net, yoloV4-tiny
 Gesture detection
 YoloV4-tiny with 3 Yolo layers (usually: 2 layers)
 Speech recognition
 Mozilla DeepSpeech
 AI Art: Style-Gan trained on works of arts
 Collect usage data in situation memory
Smart Mirror – Neural Networks
16
Industrial IoT – drive condition classification
 Control applications need DL-based condition classification
 On the edge device for low power consumption
 Suggestions for control and maintenance
 DL methods on all communication layers
 DL in a distributed architecture
 Dynamically configured systems
 Sensored testbench with 2 motors
 Acceleration, Magnetic field, Temperature,
IR-Cam (temperature), Current-Sensors, Torque
 On / Off detection without
motor current or voltage
 Cooling fault detection
 Bearing fault detection
17
Industrial IoT – Arc detection for DC distributions
 AI based pattern recognition for different local sensor data
 current, magnetic field, vibration, temperature, low resolution infrared picture
 Safety critical nature
 response time should be <10ms
 AI based or AI supported decision made by the sensor node itself or by a local part of the sensor
network
18
 Improve performance/cost ratio – AI processing hardware distributed over the entire
chain
Use case: Automotive
19
The fun has just started!
Follow our work!
https://twitter.com/VEDLIoT
https://www.linkedin.com/company/vedliot/
https://vedliot.eu
Be part of it
Open call at project mid-term
Allow early use and evaluation of VEDLIoT
technology
20
Thank you for your
attention.
Contact
Jens Hagemeyer, Carola Haumann
Bielefeld University, Germany
chaumann@cor-lab.uni-bielefeld.de
jhagemey@cit-ec.uni-bielefeld.de

Industrial Pioneers Days - Machine Learning

  • 1.
    Jens Hagemeyer, CarolaHaumann Industrial Pioneers Days OWL – Machine Learning 20. April 2021 VEDLIoT - Very Efficient Deep Learning in IoT Teaching the IoT to learn
  • 2.
    2  Platform  Hardware:Scalable, heterogeneous, distributed  Accelerators: Efficiency boost by FPGA and ASIC technology  Toolchain: Optimizing Deep Learning for IoT  Use cases  Industrial IoT  Automotive  Smart Home  Open call  At project mid-term  Early use and evaluation of VEDLIoT technology Very Efficient Deep Learning IoT – VEDLIoT  Call: H2020-ICT2020-1  Topic: ICT-56-2020 Next Generation Internet of Things  Duration: 1. November 2020 – 31. Oktober 2023  Coordinator: Bielefeld University (Germany)  Overall budget: 7 996 646.25 €  Consortium: 12 partners from 4 EU countries (Germany, Poland, Portugal and Sweden) and one associated country (Switzerland). More info:  https://www.vedliot.eu/  https://twitter.com/VEDLIoT  https://www.linkedin.com/company/vedliot/
  • 3.
    3  Bielefeld University(UNIBI) - Coordinator  Christmann (CHR)  University of Osnabrück (UOS)  Siemens (SIEMENS)  University of Neuchâtel (UNINE)  University of Lisbon (FC.ID)  Chalmers (CHALMERS)  University of Gothenburg (UGOT)  RISE (RISE)  EmbeDL (EMBEDL)  Veoneer (VEONEER)  Antmicro (ANT) VEDLIOT partners
  • 4.
    4 Cognitive IoT Platforms Security, Privacy and Trust, Robustness and Safety HardwareAccelerators Requirements Use Cases Industrial IoT Smart Home Automotive Open Call Toolchain VEDLIoT structure Reconfigurable, heterogeneous, scalable 10x performance improvement 10x efficiency improvement
  • 5.
    5 VEDLIoT Hardware Platform Heterogeneous, modular, scalable microserver system  Different technology concepts for improving: computing power density, cost-effectiveness, maintainability, and reliability for the full spectrum of IoT, from embedded devices over the edge towards the cloud x86 GPU ML-ASIC ARM v8 GPU SoC FPGA SoC RISC-V FPGA VEDLIOT Cognitive IoT Platform
  • 6.
    6 RECS Architecture (RECS|BOX,t.RECS) RECS Server Backplane (up to 15 Carriers) Carrier (PCIe Expansion) Carrier (High Performance) e.g. GPU-Accelerator Carrier (Low Power) #3 #2 Microserver (High Performance) #1 Microserver (Low Power) #16 #3 #2 Microserver (Low Power) #1 High-Speed Low-Latency Network (PCIe, High-Speed Serial) Compute Network (up to 40 GbE) Management Network (KVM, Monitoring, …) HDMI/USB iPass+ HD QSFP+ RJ45 Ext. Connectors GPU SoC FPGA SoC ARM Soc Low-Power Microserver (Apalis/Jetson) x86 ARM v8 High-Performance Microserver (COM Express) FPGA SoC High-Performance Carrier (up to 3 microservers) Low-Power Carrier (up to 16 microservers)
  • 7.
    7  Embedded Device Supports ML acceleration  FPGA  ASIC  Communication interfaces  Wired (CAN, Ethernet, CSI)  Wireless (WLAN, LoRa, 5G)  Sensors  Camera  Environment (Temp./Hum.)  Housekeeping RECS Architecture (µ.RECS) Processing Unit ML Accelerator Wireless Communication Sensors Wired Communication Module (COM HPC; NVIDIA Jetson; …) ML Accelerator Carrier Low-Power Processing Unit
  • 8.
    8  End ofMoore’s law & dark silicon – Domain Specific Architectures (DSA)  Efficient, flexible, scalable accelerators for the compute continuum  Algotecture – DL algorithm + computer architecture co-design DL Accelerators
  • 9.
    9  Enabling therapid convergence of the fast pace innovation on the hardware and software Deep Learning Toolchain
  • 10.
    10 Simulation platform forIoT  Open source framework for software/hardware co-development with CI-driven testing capabilities, as well as metrics for measuring efficiency of ML workloads  Enables development and continuous testing of VEDLIoT’s security features and its robustness  Renode is available to all project members and future users of VEDLIoT and will include a simulated model of the RISC-V-based FPGA SoC platform developed as part of the VEDLIoT project
  • 11.
    11 End-to-end attestation, supportfor execution and communication  Support for  End-to-end trust (attestation)  Secrets distribution (keys)  Machine learning (federated, streaming, …)  Requirements  Support for edge applications  Decentralised, dynamic, loosely organised  Rely on  Trusted execution environments  BFT and peer-to-peer algorithms  Blockchain concepts VEDLIoT app Attestation Pool
  • 12.
    12 Safety and robustness Define monitors for timeliness and data quality assessment  Develop safety argumentation for adaptive solutions, exploiting architectural hybridization Hybrid System architecture: • Some system components implemented with assured reliability (as needed) • Remaining components subject to uncertainties and prone to failures Safety requirements: • Defined at design time for each operational mode Monitoring data: • Collected in run-time, to allow verifying if safety requirements are being satisfied and trigger reconfiguration if not Timeliness (e.g., deadline miss detection) Sensor data quality (e.g., outlier, noise detection) Accuracy of DL systems (e.g., anomaly detection)
  • 13.
    13  Increase safety,health and well being of residents – acceleration of AI methods for demand-oriented user-home interaction Use case: Smart Home / Assisted Living
  • 14.
    14  Smart Mirroras central user interface  Own mirror image can be seen normally  Intuitive control over gesture and voice  Shows personalized information  Public transportation schedule  Cafeteria offering  Football scorings/leader board  Weather  News  …  Data Privacy as the highest priority -> Edge computation of many neural networks Smart Mirror – Central Demonstrator for Smart Home
  • 15.
    15  Face recognition mobilenetSSD trained on WIDERFACE dataset  Object detection  YoloV3, Efficient-Net, yoloV4-tiny  Gesture detection  YoloV4-tiny with 3 Yolo layers (usually: 2 layers)  Speech recognition  Mozilla DeepSpeech  AI Art: Style-Gan trained on works of arts  Collect usage data in situation memory Smart Mirror – Neural Networks
  • 16.
    16 Industrial IoT –drive condition classification  Control applications need DL-based condition classification  On the edge device for low power consumption  Suggestions for control and maintenance  DL methods on all communication layers  DL in a distributed architecture  Dynamically configured systems  Sensored testbench with 2 motors  Acceleration, Magnetic field, Temperature, IR-Cam (temperature), Current-Sensors, Torque  On / Off detection without motor current or voltage  Cooling fault detection  Bearing fault detection
  • 17.
    17 Industrial IoT –Arc detection for DC distributions  AI based pattern recognition for different local sensor data  current, magnetic field, vibration, temperature, low resolution infrared picture  Safety critical nature  response time should be <10ms  AI based or AI supported decision made by the sensor node itself or by a local part of the sensor network
  • 18.
    18  Improve performance/costratio – AI processing hardware distributed over the entire chain Use case: Automotive
  • 19.
    19 The fun hasjust started! Follow our work! https://twitter.com/VEDLIoT https://www.linkedin.com/company/vedliot/ https://vedliot.eu Be part of it Open call at project mid-term Allow early use and evaluation of VEDLIoT technology
  • 20.
    20 Thank you foryour attention. Contact Jens Hagemeyer, Carola Haumann Bielefeld University, Germany chaumann@cor-lab.uni-bielefeld.de jhagemey@cit-ec.uni-bielefeld.de