AI auf Edge Geräten
Dominik Helleberg Building IoT / 03.04.2019
AI or Machine Learning?
3
If it is written in Python, it's probably
Machine Learning.
If it is written in PowerPoint, it's
probably AI.
4 https://twitter.com/matvelloso/status/1065778379612282885?lang=de
Scenario
5
Why?
6
› Latency
› Autonomy
› Connectivity
› Security
› Economic (Costs: Hardware / Power)
› Privacy
Privacy
7 http://cnrpark.it/
Privacy
8
{
"timestamp":"1552035532715",
"slots":{
"id:214" : 0,
"id:215" : 1,
"id:286" : 1
}
}
Examples
9
Google Clips
https://ai.googleblog.com/2017/10/portrait-mode-on-pixel-2-and-pixel-2-xl.html
Portrait Mode
Examples
10
ARCore
https://ai.googleblog.com/2019/03/real-time-ar-self-expression-with.html
Alexa
HikVision
Face Detection
Examples
11
https://www.starship.xyz/
Robots
https://cloud.google.com/blog/products/ai-machine-learning/monitoring-home-
appliances-from-power-readings-with-ml
The Challenge
12
https://arstechnica.com/information-technology/2015/12/facebooks-open-sourcing-of-ai-hardware-is-the-start-of-the-deep-learning-revolution
https://cloud.google.com/tpu/
The Challenge
13
The Choices - Hardware
14
Software
15
CNN
LSTM
RNN
…
Optimization
Tools / Manual
Optimized
Network
(Reduced Accuarcy)
Software - Quantization
16
float
int
› Accelerated computations (maybe)
› Reduced size
https://sahnimanas.github.io/2018/06/24/quantization-in-tf-lite.html
› Reduced Accuracy
Software - Quantization
17
Quantization-Aware-training / Post-Training-Quantization
0,62
0,63
0,64
0,65
0,66
0,67
0,68
0,69
0,7
0,71
0,72
Top 1 Acc
Float Q-aware Training Post Training Q
Software - Pruning
18
https://www.oreilly.com/ideas/compressing-and-regularizing-deep-neural-networks
Software - Pruning
19
https://www.oreilly.com/ideas/compressing-and-regularizing-deep-neural-networks
0,56
0,58
0,6
0,62
0,64
0,66
0,68
0,7
0,72
Top 1 Acc
Channel Pruning
0% 50% 60% 70% 80%
The choices - Software
21
Converter
/ Optimizer
ML / DL
Framework
Runtime
› Consider the tools when selecting Hardware!
Vendor specific
limited features
limited network-architectures
The „project“
22
Build a smart device which can
› Recognize objects it sees („a person“)
› React in < 1 second to it
The „project“
23
Recognize objects it sees -> Image Classification
Choose network architecture:
› LeNet
› VGG
› GoogleNet
› Inception
› ResNet
› SqueezeNet
› NasNet
› AlexNet
› MobileNet
› …
› Try to stick to existing nets!
› MobileNetV1
The choices - Hardware
24
CPU
SoC
GPU
ASIC
MCU
FPGA
› React in < 1 second to it
The choices - Hardware
25
› How to compare?
› GOPS / TOPS
› GFLOPS / TFLOPS
› Inference / Sec
› Inference / Watt
= =
The choices - Hardware
26
Evaluation Hardware
GPU
FPGA
MCU+ASIC
ASIC
CPU
SoCs
Software
27
CNN
MobileNetV1
.pb
Jetson TX2
Raspberry Pi
Intel i7
.tflite
tfliteC
Qualcomm
SD 605
int8
float32
Open
Vino
.bin
.xml
Myriad X
float16
DNNC
.elf
Xilinx Zynq
UltraScale+
int8
Kendryte
modelC
.kmodel
Kendryte
K210
int8
Edge TPU
.tflite
The „project“
28
0
100
200
300
400
500
600
700
800
MobileNetV1 in ms
Performance
RasPi Jetson TX2 Myriad-X Zynq ZU3EG Edge TPU K210 MCU
The „project“
29
0
10
20
30
40
50
60
70
80
90
100
MobileNetV1 in ms
Performance
RasPi Jetson TX2 Myriad-X Zynq ZU3EG Edge TPU K210 MCU
The „project“
30
Performance is not (always) the key issue
› Cost
› Accuracy
› Power / Heat
› Flexibility
› Tooling
› Availibility
› …
Conclusion
31
› Validate your explicit use case
› Official Performance numbers -> rough guideline
› Your Model-Architecture might dictate your Hardware
› Always validate accuracy
› More „tools“ -> more effort / time
› Build a CI-Pipeline + Versionize everything
› Technology is in flux
› You‘re on the “bleeding“ egde
“Bleeding“ Edge Technology
32
I tensorflow/contrib/lite/toco/import_tensorflow.cc:937]
Converting unsupported operation: Pack
I tensorflow/contrib/lite/toco/import_tensorflow.cc:937]
Converting unsupported operation: Where
Usb_WritePipe: System err 995 W WinUsb_WritePipe failed with
error:=995 i[35mE: [xLink] [ 0] dispatcherEventSend:908
nUsb_ReadPipeWrite failed event -2 :[0m [31mF: [xLink] [ 0]
dispatcherResponseServe:346 Sno request for this response:
USB_READ_REL_RESP 1 9121 [0m y#### (i == MAX_EVENTS)
USB_READ_REL_RESP 1 9121
“Bleeding“ Edge Technology
33
The „project“
34
DEMO
Vielen Dank
Dominik Helleberg
Schanzenstraße 6-20
51063 Köln
dominik.helleberg@inovex.de

AI auf Edge-Geraeten