SlideShare a Scribd company logo
1 of 94
Download to read offline
2 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Table of Contents
āž¢ Introduction:Why embedded machine learning?
āž¢ Three main ingredients
āž¢ Training our model
āž¢ How to run inference on a Raspberry Pi PICO?
āž¢ Conclusion
Why Embedded
Machine Learning ?
4 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Why Machine Learning?
āž¢ Very good at finding patterns
āž¢ Less human input needed
āž¢ Broadly applicable
āž¢ More processing power
āž¢ Lots and lots of data
āž¢ Explicability?
Can be a great tool!
Not the only tool!
ie., letā€™s avoid doing it for the fancy factor when
traditional computer vision techniques are more
suitable!
Need Machine learning
?
Machine learning ???
5 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Today
āž¢ Machine learning, and in particular deep learning with convolutional networks, is a
good tool to do classification on images
āž¢ Letā€™s try to look into such a classification taskā€¦ Embedded!
6 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Why Machine Learning on EDGE
Low Cost
Data stays local Less space usage
Independent of Internet
Connection
Low Energy Consumption
Low Latency
āœ“ Autonomous
āœ“ Reliable
āœ“ No bandwidth limitations
āœ“ Data privacy
āœ“ Security
āœ“ Control āœ“ Mobile application
āœ“ No Cloud Computations
āœ“ Vehicles, ā€¦
āœ“ Viability of business case
Why?
7 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Embedded Machine Learning
ā€¢ Only the inference will be embedded
ā€¢ i.e. no on-device training in this presentation
Three ingredients
9 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Three Ingredients
Hardware A Framework A Model
10 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Choosing a Frameworkā€¦
ā€¦BEFORETHE MICROCONTROLLER
āž¢ Similar accuracy (better with CNN?)
āž¢ Better for deployment
āž¢ Harder without Keras, easier with Keras
āž¢ TF lite/TFLĪ¼ for Embedded systems
āž¢ Similar accuracy (better with RNN?)
āž¢ Better for GPU support?
āž¢ Easier and more pythonic
āž¢ PyTorch live/mobile for ML on smartphones
11 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Choosing a Frameworkā€¦
ā€¦BEFORETHE MICROCONTROLLER
āž¢ Similar accuracy (better with CNN?)
āž¢ Better for deployment
āž¢ Harder without Keras, easier with Keras
āž¢ TF lite/TFLĪ¼ for Embedded systems
āž¢ Similar accuracy (better with RNN?)
āž¢ Better for GPU support?
āž¢ Easier and more pythonic
āž¢ PyTorch live/mobile for ML on smartphones
12 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
A Link: good or bad news?
āž¢ Sometimes, you can get the best of both worldsā€¦
A SOTA PyTorch
model
TF lite deployment
ONNX ā€¢ Open Neural Network Exchange
ā€¢ Allowing exchange between frameworks
ā€¢ Helping Hardware providers for AI optimisation
Letā€™s say you must useTensorFlow Lite for Microcontrollers (TFLu) ā€¦That does
not mean you can skip the choiceTensorFlow vs PyTorch!
13 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Embedded Frameworks
āž¢ A dedicated embedded framework providesā€¦
āž¢ It is not mandatory
āž¢ i.e., one could directly use python on a SBC like a raspberry PI
Optimisation/
Compression
On-device inference
Ā« engine Ā»
Getting rid of as much
as possible
14 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
STM32 CUBE AI
TFLu CUBE AI
Wider availability (Not only for STM boards) Better performance
CLI CLI/GUI
Interpreter-based Generated C++ code
Open-source Tools for testing
More help available on internet More information on your model
TFLU
Embedded Frameworks
15 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Different Networks
Convolutional layer Depthwise SeparableConvolution
4*4*27*2 = 864 operations 4*4*9*3 + 4*4*3*2 = 528 operations
āž¢ Depthwise separable convolution
āž¢ Thinner model possible (š›¼)
MobileNetV2
16 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Different Networks
GroupConvolution
Normal Convolution:
4*4*54*3 = 2592
operations
Group Convolution:
4*4*18*3 = 864
operations
4
4
channel shuffle
ShuffleNet
17 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Different Networks
COMPARISON
NAS = Neural Architecture Search
18 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Different Hardware
ā‘ The hardware can be microcontrollers not dedicated toAI
āž¢ Raspberry Pi PICO (RP2040/Arm cortex M0+), STM32 boards, ā€¦
ā‘ But it can be helped with a co-processorā€¦
19 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Why NPUs?
CPU
GPU
Or DSP (Digital Signal
Processor), ā€¦
NPU
Neural Processing Unit
āœ“ Always required
āœ“ Fast & versatile
āœ“ Improving everyday
āœ“ Models are smaller
āœ“ Can be a stand-alone good choice for inference
speed in some cases (sequential aspects of
Recurrent Neural Networks, small deep networks?)
āœ“ Better for parallelization
āœ“ Convolutional networks
āœ“ Can be much faster, but never
alone
TPU (Tensor Processing Unit)
from google
VPU (Vision Processing Unit)
from intel
Tensor Cores (Nvidia)
FPGA (reconfigurable aspects !)
ā€¦
20 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Our Setup
Raspberry 4
(Note: with RPI5, PCIe port present. &
Coral PCIe is 20 ā‚¬)
Tensorflow Lite model
&TFL_runtime
(smaller python package with only theTF
lite interpreter)
MobileNetV2
(224, 224)
Full integer quantized model
Coral USB accelerator
USB 3.0 required (otherwise, little gain due
to low data transfer)
60ā‚¬
PICAM
21 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Google Coral EdgeTPU
22 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Three Ingredients : lots of optionsā€¦
A hardware A framework A model
ā€¦
ā€¦
ā€¦
23 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Many things existā€¦
1
2
3
Letā€™s avoid name-dropping !
Many optionsā€¦
Letā€™s pick !
Feeling lost ā€¦
ā€¦But many available options is
also a good thing!
24 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Three Ingredients : the path for today!
Hardware A Framework A Model
TensorFlow Lite for
šœ‡Controllers
Raspberry PICO
MobileNetV2
Training our Model
26 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TF model
dataset
Pre-trained
weights
MobileNetV2
TF Lite model
Reduction/Optimization
Convert to .h file
TFLu interpreter Plane !
TFLu
TFLu for PICO
Make file
PICO SDK
C++ code
Training
Inference
27 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TF model
dataset
Pre-trained
weights
MobileNetV2
TF Lite model
Reduction/Optimization
Convert to .h file
TFLu interpreter Plane !
TFLu
TFLu for PICO
Make file
PICO SDK
C++ code
Training
Inference
28 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Database
SOURCES
āž¢ PascalVOC
āž¢ Image Classification & Object Detection
āž¢ 11.540 images, 20 classes
āž¢ COCO
āž¢ Object Detection
āž¢ 200.000 images, 80 classes
29 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Database
CUSTOM DATASET
āž¢ 8 classes: airplane, boat, bus, car, motorbike, none, person, train
āž¢ 800 images per class (600 training + 200 validation)
āž¢ Image size 224 x 224
āž¢ Bounding box
āž¢ Random size (minimum 25% of image)
āž¢ Random location
30 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Transfer Learning
BUILD NEW APPLICATIONS
āž¢ Pretrained network
āž¢ Feature extraction
āž¢ Reuse for different task
āž¢ Benefits
āž¢ Less data needed
āž¢ Lower training time
āž¢ Better generalization
āž¢ Fine-tuning
Note: We only re-train a last dense
layer for our classification task
31 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Training
HYPERPARAMETERS & OVERFITTING
āž¢ Data augmentation
āž¢ Prevent overfitting
āž¢ Increase dataset size
āž¢ Improve accuracy
āž¢ Images: flip, crop, rotate, zoom, stretch, contrast, brightnessā€¦
āž¢ Batch size
āž¢ Smaller = less overfitting BUT slower training
āž¢ Learning rate
āž¢ Larger batches = higher learning rate
āž¢ Decrease over time
32 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Database
INFLUENCEOF DATAAMOUNT
āž¢ MobileNetV2 (96 x 96) ā†’ remove random training images (same amount per class)
āž¢ 600 training images per class
āž¢ Accuracy: 0.868
āž¢ 150 training images per class
āž¢ Accuracy: 0.825
āž¢ 75 training images per class
āž¢ Accuracy: 0.812
āž¢ 37 training images per class
āž¢ Accuracy: 0.794
āœ“ Transfer Learning is Great!
33 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Training
EXAMPLE INTENSORFLOW
āž¢ Pretrained network: MobilenetV2
āž¢ 2.257.984 parameters
āž¢ Pretrained on ImageNet (1.300.000 images)
āž¢ Classifier: Fully connected layer
āž¢ 10.248 parameters
āž¢ Batch size = 20
āž¢ Learning rate = 0.0002
āž¢ Training time ā‰ˆ 2 minutes/epoch
34 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
ResNet50
āž¢ Parameters
āž¢ Base model: 23.587.712
āž¢ Classifier: 16.392
āž¢ Training time: 9 epochs, 6 minutes/epoch
āž¢ Size
āž¢ Normal: 94 Mb
āž¢ TFLite: 92 kb
āž¢ Performance
āž¢ Accuracy: 0.979
MobileNetV2
āž¢ Parameters
āž¢ Base model: 2.257.984
āž¢ Classifier: 10.248
āž¢ Training time: 18 epochs, 2 minutes/epoch
āž¢ Size
āž¢ Normal: 12 Mb
āž¢ TFLite: 9 kb
āž¢ Performance
āž¢ Accuracy: 0.970
Comparison
RESNETVS. MOBILENET (224 X 224)
35 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
MobileNetV2
COMPARISON
Input Resolution Scaling Factor Size (TFLite) Accuracy F1 score Inference time (pc)
224 x 224 1 8.698 kb 0.970 0.969 11.05 ms
96 x 96 1 8.698 kb 0.931 0.931 2.22 ms
96 x 96 0.35 1.597 kb 0.868 0.869 0.67 ms
48 x 48 1 8.698 kb 0.732 0.732 1.21 ms
48 x 48 0.35 1.597 kb 0.630 0.633 0.28 ms
(Accuracy drop in 48 x 48 model partly due to pretrained weights not available)
36 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
ResNet MobileNetV2
Comparison
RESNETVS. MOBILENET (224 X 224)
37 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
MobileNetV2
224VS. 96
224 x 224 (Ī± = 1) 96 x 96 (Ī± = 0.35)
38 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TFLite
COMPRESSED FLATBUFFER FORMAT
āž¢ Benefits
āž¢ Reduced size
āž¢ Faster inference
āž¢ Includes optimization possibilities
āž¢ Works out-0f-the-box for most models
āž¢ Not allTensorFlow operations supported
39 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Quantization
āž¢ Changing the datatype
āž¢ Like moving from RGB888 to RGB565 orYCbCr422 in computer vision
āž¢ Can be float16, dynamic range, ā€¦
āž¢ Here, we use 8 bit integers
āž¢ Post-TrainingQuantization (PTQ) vs QuantizationAwareTraining (QAT)
āž¢ Model is smaller & fasterā€¦
-128 127
min max
At the cost ofā€¦ Accuracy drop?
40 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Quantization Size (TFLite) Accuracy F1 score Inference time (RPI 4)
None 1.590 kb 0.869 0.830 4.17 ms
float16 825 kb 0.870 0.831 4.12 ms
Dynamic Range 538 kb 0.869 0.834 4.66 ms
Full Integer 611 kb 0.845 0.805 3.46 ms
Full integer
(Quantization Aware)
611 kb 0.852 0.810 3.46 ms
Quantization
RESULTS
Model: MobileNetV2 (96 x 96, Ī± = 0.35)
41 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Pruning
PRINCIPLE
āž¢ Weight pruning
āž¢ Gradually zero out weights
āž¢ Based on magnitude, activation, gradient ā€¦
āž¢ Intermediate training for recalibration
āž¢ Structured pruning
āž¢ Remove neurons/filters
āž¢ Ī± factor in MobileNetV2 architecture
āž¢ Reduced size due to efficient compression
āž¢ Improved inference time (skip zero computations)
42 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Pruning Size (TFLite) Size (zip) Accuracy F1 score Inference time (RPI 4)
None 1.590 kb 1.463 kb 0.869 0.830 4.17 ms
Dense layer (80%) 1.564 kb 1.439 kb 0.865 0.820 3.95 ms
Dense layer (90%) 1.558 kb 1.434 kb o.855 0.823 3.97 ms
Dense layer (80%)
+
1/3 Conv layers (50%)
1.191 kb 989 kb 0.769 0.748 3.99 ms
Pruning
RESULTS
Model: MobileNetV2 (96 x 96, Ī± = 0.35)
Less accuracy drop when pruning the
latter stages of the model
43 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Pruning and Quantization
COMBINED
Pruning Size (TFLite) Size (zip) Accuracy F1 score Inference time (RPI 4)
None 1.590 kb 1.463 kb 0.869 0.830 4.17 ms
Full integer 611 kb 0.845 0.805 3.46 ms
Dense layer (80%)
+
50 Conv layers (50%)
1.191 kb 989 kb 0.769 0.748 3.99 ms
Dense layer (80%)
+
1/3 Conv layers (50%)
+
Full integer
611 kb 365 kb 0.734 0.709 3.62 ms
Model: MobileNetV2 (96 x 96, Ī± = 0.35)
44 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Weight Clustering
PRINCIPLE
āž¢ Cluster weights in a layer in N clusters
āž¢ Cluster centroid value gets shared to all weights in cluster
āž¢ Additional fine-tuning possible
45 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Weight Clustering
RESULTS
Pruning Size (TFLite) Size (zip) Accuracy F1 score
None 1.590 kb 1.463 kb 0.869 0.830
Dense layer (90%) 1.558 kb 1.434 kb o.855 0.823
Dense layer
(16 clusters)
1.671 kb 1.442 kb 0.872 0.832
Model: MobileNetV2 (96 x 96, Ī± = 0.35)
Strip_clustering_wrapper function inTensorFlow not working
46 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Parameter Accuracy Size Inference
Decreased input resolution - - = ++ Enough information necessary in pixels
Decreased model size (Ī±) - - - ++ +++ NAS can improve the accuracy loss
Full integer quantization - ++ + Often required for inference on MCU
Pruning - (-) + = Less impact on later layers
Weight Clustering - + =
Overview
INFLUENCEOF DIFFERENT PARAMETERS
47 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Howeverā€¦
ā‘ We do not need to redo everything from scratch
ā‘ Many tools, tutorials, etc. are available
ā‘ A bunch of weights does not mean anything for us humans (hence all the work done on explicable AI) but we do not
need to understand themā€¦
āž¢ Accepting abstraction + using available tools = simpler than it may look?
OK, I lied.
I SAIDWE HADTHEWAY ANDTHEN I COMEWITH OTHER COMPARISONTABLESā€¦
ā€¢ We donā€™t need to create new model architectures : MobileNetV2!
ā€¢ We donā€™t need to implement them: TensorFlow!
ā€¢ We donā€™t even train most of that: transfer learning!
48 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Memory & accuracy are known from the model Inference time depends on the hardware !
Some results of inference time
BUT
Model RPI4 Coral
(usb2.0)
Coral Coral+
(96, 96) ~ 3.4 ms NA ~ 1.72 ms NA
(224, 224) ~100 ms ~ 12.6 ms ~ 5 ms ~3.3 ms
20 x faster
2 x faster
STM32H747I + MBV2 (š‘Ÿš‘’š‘  = 96, š›¼ = 0,35)
šŸ“šŸ” š’Žš’”
PICO + MBV2 (š‘Ÿš‘’š‘  = 48, š›¼ = 0,35)
250 š’Žš’”
Spoiler alert!
How to run inference
on a
Raspberry Pi PICO?
50 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TF model
dataset
Pre-trained
weights
MobileNetV2
TF Lite model
Reduction/Optimization
Convert to .h file
TFLĪ¼
interpreter
Plane !
TFLu
TFLu for PICO
Make file
PICO SDK
C++ code
Training
Inference
51 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TF model
dataset
Pre-trained
weights
MobileNetV2
TF Lite model
Reduction/Optimization
Convert to .h file
TFLu interpreter Plane !
TFLu
TFLu for PICO
Make file
PICO SDK
C++ code
Training
Inference
52 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Content
OVERVIEW
āž¢ Raspberry Pi Pico
1. Initial setup
āž¢ CMake file
āž¢ Pico-sdk library
2. Blinking a LED
3. Run inference
āž¢ TFLite-micro library
āž¢ Include model
āž¢ Execute the code
Ī¼
Ubuntu WSL
53 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Resources
WHERETO BEGIN
āž¢ Datasheet
āž¢ ā€œGetting Startedā€
āž¢ Githubā€™s README
We are helped!
54 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Initial Setup
FOLDER STRUCTURE
āž¢ Libraries needed:
āž¢ pico-sdk (software development kit)
āž¢ Install toolchain
āž¢ $ sudo apt install cmake gcc-arm-none-eabi libnewlib-arm-none-eabi build-essential
55 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Pico-sdk Library
INSTALLATION PROCEDURE
āž¢ Clone the repository and update the submodules
āž¢ $ git clone https://github.com/raspberrypi/pico-sdk.git --branch master
āž¢ $ cd pico-sdk
āž¢ $ git submodule update --init
āž¢ Copy pico_sdk_import.cmake from lib/pico-sdk/external to main folder
āž¢ Update pico_sdk_path variable
āž¢ $ export PICO_SDK_PATH=ā€˜<main_folder>/lib/pico_sdk
56 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Pico-sdk Library
PATHVARIABLE
Blinking a LED
58 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Blinking a LED
MAIN.CPP
#include <stdio.h>
#include "pico/stdlib.h"
#include "hardware/gpio.h"
#include "pico/binary_info.h"
/* As per raspberry pico pinout documentation. */
#define LED_PIN 28
/* Program entry point. */
int main() {
/* Initilisation of standard lib for input/output. */
stdio_init_all();
/* Initialisation of LED pin as ouput PIN, with LOW initial value. */
gpio_init(LED_PIN);
gpio_set_dir(LED_PIN, GPIO_OUT);
gpio_put(LED_PIN, 0);
/* Forever loop. */
while (true) {
/* Blinking the LED. */
gpio_put(LED_PIN, 1);
sleep_ms(1000);
gpio_put(LED_PIN, 0);
sleep_ms(1000);
}
/* Unreachable code. */
return 0;
}
59 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Blinking a LED
CMAKE FILE
cmake_minimum_required(VERSION 3.12)
include(pico_sdk_import.cmake)
project(picoDemo C CXX ASM)
set(CMAKE_C_STANDARD 11)
set(CMAKE_CXX_STANDARD 17)
pico_sdk_init()
add_compile_options(-Wall -Wno-format -Wno-unused-function -Wno-maybe-uninitialized)
add_executable(picoDemo src/main.cpp)
target_link_libraries(${PROJECT_NAME} pico_stdlib)
pico_enable_stdio_usb(picoDemo 1)
pico_enable_stdio_uart(picoDemo 0)
pico_add_extra_outputs(picoDemo)
pico-sdk library
uf2 file format to easily flash to the pico
Use USB connection for communication
60 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Blinking a LED
BUILDINGTHE PROJECT
āž¢ Navigate to the build folder
āž¢ $ cmake ..
āž¢ $ make
āž¢ Copy the created .uf2 file to the PICO to flash the program
Running Inference
62 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Pico-tflmicro
CLONE REPOSITORY
āž¢ Navigate to the lib folder
āž¢ $ git clone https://github.com/raspberrypi/pico-tflmicro.git
āž¢ This is a repository that:
āž¢ Includes theTensorFlow library (https://github.com/tensorflow/tflite-micro)
āž¢ BUT already configured for the pico
āž¢ Later we will show how to configure directly theTensorFlow library
63 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Folder structure
64 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
INCLUDE IN PROJECT
Pico-tflmicro
āž¢ In main.cpp add
#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "tensorflow/lite/micro/tflite_bridge/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.hā€œ
āž¢ In CMakeLists.txt add
target_link_libraries(${PROJECT_NAME} pico_stdlib pico-tflmicro)
add_subdirectory("lib/pico-tflmicro" EXCLUDE_FROM_ALL)
65 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
INCLUDE TEST IMAGE
Test image
Actual 48x48 image used for inference
Included as image.h (array with the data), eventually should come from camera
66 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
INCLUDE MODEL
āž¢ .tflite model can be converted to .h via a command
āž¢ $ xxd ā€“i model_name.tflite > new_model_name.h
āž¢ Open the model and make it a const unsigned char!
67 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
MAIN.CPP ā€“ NECESSARY INCLUDES
#include <stdio.h>
#include "pico/stdlib.h"
#include "hardware/gpio.h"
#include "pico/binary_info.h"
/* Specific includes for tensorflow lite for microcontrollers. */
#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "tensorflow/lite/micro/tflite_bridge/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
/* The image that will be tested. */
#include "image.h"
/* The trained model, convert for TFLu, and within a C header file. */
#include "..//models//model.h"
/* As per raspberry pico pinout documentation. */
#define LED_PIN 28
/* Adding all operations that were available before, i.e. 128 operations. */
using AllOpsResolver_t = tflite::MicroMutableOpResolver<9>;
68 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
MAIN.CPP ā€“ FUNCTIONSTO PREPROCESSTHE IMAGE (NEEDED FOR CORRECT INPUTTOTHE MODEL)
/* Rescaling to perform operations on data fo similar scale. */
float rescaling(float x, float scale, float offset) {
return (x * scale) - offset;
}
/* Quantization procedure, i.e. moving from a number represented with floats to a number
represented with int8. */
int8_t quantize(float x, float scale, float zero_point) {
return (x/scale) + zero_point;
}
69 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
MAIN.CPP ā€“ INITIALIZE LED, LABELSAND IMAGE SIZE
/* Program entry point. */
int main() {
/* Initilisation of the standard lib for input/output. */
stdio_init_all();
/* Initialisation of the LED pin, as an ouput PIN, with LOW initial value. */
gpio_init(LED_PIN);
gpio_set_dir(LED_PIN, GPIO_OUT);
gpio_put(LED_PIN, 0);
/* Image dimensions (48,48) on 3 channels (RGB). */
int Npix = 48;
int Nchan = 3;
int Nlabels = 8;
/* The 8 possible labels for the classifier as strings for the serial output. */
const char *label [] = {"aeroplane", "boat", "bus", "car", "motorbike", "none", "person", "train"};
70 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
MAIN.CPP ā€“ INITIALIZETFLITE-MICROOBJECTS
/* Initialisation of the TFLu interpreter. */
static const tflite::Model* tflu_model = nullptr;
static tflite::MicroInterpreter* tflu_interpreter = nullptr;
static TfLiteTensor* tflu_i_tensor = nullptr;
static TfLiteTensor* tflu_o_tensor = nullptr;
/* The ops resolver and error report. */
static AllOpsResolver_t op_resolver;
static tflite::MicroErrorReporter micro_error_reporter;
tflite::ErrorReporter* error_reporter = &micro_error_reporter;
71 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
MAIN.CPP ā€“ ADDTHEOPERATIONS INCLUDED INYOUR MODEL
op_resolver.AddConv2D(tflite::Register_CONV_2D_INT8());
op_resolver.AddDepthwiseConv2D(tflite::Register_DEPTHWISE_CONV_2D_INT8());
op_resolver.AddPad();
op_resolver.AddAdd(tflite::Register_ADD_INT8());
op_resolver.AddRelu6();
op_resolver.AddMean();
op_resolver.AddSoftmax(tflite::Register_SOFTMAX_INT8());
op_resolver.AddFullyConnected(tflite::Register_FULLY_CONNECTED_INT8());
op_resolver.AddDequantize();
72 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
MAIN.CPP ā€“ MORE INITIALIZATION + INCLUDINGTHE MODEL
/* Allocation of the tensor arena, in the HEAP. */
constexpr int tensor_arena_size = 144000;
uint8_t *tensor_arena = nullptr;
tensor_arena = (uint8_t *)malloc(tensor_arena_size);
/* Initilizing the scaling values. */
float scaling_scale = 1.0f/127.5f;
int32_t scaling_offset = -1.0f;
/* Retrieving the model from the header file. */
tflu_model = ::tflite::GetModel(mobilenet48_int_input_tflite);
/* Creating the interpreter and allocating tensors. */
static tflite::MicroInterpreter static_interpreter(tflu_model, op_resolver, tensor_arena, tensor_arena_size);
tflu_interpreter = &static_interpreter;
TfLiteStatus allocate_status = tflu_interpreter->AllocateTensors();
if (allocate_status != kTfLiteOk) {
printf("Issue when allocating the tensors. rn");
}
73 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
MAIN.CPP ā€“ LINK INPUT/OUTPUTANDGET QUANTIZATION PARAMETERS
/* Linking the interpreter to the input/output tensors. */
tflu_i_tensor = tflu_interpreter->input(0);
tflu_o_tensor = tflu_interpreter->output(0);
/* Retrieving the quantization parameters from the model. */
const auto* i_quantization = reinterpret_cast<TfLiteAffineQuantization*>(tflu_i_tensor->quantization.params);
float tfluQuant_scale = i_quantization->scale->data[0];
int32_t tfluQuant_zeropoint = i_quantization->zero_point->data[0];
/* Indices initialization. */
int idx = 0;
float value = 0;
float value_scaled = 0;
float value_quant = 0;
int idx_tf = 0;
74 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
MAIN.CPP ā€“ DO RESCALINGANDQUANTIZATIONAND GIVE ITAS INPUTTOTHE MODEL
/* Forever loop. */
while (true) {
/* Blinking the LED. */
gpio_put(LED_PIN, 1);
sleep_ms(1000);
gpio_put(LED_PIN, 0);
sleep_ms(1000);
/* Preparing the input. */
for (int i(0); i<Npix; i++) {
for (int j(0); j<Npix; j++) {
for (int k(0); k<Nchan; k++) {
/* Compute the 1D index*/
idx = k*Npix*Npix + j*Npix + i;
value = test_image[idx];
/* Re-scale than quantize the result. */
value_scaled = rescaling(value, scaling_scale, scaling_offset);
value_quant = quantize(value_scaled, tfluQuant_scale, tfluQuant_zeropoint);
/* Put the result in the input tensor */
tflu_i_tensor->data.int8[idx] = value_quant;
}
}
}
75 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Run Inference
MAIN.CPP ā€“ RUN INFERENCEAND PRINT RESULT
/* Call the interpreter to infer a label. */
TfLiteStatus invoke_status = tflu_interpreter->Invoke();
/* Print the probabilities for each labels with the serial communication. */
printf("Result: [%f; %f; %f; %f; %f; %f; %f; %f].n", tflu_o_tensor->data.f[0], tflu_o_tensor->data.f[1], tflu_o_tensor->data.f[2],
tflu_o_tensor->data.f[3], tflu_o_tensor->data.f[4], tflu_o_tensor->data.f[5],
tflu_o_tensor->data.f[6], tflu_o_tensor->data.f[7]);
/* Retrieve the result with maximum likelihood. */
size_t ix_max = 0;
float pb_max = 0;
for (size_t ix = 0; ix<=Nlabels; ix++) {
if (tflu_o_tensor->data.f[ix] > pb_max) {
ix_max = ix;
pb_max = tflu_o_tensor->data.f[ix];
}
}
/* Print the most likely label with the serial communication. */
printf("Result of inference: %s with proba %f.n", label[ix_max], pb_max);
}
Result
77 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TFLite-micro
RESULT
āž¢ Result:
āž¢ aeroplane: 0.488
āž¢ boat: 0.446
āž¢ bus: 0.012
āž¢ car: 0.023
āž¢ moborbike: 0.000
āž¢ none: 0.004
āž¢ person: 0.004
āž¢ train: 0.012
TFLite-micro
ONLY USINGTHETENSORFLOW GITHUB
79 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Three githubs
TensorFlow Tflite-micro Pico-tflmicro
Just a subset Library already
prepared for pico
1
2
Used in the demo until now
Alternative route, shown in
next slides
80 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TFLite-Micro
ONLY USINGTHETENSORFLOW GITHUB
āž¢ Navigate to the lib folder
āž¢ $ git clone https://github.com/tensorflow/tflite-micro.git
āž¢ $ cd tflite-micro
āž¢ $ make -f tensorflow/lite/micro/tools/make/Makefile TARGET=cortex_m_generic
TARGET_ARCH=cortex-m0plus OPTIMIZED_KERNEL_DIR=cmsis_nn microlite
āž¢ TheTARGET andTARGET_ARCH are specific to the hardware (PICO in this case)
81 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TFLite-Micro
CMAKELISTS.TXT
add_compile_definitions(TF_LITE_STATIC_MEMORY=1)
target_link_libraries(${PROJECT_NAME} pico_stdlib
$ENV{TFLITE_MICRO_PATH}/gen/cortex_m_generic_cortex-m0plus_default/lib/libtensorflow-microlite.a)
include_directories(${PROJECT_NAME} PRIVATE $ENV{TFLITE_MICRO_PATH}/)
include_directories(${PROJECT_NAME} PRIVATE
$ENV{TFLITE_MICRO_PATH}/tensorflow/lite/micro/tools/make/downloads/flatbuffers/include/)
include_directories(${PROJECT_NAME} PRIVATE
$ENV{TFLITE_MICRO_PATH}/tensorflow/lite/micro/tools/make/downloads/gemmlowp/)
tflite-micro library
Includes needed for
the code in main.cpp
82 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TFLite-Micro
MAIN.CPP
āž¢ Most of the code in main.cpp is the same
āž¢ Following slides show the things that change
83 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TFLite-Micro
MAIN.CPP
āž¢ Main.cpp:
#include "tensorflow/lite/micro/tflite_bridge/micro_error_reporter.hā€œ
#include "tensorflow/lite/micro/micro_interpreter.hā€œ
#include "tensorflow/lite/schema/schema_generated.hā€œ
#include "tensorflow/lite/micro/system_setup.hā€œ
#include "tensorflow/lite/micro/micro_log.hā€œ
#include "tensorflow/lite/micro/micro_mutable_op_resolver.hā€œ
84 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TFLite-Micro
MAIN.CPP
āž¢ Main.cpp:
const tflite::Model* model = nullptr;
tflite::MicroInterpreter* interpreter = nullptr;
TfLiteTensor* input = nullptr;
constexpr int kTensorArenaSize = 160000;
alignas(16) static uint8_t tensor_arena[kTensorArenaSize];
tflite::InitializeTarget();
static tflite::MicroMutableOpResolver<9> op_resolver;
added
Replaces allOpsResolver
85 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
TFLite-Micro
MAIN.CPP
āž¢ Main.cpp:
āž¢ op_resolver.AddConv2D(tflite::Register_CONV_2D_INT8());
āž¢ op_resolver.AddDepthwiseConv2D(tflite::Register_DEPTHWISE_CONV_2D_INT8());
āž¢ op_resolver.AddPad();
āž¢ op_resolver.AddAdd(tflite::Register_ADD_INT8());
āž¢ op_resolver.AddRelu6();
āž¢ op_resolver.AddMean();
āž¢ op_resolver.AddSoftmax(tflite::Register_SOFTMAX_INT8());
āž¢ op_resolver.AddFullyConnected(tflite::Register_FULLY_CONNECTED_INT8());
āž¢ op_resolver.AddDequantize();
Conclusion
87 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Next steps
We want to have the inference live with images coming from a ~5ā‚¬ camera
āž¢ OV7670 sensor
Get images but keep the corresponding cost to a minimum ?
Counting on
the CPU (SW)?
Not Counting
on CPU (HW)?
ā€¢ Efficient memory usage
ā€¢ Fusing operations
ā€¢ ā€¦
ā€¢ PIO
ā€¢ DMA (direct memory access)
88 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
PIO
āž¢ A PIO instance contains 4 state machines
āž¢ A state machine is like a tiny processor that can execute a limited set of instructions
āž¢ The CPU loads the corresponding instructions, enabling/disabling the state machines
āž¢ Its a way to delegate some workload away from the CPU, i.e. communication with
the camera
āž¢ We wanted to show a quick example where the Raspberry Pi PICO blinks a LED
without using the CPU butā€¦ time is ticking !
PROGRAMMABLE INPUT/OUPUT
89 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Edge AI is Multidisciplinary
Hardware
Data
science
Software
A
B
C
Wanting a first inference from scratchā€¦
A need is defined, hence technical specifications must be met
Seeking to be state of the artā€¦
Bad news.
All 3 needed.
Bad news.
Mastering all 3 needed.
Good news.
Insisting on the most comfortable one(s).
Usually what must be
achieved?
90 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Edge AI issues
Multidisciplinary expertise requiredā€¦ andā€¦
āž¢ Fast evolving field (new models, new frameworks,
new HWā€¦)
āž¢ Many constraints to respect for the solution:
āž¢ Technical (inference time, memory usage, ā€¦)
āž¢ Human (appropriate to user expertise)
āž¢ Business (at an OK cost)
Hardware
Data
science
Software
91 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Conclusion
āž¢ Beware of name dropping
āž¢ First a need, then embedded machine learning is a possibility (and technical
specifications are the finish line)
āž¢ Embedded Machine learning is inherently multidisciplinary
āž¢ Very complex from scratchā€¦ But we are helped!
92 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Donā€™t hesitate to contact us!
āž¢ Questions about this presentation?
āž¢ Miguel Lejeune (miguel.lejeune@sirris.be, +32 490 01 41 44)
āž¢ Vincent Lucas (vincent.lucas@sirris.be, +32 493 31 15 92)
āž¢ Questions about other technologies/Sirris offerings?
āž¢ Questions about fundings ?
āž¢Bas Rottier (bas.rottier@sirris.be, +32 491 86 91 70)
93 31/01/2024
Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢
Thanks!
LINKEDIN.COM/COMPANY/SIRRIS
@SIRRIS_BE
FACEBOOK.COM/SIRRIS.BE
SIRRIS.BE

More Related Content

Similar to Presentation - webinar embedded machine learning

WekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound AgainWekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound Again
inside-BigData.com
Ā 
ā€œHigh-fidelity Conversion of Floating-point Networks for Low-precision Infere...
ā€œHigh-fidelity Conversion of Floating-point Networks for Low-precision Infere...ā€œHigh-fidelity Conversion of Floating-point Networks for Low-precision Infere...
ā€œHigh-fidelity Conversion of Floating-point Networks for Low-precision Infere...
Edge AI and Vision Alliance
Ā 

Similar to Presentation - webinar embedded machine learning (20)

Distributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learnedDistributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learned
Ā 
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBShaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDB
Ā 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
Ā 
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemHai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Ā 
Accelerating algorithmic and hardware advancements for power efficient on-dev...
Accelerating algorithmic and hardware advancements for power efficient on-dev...Accelerating algorithmic and hardware advancements for power efficient on-dev...
Accelerating algorithmic and hardware advancements for power efficient on-dev...
Ā 
Hyper-Convergence: Worth the Hype?
Hyper-Convergence: Worth the Hype?Hyper-Convergence: Worth the Hype?
Hyper-Convergence: Worth the Hype?
Ā 
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep Learning Frameworks Using Spark on YARN by Vartika SinghDeep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Ā 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Ā 
Nervana and the Future of Computing
Nervana and the Future of ComputingNervana and the Future of Computing
Nervana and the Future of Computing
Ā 
Thoughts on Cybersecurity
Thoughts on CybersecurityThoughts on Cybersecurity
Thoughts on Cybersecurity
Ā 
Supermicro AI Pod thatā€™s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod thatā€™s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod thatā€™s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod thatā€™s Super Simple, Super Scalable, and Super Affordable
Ā 
Aberdeen Oil & Gas Event - AWS Partner Eurotech
Aberdeen Oil & Gas Event - AWS Partner EurotechAberdeen Oil & Gas Event - AWS Partner Eurotech
Aberdeen Oil & Gas Event - AWS Partner Eurotech
Ā 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Ā 
WekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound AgainWekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound Again
Ā 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AI
Ā 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentation
Ā 
Part 1: Clouderaā€™s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Clouderaā€™s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Clouderaā€™s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Clouderaā€™s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Ā 
ā€œHigh-fidelity Conversion of Floating-point Networks for Low-precision Infere...
ā€œHigh-fidelity Conversion of Floating-point Networks for Low-precision Infere...ā€œHigh-fidelity Conversion of Floating-point Networks for Low-precision Infere...
ā€œHigh-fidelity Conversion of Floating-point Networks for Low-precision Infere...
Ā 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Ā 
HPC Advisory Council Stanford Conference 2016
HPC Advisory Council Stanford Conference 2016HPC Advisory Council Stanford Conference 2016
HPC Advisory Council Stanford Conference 2016
Ā 

More from Sirris

Challenges and solutions for improved durability of materials - Opin summary ...
Challenges and solutions for improved durability of materials - Opin summary ...Challenges and solutions for improved durability of materials - Opin summary ...
Challenges and solutions for improved durability of materials - Opin summary ...
Sirris
Ā 
Challenges and solutions for improved durability of materials - Hybrid joints...
Challenges and solutions for improved durability of materials - Hybrid joints...Challenges and solutions for improved durability of materials - Hybrid joints...
Challenges and solutions for improved durability of materials - Hybrid joints...
Sirris
Ā 
Challenges and solutions for improved durability of materials - Corrosion mon...
Challenges and solutions for improved durability of materials - Corrosion mon...Challenges and solutions for improved durability of materials - Corrosion mon...
Challenges and solutions for improved durability of materials - Corrosion mon...
Sirris
Ā 
Challenges and solutions for improved durability of materials - Coatings done...
Challenges and solutions for improved durability of materials - Coatings done...Challenges and solutions for improved durability of materials - Coatings done...
Challenges and solutions for improved durability of materials - Coatings done...
Sirris
Ā 

More from Sirris (20)

2 - Pattyn - Smart Products Webinar 03-02-2023.
2 - Pattyn - Smart Products Webinar 03-02-2023.2 - Pattyn - Smart Products Webinar 03-02-2023.
2 - Pattyn - Smart Products Webinar 03-02-2023.
Ā 
2021 01-27 - webinar - Corrosie van 3D geprinte onderdelen
2021 01-27 - webinar - Corrosie van 3D geprinte onderdelen2021 01-27 - webinar - Corrosie van 3D geprinte onderdelen
2021 01-27 - webinar - Corrosie van 3D geprinte onderdelen
Ā 
2021/0/15 - Solarwinds supply chain attack: why we should take it sereously
2021/0/15 - Solarwinds supply chain attack: why we should take it sereously2021/0/15 - Solarwinds supply chain attack: why we should take it sereously
2021/0/15 - Solarwinds supply chain attack: why we should take it sereously
Ā 
20200923 inside metal am webinar_laborelec
20200923 inside metal am webinar_laborelec20200923 inside metal am webinar_laborelec
20200923 inside metal am webinar_laborelec
Ā 
20200923 inside metal am webinar sirris-crm
20200923 inside metal am webinar sirris-crm20200923 inside metal am webinar sirris-crm
20200923 inside metal am webinar sirris-crm
Ā 
Challenges and solutions for improved durability of materials - Opin summary ...
Challenges and solutions for improved durability of materials - Opin summary ...Challenges and solutions for improved durability of materials - Opin summary ...
Challenges and solutions for improved durability of materials - Opin summary ...
Ā 
Challenges and solutions for improved durability of materials - Hybrid joints...
Challenges and solutions for improved durability of materials - Hybrid joints...Challenges and solutions for improved durability of materials - Hybrid joints...
Challenges and solutions for improved durability of materials - Hybrid joints...
Ā 
Challenges and solutions for improved durability of materials - Corrosion mon...
Challenges and solutions for improved durability of materials - Corrosion mon...Challenges and solutions for improved durability of materials - Corrosion mon...
Challenges and solutions for improved durability of materials - Corrosion mon...
Ā 
Challenges and solutions for improved durability of materials - Concrete in m...
Challenges and solutions for improved durability of materials - Concrete in m...Challenges and solutions for improved durability of materials - Concrete in m...
Challenges and solutions for improved durability of materials - Concrete in m...
Ā 
Challenges and solutions for improved durability of materials - Coatings done...
Challenges and solutions for improved durability of materials - Coatings done...Challenges and solutions for improved durability of materials - Coatings done...
Challenges and solutions for improved durability of materials - Coatings done...
Ā 
Futureproof by sirris- product of the future
Futureproof by sirris- product of the futureFutureproof by sirris- product of the future
Futureproof by sirris- product of the future
Ā 
2018 11-07-verbinden-ongelijksoortige-materialen-hupico multimaterial welding
2018 11-07-verbinden-ongelijksoortige-materialen-hupico multimaterial welding2018 11-07-verbinden-ongelijksoortige-materialen-hupico multimaterial welding
2018 11-07-verbinden-ongelijksoortige-materialen-hupico multimaterial welding
Ā 
2018 11-07-verbinden-ongelijksoortige-materialen-bil ongelijksoortige materia...
2018 11-07-verbinden-ongelijksoortige-materialen-bil ongelijksoortige materia...2018 11-07-verbinden-ongelijksoortige-materialen-bil ongelijksoortige materia...
2018 11-07-verbinden-ongelijksoortige-materialen-bil ongelijksoortige materia...
Ā 
2018 11-07-verbinden-ongelijksoortige-materialen-sirris bil-flanders_make_mmj
2018 11-07-verbinden-ongelijksoortige-materialen-sirris bil-flanders_make_mmj2018 11-07-verbinden-ongelijksoortige-materialen-sirris bil-flanders_make_mmj
2018 11-07-verbinden-ongelijksoortige-materialen-sirris bil-flanders_make_mmj
Ā 
2018 11-07-verbinden-ongelijksoortige-materialen-ku leuven-lijmen
2018 11-07-verbinden-ongelijksoortige-materialen-ku leuven-lijmen2018 11-07-verbinden-ongelijksoortige-materialen-ku leuven-lijmen
2018 11-07-verbinden-ongelijksoortige-materialen-ku leuven-lijmen
Ā 
Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Lcv lasercladding for...
Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Lcv lasercladding for...Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Lcv lasercladding for...
Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Lcv lasercladding for...
Ā 
Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Juno industries mecha...
Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Juno industries mecha...Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Juno industries mecha...
Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Juno industries mecha...
Ā 
Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Castolin verbinden v...
Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Castolin  verbinden v...Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Castolin  verbinden v...
Slotevent ā€˜Verbinden van ongelijksoortige materialenā€™ - Castolin verbinden v...
Ā 
Masterclass Mechatronics 4.0 - Indoor and outdoor localisation and positionin...
Masterclass Mechatronics 4.0 - Indoor and outdoor localisation and positionin...Masterclass Mechatronics 4.0 - Indoor and outdoor localisation and positionin...
Masterclass Mechatronics 4.0 - Indoor and outdoor localisation and positionin...
Ā 
Invisible but functional - protective coatings
Invisible but functional - protective coatingsInvisible but functional - protective coatings
Invisible but functional - protective coatings
Ā 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
Ā 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
Ā 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
Ā 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
Ā 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
Ā 

Recently uploaded (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
Ā 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
Ā 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
Ā 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
Ā 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Ā 
Mcleodganj Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Ā 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
Ā 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
Ā 
Elevate Developer Efficiency & build GenAI Application with Amazon Qā€‹
Elevate Developer Efficiency & build GenAI Application with Amazon Qā€‹Elevate Developer Efficiency & build GenAI Application with Amazon Qā€‹
Elevate Developer Efficiency & build GenAI Application with Amazon Qā€‹
Ā 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Ā 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
Ā 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
Ā 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Ā 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
Ā 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Ā 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Ā 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
Ā 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Ā 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
Ā 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Ā 

Presentation - webinar embedded machine learning

  • 1.
  • 2. 2 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Table of Contents āž¢ Introduction:Why embedded machine learning? āž¢ Three main ingredients āž¢ Training our model āž¢ How to run inference on a Raspberry Pi PICO? āž¢ Conclusion
  • 4. 4 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Why Machine Learning? āž¢ Very good at finding patterns āž¢ Less human input needed āž¢ Broadly applicable āž¢ More processing power āž¢ Lots and lots of data āž¢ Explicability? Can be a great tool! Not the only tool! ie., letā€™s avoid doing it for the fancy factor when traditional computer vision techniques are more suitable! Need Machine learning ? Machine learning ???
  • 5. 5 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Today āž¢ Machine learning, and in particular deep learning with convolutional networks, is a good tool to do classification on images āž¢ Letā€™s try to look into such a classification taskā€¦ Embedded!
  • 6. 6 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Why Machine Learning on EDGE Low Cost Data stays local Less space usage Independent of Internet Connection Low Energy Consumption Low Latency āœ“ Autonomous āœ“ Reliable āœ“ No bandwidth limitations āœ“ Data privacy āœ“ Security āœ“ Control āœ“ Mobile application āœ“ No Cloud Computations āœ“ Vehicles, ā€¦ āœ“ Viability of business case Why?
  • 7. 7 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Embedded Machine Learning ā€¢ Only the inference will be embedded ā€¢ i.e. no on-device training in this presentation
  • 9. 9 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Three Ingredients Hardware A Framework A Model
  • 10. 10 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Choosing a Frameworkā€¦ ā€¦BEFORETHE MICROCONTROLLER āž¢ Similar accuracy (better with CNN?) āž¢ Better for deployment āž¢ Harder without Keras, easier with Keras āž¢ TF lite/TFLĪ¼ for Embedded systems āž¢ Similar accuracy (better with RNN?) āž¢ Better for GPU support? āž¢ Easier and more pythonic āž¢ PyTorch live/mobile for ML on smartphones
  • 11. 11 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Choosing a Frameworkā€¦ ā€¦BEFORETHE MICROCONTROLLER āž¢ Similar accuracy (better with CNN?) āž¢ Better for deployment āž¢ Harder without Keras, easier with Keras āž¢ TF lite/TFLĪ¼ for Embedded systems āž¢ Similar accuracy (better with RNN?) āž¢ Better for GPU support? āž¢ Easier and more pythonic āž¢ PyTorch live/mobile for ML on smartphones
  • 12. 12 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ A Link: good or bad news? āž¢ Sometimes, you can get the best of both worldsā€¦ A SOTA PyTorch model TF lite deployment ONNX ā€¢ Open Neural Network Exchange ā€¢ Allowing exchange between frameworks ā€¢ Helping Hardware providers for AI optimisation Letā€™s say you must useTensorFlow Lite for Microcontrollers (TFLu) ā€¦That does not mean you can skip the choiceTensorFlow vs PyTorch!
  • 13. 13 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Embedded Frameworks āž¢ A dedicated embedded framework providesā€¦ āž¢ It is not mandatory āž¢ i.e., one could directly use python on a SBC like a raspberry PI Optimisation/ Compression On-device inference Ā« engine Ā» Getting rid of as much as possible
  • 14. 14 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ STM32 CUBE AI TFLu CUBE AI Wider availability (Not only for STM boards) Better performance CLI CLI/GUI Interpreter-based Generated C++ code Open-source Tools for testing More help available on internet More information on your model TFLU Embedded Frameworks
  • 15. 15 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Different Networks Convolutional layer Depthwise SeparableConvolution 4*4*27*2 = 864 operations 4*4*9*3 + 4*4*3*2 = 528 operations āž¢ Depthwise separable convolution āž¢ Thinner model possible (š›¼) MobileNetV2
  • 16. 16 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Different Networks GroupConvolution Normal Convolution: 4*4*54*3 = 2592 operations Group Convolution: 4*4*18*3 = 864 operations 4 4 channel shuffle ShuffleNet
  • 17. 17 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Different Networks COMPARISON NAS = Neural Architecture Search
  • 18. 18 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Different Hardware ā‘ The hardware can be microcontrollers not dedicated toAI āž¢ Raspberry Pi PICO (RP2040/Arm cortex M0+), STM32 boards, ā€¦ ā‘ But it can be helped with a co-processorā€¦
  • 19. 19 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Why NPUs? CPU GPU Or DSP (Digital Signal Processor), ā€¦ NPU Neural Processing Unit āœ“ Always required āœ“ Fast & versatile āœ“ Improving everyday āœ“ Models are smaller āœ“ Can be a stand-alone good choice for inference speed in some cases (sequential aspects of Recurrent Neural Networks, small deep networks?) āœ“ Better for parallelization āœ“ Convolutional networks āœ“ Can be much faster, but never alone TPU (Tensor Processing Unit) from google VPU (Vision Processing Unit) from intel Tensor Cores (Nvidia) FPGA (reconfigurable aspects !) ā€¦
  • 20. 20 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Our Setup Raspberry 4 (Note: with RPI5, PCIe port present. & Coral PCIe is 20 ā‚¬) Tensorflow Lite model &TFL_runtime (smaller python package with only theTF lite interpreter) MobileNetV2 (224, 224) Full integer quantized model Coral USB accelerator USB 3.0 required (otherwise, little gain due to low data transfer) 60ā‚¬ PICAM
  • 21. 21 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Google Coral EdgeTPU
  • 22. 22 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Three Ingredients : lots of optionsā€¦ A hardware A framework A model ā€¦ ā€¦ ā€¦
  • 23. 23 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Many things existā€¦ 1 2 3 Letā€™s avoid name-dropping ! Many optionsā€¦ Letā€™s pick ! Feeling lost ā€¦ ā€¦But many available options is also a good thing!
  • 24. 24 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Three Ingredients : the path for today! Hardware A Framework A Model TensorFlow Lite for šœ‡Controllers Raspberry PICO MobileNetV2
  • 26. 26 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TF model dataset Pre-trained weights MobileNetV2 TF Lite model Reduction/Optimization Convert to .h file TFLu interpreter Plane ! TFLu TFLu for PICO Make file PICO SDK C++ code Training Inference
  • 27. 27 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TF model dataset Pre-trained weights MobileNetV2 TF Lite model Reduction/Optimization Convert to .h file TFLu interpreter Plane ! TFLu TFLu for PICO Make file PICO SDK C++ code Training Inference
  • 28. 28 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Database SOURCES āž¢ PascalVOC āž¢ Image Classification & Object Detection āž¢ 11.540 images, 20 classes āž¢ COCO āž¢ Object Detection āž¢ 200.000 images, 80 classes
  • 29. 29 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Database CUSTOM DATASET āž¢ 8 classes: airplane, boat, bus, car, motorbike, none, person, train āž¢ 800 images per class (600 training + 200 validation) āž¢ Image size 224 x 224 āž¢ Bounding box āž¢ Random size (minimum 25% of image) āž¢ Random location
  • 30. 30 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Transfer Learning BUILD NEW APPLICATIONS āž¢ Pretrained network āž¢ Feature extraction āž¢ Reuse for different task āž¢ Benefits āž¢ Less data needed āž¢ Lower training time āž¢ Better generalization āž¢ Fine-tuning Note: We only re-train a last dense layer for our classification task
  • 31. 31 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Training HYPERPARAMETERS & OVERFITTING āž¢ Data augmentation āž¢ Prevent overfitting āž¢ Increase dataset size āž¢ Improve accuracy āž¢ Images: flip, crop, rotate, zoom, stretch, contrast, brightnessā€¦ āž¢ Batch size āž¢ Smaller = less overfitting BUT slower training āž¢ Learning rate āž¢ Larger batches = higher learning rate āž¢ Decrease over time
  • 32. 32 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Database INFLUENCEOF DATAAMOUNT āž¢ MobileNetV2 (96 x 96) ā†’ remove random training images (same amount per class) āž¢ 600 training images per class āž¢ Accuracy: 0.868 āž¢ 150 training images per class āž¢ Accuracy: 0.825 āž¢ 75 training images per class āž¢ Accuracy: 0.812 āž¢ 37 training images per class āž¢ Accuracy: 0.794 āœ“ Transfer Learning is Great!
  • 33. 33 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Training EXAMPLE INTENSORFLOW āž¢ Pretrained network: MobilenetV2 āž¢ 2.257.984 parameters āž¢ Pretrained on ImageNet (1.300.000 images) āž¢ Classifier: Fully connected layer āž¢ 10.248 parameters āž¢ Batch size = 20 āž¢ Learning rate = 0.0002 āž¢ Training time ā‰ˆ 2 minutes/epoch
  • 34. 34 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ ResNet50 āž¢ Parameters āž¢ Base model: 23.587.712 āž¢ Classifier: 16.392 āž¢ Training time: 9 epochs, 6 minutes/epoch āž¢ Size āž¢ Normal: 94 Mb āž¢ TFLite: 92 kb āž¢ Performance āž¢ Accuracy: 0.979 MobileNetV2 āž¢ Parameters āž¢ Base model: 2.257.984 āž¢ Classifier: 10.248 āž¢ Training time: 18 epochs, 2 minutes/epoch āž¢ Size āž¢ Normal: 12 Mb āž¢ TFLite: 9 kb āž¢ Performance āž¢ Accuracy: 0.970 Comparison RESNETVS. MOBILENET (224 X 224)
  • 35. 35 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ MobileNetV2 COMPARISON Input Resolution Scaling Factor Size (TFLite) Accuracy F1 score Inference time (pc) 224 x 224 1 8.698 kb 0.970 0.969 11.05 ms 96 x 96 1 8.698 kb 0.931 0.931 2.22 ms 96 x 96 0.35 1.597 kb 0.868 0.869 0.67 ms 48 x 48 1 8.698 kb 0.732 0.732 1.21 ms 48 x 48 0.35 1.597 kb 0.630 0.633 0.28 ms (Accuracy drop in 48 x 48 model partly due to pretrained weights not available)
  • 36. 36 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ ResNet MobileNetV2 Comparison RESNETVS. MOBILENET (224 X 224)
  • 37. 37 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ MobileNetV2 224VS. 96 224 x 224 (Ī± = 1) 96 x 96 (Ī± = 0.35)
  • 38. 38 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TFLite COMPRESSED FLATBUFFER FORMAT āž¢ Benefits āž¢ Reduced size āž¢ Faster inference āž¢ Includes optimization possibilities āž¢ Works out-0f-the-box for most models āž¢ Not allTensorFlow operations supported
  • 39. 39 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Quantization āž¢ Changing the datatype āž¢ Like moving from RGB888 to RGB565 orYCbCr422 in computer vision āž¢ Can be float16, dynamic range, ā€¦ āž¢ Here, we use 8 bit integers āž¢ Post-TrainingQuantization (PTQ) vs QuantizationAwareTraining (QAT) āž¢ Model is smaller & fasterā€¦ -128 127 min max At the cost ofā€¦ Accuracy drop?
  • 40. 40 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Quantization Size (TFLite) Accuracy F1 score Inference time (RPI 4) None 1.590 kb 0.869 0.830 4.17 ms float16 825 kb 0.870 0.831 4.12 ms Dynamic Range 538 kb 0.869 0.834 4.66 ms Full Integer 611 kb 0.845 0.805 3.46 ms Full integer (Quantization Aware) 611 kb 0.852 0.810 3.46 ms Quantization RESULTS Model: MobileNetV2 (96 x 96, Ī± = 0.35)
  • 41. 41 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Pruning PRINCIPLE āž¢ Weight pruning āž¢ Gradually zero out weights āž¢ Based on magnitude, activation, gradient ā€¦ āž¢ Intermediate training for recalibration āž¢ Structured pruning āž¢ Remove neurons/filters āž¢ Ī± factor in MobileNetV2 architecture āž¢ Reduced size due to efficient compression āž¢ Improved inference time (skip zero computations)
  • 42. 42 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Pruning Size (TFLite) Size (zip) Accuracy F1 score Inference time (RPI 4) None 1.590 kb 1.463 kb 0.869 0.830 4.17 ms Dense layer (80%) 1.564 kb 1.439 kb 0.865 0.820 3.95 ms Dense layer (90%) 1.558 kb 1.434 kb o.855 0.823 3.97 ms Dense layer (80%) + 1/3 Conv layers (50%) 1.191 kb 989 kb 0.769 0.748 3.99 ms Pruning RESULTS Model: MobileNetV2 (96 x 96, Ī± = 0.35) Less accuracy drop when pruning the latter stages of the model
  • 43. 43 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Pruning and Quantization COMBINED Pruning Size (TFLite) Size (zip) Accuracy F1 score Inference time (RPI 4) None 1.590 kb 1.463 kb 0.869 0.830 4.17 ms Full integer 611 kb 0.845 0.805 3.46 ms Dense layer (80%) + 50 Conv layers (50%) 1.191 kb 989 kb 0.769 0.748 3.99 ms Dense layer (80%) + 1/3 Conv layers (50%) + Full integer 611 kb 365 kb 0.734 0.709 3.62 ms Model: MobileNetV2 (96 x 96, Ī± = 0.35)
  • 44. 44 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Weight Clustering PRINCIPLE āž¢ Cluster weights in a layer in N clusters āž¢ Cluster centroid value gets shared to all weights in cluster āž¢ Additional fine-tuning possible
  • 45. 45 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Weight Clustering RESULTS Pruning Size (TFLite) Size (zip) Accuracy F1 score None 1.590 kb 1.463 kb 0.869 0.830 Dense layer (90%) 1.558 kb 1.434 kb o.855 0.823 Dense layer (16 clusters) 1.671 kb 1.442 kb 0.872 0.832 Model: MobileNetV2 (96 x 96, Ī± = 0.35) Strip_clustering_wrapper function inTensorFlow not working
  • 46. 46 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Parameter Accuracy Size Inference Decreased input resolution - - = ++ Enough information necessary in pixels Decreased model size (Ī±) - - - ++ +++ NAS can improve the accuracy loss Full integer quantization - ++ + Often required for inference on MCU Pruning - (-) + = Less impact on later layers Weight Clustering - + = Overview INFLUENCEOF DIFFERENT PARAMETERS
  • 47. 47 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Howeverā€¦ ā‘ We do not need to redo everything from scratch ā‘ Many tools, tutorials, etc. are available ā‘ A bunch of weights does not mean anything for us humans (hence all the work done on explicable AI) but we do not need to understand themā€¦ āž¢ Accepting abstraction + using available tools = simpler than it may look? OK, I lied. I SAIDWE HADTHEWAY ANDTHEN I COMEWITH OTHER COMPARISONTABLESā€¦ ā€¢ We donā€™t need to create new model architectures : MobileNetV2! ā€¢ We donā€™t need to implement them: TensorFlow! ā€¢ We donā€™t even train most of that: transfer learning!
  • 48. 48 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Memory & accuracy are known from the model Inference time depends on the hardware ! Some results of inference time BUT Model RPI4 Coral (usb2.0) Coral Coral+ (96, 96) ~ 3.4 ms NA ~ 1.72 ms NA (224, 224) ~100 ms ~ 12.6 ms ~ 5 ms ~3.3 ms 20 x faster 2 x faster STM32H747I + MBV2 (š‘Ÿš‘’š‘  = 96, š›¼ = 0,35) šŸ“šŸ” š’Žš’” PICO + MBV2 (š‘Ÿš‘’š‘  = 48, š›¼ = 0,35) 250 š’Žš’” Spoiler alert!
  • 49. How to run inference on a Raspberry Pi PICO?
  • 50. 50 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TF model dataset Pre-trained weights MobileNetV2 TF Lite model Reduction/Optimization Convert to .h file TFLĪ¼ interpreter Plane ! TFLu TFLu for PICO Make file PICO SDK C++ code Training Inference
  • 51. 51 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TF model dataset Pre-trained weights MobileNetV2 TF Lite model Reduction/Optimization Convert to .h file TFLu interpreter Plane ! TFLu TFLu for PICO Make file PICO SDK C++ code Training Inference
  • 52. 52 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Content OVERVIEW āž¢ Raspberry Pi Pico 1. Initial setup āž¢ CMake file āž¢ Pico-sdk library 2. Blinking a LED 3. Run inference āž¢ TFLite-micro library āž¢ Include model āž¢ Execute the code Ī¼ Ubuntu WSL
  • 53. 53 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Resources WHERETO BEGIN āž¢ Datasheet āž¢ ā€œGetting Startedā€ āž¢ Githubā€™s README We are helped!
  • 54. 54 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Initial Setup FOLDER STRUCTURE āž¢ Libraries needed: āž¢ pico-sdk (software development kit) āž¢ Install toolchain āž¢ $ sudo apt install cmake gcc-arm-none-eabi libnewlib-arm-none-eabi build-essential
  • 55. 55 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Pico-sdk Library INSTALLATION PROCEDURE āž¢ Clone the repository and update the submodules āž¢ $ git clone https://github.com/raspberrypi/pico-sdk.git --branch master āž¢ $ cd pico-sdk āž¢ $ git submodule update --init āž¢ Copy pico_sdk_import.cmake from lib/pico-sdk/external to main folder āž¢ Update pico_sdk_path variable āž¢ $ export PICO_SDK_PATH=ā€˜<main_folder>/lib/pico_sdk
  • 56. 56 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Pico-sdk Library PATHVARIABLE
  • 58. 58 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Blinking a LED MAIN.CPP #include <stdio.h> #include "pico/stdlib.h" #include "hardware/gpio.h" #include "pico/binary_info.h" /* As per raspberry pico pinout documentation. */ #define LED_PIN 28 /* Program entry point. */ int main() { /* Initilisation of standard lib for input/output. */ stdio_init_all(); /* Initialisation of LED pin as ouput PIN, with LOW initial value. */ gpio_init(LED_PIN); gpio_set_dir(LED_PIN, GPIO_OUT); gpio_put(LED_PIN, 0); /* Forever loop. */ while (true) { /* Blinking the LED. */ gpio_put(LED_PIN, 1); sleep_ms(1000); gpio_put(LED_PIN, 0); sleep_ms(1000); } /* Unreachable code. */ return 0; }
  • 59. 59 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Blinking a LED CMAKE FILE cmake_minimum_required(VERSION 3.12) include(pico_sdk_import.cmake) project(picoDemo C CXX ASM) set(CMAKE_C_STANDARD 11) set(CMAKE_CXX_STANDARD 17) pico_sdk_init() add_compile_options(-Wall -Wno-format -Wno-unused-function -Wno-maybe-uninitialized) add_executable(picoDemo src/main.cpp) target_link_libraries(${PROJECT_NAME} pico_stdlib) pico_enable_stdio_usb(picoDemo 1) pico_enable_stdio_uart(picoDemo 0) pico_add_extra_outputs(picoDemo) pico-sdk library uf2 file format to easily flash to the pico Use USB connection for communication
  • 60. 60 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Blinking a LED BUILDINGTHE PROJECT āž¢ Navigate to the build folder āž¢ $ cmake .. āž¢ $ make āž¢ Copy the created .uf2 file to the PICO to flash the program
  • 62. 62 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Pico-tflmicro CLONE REPOSITORY āž¢ Navigate to the lib folder āž¢ $ git clone https://github.com/raspberrypi/pico-tflmicro.git āž¢ This is a repository that: āž¢ Includes theTensorFlow library (https://github.com/tensorflow/tflite-micro) āž¢ BUT already configured for the pico āž¢ Later we will show how to configure directly theTensorFlow library
  • 63. 63 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Folder structure
  • 64. 64 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ INCLUDE IN PROJECT Pico-tflmicro āž¢ In main.cpp add #include "tensorflow/lite/micro/micro_mutable_op_resolver.h" #include "tensorflow/lite/micro/tflite_bridge/micro_error_reporter.h" #include "tensorflow/lite/micro/micro_interpreter.h" #include "tensorflow/lite/schema/schema_generated.hā€œ āž¢ In CMakeLists.txt add target_link_libraries(${PROJECT_NAME} pico_stdlib pico-tflmicro) add_subdirectory("lib/pico-tflmicro" EXCLUDE_FROM_ALL)
  • 65. 65 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference INCLUDE TEST IMAGE Test image Actual 48x48 image used for inference Included as image.h (array with the data), eventually should come from camera
  • 66. 66 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference INCLUDE MODEL āž¢ .tflite model can be converted to .h via a command āž¢ $ xxd ā€“i model_name.tflite > new_model_name.h āž¢ Open the model and make it a const unsigned char!
  • 67. 67 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference MAIN.CPP ā€“ NECESSARY INCLUDES #include <stdio.h> #include "pico/stdlib.h" #include "hardware/gpio.h" #include "pico/binary_info.h" /* Specific includes for tensorflow lite for microcontrollers. */ #include "tensorflow/lite/micro/micro_mutable_op_resolver.h" #include "tensorflow/lite/micro/tflite_bridge/micro_error_reporter.h" #include "tensorflow/lite/micro/micro_interpreter.h" #include "tensorflow/lite/schema/schema_generated.h" /* The image that will be tested. */ #include "image.h" /* The trained model, convert for TFLu, and within a C header file. */ #include "..//models//model.h" /* As per raspberry pico pinout documentation. */ #define LED_PIN 28 /* Adding all operations that were available before, i.e. 128 operations. */ using AllOpsResolver_t = tflite::MicroMutableOpResolver<9>;
  • 68. 68 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference MAIN.CPP ā€“ FUNCTIONSTO PREPROCESSTHE IMAGE (NEEDED FOR CORRECT INPUTTOTHE MODEL) /* Rescaling to perform operations on data fo similar scale. */ float rescaling(float x, float scale, float offset) { return (x * scale) - offset; } /* Quantization procedure, i.e. moving from a number represented with floats to a number represented with int8. */ int8_t quantize(float x, float scale, float zero_point) { return (x/scale) + zero_point; }
  • 69. 69 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference MAIN.CPP ā€“ INITIALIZE LED, LABELSAND IMAGE SIZE /* Program entry point. */ int main() { /* Initilisation of the standard lib for input/output. */ stdio_init_all(); /* Initialisation of the LED pin, as an ouput PIN, with LOW initial value. */ gpio_init(LED_PIN); gpio_set_dir(LED_PIN, GPIO_OUT); gpio_put(LED_PIN, 0); /* Image dimensions (48,48) on 3 channels (RGB). */ int Npix = 48; int Nchan = 3; int Nlabels = 8; /* The 8 possible labels for the classifier as strings for the serial output. */ const char *label [] = {"aeroplane", "boat", "bus", "car", "motorbike", "none", "person", "train"};
  • 70. 70 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference MAIN.CPP ā€“ INITIALIZETFLITE-MICROOBJECTS /* Initialisation of the TFLu interpreter. */ static const tflite::Model* tflu_model = nullptr; static tflite::MicroInterpreter* tflu_interpreter = nullptr; static TfLiteTensor* tflu_i_tensor = nullptr; static TfLiteTensor* tflu_o_tensor = nullptr; /* The ops resolver and error report. */ static AllOpsResolver_t op_resolver; static tflite::MicroErrorReporter micro_error_reporter; tflite::ErrorReporter* error_reporter = &micro_error_reporter;
  • 71. 71 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference MAIN.CPP ā€“ ADDTHEOPERATIONS INCLUDED INYOUR MODEL op_resolver.AddConv2D(tflite::Register_CONV_2D_INT8()); op_resolver.AddDepthwiseConv2D(tflite::Register_DEPTHWISE_CONV_2D_INT8()); op_resolver.AddPad(); op_resolver.AddAdd(tflite::Register_ADD_INT8()); op_resolver.AddRelu6(); op_resolver.AddMean(); op_resolver.AddSoftmax(tflite::Register_SOFTMAX_INT8()); op_resolver.AddFullyConnected(tflite::Register_FULLY_CONNECTED_INT8()); op_resolver.AddDequantize();
  • 72. 72 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference MAIN.CPP ā€“ MORE INITIALIZATION + INCLUDINGTHE MODEL /* Allocation of the tensor arena, in the HEAP. */ constexpr int tensor_arena_size = 144000; uint8_t *tensor_arena = nullptr; tensor_arena = (uint8_t *)malloc(tensor_arena_size); /* Initilizing the scaling values. */ float scaling_scale = 1.0f/127.5f; int32_t scaling_offset = -1.0f; /* Retrieving the model from the header file. */ tflu_model = ::tflite::GetModel(mobilenet48_int_input_tflite); /* Creating the interpreter and allocating tensors. */ static tflite::MicroInterpreter static_interpreter(tflu_model, op_resolver, tensor_arena, tensor_arena_size); tflu_interpreter = &static_interpreter; TfLiteStatus allocate_status = tflu_interpreter->AllocateTensors(); if (allocate_status != kTfLiteOk) { printf("Issue when allocating the tensors. rn"); }
  • 73. 73 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference MAIN.CPP ā€“ LINK INPUT/OUTPUTANDGET QUANTIZATION PARAMETERS /* Linking the interpreter to the input/output tensors. */ tflu_i_tensor = tflu_interpreter->input(0); tflu_o_tensor = tflu_interpreter->output(0); /* Retrieving the quantization parameters from the model. */ const auto* i_quantization = reinterpret_cast<TfLiteAffineQuantization*>(tflu_i_tensor->quantization.params); float tfluQuant_scale = i_quantization->scale->data[0]; int32_t tfluQuant_zeropoint = i_quantization->zero_point->data[0]; /* Indices initialization. */ int idx = 0; float value = 0; float value_scaled = 0; float value_quant = 0; int idx_tf = 0;
  • 74. 74 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference MAIN.CPP ā€“ DO RESCALINGANDQUANTIZATIONAND GIVE ITAS INPUTTOTHE MODEL /* Forever loop. */ while (true) { /* Blinking the LED. */ gpio_put(LED_PIN, 1); sleep_ms(1000); gpio_put(LED_PIN, 0); sleep_ms(1000); /* Preparing the input. */ for (int i(0); i<Npix; i++) { for (int j(0); j<Npix; j++) { for (int k(0); k<Nchan; k++) { /* Compute the 1D index*/ idx = k*Npix*Npix + j*Npix + i; value = test_image[idx]; /* Re-scale than quantize the result. */ value_scaled = rescaling(value, scaling_scale, scaling_offset); value_quant = quantize(value_scaled, tfluQuant_scale, tfluQuant_zeropoint); /* Put the result in the input tensor */ tflu_i_tensor->data.int8[idx] = value_quant; } } }
  • 75. 75 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Run Inference MAIN.CPP ā€“ RUN INFERENCEAND PRINT RESULT /* Call the interpreter to infer a label. */ TfLiteStatus invoke_status = tflu_interpreter->Invoke(); /* Print the probabilities for each labels with the serial communication. */ printf("Result: [%f; %f; %f; %f; %f; %f; %f; %f].n", tflu_o_tensor->data.f[0], tflu_o_tensor->data.f[1], tflu_o_tensor->data.f[2], tflu_o_tensor->data.f[3], tflu_o_tensor->data.f[4], tflu_o_tensor->data.f[5], tflu_o_tensor->data.f[6], tflu_o_tensor->data.f[7]); /* Retrieve the result with maximum likelihood. */ size_t ix_max = 0; float pb_max = 0; for (size_t ix = 0; ix<=Nlabels; ix++) { if (tflu_o_tensor->data.f[ix] > pb_max) { ix_max = ix; pb_max = tflu_o_tensor->data.f[ix]; } } /* Print the most likely label with the serial communication. */ printf("Result of inference: %s with proba %f.n", label[ix_max], pb_max); }
  • 77. 77 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TFLite-micro RESULT āž¢ Result: āž¢ aeroplane: 0.488 āž¢ boat: 0.446 āž¢ bus: 0.012 āž¢ car: 0.023 āž¢ moborbike: 0.000 āž¢ none: 0.004 āž¢ person: 0.004 āž¢ train: 0.012
  • 79. 79 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Three githubs TensorFlow Tflite-micro Pico-tflmicro Just a subset Library already prepared for pico 1 2 Used in the demo until now Alternative route, shown in next slides
  • 80. 80 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TFLite-Micro ONLY USINGTHETENSORFLOW GITHUB āž¢ Navigate to the lib folder āž¢ $ git clone https://github.com/tensorflow/tflite-micro.git āž¢ $ cd tflite-micro āž¢ $ make -f tensorflow/lite/micro/tools/make/Makefile TARGET=cortex_m_generic TARGET_ARCH=cortex-m0plus OPTIMIZED_KERNEL_DIR=cmsis_nn microlite āž¢ TheTARGET andTARGET_ARCH are specific to the hardware (PICO in this case)
  • 81. 81 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TFLite-Micro CMAKELISTS.TXT add_compile_definitions(TF_LITE_STATIC_MEMORY=1) target_link_libraries(${PROJECT_NAME} pico_stdlib $ENV{TFLITE_MICRO_PATH}/gen/cortex_m_generic_cortex-m0plus_default/lib/libtensorflow-microlite.a) include_directories(${PROJECT_NAME} PRIVATE $ENV{TFLITE_MICRO_PATH}/) include_directories(${PROJECT_NAME} PRIVATE $ENV{TFLITE_MICRO_PATH}/tensorflow/lite/micro/tools/make/downloads/flatbuffers/include/) include_directories(${PROJECT_NAME} PRIVATE $ENV{TFLITE_MICRO_PATH}/tensorflow/lite/micro/tools/make/downloads/gemmlowp/) tflite-micro library Includes needed for the code in main.cpp
  • 82. 82 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TFLite-Micro MAIN.CPP āž¢ Most of the code in main.cpp is the same āž¢ Following slides show the things that change
  • 83. 83 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TFLite-Micro MAIN.CPP āž¢ Main.cpp: #include "tensorflow/lite/micro/tflite_bridge/micro_error_reporter.hā€œ #include "tensorflow/lite/micro/micro_interpreter.hā€œ #include "tensorflow/lite/schema/schema_generated.hā€œ #include "tensorflow/lite/micro/system_setup.hā€œ #include "tensorflow/lite/micro/micro_log.hā€œ #include "tensorflow/lite/micro/micro_mutable_op_resolver.hā€œ
  • 84. 84 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TFLite-Micro MAIN.CPP āž¢ Main.cpp: const tflite::Model* model = nullptr; tflite::MicroInterpreter* interpreter = nullptr; TfLiteTensor* input = nullptr; constexpr int kTensorArenaSize = 160000; alignas(16) static uint8_t tensor_arena[kTensorArenaSize]; tflite::InitializeTarget(); static tflite::MicroMutableOpResolver<9> op_resolver; added Replaces allOpsResolver
  • 85. 85 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ TFLite-Micro MAIN.CPP āž¢ Main.cpp: āž¢ op_resolver.AddConv2D(tflite::Register_CONV_2D_INT8()); āž¢ op_resolver.AddDepthwiseConv2D(tflite::Register_DEPTHWISE_CONV_2D_INT8()); āž¢ op_resolver.AddPad(); āž¢ op_resolver.AddAdd(tflite::Register_ADD_INT8()); āž¢ op_resolver.AddRelu6(); āž¢ op_resolver.AddMean(); āž¢ op_resolver.AddSoftmax(tflite::Register_SOFTMAX_INT8()); āž¢ op_resolver.AddFullyConnected(tflite::Register_FULLY_CONNECTED_INT8()); āž¢ op_resolver.AddDequantize();
  • 87. 87 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Next steps We want to have the inference live with images coming from a ~5ā‚¬ camera āž¢ OV7670 sensor Get images but keep the corresponding cost to a minimum ? Counting on the CPU (SW)? Not Counting on CPU (HW)? ā€¢ Efficient memory usage ā€¢ Fusing operations ā€¢ ā€¦ ā€¢ PIO ā€¢ DMA (direct memory access)
  • 88. 88 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ PIO āž¢ A PIO instance contains 4 state machines āž¢ A state machine is like a tiny processor that can execute a limited set of instructions āž¢ The CPU loads the corresponding instructions, enabling/disabling the state machines āž¢ Its a way to delegate some workload away from the CPU, i.e. communication with the camera āž¢ We wanted to show a quick example where the Raspberry Pi PICO blinks a LED without using the CPU butā€¦ time is ticking ! PROGRAMMABLE INPUT/OUPUT
  • 89. 89 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Edge AI is Multidisciplinary Hardware Data science Software A B C Wanting a first inference from scratchā€¦ A need is defined, hence technical specifications must be met Seeking to be state of the artā€¦ Bad news. All 3 needed. Bad news. Mastering all 3 needed. Good news. Insisting on the most comfortable one(s). Usually what must be achieved?
  • 90. 90 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Edge AI issues Multidisciplinary expertise requiredā€¦ andā€¦ āž¢ Fast evolving field (new models, new frameworks, new HWā€¦) āž¢ Many constraints to respect for the solution: āž¢ Technical (inference time, memory usage, ā€¦) āž¢ Human (appropriate to user expertise) āž¢ Business (at an OK cost) Hardware Data science Software
  • 91. 91 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Conclusion āž¢ Beware of name dropping āž¢ First a need, then embedded machine learning is a possibility (and technical specifications are the finish line) āž¢ Embedded Machine learning is inherently multidisciplinary āž¢ Very complex from scratchā€¦ But we are helped!
  • 92. 92 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Donā€™t hesitate to contact us! āž¢ Questions about this presentation? āž¢ Miguel Lejeune (miguel.lejeune@sirris.be, +32 490 01 41 44) āž¢ Vincent Lucas (vincent.lucas@sirris.be, +32 493 31 15 92) āž¢ Questions about other technologies/Sirris offerings? āž¢ Questions about fundings ? āž¢Bas Rottier (bas.rottier@sirris.be, +32 491 86 91 70)
  • 93. 93 31/01/2024 Ā©SIRRIS ā€¢ CONFIDENTIAL ā€¢ Thanks!