SlideShare a Scribd company logo
1 of 17
Project : Micro-speech Recognition
Command
Recognizer
“No”
“Yes”
Phase 2 :
Deploy to a Microcontroller
T
Command Recognizer
Recognize what people said.
3
Training
.wav data
To FFT Trained model
FFT
Feature
Command
Recognizer
Model
Get
.wav data
To FFT
FFT
Feature “Yes”
Training
Inference
https://bit.ly/2XBdE4q
Overall flow to this project
ADC PCM FFT and
pre-process
Audio
Spectrum
CNN model
output tensor
silence
unknown
yes
no
audio_provider feature_provider
Copy into input tensor
PopulateFeatureData
Interpreter
Invoke()
softmax
RecognizeCommands::
ProcessLatestResults
RespondToCommand
The audio features themselves are a two-
dimensional array, made up of horizontal slices
representing the frequencies at one point in time,
stacked on top of each other to form a spectrogram
showing how those frequencies changed over time.
How to get audio features ?
Fourier Transform on sound
Frequencies in sound
The magnitude spectrum of the signal
A magnitude spectrogram is a
visualization of the frequencies
in sound over time, and can be
useful as a feature for neural
network recognition on noise or
speech.
Examine the spectrogram “audio images"
Audio spectrum representants audio features
You can see how the 30-ms
sample window is moved
forward by 20 ms each time
until it has covered the full
one-second sample.
40
49
feature buffer(1 second)
we combine the results of running the FFT on 49 consecutive 30-ms slices
of audio, and this will pass into the model
each FFT row represents a
30ms sample of audio split into
40 frequency buckets.
int(
𝑙𝑒𝑛𝑔𝑡ℎ−𝑤𝑖𝑛𝑑𝑜𝑤_𝑠𝑖𝑧𝑒
𝑠𝑡𝑟𝑖𝑑𝑒
) + 1
30+48*20=990ms
running an FFT across a 30ms section of the audio
sample data
FFT FFT
Audio Recognition Model (CNN Model)
CNN Model
silence
unknown
yes
no
1 second audio=40x49 pixels image
40
49
Our Model
CNN
Model
Input
output
(1,49,40,1) (1,4)
Type: int8 Type: int8 (-128~127)
Input byte: (1x49x40)x1 byte=1960 0 1 2
unknown
silence yes no
3
1 second audio spectrogram (49x40)
tensorflow/lite/micro/examples/micro_speech
Project File Structure
main_function.cc Tensorflow Lite 框架主要程式
recognize_commands.cc  對推論結果進行處理
micro_features/model.cc  Tflite model
XXXX_test.cc  以_test.cc 為檔名結尾
是一些可以在開發主機上進行的測試程式
arduino, sparkfun_edge, zephyr_riscv,..
裡頭為特定硬體的處理檔案, 若在編譯時指定
TARGET=XXX, 則會以資料匣內的檔案取代原檔案
├── sparkfun_edge
| ├── command_responder.cc
| └── audio_provider.cc
├── micro_features
GetAudioSamples()
GenerateMicroFeatures()
Project Flow
程式流程
Audio Spectrum
ADC
PCM
sparkfun_edge/audio_provider.cc
GetAudioSamples ()
GenerateMicroFeatures() 40
49
kFeatureSliceCount
kFeatureSliceSize
kFeatureElementCount=49x40
1 second window
performs the FFT and returns the audio
frequency information.
feature_provider.cc FeatureProvider::PopulateFeatureData
model input
main_functions.cc
feature_provider.cc
The feature provider converts raw audio, obtained
from the audio provider, into spectrograms that can
be fed into our model. It is called during the main
loop
FeatureProvider::PopulateFeatureData() : Fills the
feature data with information from audio inputs,
and returns how many feature slices were updated.
The Feature Provider
PopulateFeatureData()
每次都是1秒鐘的語音資
料, 但不用每次又全部重
算FFT , 只針對有新的
audio slice計算其FFT 即可,
以節省計算量及時間
feature_provider.cc
PopulateFeatureData()
1 second window
it first requests audio for that slice from
the audio provider using GetAudioSamples()
, and then it calls GenerateMicroFeatures() to
perform the FFT and returns the audio
frequency information .
feature_provider.cc
1 second window
audio_samples
_size: 512
audio_samples
feature_data_
FFT
feature_provider.cc
micro_features/micro_model_settings.h
sparkfun_edge/audio_provider.cc
GetAudioSamples () is expected to return an array of
14-bit pulse code modulated (PCM) audio data.
The Audio Provider
audio_samples
FFT
Size: 512
20ms 40ms 60ms 80ms 100ms
Digital audio format
14 bit PCM(Pulse-Code Modulation)
kAudioSampleFrequency=16KHz
 audio sample size=16000 samples/second
=16 samples/ 1ms
Generating the Sample Rate for the ADC
Trigger frequency
am_hal_ctimer_period_set(3, AM_HAL_CTIMER_TIMERA, 750, 0);
12MHz/750 = 16KHz (sampling rate)
audio_provider.cc
d
MIC1
MIC0
Timer A3
GPIO11/ADC2
GPIO29/ADC1
14bit ADC
12MHz
32K
SRAM
DMA
FIFO
ADC set up as a repeat scan mode
trigger ADC periodically
slot number+ Sampling data
Microphone
GPIO29/ADC1
GPIO11/ADC2
the channel select bit field specifies
which one of the analog
multiplexer channels will be used
for the conversions requested for
an individual slot.
When each active slot obtains a
sample from the ADC, it is added to
the value in its accumulator.
All slots write their accumulated
results to the FIFO
sparkfun_edge/audio_provider.cc
Copy (size:kAdcSampleBufferSize)
GetAudioSamples()
sparkfun_edge/audio_provider.cc
g_ui32ADCSampleBuffer1 [kAdcSampleBufferSize]
g_audio_capture_buffer
g_audio_capture_buffer[g_audio_capture_buffer_start]
= temp.ui32Sample;
Copy(size: duration_ms)
30ms PCM audio data
GetAudioSamples
(int start_ms, int duration_ms)
g_audio_output_buffer
Copy when ADC Interrupt occurs
ui32Slot
ui32Sample
ADC data (Slot 1 +Slot2 )
g_ui32ADCSampleBuffer0 [kAdcSampleBufferSize]
ui32TargetAddress
kAdcSampleBufferSize =2 slot* 1024 samples per slot
16000
512
Audio data is transferred by
DMA transfer
GetAudioSamples()
start_ms
start_ms+duration_ms
g_audio_capture_buffer
g_audio_output_buffer
當ISR發生一次, time stamp 就加1, 16 次ISR 表示共讀了16 * 1000 samples, , 約略經過1ms
Time stamp 計算方式
16000
g_audio_output_buffer[kMaxAudioSampleSize]
kMaxAudioSampleSize =512 ( power of two)
Part of the word “yes” being captured in our window
One Problem : Audio is live streaming
YES
??
CNN model
output tensor
silence
unknown
yes
no
Interpreter
Invoke()
softmax
RecognizeCommands::
ProcessLatestResults
RespondToCommand
The length of the averaging window
(average_window_duration_ms)
The minimum average score that counts as a detection
(detection_threshold)
The amount of time we’ll wait after hearing a command
before recognizing a second one (suppression_ms)
The minimum number of inferences required in the window
for a result to count (3)
RecognizeCommands
recognize_commands.cc
產生燒錄檔 micro_speech_wire.bin
寫入燒錄檔到板子
Hands – on
https://drive.google.com/drive/folders/1FhkM
DQ5xZoQS8GLkPZJPoVvT3dD3pk3g
Study
tensorflow/lite/micro/examples/micro_speech
main_function.cc
feature_provider.cc
recognize_commands.cc
/sparkfun_edge/command_responder.cc
開啓終端機 (baud rate: 115200bps)
Demo 終端機會輸出以下訊息
將 Sparkfun edge 透過 USB 連接電源後
會看到有藍光一直在閃 ,表示此時板子在
正等待語音輸入

More Related Content

What's hot

chap3 numerisation_des_signaux
chap3 numerisation_des_signauxchap3 numerisation_des_signaux
chap3 numerisation_des_signauxBAKKOURY Jamila
 
六足機器人超入門簡介
六足機器人超入門簡介六足機器人超入門簡介
六足機器人超入門簡介roboard
 
Cour traitement du signal.pdf
Cour traitement du signal.pdfCour traitement du signal.pdf
Cour traitement du signal.pdfstock8602
 
Lamini&farsane traitement de_signale
Lamini&farsane traitement de_signaleLamini&farsane traitement de_signale
Lamini&farsane traitement de_signaleAsmae Lamini
 
presentation serrure codee
presentation serrure codeepresentation serrure codee
presentation serrure codeeMohammedFassih
 
PGI CUDA FortranとGPU最適化ライブラリの一連携法
PGI CUDA FortranとGPU最適化ライブラリの一連携法PGI CUDA FortranとGPU最適化ライブラリの一連携法
PGI CUDA FortranとGPU最適化ライブラリの一連携法智啓 出川
 
STM32 F4 (PWM,SPI And ADC Test Examples)
STM32 F4 (PWM,SPI And ADC Test Examples)STM32 F4 (PWM,SPI And ADC Test Examples)
STM32 F4 (PWM,SPI And ADC Test Examples)Aymen Lachkhem
 
Introduction à la transformée en z et convolution discrète (GEII MA32)
Introduction à la transformée en z et convolution discrète (GEII MA32)Introduction à la transformée en z et convolution discrète (GEII MA32)
Introduction à la transformée en z et convolution discrète (GEII MA32)Frédéric Morain-Nicolier
 
AMD: Where Gaming Begins
AMD: Where Gaming BeginsAMD: Where Gaming Begins
AMD: Where Gaming BeginsAMD
 
한컴MDS_무기체계 SW 개발을 위한 TRACE32 활용방안
한컴MDS_무기체계 SW 개발을 위한 TRACE32 활용방안한컴MDS_무기체계 SW 개발을 위한 TRACE32 활용방안
한컴MDS_무기체계 SW 개발을 위한 TRACE32 활용방안HANCOM MDS
 
Mesure de température et humidité avec le capteur DHT11 et la Raspberry Pi 3
Mesure de température et humidité  avec le capteur DHT11 et la Raspberry Pi 3Mesure de température et humidité  avec le capteur DHT11 et la Raspberry Pi 3
Mesure de température et humidité avec le capteur DHT11 et la Raspberry Pi 3Chiheb Ameur ABID
 
Dsp U Lec10 DFT And FFT
Dsp U   Lec10  DFT And  FFTDsp U   Lec10  DFT And  FFT
Dsp U Lec10 DFT And FFTtaha25
 
Benefits of Multi-rail Cluster Architectures for GPU-based Nodes
Benefits of Multi-rail Cluster Architectures for GPU-based NodesBenefits of Multi-rail Cluster Architectures for GPU-based Nodes
Benefits of Multi-rail Cluster Architectures for GPU-based Nodesinside-BigData.com
 
Instrumentation et régulation
Instrumentation et régulationInstrumentation et régulation
Instrumentation et régulationPierre Maréchal
 
Circuits_Chp.1_Eléments de circuits
Circuits_Chp.1_Eléments de circuitsCircuits_Chp.1_Eléments de circuits
Circuits_Chp.1_Eléments de circuitsChafik Cf
 
Cours electrostatique
Cours electrostatiqueCours electrostatique
Cours electrostatiquemaidine96
 

What's hot (20)

chap3 numerisation_des_signaux
chap3 numerisation_des_signauxchap3 numerisation_des_signaux
chap3 numerisation_des_signaux
 
六足機器人超入門簡介
六足機器人超入門簡介六足機器人超入門簡介
六足機器人超入門簡介
 
Cour traitement du signal.pdf
Cour traitement du signal.pdfCour traitement du signal.pdf
Cour traitement du signal.pdf
 
Lamini&farsane traitement de_signale
Lamini&farsane traitement de_signaleLamini&farsane traitement de_signale
Lamini&farsane traitement de_signale
 
presentation serrure codee
presentation serrure codeepresentation serrure codee
presentation serrure codee
 
PGI CUDA FortranとGPU最適化ライブラリの一連携法
PGI CUDA FortranとGPU最適化ライブラリの一連携法PGI CUDA FortranとGPU最適化ライブラリの一連携法
PGI CUDA FortranとGPU最適化ライブラリの一連携法
 
STM32 F4 (PWM,SPI And ADC Test Examples)
STM32 F4 (PWM,SPI And ADC Test Examples)STM32 F4 (PWM,SPI And ADC Test Examples)
STM32 F4 (PWM,SPI And ADC Test Examples)
 
Introduction à la transformée en z et convolution discrète (GEII MA32)
Introduction à la transformée en z et convolution discrète (GEII MA32)Introduction à la transformée en z et convolution discrète (GEII MA32)
Introduction à la transformée en z et convolution discrète (GEII MA32)
 
AMD: Where Gaming Begins
AMD: Where Gaming BeginsAMD: Where Gaming Begins
AMD: Where Gaming Begins
 
Traitement du signal
Traitement du signalTraitement du signal
Traitement du signal
 
한컴MDS_무기체계 SW 개발을 위한 TRACE32 활용방안
한컴MDS_무기체계 SW 개발을 위한 TRACE32 활용방안한컴MDS_무기체계 SW 개발을 위한 TRACE32 활용방안
한컴MDS_무기체계 SW 개발을 위한 TRACE32 활용방안
 
Mesure de température et humidité avec le capteur DHT11 et la Raspberry Pi 3
Mesure de température et humidité  avec le capteur DHT11 et la Raspberry Pi 3Mesure de température et humidité  avec le capteur DHT11 et la Raspberry Pi 3
Mesure de température et humidité avec le capteur DHT11 et la Raspberry Pi 3
 
Dsp U Lec10 DFT And FFT
Dsp U   Lec10  DFT And  FFTDsp U   Lec10  DFT And  FFT
Dsp U Lec10 DFT And FFT
 
Benefits of Multi-rail Cluster Architectures for GPU-based Nodes
Benefits of Multi-rail Cluster Architectures for GPU-based NodesBenefits of Multi-rail Cluster Architectures for GPU-based Nodes
Benefits of Multi-rail Cluster Architectures for GPU-based Nodes
 
Instrumentation et régulation
Instrumentation et régulationInstrumentation et régulation
Instrumentation et régulation
 
Poly td ea
Poly td eaPoly td ea
Poly td ea
 
Circuits_Chp.1_Eléments de circuits
Circuits_Chp.1_Eléments de circuitsCircuits_Chp.1_Eléments de circuits
Circuits_Chp.1_Eléments de circuits
 
présentation STM32
présentation STM32présentation STM32
présentation STM32
 
Cours electrostatique
Cours electrostatiqueCours electrostatique
Cours electrostatique
 
Regulation PI
Regulation PIRegulation PI
Regulation PI
 

Similar to TinyML - 4 speech recognition

Fyp Final Presentation E1 Tapping
Fyp Final Presentation E1 TappingFyp Final Presentation E1 Tapping
Fyp Final Presentation E1 TappingFacebook Guru
 
Applications - embedded systems
Applications - embedded systemsApplications - embedded systems
Applications - embedded systemsDr.YNM
 
Emergency Service Provide by Mobile
Emergency Service Provide by MobileEmergency Service Provide by Mobile
Emergency Service Provide by MobileSamiul Hoque
 
igorFreire_UCI_real-time-dsp_reports
igorFreire_UCI_real-time-dsp_reportsigorFreire_UCI_real-time-dsp_reports
igorFreire_UCI_real-time-dsp_reportsIgor Freire
 
Lect1a_ basics of DSP.pptx
Lect1a_ basics of DSP.pptxLect1a_ basics of DSP.pptx
Lect1a_ basics of DSP.pptxVarsha506533
 
The evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'sThe evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'sRitul Sonania
 
Usb Controlled Function Generator
Usb Controlled Function GeneratorUsb Controlled Function Generator
Usb Controlled Function GeneratorKent Schonert
 
Melp codec optimization using DSP kit
Melp codec optimization using DSP kitMelp codec optimization using DSP kit
Melp codec optimization using DSP kitsohaibaslam207
 
Fpga video capturing
Fpga video capturingFpga video capturing
Fpga video capturingshehryar88
 
Sudhir tms 320 f 2812
Sudhir tms 320 f 2812 Sudhir tms 320 f 2812
Sudhir tms 320 f 2812 vijaydeepakg
 
Fault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch CodesFault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch CodesIJERA Editor
 

Similar to TinyML - 4 speech recognition (20)

Fyp Final Presentation E1 Tapping
Fyp Final Presentation E1 TappingFyp Final Presentation E1 Tapping
Fyp Final Presentation E1 Tapping
 
Applications - embedded systems
Applications - embedded systemsApplications - embedded systems
Applications - embedded systems
 
dsp.pdf
dsp.pdfdsp.pdf
dsp.pdf
 
DSP_Assign_1
DSP_Assign_1DSP_Assign_1
DSP_Assign_1
 
Emergency Service Provide by Mobile
Emergency Service Provide by MobileEmergency Service Provide by Mobile
Emergency Service Provide by Mobile
 
XMC4000 Brochure | Infineon Technologies
XMC4000 Brochure | Infineon TechnologiesXMC4000 Brochure | Infineon Technologies
XMC4000 Brochure | Infineon Technologies
 
igorFreire_UCI_real-time-dsp_reports
igorFreire_UCI_real-time-dsp_reportsigorFreire_UCI_real-time-dsp_reports
igorFreire_UCI_real-time-dsp_reports
 
Mixer v1.0.3
Mixer v1.0.3Mixer v1.0.3
Mixer v1.0.3
 
3D-DRESD ASIDA
3D-DRESD ASIDA3D-DRESD ASIDA
3D-DRESD ASIDA
 
Lect1a_ basics of DSP.pptx
Lect1a_ basics of DSP.pptxLect1a_ basics of DSP.pptx
Lect1a_ basics of DSP.pptx
 
My Project
My ProjectMy Project
My Project
 
The evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'sThe evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'s
 
Usb Controlled Function Generator
Usb Controlled Function GeneratorUsb Controlled Function Generator
Usb Controlled Function Generator
 
Melp codec optimization using DSP kit
Melp codec optimization using DSP kitMelp codec optimization using DSP kit
Melp codec optimization using DSP kit
 
Fpga video capturing
Fpga video capturingFpga video capturing
Fpga video capturing
 
Sudhir tms 320 f 2812
Sudhir tms 320 f 2812 Sudhir tms 320 f 2812
Sudhir tms 320 f 2812
 
SDH and TDM telecom
SDH and TDM telecomSDH and TDM telecom
SDH and TDM telecom
 
PC based oscilloscope
PC based oscilloscopePC based oscilloscope
PC based oscilloscope
 
Fault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch CodesFault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch Codes
 
FPGA Implementation of High Speed FIR Filters and less power consumption stru...
FPGA Implementation of High Speed FIR Filters and less power consumption stru...FPGA Implementation of High Speed FIR Filters and less power consumption stru...
FPGA Implementation of High Speed FIR Filters and less power consumption stru...
 

More from 艾鍗科技

Appendix 1 Goolge colab
Appendix 1 Goolge colabAppendix 1 Goolge colab
Appendix 1 Goolge colab艾鍗科技
 
Project-IOT於餐館系統的應用
Project-IOT於餐館系統的應用Project-IOT於餐館系統的應用
Project-IOT於餐館系統的應用艾鍗科技
 
02 IoT implementation
02 IoT implementation02 IoT implementation
02 IoT implementation艾鍗科技
 
2. 機器學習簡介
2. 機器學習簡介2. 機器學習簡介
2. 機器學習簡介艾鍗科技
 
心率血氧檢測與運動促進
心率血氧檢測與運動促進心率血氧檢測與運動促進
心率血氧檢測與運動促進艾鍗科技
 
利用音樂&情境燈幫助放鬆
利用音樂&情境燈幫助放鬆利用音樂&情境燈幫助放鬆
利用音樂&情境燈幫助放鬆艾鍗科技
 
IoT感測器驅動程式 在樹莓派上實作
IoT感測器驅動程式在樹莓派上實作IoT感測器驅動程式在樹莓派上實作
IoT感測器驅動程式 在樹莓派上實作艾鍗科技
 
無線聲控遙控車
無線聲控遙控車無線聲控遙控車
無線聲控遙控車艾鍗科技
 
最佳光源的研究和實作
最佳光源的研究和實作最佳光源的研究和實作
最佳光源的研究和實作 艾鍗科技
 
無線監控網路攝影機與控制自走車
無線監控網路攝影機與控制自走車無線監控網路攝影機與控制自走車
無線監控網路攝影機與控制自走車 艾鍗科技
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning艾鍗科技
 
人臉辨識考勤系統
人臉辨識考勤系統人臉辨識考勤系統
人臉辨識考勤系統艾鍗科技
 
智慧家庭Smart Home
智慧家庭Smart Home智慧家庭Smart Home
智慧家庭Smart Home艾鍗科技
 
雲端智能盆栽
雲端智能盆栽雲端智能盆栽
雲端智能盆栽艾鍗科技
 
How to -- Goolge colab
How to -- Goolge colabHow to -- Goolge colab
How to -- Goolge colab艾鍗科技
 

More from 艾鍗科技 (20)

Appendix 1 Goolge colab
Appendix 1 Goolge colabAppendix 1 Goolge colab
Appendix 1 Goolge colab
 
Project-IOT於餐館系統的應用
Project-IOT於餐館系統的應用Project-IOT於餐館系統的應用
Project-IOT於餐館系統的應用
 
02 IoT implementation
02 IoT implementation02 IoT implementation
02 IoT implementation
 
Openvino ncs2
Openvino ncs2Openvino ncs2
Openvino ncs2
 
Step motor
Step motorStep motor
Step motor
 
2. 機器學習簡介
2. 機器學習簡介2. 機器學習簡介
2. 機器學習簡介
 
3. data features
3. data features3. data features
3. data features
 
心率血氧檢測與運動促進
心率血氧檢測與運動促進心率血氧檢測與運動促進
心率血氧檢測與運動促進
 
利用音樂&情境燈幫助放鬆
利用音樂&情境燈幫助放鬆利用音樂&情境燈幫助放鬆
利用音樂&情境燈幫助放鬆
 
IoT感測器驅動程式 在樹莓派上實作
IoT感測器驅動程式在樹莓派上實作IoT感測器驅動程式在樹莓派上實作
IoT感測器驅動程式 在樹莓派上實作
 
無線聲控遙控車
無線聲控遙控車無線聲控遙控車
無線聲控遙控車
 
最佳光源的研究和實作
最佳光源的研究和實作最佳光源的研究和實作
最佳光源的研究和實作
 
無線監控網路攝影機與控制自走車
無線監控網路攝影機與控制自走車無線監控網路攝影機與控制自走車
無線監控網路攝影機與控制自走車
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Linux Device Tree
Linux Device TreeLinux Device Tree
Linux Device Tree
 
人臉辨識考勤系統
人臉辨識考勤系統人臉辨識考勤系統
人臉辨識考勤系統
 
智慧家庭Smart Home
智慧家庭Smart Home智慧家庭Smart Home
智慧家庭Smart Home
 
智能健身
智能健身智能健身
智能健身
 
雲端智能盆栽
雲端智能盆栽雲端智能盆栽
雲端智能盆栽
 
How to -- Goolge colab
How to -- Goolge colabHow to -- Goolge colab
How to -- Goolge colab
 

Recently uploaded

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 

Recently uploaded (20)

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

TinyML - 4 speech recognition

  • 1. Project : Micro-speech Recognition Command Recognizer “No” “Yes” Phase 2 : Deploy to a Microcontroller
  • 2. T Command Recognizer Recognize what people said. 3 Training .wav data To FFT Trained model FFT Feature Command Recognizer Model Get .wav data To FFT FFT Feature “Yes” Training Inference https://bit.ly/2XBdE4q Overall flow to this project ADC PCM FFT and pre-process Audio Spectrum CNN model output tensor silence unknown yes no audio_provider feature_provider Copy into input tensor PopulateFeatureData Interpreter Invoke() softmax RecognizeCommands:: ProcessLatestResults RespondToCommand
  • 3. The audio features themselves are a two- dimensional array, made up of horizontal slices representing the frequencies at one point in time, stacked on top of each other to form a spectrogram showing how those frequencies changed over time. How to get audio features ? Fourier Transform on sound Frequencies in sound
  • 4. The magnitude spectrum of the signal A magnitude spectrogram is a visualization of the frequencies in sound over time, and can be useful as a feature for neural network recognition on noise or speech. Examine the spectrogram “audio images"
  • 5. Audio spectrum representants audio features You can see how the 30-ms sample window is moved forward by 20 ms each time until it has covered the full one-second sample. 40 49 feature buffer(1 second) we combine the results of running the FFT on 49 consecutive 30-ms slices of audio, and this will pass into the model each FFT row represents a 30ms sample of audio split into 40 frequency buckets. int( 𝑙𝑒𝑛𝑔𝑡ℎ−𝑤𝑖𝑛𝑑𝑜𝑤_𝑠𝑖𝑧𝑒 𝑠𝑡𝑟𝑖𝑑𝑒 ) + 1 30+48*20=990ms running an FFT across a 30ms section of the audio sample data FFT FFT Audio Recognition Model (CNN Model) CNN Model silence unknown yes no 1 second audio=40x49 pixels image 40 49
  • 6. Our Model CNN Model Input output (1,49,40,1) (1,4) Type: int8 Type: int8 (-128~127) Input byte: (1x49x40)x1 byte=1960 0 1 2 unknown silence yes no 3 1 second audio spectrogram (49x40) tensorflow/lite/micro/examples/micro_speech Project File Structure main_function.cc Tensorflow Lite 框架主要程式 recognize_commands.cc  對推論結果進行處理 micro_features/model.cc  Tflite model XXXX_test.cc  以_test.cc 為檔名結尾 是一些可以在開發主機上進行的測試程式 arduino, sparkfun_edge, zephyr_riscv,.. 裡頭為特定硬體的處理檔案, 若在編譯時指定 TARGET=XXX, 則會以資料匣內的檔案取代原檔案 ├── sparkfun_edge | ├── command_responder.cc | └── audio_provider.cc ├── micro_features GetAudioSamples() GenerateMicroFeatures()
  • 7. Project Flow 程式流程 Audio Spectrum ADC PCM sparkfun_edge/audio_provider.cc GetAudioSamples () GenerateMicroFeatures() 40 49 kFeatureSliceCount kFeatureSliceSize kFeatureElementCount=49x40 1 second window performs the FFT and returns the audio frequency information. feature_provider.cc FeatureProvider::PopulateFeatureData model input
  • 8. main_functions.cc feature_provider.cc The feature provider converts raw audio, obtained from the audio provider, into spectrograms that can be fed into our model. It is called during the main loop FeatureProvider::PopulateFeatureData() : Fills the feature data with information from audio inputs, and returns how many feature slices were updated. The Feature Provider
  • 9. PopulateFeatureData() 每次都是1秒鐘的語音資 料, 但不用每次又全部重 算FFT , 只針對有新的 audio slice計算其FFT 即可, 以節省計算量及時間 feature_provider.cc PopulateFeatureData() 1 second window it first requests audio for that slice from the audio provider using GetAudioSamples() , and then it calls GenerateMicroFeatures() to perform the FFT and returns the audio frequency information . feature_provider.cc
  • 10. 1 second window audio_samples _size: 512 audio_samples feature_data_ FFT feature_provider.cc micro_features/micro_model_settings.h
  • 11. sparkfun_edge/audio_provider.cc GetAudioSamples () is expected to return an array of 14-bit pulse code modulated (PCM) audio data. The Audio Provider audio_samples FFT Size: 512 20ms 40ms 60ms 80ms 100ms Digital audio format 14 bit PCM(Pulse-Code Modulation) kAudioSampleFrequency=16KHz  audio sample size=16000 samples/second =16 samples/ 1ms Generating the Sample Rate for the ADC Trigger frequency am_hal_ctimer_period_set(3, AM_HAL_CTIMER_TIMERA, 750, 0); 12MHz/750 = 16KHz (sampling rate) audio_provider.cc d MIC1 MIC0 Timer A3 GPIO11/ADC2 GPIO29/ADC1 14bit ADC 12MHz 32K SRAM DMA FIFO ADC set up as a repeat scan mode trigger ADC periodically slot number+ Sampling data
  • 12. Microphone GPIO29/ADC1 GPIO11/ADC2 the channel select bit field specifies which one of the analog multiplexer channels will be used for the conversions requested for an individual slot. When each active slot obtains a sample from the ADC, it is added to the value in its accumulator. All slots write their accumulated results to the FIFO
  • 13. sparkfun_edge/audio_provider.cc Copy (size:kAdcSampleBufferSize) GetAudioSamples() sparkfun_edge/audio_provider.cc g_ui32ADCSampleBuffer1 [kAdcSampleBufferSize] g_audio_capture_buffer g_audio_capture_buffer[g_audio_capture_buffer_start] = temp.ui32Sample; Copy(size: duration_ms) 30ms PCM audio data GetAudioSamples (int start_ms, int duration_ms) g_audio_output_buffer Copy when ADC Interrupt occurs ui32Slot ui32Sample ADC data (Slot 1 +Slot2 ) g_ui32ADCSampleBuffer0 [kAdcSampleBufferSize] ui32TargetAddress kAdcSampleBufferSize =2 slot* 1024 samples per slot 16000 512 Audio data is transferred by DMA transfer
  • 14. GetAudioSamples() start_ms start_ms+duration_ms g_audio_capture_buffer g_audio_output_buffer 當ISR發生一次, time stamp 就加1, 16 次ISR 表示共讀了16 * 1000 samples, , 約略經過1ms Time stamp 計算方式 16000 g_audio_output_buffer[kMaxAudioSampleSize] kMaxAudioSampleSize =512 ( power of two) Part of the word “yes” being captured in our window One Problem : Audio is live streaming YES ??
  • 15. CNN model output tensor silence unknown yes no Interpreter Invoke() softmax RecognizeCommands:: ProcessLatestResults RespondToCommand The length of the averaging window (average_window_duration_ms) The minimum average score that counts as a detection (detection_threshold) The amount of time we’ll wait after hearing a command before recognizing a second one (suppression_ms) The minimum number of inferences required in the window for a result to count (3) RecognizeCommands
  • 16. recognize_commands.cc 產生燒錄檔 micro_speech_wire.bin 寫入燒錄檔到板子 Hands – on https://drive.google.com/drive/folders/1FhkM DQ5xZoQS8GLkPZJPoVvT3dD3pk3g Study tensorflow/lite/micro/examples/micro_speech main_function.cc feature_provider.cc recognize_commands.cc /sparkfun_edge/command_responder.cc
  • 17. 開啓終端機 (baud rate: 115200bps) Demo 終端機會輸出以下訊息 將 Sparkfun edge 透過 USB 連接電源後 會看到有藍光一直在閃 ,表示此時板子在 正等待語音輸入