SlideShare a Scribd company logo
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
Office +82-2-2220-1795 | Fax +82-2-2220-4794
http://acoustics.hanyang.ac.kr
Muhammad Imran, Jin Yong Jeon
December 12, 2015
A Steered-Response Power (SRP) based Framework for
Sound Source Localization using Microphone Arrays in
Reverberant Rooms for Enhancement of Speech Intelligibility
42. Jahrestagung für Akustik, 14.-17. März 2016
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
2
o Introduction
o Background and Motivation
o Sound Source Localization
• Methodology
◦ VAD: Voice Activity Detection
◦ SRP (Beamforming) filters
◦ PHAT-weighting
• Real-time Framework and Implementation
• Optimization and Clustering
o Results
o Conclusion
Contents
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
3
o Sound Source Localization and Tracking using microphone arrays for
• Room Acoustics measurements
• Teleconference Systems
o Traditional Methods
• Time-delay estimation (TDOA) techniques between microphone pairs using Correlation
Function, ignoring
◦ Ambient noise
◦ Reflections from surrounding
◦ Reverberation in closed space
o Therefore
• Producing poor results in terms of Precision, Resolution and Robustness
• Require additional post-processing to track multiple sources in real time applications
• Limited bandwidth
Introduction (Background and Motivation)
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
4
o The Methods for sound source localization using microphone arrays
o Time difference of arrival estimation (TDOA)
o Generalized cross-correlation (GCC)
o Weighting function
o Optimum detection in the presence of reverberant environment
o Improved Signal to Noise Ratio (SNR)
o Steered Response Power (SRP)
o Weighting function as Beamformer
o Source localization and tracking
o Robust in Reverb Condition
Sound Source Localization
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
5
o The method is based on Steered-response power (SRP)
o The power at the beamformer output as a function of the look-up direction 𝑐 is
o Weighted Steered-response Power
o MVDR beamformer as Weight
o After simplifications
Methodology (1/3)
𝑃𝐵𝐹(𝑐) = 𝐷(𝑐) 𝐻 𝑆𝐷(𝑐)
𝑆 is cross-power matrix
𝐷(𝑐) is Array directivity
𝑤 =
𝐷 𝐻
(𝑐)𝑆−1
𝐷 𝐻(𝑐)𝑆−1 𝐷(𝑐)
𝑃 𝑀𝑉𝐷𝑅(𝑐) =
1
𝐷(𝑐) 𝐻 𝑆−1 𝐷(𝑐)
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
6
o Combining the Bins
o Combining the signals 𝑃 𝑀𝑉𝐷𝑅(𝑐, 𝑘) from different frequency bins
o Approach used is PHAT weighting
o Information of noise variance can be used and the final beamformer is improved
o Therefore,
Methodology (2/3)
𝑃𝑆𝑆𝐿(𝑐) =
1
𝐾
𝑘=1
𝐾
𝑀
𝑋 𝑘
𝐻
𝑋 𝑘
𝑃 𝑀𝑉𝐷𝑅(𝑐, 𝑘)
𝑋 𝑘 input vector with length M containing the input signals for this frequency bin from all microphones
𝑁𝑘 =
1
𝑀
𝑖
𝑁𝑖(𝑘)
𝑃𝑆𝑆𝐿(𝑐) =
1
𝐾
𝑘=1
𝐾
𝑀
𝑞𝑋 𝑘
𝐻
𝑋 𝑘 + (1 − 𝑞) 𝑁𝑘
𝑃 𝑀𝑉𝐷𝑅(𝑐, 𝑘)
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
7
o Post Processing (Simple Clustering)
o Algorithm used is so-called “Bucket Clustering”
o Step: 1; Grouping the Measurements
o Based on Single-frame Information of azimuth ′𝜑′, elevation ′𝜃′, standard deviations ′𝜎 𝜑
′
and ′𝜎 𝜃
′ microphone-array working volume is computed as (50% overlapping)
o Step: 2; Number of Cluster Candidates
o Applying threshold defined as the average confidence of sections with more than one measurement:
o Step: 3; Averaging the Measurements in Each Cluster Candidate
Methodology (3/3)
𝑀 = 4
𝜑 𝑚𝑎𝑥 − 𝜑 𝑚𝑖𝑛
6𝜎 𝜑
×
𝜃 𝑚𝑎𝑥 − 𝜃 𝑚𝑖𝑛
6𝜎 𝜃
𝐶 𝑇ℎ =
1
𝐿
𝑖=1
𝑀
𝑗=1
𝑁 𝑖
𝐶𝑖𝑗
𝐶𝑖𝑗 confidence of the jth measurement in the ith section
𝑁𝑖 number of measurements in the ith section, 𝑀 number of sections
𝐿 number of sections with number of measurements larger than 1
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
8
Real time framework
o Sound capturing by 3D microphone array (6-channel)
o Data is Subjected to Framing and windowing block
o Short time discrete Fourier transform (DFT)
o Voice Activity Detectors (VAD)
• VAD are used for detecting active signals
• Based on Energy and Spectral shifts for each frame
o Localization block (Source estimation)
• MVDR Weights for each frequency bin
• PHAT weightings for combining all frequency response
o Optimization (Improving the localization estimates by averaging several measurements)
• Simple Clustering
Framework
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
9
Measurement setup and Procedure
o Microphone Array:
o 6-channel orthogonal array
o Speech Sources:
o Three Speech Sources placed at 0o, 45o, -45o Azimuth
o Speech duration = 20 sec
o 1.5 m from array
o Pure Speech mixed with Pink Noise
o SNR is 15 dB
Evaluation
Source 02 (-45o)
Source 03 (+45o)
Source 01 (0o)
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
Results (1/3)
o Localization Results at 0o azimuth
• VAD Voiced Frames = 500
Results
Number of Frames 496
STD 2.03
Localization Error 0.027
Localization error =
1
𝑁
𝑖=1
𝑁
𝜑𝑖 − 𝜑𝑖
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
Results (2/3)
o Localization Results at 45o azimuth
• VAD Voiced Frames = 500
Results
Number of Frames 493
STD 1.9
Localization error 0.081
Localization error =
1
𝑁
𝑖=1
𝑁
𝜑𝑖 − 𝜑𝑖
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
Results (3/3)
o Localization Results at -45o azimuth
• VAD Voiced Frames = 500
Results
Number of Frames 494
STD 1.3
Localization error 0.058
Localization error =
1
𝑁
𝑖=1
𝑁
𝜑𝑖 − 𝜑𝑖
1 November 2016
HANYANG UNIVERSITY
ARCHITECTURAL ACOUSTICS LAB
Conclusion
o Presented a framework for sound source localization
• Using six channel spherical microphone array based on
 SRP-MVDR algorithms weighted with PHAT
 VAD is used for extracting voiced frames of Speech
 Optimization using clustering method for accurate localization
o Producing convincing results within the accuracy of ±2o works well for 15 dB SNR
o MVDR weighted SPR Localizer estimated the sound sources with STD value of ±1.3o in
DOA

More Related Content

What's hot

Multirate DSP
Multirate DSPMultirate DSP
Multirate DSP
@zenafaris91
 
Radar Systems- Unit-II : CW and Frequency Modulated Radar
Radar Systems- Unit-II : CW and Frequency Modulated RadarRadar Systems- Unit-II : CW and Frequency Modulated Radar
Radar Systems- Unit-II : CW and Frequency Modulated Radar
VenkataRatnam14
 
Antinoise system & Noise Cancellation
Antinoise system & Noise CancellationAntinoise system & Noise Cancellation
Antinoise system & Noise Cancellation
Gujarat Technological University
 
Earmold acoustics
Earmold acousticsEarmold acoustics
Earmold acoustics
Mark Shaver
 
Specific features of hearing aids
Specific features of hearing aidsSpecific features of hearing aids
Specific features of hearing aids
Pra_buddha
 
SPEECH CODING
SPEECH CODINGSPEECH CODING
SPEECH CODING
Shradheshwar Verma
 
9 mod analog_am_fm (1)
9 mod analog_am_fm (1)9 mod analog_am_fm (1)
9 mod analog_am_fm (1)mirla manama
 
Application of DSP in Biomedical science
Application of DSP in Biomedical scienceApplication of DSP in Biomedical science
Application of DSP in Biomedical science
Taslima Yasmin Tarin
 
5. 2 ray propagation model part 1
5. 2 ray propagation model   part 15. 2 ray propagation model   part 1
5. 2 ray propagation model part 1
JAIGANESH SEKAR
 
N 1-awp-lecture-notes-final
N 1-awp-lecture-notes-finalN 1-awp-lecture-notes-final
N 1-awp-lecture-notes-final
15010192
 
Wavelet
WaveletWavelet
Wavelet
Amr Nasr
 
Adaptive filter
Adaptive filterAdaptive filter
Adaptive filter
A. Shamel
 
Speech technology basics
Speech technology   basicsSpeech technology   basics
Speech technology basics
Hemaraja Nayaka S
 
Earlang ejercicios
Earlang ejerciciosEarlang ejercicios
Earlang ejercicios
Pablo Barbecho
 
MIMO
MIMOMIMO
DSP_FOEHU - Lec 08 - The Discrete Fourier Transform
DSP_FOEHU - Lec 08 - The Discrete Fourier TransformDSP_FOEHU - Lec 08 - The Discrete Fourier Transform
DSP_FOEHU - Lec 08 - The Discrete Fourier Transform
Amr E. Mohamed
 

What's hot (20)

Multirate DSP
Multirate DSPMultirate DSP
Multirate DSP
 
Radar Systems- Unit-II : CW and Frequency Modulated Radar
Radar Systems- Unit-II : CW and Frequency Modulated RadarRadar Systems- Unit-II : CW and Frequency Modulated Radar
Radar Systems- Unit-II : CW and Frequency Modulated Radar
 
Antinoise system & Noise Cancellation
Antinoise system & Noise CancellationAntinoise system & Noise Cancellation
Antinoise system & Noise Cancellation
 
Earmold acoustics
Earmold acousticsEarmold acoustics
Earmold acoustics
 
Specific features of hearing aids
Specific features of hearing aidsSpecific features of hearing aids
Specific features of hearing aids
 
SPEECH CODING
SPEECH CODINGSPEECH CODING
SPEECH CODING
 
9 mod analog_am_fm (1)
9 mod analog_am_fm (1)9 mod analog_am_fm (1)
9 mod analog_am_fm (1)
 
Application of DSP in Biomedical science
Application of DSP in Biomedical scienceApplication of DSP in Biomedical science
Application of DSP in Biomedical science
 
Final ppt
Final pptFinal ppt
Final ppt
 
Smart antenna
Smart antennaSmart antenna
Smart antenna
 
Mimo
MimoMimo
Mimo
 
5. 2 ray propagation model part 1
5. 2 ray propagation model   part 15. 2 ray propagation model   part 1
5. 2 ray propagation model part 1
 
N 1-awp-lecture-notes-final
N 1-awp-lecture-notes-finalN 1-awp-lecture-notes-final
N 1-awp-lecture-notes-final
 
Dsp lecture vol 7 adaptive filter
Dsp lecture vol 7 adaptive filterDsp lecture vol 7 adaptive filter
Dsp lecture vol 7 adaptive filter
 
Wavelet
WaveletWavelet
Wavelet
 
Adaptive filter
Adaptive filterAdaptive filter
Adaptive filter
 
Speech technology basics
Speech technology   basicsSpeech technology   basics
Speech technology basics
 
Earlang ejercicios
Earlang ejerciciosEarlang ejercicios
Earlang ejercicios
 
MIMO
MIMOMIMO
MIMO
 
DSP_FOEHU - Lec 08 - The Discrete Fourier Transform
DSP_FOEHU - Lec 08 - The Discrete Fourier TransformDSP_FOEHU - Lec 08 - The Discrete Fourier Transform
DSP_FOEHU - Lec 08 - The Discrete Fourier Transform
 

Viewers also liked

FPGA Based Acoustic Source Localization Project
FPGA Based Acoustic Source Localization ProjectFPGA Based Acoustic Source Localization Project
FPGA Based Acoustic Source Localization Project
Shristi Pradhan
 
Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...Muhammad Imran
 
Sound source localization
Sound source localizationSound source localization
Sound source localization
guest10652b
 
Kinect Microphone Array case study
Kinect Microphone Array case studyKinect Microphone Array case study
Kinect Microphone Array case study
Muhammad Imran
 
Food Preservation for Farming Communities in Nepal: A Low Cost Engineering So...
Food Preservation for Farming Communities in Nepal: A Low Cost Engineering So...Food Preservation for Farming Communities in Nepal: A Low Cost Engineering So...
Food Preservation for Farming Communities in Nepal: A Low Cost Engineering So...
Shristi Pradhan
 
Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...
Muhammad Imran
 

Viewers also liked (7)

FPGA Based Acoustic Source Localization Project
FPGA Based Acoustic Source Localization ProjectFPGA Based Acoustic Source Localization Project
FPGA Based Acoustic Source Localization Project
 
Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...
 
Sound source localization
Sound source localizationSound source localization
Sound source localization
 
Kinect Microphone Array case study
Kinect Microphone Array case studyKinect Microphone Array case study
Kinect Microphone Array case study
 
Food Preservation for Farming Communities in Nepal: A Low Cost Engineering So...
Food Preservation for Farming Communities in Nepal: A Low Cost Engineering So...Food Preservation for Farming Communities in Nepal: A Low Cost Engineering So...
Food Preservation for Farming Communities in Nepal: A Low Cost Engineering So...
 
Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...
 
recommender_systems
recommender_systemsrecommender_systems
recommender_systems
 

Similar to Sound Source Localization

Voice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency FilteringVoice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency Filtering
Tejus Adiga M
 
2015-04 PhD defense
2015-04 PhD defense2015-04 PhD defense
2015-04 PhD defenseNil Garcia
 
Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016
SaruwatariLabUTokyo
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016
SaruwatariLabUTokyo
 
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdfA_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
Bala Murugan
 
A novel high resolution doa estimation design algorithm of close sources sign...
A novel high resolution doa estimation design algorithm of close sources sign...A novel high resolution doa estimation design algorithm of close sources sign...
A novel high resolution doa estimation design algorithm of close sources sign...
eSAT Journals
 
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
IJERA Editor
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...CSCJournals
 
OFDM_Channel Estimation Techniques Based on Pilot Arrangement in OFDM Syste...
OFDM_Channel Estimation Techniques Based on Pilot Arrangement in OFDM Syste...OFDM_Channel Estimation Techniques Based on Pilot Arrangement in OFDM Syste...
OFDM_Channel Estimation Techniques Based on Pilot Arrangement in OFDM Syste...
56260891
 
Room Transfer Function Estimation and Room Equalization in Noise Environments
Room Transfer Function Estimation and Room Equalization in Noise EnvironmentsRoom Transfer Function Estimation and Room Equalization in Noise Environments
Room Transfer Function Estimation and Room Equalization in Noise Environments
IJERA Editor
 
F0483340
F0483340F0483340
Spectrum Sensing Detection with Sequential Forward Search in Comparison to Kn...
Spectrum Sensing Detection with Sequential Forward Search in Comparison to Kn...Spectrum Sensing Detection with Sequential Forward Search in Comparison to Kn...
Spectrum Sensing Detection with Sequential Forward Search in Comparison to Kn...
IJMTST Journal
 
IMPLEMENTATION OF WBSS IN LOW SNR.pptx
IMPLEMENTATION OF WBSS IN LOW SNR.pptxIMPLEMENTATION OF WBSS IN LOW SNR.pptx
IMPLEMENTATION OF WBSS IN LOW SNR.pptx
RockFellerSinghRusse
 
SP Study1018 Paper Reading
SP Study1018 Paper ReadingSP Study1018 Paper Reading
SP Study1018 Paper Reading
Mori Takuma
 
Design and Implementation of Polyphase based Subband Adaptive Structure for N...
Design and Implementation of Polyphase based Subband Adaptive Structure for N...Design and Implementation of Polyphase based Subband Adaptive Structure for N...
Design and Implementation of Polyphase based Subband Adaptive Structure for N...
Pratik Ghotkar
 
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionSpeech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
IOSRJVSP
 
Bayesian modelling and computation for Raman spectroscopy
Bayesian modelling and computation for Raman spectroscopyBayesian modelling and computation for Raman spectroscopy
Bayesian modelling and computation for Raman spectroscopy
Matt Moores
 
ROBUST RECONSTRUCTION FOR CS-BASED FETAL BEATS DETECTION
ROBUST RECONSTRUCTION FOR CS-BASED FETAL BEATS DETECTIONROBUST RECONSTRUCTION FOR CS-BASED FETAL BEATS DETECTION
ROBUST RECONSTRUCTION FOR CS-BASED FETAL BEATS DETECTION
Riccardo Bernardini
 
Getting started with chemometric classification
Getting started with chemometric classificationGetting started with chemometric classification
Getting started with chemometric classification
Alex Henderson
 
A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Us...
A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Us...A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Us...
A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Us...
CSCJournals
 

Similar to Sound Source Localization (20)

Voice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency FilteringVoice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency Filtering
 
2015-04 PhD defense
2015-04 PhD defense2015-04 PhD defense
2015-04 PhD defense
 
Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016
 
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdfA_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
 
A novel high resolution doa estimation design algorithm of close sources sign...
A novel high resolution doa estimation design algorithm of close sources sign...A novel high resolution doa estimation design algorithm of close sources sign...
A novel high resolution doa estimation design algorithm of close sources sign...
 
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
 
OFDM_Channel Estimation Techniques Based on Pilot Arrangement in OFDM Syste...
OFDM_Channel Estimation Techniques Based on Pilot Arrangement in OFDM Syste...OFDM_Channel Estimation Techniques Based on Pilot Arrangement in OFDM Syste...
OFDM_Channel Estimation Techniques Based on Pilot Arrangement in OFDM Syste...
 
Room Transfer Function Estimation and Room Equalization in Noise Environments
Room Transfer Function Estimation and Room Equalization in Noise EnvironmentsRoom Transfer Function Estimation and Room Equalization in Noise Environments
Room Transfer Function Estimation and Room Equalization in Noise Environments
 
F0483340
F0483340F0483340
F0483340
 
Spectrum Sensing Detection with Sequential Forward Search in Comparison to Kn...
Spectrum Sensing Detection with Sequential Forward Search in Comparison to Kn...Spectrum Sensing Detection with Sequential Forward Search in Comparison to Kn...
Spectrum Sensing Detection with Sequential Forward Search in Comparison to Kn...
 
IMPLEMENTATION OF WBSS IN LOW SNR.pptx
IMPLEMENTATION OF WBSS IN LOW SNR.pptxIMPLEMENTATION OF WBSS IN LOW SNR.pptx
IMPLEMENTATION OF WBSS IN LOW SNR.pptx
 
SP Study1018 Paper Reading
SP Study1018 Paper ReadingSP Study1018 Paper Reading
SP Study1018 Paper Reading
 
Design and Implementation of Polyphase based Subband Adaptive Structure for N...
Design and Implementation of Polyphase based Subband Adaptive Structure for N...Design and Implementation of Polyphase based Subband Adaptive Structure for N...
Design and Implementation of Polyphase based Subband Adaptive Structure for N...
 
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionSpeech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
 
Bayesian modelling and computation for Raman spectroscopy
Bayesian modelling and computation for Raman spectroscopyBayesian modelling and computation for Raman spectroscopy
Bayesian modelling and computation for Raman spectroscopy
 
ROBUST RECONSTRUCTION FOR CS-BASED FETAL BEATS DETECTION
ROBUST RECONSTRUCTION FOR CS-BASED FETAL BEATS DETECTIONROBUST RECONSTRUCTION FOR CS-BASED FETAL BEATS DETECTION
ROBUST RECONSTRUCTION FOR CS-BASED FETAL BEATS DETECTION
 
Getting started with chemometric classification
Getting started with chemometric classificationGetting started with chemometric classification
Getting started with chemometric classification
 
A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Us...
A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Us...A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Us...
A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Us...
 

Recently uploaded

Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
top1002
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 

Recently uploaded (20)

Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 

Sound Source Localization

  • 1. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB Office +82-2-2220-1795 | Fax +82-2-2220-4794 http://acoustics.hanyang.ac.kr Muhammad Imran, Jin Yong Jeon December 12, 2015 A Steered-Response Power (SRP) based Framework for Sound Source Localization using Microphone Arrays in Reverberant Rooms for Enhancement of Speech Intelligibility 42. Jahrestagung für Akustik, 14.-17. März 2016
  • 2. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB 2 o Introduction o Background and Motivation o Sound Source Localization • Methodology ◦ VAD: Voice Activity Detection ◦ SRP (Beamforming) filters ◦ PHAT-weighting • Real-time Framework and Implementation • Optimization and Clustering o Results o Conclusion Contents
  • 3. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB 3 o Sound Source Localization and Tracking using microphone arrays for • Room Acoustics measurements • Teleconference Systems o Traditional Methods • Time-delay estimation (TDOA) techniques between microphone pairs using Correlation Function, ignoring ◦ Ambient noise ◦ Reflections from surrounding ◦ Reverberation in closed space o Therefore • Producing poor results in terms of Precision, Resolution and Robustness • Require additional post-processing to track multiple sources in real time applications • Limited bandwidth Introduction (Background and Motivation)
  • 4. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB 4 o The Methods for sound source localization using microphone arrays o Time difference of arrival estimation (TDOA) o Generalized cross-correlation (GCC) o Weighting function o Optimum detection in the presence of reverberant environment o Improved Signal to Noise Ratio (SNR) o Steered Response Power (SRP) o Weighting function as Beamformer o Source localization and tracking o Robust in Reverb Condition Sound Source Localization
  • 5. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB 5 o The method is based on Steered-response power (SRP) o The power at the beamformer output as a function of the look-up direction 𝑐 is o Weighted Steered-response Power o MVDR beamformer as Weight o After simplifications Methodology (1/3) 𝑃𝐵𝐹(𝑐) = 𝐷(𝑐) 𝐻 𝑆𝐷(𝑐) 𝑆 is cross-power matrix 𝐷(𝑐) is Array directivity 𝑤 = 𝐷 𝐻 (𝑐)𝑆−1 𝐷 𝐻(𝑐)𝑆−1 𝐷(𝑐) 𝑃 𝑀𝑉𝐷𝑅(𝑐) = 1 𝐷(𝑐) 𝐻 𝑆−1 𝐷(𝑐)
  • 6. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB 6 o Combining the Bins o Combining the signals 𝑃 𝑀𝑉𝐷𝑅(𝑐, 𝑘) from different frequency bins o Approach used is PHAT weighting o Information of noise variance can be used and the final beamformer is improved o Therefore, Methodology (2/3) 𝑃𝑆𝑆𝐿(𝑐) = 1 𝐾 𝑘=1 𝐾 𝑀 𝑋 𝑘 𝐻 𝑋 𝑘 𝑃 𝑀𝑉𝐷𝑅(𝑐, 𝑘) 𝑋 𝑘 input vector with length M containing the input signals for this frequency bin from all microphones 𝑁𝑘 = 1 𝑀 𝑖 𝑁𝑖(𝑘) 𝑃𝑆𝑆𝐿(𝑐) = 1 𝐾 𝑘=1 𝐾 𝑀 𝑞𝑋 𝑘 𝐻 𝑋 𝑘 + (1 − 𝑞) 𝑁𝑘 𝑃 𝑀𝑉𝐷𝑅(𝑐, 𝑘)
  • 7. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB 7 o Post Processing (Simple Clustering) o Algorithm used is so-called “Bucket Clustering” o Step: 1; Grouping the Measurements o Based on Single-frame Information of azimuth ′𝜑′, elevation ′𝜃′, standard deviations ′𝜎 𝜑 ′ and ′𝜎 𝜃 ′ microphone-array working volume is computed as (50% overlapping) o Step: 2; Number of Cluster Candidates o Applying threshold defined as the average confidence of sections with more than one measurement: o Step: 3; Averaging the Measurements in Each Cluster Candidate Methodology (3/3) 𝑀 = 4 𝜑 𝑚𝑎𝑥 − 𝜑 𝑚𝑖𝑛 6𝜎 𝜑 × 𝜃 𝑚𝑎𝑥 − 𝜃 𝑚𝑖𝑛 6𝜎 𝜃 𝐶 𝑇ℎ = 1 𝐿 𝑖=1 𝑀 𝑗=1 𝑁 𝑖 𝐶𝑖𝑗 𝐶𝑖𝑗 confidence of the jth measurement in the ith section 𝑁𝑖 number of measurements in the ith section, 𝑀 number of sections 𝐿 number of sections with number of measurements larger than 1
  • 8. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB 8 Real time framework o Sound capturing by 3D microphone array (6-channel) o Data is Subjected to Framing and windowing block o Short time discrete Fourier transform (DFT) o Voice Activity Detectors (VAD) • VAD are used for detecting active signals • Based on Energy and Spectral shifts for each frame o Localization block (Source estimation) • MVDR Weights for each frequency bin • PHAT weightings for combining all frequency response o Optimization (Improving the localization estimates by averaging several measurements) • Simple Clustering Framework
  • 9. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB 9 Measurement setup and Procedure o Microphone Array: o 6-channel orthogonal array o Speech Sources: o Three Speech Sources placed at 0o, 45o, -45o Azimuth o Speech duration = 20 sec o 1.5 m from array o Pure Speech mixed with Pink Noise o SNR is 15 dB Evaluation Source 02 (-45o) Source 03 (+45o) Source 01 (0o)
  • 10. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB Results (1/3) o Localization Results at 0o azimuth • VAD Voiced Frames = 500 Results Number of Frames 496 STD 2.03 Localization Error 0.027 Localization error = 1 𝑁 𝑖=1 𝑁 𝜑𝑖 − 𝜑𝑖
  • 11. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB Results (2/3) o Localization Results at 45o azimuth • VAD Voiced Frames = 500 Results Number of Frames 493 STD 1.9 Localization error 0.081 Localization error = 1 𝑁 𝑖=1 𝑁 𝜑𝑖 − 𝜑𝑖
  • 12. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB Results (3/3) o Localization Results at -45o azimuth • VAD Voiced Frames = 500 Results Number of Frames 494 STD 1.3 Localization error 0.058 Localization error = 1 𝑁 𝑖=1 𝑁 𝜑𝑖 − 𝜑𝑖
  • 13. 1 November 2016 HANYANG UNIVERSITY ARCHITECTURAL ACOUSTICS LAB Conclusion o Presented a framework for sound source localization • Using six channel spherical microphone array based on  SRP-MVDR algorithms weighted with PHAT  VAD is used for extracting voiced frames of Speech  Optimization using clustering method for accurate localization o Producing convincing results within the accuracy of ±2o works well for 15 dB SNR o MVDR weighted SPR Localizer estimated the sound sources with STD value of ±1.3o in DOA

Editor's Notes

  1. Hello I have been presenting a framework for sound source localization using SRP Method in order to get a clean speech signal in reverb environments form a particular speaker
  2. Staring from background and motivation of the presented work, I would talk about the methodology of the sound localizer, including the usage of VAD as voiced frame detection in speech signals and PHAT as combining the frequency bins of the beamformer for different microphone pairs of arrays. Secondly, a real-time frame would be introduced with optimization of the results of DOA Finally, results will be shown and discussed
  3. The common purpose of sound source localizer and tracker is for room acoustics measurements and teleconference systems where the speakers are detected and localized for reverb rooms Traditional methods use TDOA between single pair of microphone using GCC. Some recent techniques have been introduced that uses different microphone combination for computing GCC function with certain weighting functions. But these methods often ignores the ambience and reverb room condition, therefore offering poor results in terms of Precision, Resolution and Robustness. Additionally they require post-processing algorithm to track multiple sources in real time applications and have limited bandwidth
  4. Generally the sound localization is categorized into two areas Time difference of arrival estimation (TDOA), where Generalized cross-correlation are used for finding the delay time among different microphone pairs of arrays Steered Response Power (SRP) methods are used for steering the source and commonly applied for tracking purposed We have been discussing in this presentation a SRP based approach
  5. These slides would present the methodology and necessary mathematical background for our proposed framework The general formula for beamforming is show in first equation, where 𝑆 is cross-power matrix and 𝐷(𝑐) is Array directivity We have bean using MVDR as weighting function for the beamformer and after simplification we get the final equation for the beamformer steering at direction c This beamformer is for single frequency bin
  6. After getting the beamformer functions for al frequencies of interest, we Combining the signals 𝑃 𝑀𝑉𝐷𝑅 (𝑐,𝑘) from different frequency bins. The approach followed is by using PHAT weight as a combiner Here, 𝑋 𝑘 input vector with length M containing the input signals for this frequency bin from all microphones In addition, the noise variance information is used and the final beamformer is improved as shown in last equation
  7. A post processing is performed using so-called “Bucket Clustering” Algorithm for improving the confidence level of measurements and detecting actual sound source among reflections
  8. This slide describes the Real time framework of the method Sound is captured and transformed into frequency domain VAD is applied And the data f voiced frames are subjected to localization block Finally, Optimization is performed for Improving the localization estimates by averaging several measurements using Simple Clustering
  9. For evaluation 6-ch microphone array is used Three Speech Sources were placed at 0o, 45o, -45o Azimuth at a distance of 1.5 from array and speech was recorded for 20sec The speech was mixed with pink noise and final speech has a 15dB SNR
  10. Localization Results at 0o azimuth
  11. Localization Results at 45o azimuth
  12. Localization Results at -45o azimuth
  13. Presented a framework for sound source localization Using six channel spherical microphone array based on SRP-MVDR algorithms weighted with PHAT VAD is used for extracting voiced frames of Speech Optimization using clustering method for accurate localization Producing convincing results within the accuracy of ±2o works well for 15 dB SNR MVDR weighted SPR Localizer estimated the sound sources with STD value of ±1.3o in DOA