SlideShare a Scribd company logo
A Model of Moving Source
Localization Based on Binaural Hearing
Faculty of Electrical and Computer
Engineering
Department of Communication
Dr. Masoud Geravanchizadeh
Taher Bagheri
University
of Tabriz
Summer 92
Contents
Taher Bagheri
2
 Introduction
 Model Architecture
 Simulation Setup
 Results
Introduction
• Localization of multiple sound sources from a
binaural input is a challenging problem that has
applications in hearing prostheses, spatial sound
reproduction, and mobile robotics.
• Binaural localization has received significant
attention in the field of computational auditory scene
analysis (CASA) which is guided by principles in the
perceptual organization of sound by human listeners.
3
Taher Bagheri
• Two principal localization cues are interaural time
difference (ITD) and interaural level difference
(ILD).
• ITD commonly referred to as the time difference of
arrival, and ILD is due to the effects of the head,
torso, and outer ear.
• The generalized cross correlation (GCC) method is a
well-known approach for ITD estimation.
4
Taher Bagheri
• If it can be assumed that sources are spatially
stationary over a given interval of time, a simple
approach is to first integrate azimuth information
across frequency, then average across time and select
multiple peaks in the resulting azimuth-dependent
response function.
5
Taher Bagheri
• There are two main approaches to target tracking that
utilize Bayesian inference.
– Multiple Hypothesis Tracking (MHT)
– Bayesian filtering
• The Bayesian tracker has a closed form solution only
for a linear process with Gaussian noise which is
equivalent to the Kalman filter in this case.
6
Taher Bagheri
• However, when restricting the size of the array to
only two sensors, as in the case of human audition,
the multisource tracking problem becomes more
challenging.
• Many approaches for tracking are based on full field
calculations which are computationally intensive and
sensitive to assumptions on the structure of the
environment.
Taher Bagheri
7
• Alternative methods that use only select features of
the acoustic field for localization and environmental
parameter estimation have been proposed.
• One the best method that has been proposed, extracts
arrival times and amplitudes of distinct paths from
measured acoustic time-series using sequential
Bayesian filtering, namely, particle filtering.
• Particles filters are popular at tracking for non-linear
and/or non-Gaussian Models.
Taher Bagheri
8
Model Architecture
Taher Bagheri
9
• In this model we have two main part that in firs part
algorithm localizes primary position of target source
and next tracks target.
• At first algorithm trains the model with primary
feature extracting like ITD and ILD.
• Next step is to determine position (azimuth) of target
for initializing particle filtering.
• Model contains three main part:
►Monaural pathway
►Binaural pathway
Taher Bagheri
10
Monaural Pathway
• Onset synchrony is known to be strong cue for across
frequency grouping in auditory scene analysis and
have been shown to influence localization judgments
by human listeners.
• The proposed framework incorporates a monaural
pathway that uses an onset/offset analysis to group T-
F units dominated by the same underlying source.
11
Taher Bagheri
• The grouping is used to constrain the integration of
binaural cues for azimuth estimation.
• Monaural pathway includes two parts:
• A. Onset/Offset-Based Segmentation
• B. Onset-Based Weights
12
Taher Bagheri
• Monaural Pathway
13
Onset/Offset-based Segmentation
T-F segments
Onset-based Weights
wE
c,m
eE
c,m
Taher Bagheri
Onset/Offset-Based Segmentation
• The method first identifies onsets (increases in signal
energy) and offsets (decreases in signal energy)
across time within gammatone sub-bands.
• The set of T-F units between a pair of onset and offset
fronts forms a T-F segment.
• This segmentation system has been used to generate
T-F segments for the left and the right mixture
independently.
14
Taher Bagheri
Onset-Based Weights
• In challenging acoustic environments, many T-F units
will be corrupted by diffuse noise or reverberation.
• At first method extracts the signal envelope for each
frequency channel of the left and the right signal by
squaring and passing each sub-band through a first-
order IIR filter.
15
Taher Bagheri
• Finally we compute:
– as the weight for unit uE
c,m.
Taher Bagheri
16
,
[ ] [ 1]
[ 1]
E E
E c c
c m E
c
e m e m
w
e m
+
 − −
=  
− 
• Binaural Pathway
17
Auditory Periphery
and Binaural Feature
Extraction
ITD &
ILD
GMM model of ITD
and ILD
Pc(τ,λ|θ)
Model Training
Pc(τc,m,λc,m|θ)
noise
Taher Bagheri
Binaural Pathway
• Binaural pathway contains three stages:
• A. Auditory Periphery and Binaural Feature
Extraction
• B. Azimuth-Dependent Binaural Model
• C. Model Training
18
Taher Bagheri
Binaural Feature Extraction
• The binaural pathway consists of a low-level feature
extraction stage where we calculate the ITD, denoted
τc,m, and ILD, denoted λc,m, for each T-F unit pair.
• A T-F unit is an elemental sound component from one
frame, indexed by m, and one filter channel, indexed
by c.
19
Taher Bagheri
• We calculate ITD as the maximum peak in a running
cross-correlation between T-F units uL
c,m and uR
c,m, where
we consider time lags between -1 and 1 ms.
• So ITD is:
20
1
, ,0
2 2
1 1
, ,0 0
( ) ( )
2 2( , , )
( ) ( )
2 2
n
n n
T L Rn n
c m c mn
T TL Rn n
c m c mn n
T T
u m n u m n
C c m
T T
u m n u m n
τ
τ
τ
−
=
− −
= =
− − −
=
   
− − − ÷  ÷
   
∑
∑ ∑
, argmax ( , , )c m
L
C c m
τ
τ τ
∈
=
Taher Bagheri
• ILD corresponds to the energy ratio in dB between
uL
c,mand uR
c,m:
21
2
1
,0
, 10 2
1
,0
( )
2
10log
( )
2
n
n
T L n
c mn
c m
T R n
c mn
T
u m n
T
u m n
λ
−
=
−
=
  
− ÷ ÷
  ÷=
 ÷ 
− ÷ ÷
  
∑
∑
Taher Bagheri
Azimuth-Dependent Binaural Model
• The model independently captures the frequency-
dependent pattern of ITD and ILD values due to
direct-path propagation, which we refer to as direct-
path cues.
• The azimuth-dependent model of ITD and ILD has
been denoted as Pc(τ,λ|θ), which represents the
likelihood of observing a pair of ITD and ILD values
in frequency channel given energy from a point
source with azimuth θ.
22
Taher Bagheri
• Due to spatial aliasing, the probability space for
observed ITDs in higher frequency channels is multi-
modal. We therefore use a mixture of Gaussians to
capture:
• The ILD likelihood is well described by a single
Gaussian:
23
, , ,
1
( | , ) ( , ) ( | ( , ), ( , ))
cK
c c k c k c k
k
P r r r rθ θ θ θτ τ ρ τ τ µ τ σ τ
=
= ℵ∑
( | , ) ( | ( , ), ( , ))c c cP r r rθ θ θλ λ λ µ λ σ λ=ℵ
Taher Bagheri
Model Training
• In this work, we generate training mixtures by
combining a point source with a simulated diffuse
noise, and in doing so, avoid capturing environment-
specific effects.
• Only the head-related transfer functions (HRTFs) of
the binaural setup are known.
• We simulate a point source by filtering monaural
signals using the HRTF for a given azimuth.
24
Taher Bagheri
• The diffuse noise is created by passing uncorrelated
noise signals through each of the HRTFs for the
binaural setup and then adding them together.
• Given a set of training data for a specific azimuth,
model measures τ and λ from each pair of mixture T-F
units and calculates p.
25
Taher Bagheri
Localization Framework
• The binaural pathway extracts azimuth-dependent
information from each T-F unit pair while the
monaural pathway groups T-F units that are likely to
be dominated by the same source.
• The final stage of the proposed system then integrates
this information and produces a set of N azimuth
estimates.
26
Taher Bagheri
• To perform localization, we first postulate a set of
possible N azimuths, where we assume N is known.
• For each simultaneous stream or T-F segments we
find the most likely azimuth from the postulated set
and integrate likelihood scores over all streams and
segments.
27
Taher Bagheri
• The process generates a total likelihood for each
postulated set of azimuths, and we choose the set that
maximizes this value.
• Formally, let IE
be the total number of simultaneous
streams and T-F segments from ear signal E.
Taher Bagheri
28
• Assuming conditional independence between T-F
units, the weighted log-likelihood for sE
i is then:
• We search for the most likely set of N azimuths using:
29
Taher Bagheri
, , ,
,
( ) ln( ( , | ))
E
i
E E
i c m c c m c m
c m s
w Pβ θ τ λ θ
∈
= ∑
ˆ ˆ
1 1
ˆ argmax ( ) ( )
L R
i j
I I
L R
i y L j y R
i j
β θ β θ
= =
 
Θ = + ÷
 
∑ ∑
Simulation Setup
• To test this model ROOMSIM simulator has been
used in this study.
• Roomsim is a simulation of the acoustics of a simple
rectangular prism room has been constructed using
the MATLAB m-code programming language.
30
Taher Bagheri
• The image method of simulating room acoustics is
often used to provide a means of generating signals
incorporating “sufficiently realistic” reverberation
and directional cues for the testing of audio/speech
processing algorithms.
• The foundation on which RoomSim is built is the
publication of a Fortran routine by Allen and Berkley
in 1979.
31
Taher Bagheri
The RoomSim Program
• The program simulates the geometrical acoustics of
a perfect rectangular parallelepiped enclosure using
the image-source model to produce an impulse
response from each omni-directional primary
source to a directional receiver system.
• This software combines the image method for
reverberation with HRTF measurements made using a
KEMAR dummy head.
32
Taher Bagheri
• The simulation of a head utilizes the Head Related
Transfer Function (HRTF) data, actually Head
Related Impulse Response (HRIR) data, provided
from measurements made either on a Kemar dummy
head at the Center for Image Processing and
Integrated Computing (CIPIC), University of
California.
33
Taher Bagheri
RoomSim Operation
• In operation the user specifies the dimensions of the
room, its surface materials the type, location and
orientation of the receiver system and the location of
the primary source(s).
• This can be done by submitting either a Microsoft
Excel spreadsheet form, a text file, or by selecting a
MATLAB *.mat file which saved a configuration
from a previous run.
34
Taher Bagheri
• If a simulated head has been selected the response
from each quantized image-source direction is
convolved with the relevant HRIR data.
• The individual image-source response are then
accumulated to form the complete pressure impulse
response from each primary source to the receiver
and the result plotted and saved to file.
35
Taher Bagheri
Figure 1. RoomSim
setup, importing user
parameters like room
dimensions, humidity,
temperature ...
36
Taher Bagheri
37
Taher Bagheri
Figure 2. Room
dimensions and
source(s) and
receivers
positions.
38
Taher Bagheri
Figure 3. spectrum
of left and right ears
39
Taher Bagheri
• As mentioned this simulator saves user parameters as
MATLAB files to use this configuration for other
MATLAB functions in any application.
• In this study we save Roomsim configuration in
MATLAB file and use it for tracking algorithm.
Results
40
Taher Bagheri
• To evaluate model we use a set of binaural impulse
response (BIR)
• The simulated BIR, which we refer to as the Kemar
set, are generated using the Roomsim package.
• We create a library of BIRs by generating room
configurations, where room size, array position, and
array orientation are set at different states.
• Generated BIRs will be for azimuths between -90˚
and 90˚, spaced by 5˚, at distances of 4, 4, and 3 m.
• In order to train binaural models, we generate
anechoic BIRs for the same azimuths using the HRTF
measurements directly.
• Speech sources are selected from CIPIC database, by
a selected Kemar BIR.
• This model has been tested in anechoic room with
additive noise like babble noise, restaurant noise, …
41
Taher Bagheri
Taher Bagheri
42
Figure 6. without
noise and
reverberation.
43
Taher Bagheri
Figure 7. different
noise.
Figure 8. with
reverberation.
44
Taher Bagheri
Figure . with
reverberation and
noise.
Taher Bagheri
45
• Table 1. means of results with three experiments.
• T60= 600ms
46
Taher Bagheri
Noise Accuracy (%) Reverberation Accuracy (%)
Clean 99.2 Clean 99.2
Babble Noise (5dB) 98.5 T60=600ms 98
Babble Noise (15dB) 97.9 T60=600ms +
Babble Noise
97.4
Restaurant Noise (5dB) 97.8 T60=600ms +
Restaurant Noise
97.0
Car Noise (5dB) 98.9 T60=600ms +
Car Noise
97.6
Conclusion
• This model has not good result for more than one
source tracking.
• Comparing proposed model with previous tracking
model (Roman and Wang 2008) shows better results
in same noisy condition.
• Model performance in reverberant environment is not
as good as expected.
47
Taher Bagheri
Future works
• In this work, we assumed prior knowledge of the
number of sources, and thus a key problem for future
work is estimating the number of sources.
• Using joint visual and auditory information will lead
to better results.
Taher Bagheri
48
M.sc. presentation t.bagheri fashkhami

More Related Content

What's hot

Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...
Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...
Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...
IRJET Journal
 
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionSpeech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
IOSRJVSP
 
Performance Analysis of Acoustic Echo Cancellation Techniques
Performance Analysis of Acoustic Echo Cancellation TechniquesPerformance Analysis of Acoustic Echo Cancellation Techniques
Performance Analysis of Acoustic Echo Cancellation Techniques
IJERA Editor
 
Digital signal processing techniques for lti fiber
Digital signal processing techniques for lti fiberDigital signal processing techniques for lti fiber
Digital signal processing techniques for lti fiber
eSAT Publishing House
 
Multiuser MIMO Vector Perturbation Precoding
Multiuser MIMO Vector Perturbation PrecodingMultiuser MIMO Vector Perturbation Precoding
Multiuser MIMO Vector Perturbation Precoding
adeelrazi
 
Bat algorithm
Bat algorithmBat algorithm
Bat algorithm
Priya Kaushal
 
Handling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive TrajectoriesHandling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive Trajectories
Matthieu Hodgkinson
 
Adaptive Equalization
Adaptive EqualizationAdaptive Equalization
Adaptive Equalization
Oladapo Abiodun
 
F0331031037
F0331031037F0331031037
F0331031037
inventionjournals
 
Chap04
Chap04Chap04
Bat algorithm
Bat algorithmBat algorithm
Bat algorithm
Rohit Gujar
 
Digital signal processing techniques for lti fiber impairment compensation
Digital signal processing techniques for lti fiber impairment compensationDigital signal processing techniques for lti fiber impairment compensation
Digital signal processing techniques for lti fiber impairment compensation
eSAT Journals
 
IRJET - Essential Features Extraction from Aaroh and Avroh of Indian Clas...
IRJET -  	  Essential Features Extraction from Aaroh and Avroh of Indian Clas...IRJET -  	  Essential Features Extraction from Aaroh and Avroh of Indian Clas...
IRJET - Essential Features Extraction from Aaroh and Avroh of Indian Clas...
IRJET Journal
 
Blind deconvolution in Wireless Communication
Blind deconvolution in Wireless CommunicationBlind deconvolution in Wireless Communication
Blind deconvolution in Wireless Communication
Aritra Chatterjee
 
Optimum Receiver corrupted by AWGN Channel
Optimum Receiver corrupted by AWGN ChannelOptimum Receiver corrupted by AWGN Channel
Optimum Receiver corrupted by AWGN Channel
AWANISHKUMAR84
 
F010342837
F010342837F010342837
F010342837
IOSR Journals
 
Adaptive equalization
Adaptive equalizationAdaptive equalization
Adaptive equalization
Oladapo Abiodun
 
Noisy Speech Enhancement Using Soft Thresholding on Selected Intrinsic Mode F...
Noisy Speech Enhancement Using Soft Thresholding on Selected Intrinsic Mode F...Noisy Speech Enhancement Using Soft Thresholding on Selected Intrinsic Mode F...
Noisy Speech Enhancement Using Soft Thresholding on Selected Intrinsic Mode F...
CSCJournals
 
A Combined Voice Activity Detector Based On Singular Value Decomposition and ...
A Combined Voice Activity Detector Based On Singular Value Decomposition and ...A Combined Voice Activity Detector Based On Singular Value Decomposition and ...
A Combined Voice Activity Detector Based On Singular Value Decomposition and ...
CSCJournals
 
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
Shajun Nisha
 

What's hot (20)

Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...
Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...
Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...
 
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionSpeech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
 
Performance Analysis of Acoustic Echo Cancellation Techniques
Performance Analysis of Acoustic Echo Cancellation TechniquesPerformance Analysis of Acoustic Echo Cancellation Techniques
Performance Analysis of Acoustic Echo Cancellation Techniques
 
Digital signal processing techniques for lti fiber
Digital signal processing techniques for lti fiberDigital signal processing techniques for lti fiber
Digital signal processing techniques for lti fiber
 
Multiuser MIMO Vector Perturbation Precoding
Multiuser MIMO Vector Perturbation PrecodingMultiuser MIMO Vector Perturbation Precoding
Multiuser MIMO Vector Perturbation Precoding
 
Bat algorithm
Bat algorithmBat algorithm
Bat algorithm
 
Handling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive TrajectoriesHandling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive Trajectories
 
Adaptive Equalization
Adaptive EqualizationAdaptive Equalization
Adaptive Equalization
 
F0331031037
F0331031037F0331031037
F0331031037
 
Chap04
Chap04Chap04
Chap04
 
Bat algorithm
Bat algorithmBat algorithm
Bat algorithm
 
Digital signal processing techniques for lti fiber impairment compensation
Digital signal processing techniques for lti fiber impairment compensationDigital signal processing techniques for lti fiber impairment compensation
Digital signal processing techniques for lti fiber impairment compensation
 
IRJET - Essential Features Extraction from Aaroh and Avroh of Indian Clas...
IRJET -  	  Essential Features Extraction from Aaroh and Avroh of Indian Clas...IRJET -  	  Essential Features Extraction from Aaroh and Avroh of Indian Clas...
IRJET - Essential Features Extraction from Aaroh and Avroh of Indian Clas...
 
Blind deconvolution in Wireless Communication
Blind deconvolution in Wireless CommunicationBlind deconvolution in Wireless Communication
Blind deconvolution in Wireless Communication
 
Optimum Receiver corrupted by AWGN Channel
Optimum Receiver corrupted by AWGN ChannelOptimum Receiver corrupted by AWGN Channel
Optimum Receiver corrupted by AWGN Channel
 
F010342837
F010342837F010342837
F010342837
 
Adaptive equalization
Adaptive equalizationAdaptive equalization
Adaptive equalization
 
Noisy Speech Enhancement Using Soft Thresholding on Selected Intrinsic Mode F...
Noisy Speech Enhancement Using Soft Thresholding on Selected Intrinsic Mode F...Noisy Speech Enhancement Using Soft Thresholding on Selected Intrinsic Mode F...
Noisy Speech Enhancement Using Soft Thresholding on Selected Intrinsic Mode F...
 
A Combined Voice Activity Detector Based On Singular Value Decomposition and ...
A Combined Voice Activity Detector Based On Singular Value Decomposition and ...A Combined Voice Activity Detector Based On Singular Value Decomposition and ...
A Combined Voice Activity Detector Based On Singular Value Decomposition and ...
 
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
 

Viewers also liked

Hearing with two ears is better than with one - Dr. Dr. h. c. Monika Lehnhardt
Hearing with two ears is better than with one - Dr. Dr. h. c. Monika LehnhardtHearing with two ears is better than with one - Dr. Dr. h. c. Monika Lehnhardt
Hearing with two ears is better than with one - Dr. Dr. h. c. Monika Lehnhardt
Monika Lehnhardt
 
A Sound Foundation 2013 - Speaker Slidecasts
A Sound Foundation 2013 - Speaker SlidecastsA Sound Foundation 2013 - Speaker Slidecasts
A Sound Foundation 2013 - Speaker Slidecasts
Phonak
 
Ci powerpoint
Ci powerpointCi powerpoint
Ci powerpoint
tanyacork
 
FM Technology: When to Introduce to Children 
FM Technology: When to Introduce to Children FM Technology: When to Introduce to Children 
FM Technology: When to Introduce to Children 
Phonak
 
2012-la-val-act-49
2012-la-val-act-492012-la-val-act-49
2012-la-val-act-49
Winter Liu
 
June2015 Atlanta Actuarial Club Presentation - Individual Annuity Overview
June2015 Atlanta Actuarial Club Presentation - Individual Annuity OverviewJune2015 Atlanta Actuarial Club Presentation - Individual Annuity Overview
June2015 Atlanta Actuarial Club Presentation - Individual Annuity Overview
Winter Liu
 
Hedge_Your_Insurance_Company-Liu
Hedge_Your_Insurance_Company-LiuHedge_Your_Insurance_Company-Liu
Hedge_Your_Insurance_Company-Liu
Winter Liu
 
hspring07-085bk
hspring07-085bkhspring07-085bk
hspring07-085bk
Winter Liu
 
พิธีไหว้ครูมามะสมุนไพร 2557 (หทัยกาญจน์ มามะสมุนไพร สาขาแม่สอด)
พิธีไหว้ครูมามะสมุนไพร 2557 (หทัยกาญจน์ มามะสมุนไพร สาขาแม่สอด)พิธีไหว้ครูมามะสมุนไพร 2557 (หทัยกาญจน์ มามะสมุนไพร สาขาแม่สอด)
พิธีไหว้ครูมามะสมุนไพร 2557 (หทัยกาญจน์ มามะสมุนไพร สาขาแม่สอด)
HAttakorn CH
 
El Rincon Flamenco - Portfolio
El Rincon Flamenco - PortfolioEl Rincon Flamenco - Portfolio
El Rincon Flamenco - Portfolio
Mireille Lenferna
 
2013-las-session-59
2013-las-session-592013-las-session-59
2013-las-session-59
Winter Liu
 
2013-ca-ann-mtg-session94
2013-ca-ann-mtg-session942013-ca-ann-mtg-session94
2013-ca-ann-mtg-session94
Winter Liu
 
Visafinally caregivinggg
Visafinally caregivingggVisafinally caregivinggg
Visafinally caregivinggg
Anton Laus
 
Visa3
Visa3Visa3
Visa3
Anton Laus
 
Shafiq CV
Shafiq CVShafiq CV
Greek Time Dance Company - Portfolio 2015
Greek Time Dance Company - Portfolio 2015Greek Time Dance Company - Portfolio 2015
Greek Time Dance Company - Portfolio 2015
Mireille Lenferna
 
Negotiation
NegotiationNegotiation
Negotiation
Moch Kurniawan
 

Viewers also liked (17)

Hearing with two ears is better than with one - Dr. Dr. h. c. Monika Lehnhardt
Hearing with two ears is better than with one - Dr. Dr. h. c. Monika LehnhardtHearing with two ears is better than with one - Dr. Dr. h. c. Monika Lehnhardt
Hearing with two ears is better than with one - Dr. Dr. h. c. Monika Lehnhardt
 
A Sound Foundation 2013 - Speaker Slidecasts
A Sound Foundation 2013 - Speaker SlidecastsA Sound Foundation 2013 - Speaker Slidecasts
A Sound Foundation 2013 - Speaker Slidecasts
 
Ci powerpoint
Ci powerpointCi powerpoint
Ci powerpoint
 
FM Technology: When to Introduce to Children 
FM Technology: When to Introduce to Children FM Technology: When to Introduce to Children 
FM Technology: When to Introduce to Children 
 
2012-la-val-act-49
2012-la-val-act-492012-la-val-act-49
2012-la-val-act-49
 
June2015 Atlanta Actuarial Club Presentation - Individual Annuity Overview
June2015 Atlanta Actuarial Club Presentation - Individual Annuity OverviewJune2015 Atlanta Actuarial Club Presentation - Individual Annuity Overview
June2015 Atlanta Actuarial Club Presentation - Individual Annuity Overview
 
Hedge_Your_Insurance_Company-Liu
Hedge_Your_Insurance_Company-LiuHedge_Your_Insurance_Company-Liu
Hedge_Your_Insurance_Company-Liu
 
hspring07-085bk
hspring07-085bkhspring07-085bk
hspring07-085bk
 
พิธีไหว้ครูมามะสมุนไพร 2557 (หทัยกาญจน์ มามะสมุนไพร สาขาแม่สอด)
พิธีไหว้ครูมามะสมุนไพร 2557 (หทัยกาญจน์ มามะสมุนไพร สาขาแม่สอด)พิธีไหว้ครูมามะสมุนไพร 2557 (หทัยกาญจน์ มามะสมุนไพร สาขาแม่สอด)
พิธีไหว้ครูมามะสมุนไพร 2557 (หทัยกาญจน์ มามะสมุนไพร สาขาแม่สอด)
 
El Rincon Flamenco - Portfolio
El Rincon Flamenco - PortfolioEl Rincon Flamenco - Portfolio
El Rincon Flamenco - Portfolio
 
2013-las-session-59
2013-las-session-592013-las-session-59
2013-las-session-59
 
2013-ca-ann-mtg-session94
2013-ca-ann-mtg-session942013-ca-ann-mtg-session94
2013-ca-ann-mtg-session94
 
Visafinally caregivinggg
Visafinally caregivingggVisafinally caregivinggg
Visafinally caregivinggg
 
Visa3
Visa3Visa3
Visa3
 
Shafiq CV
Shafiq CVShafiq CV
Shafiq CV
 
Greek Time Dance Company - Portfolio 2015
Greek Time Dance Company - Portfolio 2015Greek Time Dance Company - Portfolio 2015
Greek Time Dance Company - Portfolio 2015
 
Negotiation
NegotiationNegotiation
Negotiation
 

Similar to M.sc. presentation t.bagheri fashkhami

N017428692
N017428692N017428692
N017428692
IOSR Journals
 
Equalization.pdf
Equalization.pdfEqualization.pdf
Equalization.pdf
shashi480250
 
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
Ensemble Empirical Mode Decomposition: An adaptive method for noise reductionEnsemble Empirical Mode Decomposition: An adaptive method for noise reduction
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
IOSR Journals
 
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
IOSR Journals
 
EC-8491 communication theory
EC-8491 communication theoryEC-8491 communication theory
EC-8491 communication theory
GOWTHAMMS6
 
Signal denoising techniques
Signal denoising techniquesSignal denoising techniques
Signal denoising techniques
ShwetaRevankar4
 
Signal, Sampling and signal quantization
Signal, Sampling and signal quantizationSignal, Sampling and signal quantization
Signal, Sampling and signal quantization
SamS270368
 
Hybrid Reverberation Algorithm: a Practical Approach
Hybrid Reverberation Algorithm: a Practical ApproachHybrid Reverberation Algorithm: a Practical Approach
Hybrid Reverberation Algorithm: a Practical Approach
a3labdsp
 
Fast auralization using radial basis functions type of artificial neural netw...
Fast auralization using radial basis functions type of artificial neural netw...Fast auralization using radial basis functions type of artificial neural netw...
Fast auralization using radial basis functions type of artificial neural netw...
Amir Shokri
 
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET Journal
 
Sampling
SamplingSampling
Method for Converter Synchronization with RF Injection
Method for Converter Synchronization with RF InjectionMethod for Converter Synchronization with RF Injection
Method for Converter Synchronization with RF Injection
CSCJournals
 
I0414752
I0414752I0414752
I0414752
IOSR Journals
 
De-Noising Corrupted ECG Signals By Empirical Mode Decomposition (EMD) With A...
De-Noising Corrupted ECG Signals By Empirical Mode Decomposition (EMD) With A...De-Noising Corrupted ECG Signals By Empirical Mode Decomposition (EMD) With A...
De-Noising Corrupted ECG Signals By Empirical Mode Decomposition (EMD) With A...
IOSR Journals
 
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdfA_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
Bala Murugan
 
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral SubtractionA New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
CSCJournals
 
Projected Barzilai-Borwein Methods Applied to Distributed Compressive Spectru...
Projected Barzilai-Borwein Methods Applied to Distributed Compressive Spectru...Projected Barzilai-Borwein Methods Applied to Distributed Compressive Spectru...
Projected Barzilai-Borwein Methods Applied to Distributed Compressive Spectru...
Polytechnique Montreal
 
Railways
RailwaysRailways
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement technique
eSAT Publishing House
 
VISHAV snp.pptx
VISHAV snp.pptxVISHAV snp.pptx
VISHAV snp.pptx
NeerajBhatt62
 

Similar to M.sc. presentation t.bagheri fashkhami (20)

N017428692
N017428692N017428692
N017428692
 
Equalization.pdf
Equalization.pdfEqualization.pdf
Equalization.pdf
 
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
Ensemble Empirical Mode Decomposition: An adaptive method for noise reductionEnsemble Empirical Mode Decomposition: An adaptive method for noise reduction
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
 
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction
 
EC-8491 communication theory
EC-8491 communication theoryEC-8491 communication theory
EC-8491 communication theory
 
Signal denoising techniques
Signal denoising techniquesSignal denoising techniques
Signal denoising techniques
 
Signal, Sampling and signal quantization
Signal, Sampling and signal quantizationSignal, Sampling and signal quantization
Signal, Sampling and signal quantization
 
Hybrid Reverberation Algorithm: a Practical Approach
Hybrid Reverberation Algorithm: a Practical ApproachHybrid Reverberation Algorithm: a Practical Approach
Hybrid Reverberation Algorithm: a Practical Approach
 
Fast auralization using radial basis functions type of artificial neural netw...
Fast auralization using radial basis functions type of artificial neural netw...Fast auralization using radial basis functions type of artificial neural netw...
Fast auralization using radial basis functions type of artificial neural netw...
 
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
 
Sampling
SamplingSampling
Sampling
 
Method for Converter Synchronization with RF Injection
Method for Converter Synchronization with RF InjectionMethod for Converter Synchronization with RF Injection
Method for Converter Synchronization with RF Injection
 
I0414752
I0414752I0414752
I0414752
 
De-Noising Corrupted ECG Signals By Empirical Mode Decomposition (EMD) With A...
De-Noising Corrupted ECG Signals By Empirical Mode Decomposition (EMD) With A...De-Noising Corrupted ECG Signals By Empirical Mode Decomposition (EMD) With A...
De-Noising Corrupted ECG Signals By Empirical Mode Decomposition (EMD) With A...
 
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdfA_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
 
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral SubtractionA New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
 
Projected Barzilai-Borwein Methods Applied to Distributed Compressive Spectru...
Projected Barzilai-Borwein Methods Applied to Distributed Compressive Spectru...Projected Barzilai-Borwein Methods Applied to Distributed Compressive Spectru...
Projected Barzilai-Borwein Methods Applied to Distributed Compressive Spectru...
 
Railways
RailwaysRailways
Railways
 
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement technique
 
VISHAV snp.pptx
VISHAV snp.pptxVISHAV snp.pptx
VISHAV snp.pptx
 

Recently uploaded

原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
ramrag33
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
ElakkiaU
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
LAXMAREDDY22
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
171ticu
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
riddhimaagrawal986
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 

Recently uploaded (20)

原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 

M.sc. presentation t.bagheri fashkhami

  • 1. A Model of Moving Source Localization Based on Binaural Hearing Faculty of Electrical and Computer Engineering Department of Communication Dr. Masoud Geravanchizadeh Taher Bagheri University of Tabriz Summer 92
  • 2. Contents Taher Bagheri 2  Introduction  Model Architecture  Simulation Setup  Results
  • 3. Introduction • Localization of multiple sound sources from a binaural input is a challenging problem that has applications in hearing prostheses, spatial sound reproduction, and mobile robotics. • Binaural localization has received significant attention in the field of computational auditory scene analysis (CASA) which is guided by principles in the perceptual organization of sound by human listeners. 3 Taher Bagheri
  • 4. • Two principal localization cues are interaural time difference (ITD) and interaural level difference (ILD). • ITD commonly referred to as the time difference of arrival, and ILD is due to the effects of the head, torso, and outer ear. • The generalized cross correlation (GCC) method is a well-known approach for ITD estimation. 4 Taher Bagheri
  • 5. • If it can be assumed that sources are spatially stationary over a given interval of time, a simple approach is to first integrate azimuth information across frequency, then average across time and select multiple peaks in the resulting azimuth-dependent response function. 5 Taher Bagheri
  • 6. • There are two main approaches to target tracking that utilize Bayesian inference. – Multiple Hypothesis Tracking (MHT) – Bayesian filtering • The Bayesian tracker has a closed form solution only for a linear process with Gaussian noise which is equivalent to the Kalman filter in this case. 6 Taher Bagheri
  • 7. • However, when restricting the size of the array to only two sensors, as in the case of human audition, the multisource tracking problem becomes more challenging. • Many approaches for tracking are based on full field calculations which are computationally intensive and sensitive to assumptions on the structure of the environment. Taher Bagheri 7
  • 8. • Alternative methods that use only select features of the acoustic field for localization and environmental parameter estimation have been proposed. • One the best method that has been proposed, extracts arrival times and amplitudes of distinct paths from measured acoustic time-series using sequential Bayesian filtering, namely, particle filtering. • Particles filters are popular at tracking for non-linear and/or non-Gaussian Models. Taher Bagheri 8
  • 9. Model Architecture Taher Bagheri 9 • In this model we have two main part that in firs part algorithm localizes primary position of target source and next tracks target. • At first algorithm trains the model with primary feature extracting like ITD and ILD. • Next step is to determine position (azimuth) of target for initializing particle filtering.
  • 10. • Model contains three main part: ►Monaural pathway ►Binaural pathway Taher Bagheri 10
  • 11. Monaural Pathway • Onset synchrony is known to be strong cue for across frequency grouping in auditory scene analysis and have been shown to influence localization judgments by human listeners. • The proposed framework incorporates a monaural pathway that uses an onset/offset analysis to group T- F units dominated by the same underlying source. 11 Taher Bagheri
  • 12. • The grouping is used to constrain the integration of binaural cues for azimuth estimation. • Monaural pathway includes two parts: • A. Onset/Offset-Based Segmentation • B. Onset-Based Weights 12 Taher Bagheri
  • 13. • Monaural Pathway 13 Onset/Offset-based Segmentation T-F segments Onset-based Weights wE c,m eE c,m Taher Bagheri
  • 14. Onset/Offset-Based Segmentation • The method first identifies onsets (increases in signal energy) and offsets (decreases in signal energy) across time within gammatone sub-bands. • The set of T-F units between a pair of onset and offset fronts forms a T-F segment. • This segmentation system has been used to generate T-F segments for the left and the right mixture independently. 14 Taher Bagheri
  • 15. Onset-Based Weights • In challenging acoustic environments, many T-F units will be corrupted by diffuse noise or reverberation. • At first method extracts the signal envelope for each frequency channel of the left and the right signal by squaring and passing each sub-band through a first- order IIR filter. 15 Taher Bagheri
  • 16. • Finally we compute: – as the weight for unit uE c,m. Taher Bagheri 16 , [ ] [ 1] [ 1] E E E c c c m E c e m e m w e m +  − − =   − 
  • 17. • Binaural Pathway 17 Auditory Periphery and Binaural Feature Extraction ITD & ILD GMM model of ITD and ILD Pc(τ,λ|θ) Model Training Pc(τc,m,λc,m|θ) noise Taher Bagheri
  • 18. Binaural Pathway • Binaural pathway contains three stages: • A. Auditory Periphery and Binaural Feature Extraction • B. Azimuth-Dependent Binaural Model • C. Model Training 18 Taher Bagheri
  • 19. Binaural Feature Extraction • The binaural pathway consists of a low-level feature extraction stage where we calculate the ITD, denoted τc,m, and ILD, denoted λc,m, for each T-F unit pair. • A T-F unit is an elemental sound component from one frame, indexed by m, and one filter channel, indexed by c. 19 Taher Bagheri
  • 20. • We calculate ITD as the maximum peak in a running cross-correlation between T-F units uL c,m and uR c,m, where we consider time lags between -1 and 1 ms. • So ITD is: 20 1 , ,0 2 2 1 1 , ,0 0 ( ) ( ) 2 2( , , ) ( ) ( ) 2 2 n n n T L Rn n c m c mn T TL Rn n c m c mn n T T u m n u m n C c m T T u m n u m n τ τ τ − = − − = = − − − =     − − − ÷  ÷     ∑ ∑ ∑ , argmax ( , , )c m L C c m τ τ τ ∈ = Taher Bagheri
  • 21. • ILD corresponds to the energy ratio in dB between uL c,mand uR c,m: 21 2 1 ,0 , 10 2 1 ,0 ( ) 2 10log ( ) 2 n n T L n c mn c m T R n c mn T u m n T u m n λ − = − =    − ÷ ÷   ÷=  ÷  − ÷ ÷    ∑ ∑ Taher Bagheri
  • 22. Azimuth-Dependent Binaural Model • The model independently captures the frequency- dependent pattern of ITD and ILD values due to direct-path propagation, which we refer to as direct- path cues. • The azimuth-dependent model of ITD and ILD has been denoted as Pc(τ,λ|θ), which represents the likelihood of observing a pair of ITD and ILD values in frequency channel given energy from a point source with azimuth θ. 22 Taher Bagheri
  • 23. • Due to spatial aliasing, the probability space for observed ITDs in higher frequency channels is multi- modal. We therefore use a mixture of Gaussians to capture: • The ILD likelihood is well described by a single Gaussian: 23 , , , 1 ( | , ) ( , ) ( | ( , ), ( , )) cK c c k c k c k k P r r r rθ θ θ θτ τ ρ τ τ µ τ σ τ = = ℵ∑ ( | , ) ( | ( , ), ( , ))c c cP r r rθ θ θλ λ λ µ λ σ λ=ℵ Taher Bagheri
  • 24. Model Training • In this work, we generate training mixtures by combining a point source with a simulated diffuse noise, and in doing so, avoid capturing environment- specific effects. • Only the head-related transfer functions (HRTFs) of the binaural setup are known. • We simulate a point source by filtering monaural signals using the HRTF for a given azimuth. 24 Taher Bagheri
  • 25. • The diffuse noise is created by passing uncorrelated noise signals through each of the HRTFs for the binaural setup and then adding them together. • Given a set of training data for a specific azimuth, model measures τ and λ from each pair of mixture T-F units and calculates p. 25 Taher Bagheri
  • 26. Localization Framework • The binaural pathway extracts azimuth-dependent information from each T-F unit pair while the monaural pathway groups T-F units that are likely to be dominated by the same source. • The final stage of the proposed system then integrates this information and produces a set of N azimuth estimates. 26 Taher Bagheri
  • 27. • To perform localization, we first postulate a set of possible N azimuths, where we assume N is known. • For each simultaneous stream or T-F segments we find the most likely azimuth from the postulated set and integrate likelihood scores over all streams and segments. 27 Taher Bagheri
  • 28. • The process generates a total likelihood for each postulated set of azimuths, and we choose the set that maximizes this value. • Formally, let IE be the total number of simultaneous streams and T-F segments from ear signal E. Taher Bagheri 28
  • 29. • Assuming conditional independence between T-F units, the weighted log-likelihood for sE i is then: • We search for the most likely set of N azimuths using: 29 Taher Bagheri , , , , ( ) ln( ( , | )) E i E E i c m c c m c m c m s w Pβ θ τ λ θ ∈ = ∑ ˆ ˆ 1 1 ˆ argmax ( ) ( ) L R i j I I L R i y L j y R i j β θ β θ = =   Θ = + ÷   ∑ ∑
  • 30. Simulation Setup • To test this model ROOMSIM simulator has been used in this study. • Roomsim is a simulation of the acoustics of a simple rectangular prism room has been constructed using the MATLAB m-code programming language. 30 Taher Bagheri
  • 31. • The image method of simulating room acoustics is often used to provide a means of generating signals incorporating “sufficiently realistic” reverberation and directional cues for the testing of audio/speech processing algorithms. • The foundation on which RoomSim is built is the publication of a Fortran routine by Allen and Berkley in 1979. 31 Taher Bagheri
  • 32. The RoomSim Program • The program simulates the geometrical acoustics of a perfect rectangular parallelepiped enclosure using the image-source model to produce an impulse response from each omni-directional primary source to a directional receiver system. • This software combines the image method for reverberation with HRTF measurements made using a KEMAR dummy head. 32 Taher Bagheri
  • 33. • The simulation of a head utilizes the Head Related Transfer Function (HRTF) data, actually Head Related Impulse Response (HRIR) data, provided from measurements made either on a Kemar dummy head at the Center for Image Processing and Integrated Computing (CIPIC), University of California. 33 Taher Bagheri
  • 34. RoomSim Operation • In operation the user specifies the dimensions of the room, its surface materials the type, location and orientation of the receiver system and the location of the primary source(s). • This can be done by submitting either a Microsoft Excel spreadsheet form, a text file, or by selecting a MATLAB *.mat file which saved a configuration from a previous run. 34 Taher Bagheri
  • 35. • If a simulated head has been selected the response from each quantized image-source direction is convolved with the relevant HRIR data. • The individual image-source response are then accumulated to form the complete pressure impulse response from each primary source to the receiver and the result plotted and saved to file. 35 Taher Bagheri
  • 36. Figure 1. RoomSim setup, importing user parameters like room dimensions, humidity, temperature ... 36 Taher Bagheri
  • 37. 37 Taher Bagheri Figure 2. Room dimensions and source(s) and receivers positions.
  • 38. 38 Taher Bagheri Figure 3. spectrum of left and right ears
  • 39. 39 Taher Bagheri • As mentioned this simulator saves user parameters as MATLAB files to use this configuration for other MATLAB functions in any application. • In this study we save Roomsim configuration in MATLAB file and use it for tracking algorithm.
  • 40. Results 40 Taher Bagheri • To evaluate model we use a set of binaural impulse response (BIR) • The simulated BIR, which we refer to as the Kemar set, are generated using the Roomsim package. • We create a library of BIRs by generating room configurations, where room size, array position, and array orientation are set at different states.
  • 41. • Generated BIRs will be for azimuths between -90˚ and 90˚, spaced by 5˚, at distances of 4, 4, and 3 m. • In order to train binaural models, we generate anechoic BIRs for the same azimuths using the HRTF measurements directly. • Speech sources are selected from CIPIC database, by a selected Kemar BIR. • This model has been tested in anechoic room with additive noise like babble noise, restaurant noise, … 41 Taher Bagheri
  • 42. Taher Bagheri 42 Figure 6. without noise and reverberation.
  • 43. 43 Taher Bagheri Figure 7. different noise.
  • 45. Figure . with reverberation and noise. Taher Bagheri 45
  • 46. • Table 1. means of results with three experiments. • T60= 600ms 46 Taher Bagheri Noise Accuracy (%) Reverberation Accuracy (%) Clean 99.2 Clean 99.2 Babble Noise (5dB) 98.5 T60=600ms 98 Babble Noise (15dB) 97.9 T60=600ms + Babble Noise 97.4 Restaurant Noise (5dB) 97.8 T60=600ms + Restaurant Noise 97.0 Car Noise (5dB) 98.9 T60=600ms + Car Noise 97.6
  • 47. Conclusion • This model has not good result for more than one source tracking. • Comparing proposed model with previous tracking model (Roman and Wang 2008) shows better results in same noisy condition. • Model performance in reverberant environment is not as good as expected. 47 Taher Bagheri
  • 48. Future works • In this work, we assumed prior knowledge of the number of sources, and thus a key problem for future work is estimating the number of sources. • Using joint visual and auditory information will lead to better results. Taher Bagheri 48