Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Presentation WASPAA - A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization
1. A POLYNOMIAL EIGENVALUE DECOMPOSITION MUSIC APPROACH FOR
BROADBAND SOUND SOURCE LOCALIZATION
October, 2021
Aidan Hogg, Vincent Neo, Stephan Weiss, Christine Evers and Patrick Naylor
Electrical and Electronic Engineering, Imperial College London, UK
Electrical and Electronic Engineering, University of Strathclyde, Glasgow, UK
Electronics and Computer Science, University of Southampton, UK
2. MOTIVATION
SOUND SOURCE LOCALIZATION
Sound source localization is an important task for a multitude of
applications, including robot audition and voice-controlled smart
devices.
Direction of arrival estimates are essential in providing angular
positional information for localization.
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 1/26
4. SOURCE MODEL
MUSIC ALGORITHM
Consider the scenario with a microphone array and a far-field sound
source. The sound source signals arrive with delays expressed in a
steering vector .
Data model:
x(𝑛) =
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 3/26
5. SOURCE MODEL
MUSIC ALGORITHM
Consider the scenario with a microphone array and a far-field sound
source. The sound source signals arrive with delays expressed in a
steering vector a0.
Data model:
x(𝑛) = 𝑠0(𝑛) ⋅ a0
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 3/26
6. SOURCE MODEL
MUSIC ALGORITHM
Consider the scenario with a microphone array and a far-field sound
source. The sound source signals arrive with delays expressed in a
steering vector a0.
Data model:
x(𝑛) = 𝑠0(𝑛) ⋅ a0
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 3/26
7. SOURCE MODEL
MUSIC ALGORITHM
Consider the scenario with a microphone array and a far-field sound
source. The sound source signals arrive with delays expressed in a
steering vector a0, a1.
Data model:
x(𝑛) = 𝑠0(𝑛) ⋅ a0 + 𝑠1(𝑛) ⋅ a1
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 3/26
8. SOURCE MODEL
MUSIC ALGORITHM
Consider the scenario with a microphone array and a far-field sound
source. The sound source signals arrive with delays expressed in a
steering vector a0, a1.
Data model:
x(𝑛) = 𝑠0(𝑛) ⋅ a0 + 𝑠1(𝑛) ⋅ a1
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 3/26
9. SOURCE MODEL
MUSIC ALGORITHM
Consider the scenario with a microphone array and a far-field sound
source. The sound source signals arrive with delays expressed in a
steering vector a0, a1, ⋯ , a𝑅.
Data model:
x(𝑛) = 𝑠0(𝑛) ⋅ a0 + 𝑠1(𝑛) ⋅ a1 + ⋯ + 𝑠𝑅(𝑛) ⋅ a𝑅
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 3/26
10. SOURCE MODEL
MUSIC ALGORITHM
Consider the scenario with a microphone array and a far-field sound
source. The sound source signals arrive with delays expressed in a
steering vector a0, a1, ⋯ , a𝑅.
Data model:
x(𝑛) =
𝑅
∑
𝑟=0
𝑠𝑟(𝑛) ⋅ a𝑟
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 3/26
11. STEERING VECTOR
MUSIC ALGORITHM
A signal 𝑠(𝑛) arriving at the array can be characterised by the delays
of its wavefront (neglecting attenuation):
⎡
⎢
⎢
⎢
⎣
𝑥1(𝑛)
𝑥2(𝑛)
⋮
𝑥𝑀 (𝑛)
⎤
⎥
⎥
⎥
⎦
=
⎡
⎢
⎢
⎢
⎣
𝑠(𝑛 − 𝜏1)
𝑠(𝑛 − 𝜏2)
⋮
𝑠(𝑛 − 𝜏𝑀 )
⎤
⎥
⎥
⎥
⎦
=
⎡
⎢
⎢
⎢
⎣
𝛿(𝑛 − 𝜏1)
𝛿(𝑛 − 𝜏2)
⋮
𝛿(𝑛 − 𝜏𝑀 )
⎤
⎥
⎥
⎥
⎦
∗ 𝑠(𝑛)
Z
−
→ a𝜃(𝑧)𝑆(𝑧) (1)
If evaluated at a narrowband normalised angular frequency Ω𝑖, the
time delays 𝜏𝑚 in the broadband steering vector a𝜃(𝑧) collapse to
phase shifts in the narrowband steering vector a𝜃,Ω𝑖
(𝑧),
a𝜃,Ω𝑖
(𝑧) = a𝜃(𝑧)|𝑧=𝑒−𝑗Ω𝑖 =
⎡
⎢
⎢
⎢
⎣
𝑒−𝑗𝜏1Ω𝑖
𝑒−𝑗𝜏2Ω𝑖
⋮
𝑒−𝑗𝜏𝑀Ω𝑖
⎤
⎥
⎥
⎥
⎦
. (2)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 4/26
12. SPACE-TIME COVARIANCE MATRIX
MUSIC ALGORITHM
If the noisy and reverberant signal, 𝑥𝑚(𝑛), at the 𝑚-th microphone
for discrete-time sample 𝑛 = 0, 1, … , 𝑁, is
𝑥𝑚(𝑛) = h𝑇
𝑚s0(𝑛) + 𝑣𝑚(𝑛) (3)
To capture the temporal correlations of the speech signals at
different microphones, we require a space-time covariance matrix
𝓡xx(𝜏) = 𝔼{x(𝑛)x𝑇
(𝑛 − 𝜏)}, (4)
where
x𝑇
(𝑛) = [𝑥1(𝑛), 𝑥2(𝑛) ⋯ 𝑥𝑀 (𝑛)] (5)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 5/26
14. POLYNOMIAL EVD
MUSIC ALGORITHM
Due to the signal model the PEVD of (4) can be partitioned by
thresholding the eigenvalues to give
̃
𝓡xx(𝑧) ≈ [ 𝓤 ̃
𝑠(𝑧) 𝓤 ̃
𝑣(𝑧) ] ⎡
⎢
⎣
Λ
Λ
Λ ̃
𝑠(𝑧) 0
0 Λ
Λ
Λ ̃
𝑣(𝑧)
⎤
⎥
⎦
⎡
⎢
⎣
𝓤𝑃
̃
𝑠(𝑧)
𝓤𝑃
̃
𝑣(𝑧)
⎤
⎥
⎦
,
where {.} ̃
𝑠 and {.} ̃
𝑣 represent the orthogonal signal and noise
subspace components.
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 7/26
15. DIAGONALISED MATRIX OF EIGENVALUES
MUSIC ALGORITHM
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
-40 0 40
0
1
2
3
lag τ
r
ij
[τ
]
Diagonalised matrix of eigenvectors produced by the SMD algorithm for the
noise-free case with a direction of arrival (DoA) of 45∘
.
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 8/26
16. POLYNOMIAL MUSIC
MUSIC ALGORITHM
Generalising from MUSIC, the following quantity is defined
Γ𝜃(𝑧) = a𝑃
𝜃 (𝑧)𝓤𝑣(𝑧)𝓤𝑃
𝑣 (𝑧)a𝜃(𝑧), (6)
which is used to compute the pseudo-spectrogram for
Spatio-Spectral P-MUSIC,
P𝑆𝑆𝑃−𝑀𝑈 (𝜃, Ω) =
1
Γ𝜃(𝑧)
∣
𝑧=𝑒−𝑗Ω
, (7)
where frequency Ω is obtained by evaluating 𝑧 on the unit circle.
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 9/26
18. SSP-MUSIC ENHANCEMENT 1 - WINDOWING
CONTRIBUTIONS
(Group delay response for tapered windows & rectangular window for fractional delay τ = 2.4)
▶ A broadband steering vector can be realised by sampling a sinc
function which leads to an infinite impulse response
▶ truncation results in inaccuracies towards 𝑓𝑠
2 (Alrmah, 2015)
▶ using a tapered window to create a finite length filter dramatically
improves the accuracy (Selva, IEEE TSP 2008)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 11/26
19. BROADBAND STEERING VECTOR EXAMPLE
CONTRIBUTIONS
Time
−20 −10 0 10 20
A
n
g
le
o
f
A
r
r
iv
a
l
0.0
0.5
1.0
1.5
2.0
2.5
3.0
−0.2
0.0
0.2
0.4
0.6
0.8
1.0
Broadband Steering Vector for m=0
Time
−20 −10 0 10 20
A
n
g
le
o
f
A
r
r
iv
a
l
0.0
0.5
1.0
1.5
2.0
2.5
3.0
−0.2
0.0
0.2
0.4
0.6
0.8
Broadband Steering Vector for m=1
Time
−20 −10 0 10 20
A
n
g
le
o
f
A
r
r
iv
a
l
0.0
0.5
1.0
1.5
2.0
2.5
3.0
−0.2
0.0
0.2
0.4
0.6
0.8
Broadband Steering Vector for m=2
Time
−20 −10 0 10 20
A
n
g
le
o
f
A
r
r
iv
a
l
0.0
0.5
1.0
1.5
2.0
2.5
3.0
−0.2
0.0
0.2
0.4
0.6
0.8
Broadband Steering Vector for m=3
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 12/26
20. SSP-MUSIC ENHANCEMENT 2 - TEMPORAL SUPPORT
CONTRIBUTIONS
We modify SSP-MUSIC to only include the direct-path to reduce the
impact of reverberation on the direction of arrival (DoA) estimation.
If microphone array geometry is known, the largest direct-path delay
can inform the choice of the temporal lag support, 𝑊.
̃
𝓡xx(𝑧) ≈ ∑
𝑊
𝜏=−𝑊
Rxx(𝜏)𝑧−𝜏
, (8)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 13/26
21. SSP-MUSIC ENHANCEMENT 3 - ACTIVE REGIONS
CONTRIBUTIONS
1000
2000
3000
(a)
Frequency
[Hz]
5 10 15 20 25
[dB]
4.00
6.00
8.00
10.00
(b)
Amplitude
[dB]
1000
2000
3000
(c)
Frequency
[Hz]
0 45 90 135 180 225 270 315 360
Direction of Arrival [◦
]
0.30
0.40
0.50
(d)
Amplitude
[dB]
Ground Truth
Pseudospectrum
SSP-MUSIC
IFB-MUSIC
Illustrative example of a 100 ms frame
from Exp-2 (SNR: -10 dB, T60: 0.25 s).
(a) pseudo-spectrogram of SSP-MUSIC,
(b) pseudo-spectrum of SSP-MUSIC,
(c) pseudo-spectrogram of IFB-MUSIC,
(d) pseudo-spectrum of IFB-MUSIC.
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 14/26
23. EVALUATION METRICS
RESULTS
A ‘HIT’ is when a sound source (speaker) has been detected once
within a ± 15∘
collar applied around the ground-truth azimuth.
A ‘MISS’ is when a speaker change has not been detected within this
collar and a FA is when a detection falls outside of a ground-truth
azimuth collar.
The Hit Rate and False Alarm Rate are, therefore, defined as,
Hit Rate =
HITs
HITs + MISSs
%, False Alarm Rate =
FAs
HITs + FAs
% (9)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 16/26
24. SNR - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 25 dB, T60: 0.25 s
0 1 2 3
Time [s]
40
50
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
25
dB,
T60:
0.25
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 17/26
25. SNR - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 20 dB, T60: 0.25 s
0 1 2 3
Time [s]
40
50
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
20
dB,
T60:
0.25
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 17/26
26. SNR - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.25 s
0 1 2 3
Time [s]
40
50
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
0.25
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 17/26
27. SNR - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 10 dB, T60: 0.25 s
0 1 2 3
Time [s]
40
50
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
10
dB,
T60:
0.25
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 17/26
28. SNR - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 5 dB, T60: 0.25 s
0 1 2 3
Time [s]
40
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
5
dB,
T60:
0.25
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 17/26
29. SNR - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 0 dB, T60: 0.25 s
0 1 2 3
Time [s]
50
75
100
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
0
dB,
T60:
0.25
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 17/26
30. SNR - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: -5 dB, T60: 0.25 s
0 1 2 3
Time [s]
50
100
150
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
-5
dB,
T60:
0.25
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 17/26
31. SNR - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: -10 dB, T60: 0.25 s
0 1 2 3
Time [s]
50
100
150
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
-10
dB,
T60:
0.25
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 17/26
32. SNR - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: -15 dB, T60: 0.25 s
0 1 2 3
Time [s]
50
100
150
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
-15
dB,
T60:
0.25
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 17/26
33. SNR - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 25 dB, T60: 0.25 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 18/26
34. SNR - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 20 dB, T60: 0.25 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 18/26
35. SNR - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.25 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 18/26
36. SNR - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 10 dB, T60: 0.25 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 18/26
37. SNR - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 5 dB, T60: 0.25 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 18/26
38. SNR - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 0 dB, T60: 0.25 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 18/26
39. SNR - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: -5 dB, T60: 0.25 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 18/26
40. SNR - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: -10 dB, T60: 0.25 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 18/26
41. SNR - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: -15 dB, T60: 0.25 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 18/26
42. SNR - HIT RATES
RESULTS
EX1: SINGLE SOUND SOURCE IN A SIMULATED ROOM
EX2: TWO SOUND SOURCES IN A SIMULATED ROOM
−15 −10 −5 0 5 10 15 20 25
SNR [dB]
40
60
80
100
Hit
Rate
[%]
EX1-SSP EX1-IBF EX2-SSP EX2-IBF
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 19/26
43. T60 - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.1 s
0 1 2 3
Time [s]
40
50
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
0.1
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 20/26
44. T60 - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.3 s
0 1 2 3
Time [s]
40
50
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
0.3
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 20/26
45. T60 - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.5 s
0 1 2 3
Time [s]
40
50
60
70
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
0.5
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 20/26
46. T60 - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.7 s
0 1 2 3
Time [s]
20
40
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
0.7
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 20/26
47. T60 - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.9 s
0 1 2 3
Time [s]
0
20
40
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
0.9
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 20/26
48. T60 - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 1.1 s
0 1 2 3
Time [s]
0
20
40
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
1.1
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 20/26
49. T60 - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 1.3 s
0 1 2 3
Time [s]
0
20
40
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
1.3
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 20/26
50. T60 - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 1.5 s
0 1 2 3
Time [s]
0
20
40
60
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
1.5
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 20/26
51. T60 - SINGLE SOUND SOURCE IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 1.7 s
0 1 2 3
Time [s]
0
25
50
75
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
(a)
Exp-1
(SNR:
15
dB,
T60:
1.7
s)
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 20/26
52. T60 - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.1 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 21/26
53. T60 - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.3 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 21/26
54. T60 - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.5 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 21/26
55. T60 - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.7 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 21/26
56. T60 - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 0.9 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 21/26
57. T60 - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 1.1 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 21/26
58. T60 - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 1.3 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 21/26
59. T60 - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 1.5 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 21/26
60. T60 - TWO SOUND SOURCES IN A SIMULATED ROOM
RESULTS
SNR: 15 dB, T60: 1.7 s
0 1 2 3 4 5 6
Time [s]
0
100
200
300
Direction
of
Arrival
[
◦
]
Ground Truth
VAD
IFB-MUSIC
SSP-MUSIC
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 21/26
61. T60 - HIT RATES
RESULTS
EX1: SINGLE SOUND SOURCE IN A SIMULATED ROOM
EX2: TWO SOUND SOURCES IN A SIMULATED ROOM
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
T60 [s]
50
60
70
80
90
100
Hit
Rate
[%]
EX1-SSP EX1-IBF EX2-SSP EX2-IBF
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 22/26
62. LOCATA RESULTS
RESULTS
Method SSP-MUSIC IFB-MUSIC
Metric HR FAR HR FAR
Recording
1 90.0 10.0 95.0 5.0
2 64.7 35.3 67.6 32.4
3 95.1 4.9 75.6 24.4
Comparison of SSP-MUSIC against IFB-MUSIC on the first 3 LOCATA recordings for Task 1.
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 23/26
64. KEY CONTRIBUTIONS
CONCLUSION
In this talk, we have...
... developed and explored the potential of SSP-MUSIC, which is a
polynomial extension of MUSIC.
... proposed some enhancements for sound source localization using
SSP-MUSIC.
... demonstrated that SSP-MUSIC is more robust to noise and
reverberation.
A. Hogg at el | A Polynomial Eigenvalue Decomposition MUSIC Approach for Broadband Sound Source Localization 25/26