This document discusses approaches for detecting the fundamental frequency (f0) of monophonic audio signals. It describes time-domain methods like zero-crossing rate analysis and autocorrelation, as well as frequency-domain techniques like harmonic product spectrum, harmonic sum spectrum, and spectral maximum likelihood. It also covers auditory-motivated approaches and notes that while older, tweaked versions of these methods remain state-of-the-art for monophonic f0 detection.
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
07-03-03-ACA-Tonal-Monof0.pdf
1. Introduction to Audio Content Analysis
module 7.3.3: fundamental frequency detection in monophonic signals
alexander lerch
2. overview intro mono f0 time domain frequency domain summary
introduction
overview
corresponding textbook section
section 7.3.3
lecture content
• established approaches to monophonic pitch tracking in
▶ time domain
▶ frequency domain
learning objectives
• define the task of monophonic pitch tracking
• summarize the principles of time-domain ˆ
f0-trackers and describe one approach in
detail
• summarize the principles of frequency-domain ˆ
f0-trackers and describe one approach
in detail
module 7.3.3: fundamental frequency detection in monophonic signals 1 / 15
3. overview intro mono f0 time domain frequency domain summary
introduction
overview
corresponding textbook section
section 7.3.3
lecture content
• established approaches to monophonic pitch tracking in
▶ time domain
▶ frequency domain
learning objectives
• define the task of monophonic pitch tracking
• summarize the principles of time-domain ˆ
f0-trackers and describe one approach in
detail
• summarize the principles of frequency-domain ˆ
f0-trackers and describe one approach
in detail
module 7.3.3: fundamental frequency detection in monophonic signals 1 / 15
4. overview intro mono f0 time domain frequency domain summary
fundamental frequency
introduction
remember
Fourier series: every (quasi-)periodic sound is a combination of sinusoidals with integer
frequency ratios
x(t) ≈ x(t + T̂0)
x(t) ≈
∞
X
k=−∞
a(k)ejω0kt
ˆ
f0: musically, perceptually most “relevant” frequency
module 7.3.3: fundamental frequency detection in monophonic signals 2 / 15
5. overview intro mono f0 time domain frequency domain summary
fundamental frequency
introduction
remember
Fourier series: every (quasi-)periodic sound is a combination of sinusoidals with integer
frequency ratios
x(t) ≈ x(t + T̂0)
x(t) ≈
∞
X
k=−∞
a(k)ejω0kt
ˆ
f0: musically, perceptually most “relevant” frequency
module 7.3.3: fundamental frequency detection in monophonic signals 2 / 15
6. overview intro mono f0 time domain frequency domain summary
pitch detection
task definition
detect the fundamental frequency ˆ
f0
monophonic: only one fundamental frequency at a time
related subtasks:
• voice activity: detect when there is no voice/no fundamental frequency
• note segmentation
▶ note start time and stop time
▶ average note frequency
▶ average note velocity
▶ vibrato detection
• frequency to pitch mapping
module 7.3.3: fundamental frequency detection in monophonic signals 3 / 15
7. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
zero crossing rate
ZCR per block (bad)
T̂0(n) =
2 · ie(n) − is(n)
fS ·
ie(n)
P
i=is(n)
|sign [x(i)] − sign [x(i − 1)]|
average period length
T̂0(n) =
2
Z − 1
Z−2
X
j=0
∆tZC(j).
variants:
• create distance histogram
and choose maximum
• also use distances between
local extrema
module 7.3.3: fundamental frequency detection in monophonic signals 4 / 15
matlab
source:
plotF0Zcr.m
8. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
zero crossing rate
ZCR per block (bad)
T̂0(n) =
2 · ie(n) − is(n)
fS ·
ie(n)
P
i=is(n)
|sign [x(i)] − sign [x(i − 1)]|
average period length
T̂0(n) =
2
Z − 1
Z−2
X
j=0
∆tZC(j).
variants:
• create distance histogram
and choose maximum
• also use distances between
local extrema
module 7.3.3: fundamental frequency detection in monophonic signals 4 / 15
matlab
source:
plotF0Zcr.m
9. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
zero crossing rate
ZCR per block (bad)
T̂0(n) =
2 · ie(n) − is(n)
fS ·
ie(n)
P
i=is(n)
|sign [x(i)] − sign [x(i − 1)]|
average period length
T̂0(n) =
2
Z − 1
Z−2
X
j=0
∆tZC(j).
variants:
• create distance histogram
and choose maximum
• also use distances between
local extrema
module 7.3.3: fundamental frequency detection in monophonic signals 4 / 15
matlab
source:
plotF0Zcr.m
10. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
auto correlation function
find lag of ACF maximum
rxx (η, n) =
ie(n)−η
X
i=is(n)
x(i) · x(i + η)
T̂0(n) = argmax rxx (η, n)
variants:
• center clipping
x
χ(x)
−cL +cL
x
χ′
(x)
−cL +cL
−1
+1
module 7.3.3: fundamental frequency detection in monophonic signals 5 / 15
11. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
auto correlation function
find lag of ACF maximum
rxx (η, n) =
ie(n)−η
X
i=is(n)
x(i) · x(i + η)
T̂0(n) = argmax rxx (η, n)
variants:
• center clipping
x
χ(x)
−cL +cL
x
χ′
(x)
−cL +cL
−1
+1
module 7.3.3: fundamental frequency detection in monophonic signals 5 / 15
12. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
auto correlation function
module 7.3.3: fundamental frequency detection in monophonic signals 5 / 15
matlab
source:
plotF0Acf.m
13. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
average magnitude difference function
find lag of AMDF minimum
AMDFxx (η, n) =
1
ie(n) − is(n) + 1
ie(n)−η
X
i=is(n)
|x(i) − x(i + η)|
variants:
• AMDF-weighted ACF
r′
xx (η, n) =
rxx (η, n)
AMDFxx (η, n) + 1
module 7.3.3: fundamental frequency detection in monophonic signals 6 / 15
14. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
average magnitude difference function
find lag of AMDF minimum
AMDFxx (η, n) =
1
ie(n) − is(n) + 1
ie(n)−η
X
i=is(n)
|x(i) − x(i + η)|
variants:
• AMDF-weighted ACF
r′
xx (η, n) =
rxx (η, n)
AMDFxx (η, n) + 1
module 7.3.3: fundamental frequency detection in monophonic signals 6 / 15
15. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
average magnitude difference function
module 7.3.3: fundamental frequency detection in monophonic signals 6 / 15
matlab
source:
plotF0Amdf.m
16. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
harmonic product spectrum 1/2
XHPS(k, n) =
O
Y
j=1
|X(j · k, n)|2
ˆ
f0(n) = argmax XHPS(k, n)
first published in the 1960s by Noll
1
A. M. Noll, “Pitch Determination of Human Speech by the Harmonic Product Spectrum, the Harmonic Sum Spectrum, and a Maximum
Likelihood Estimate,” in Proceedings of the Symposium on Computer Processing in Communications, vol. 19, Brooklyn: Polytechnic Press of the
University of Brooklyn, 1969, pp. 779–797.
module 7.3.3: fundamental frequency detection in monophonic signals 7 / 15
matlab
source:
plotF0HpsMethod.m
17. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
harmonic product spectrum 2/2
module 7.3.3: fundamental frequency detection in monophonic signals 8 / 15
matlab
source:
plotF0Hps.m
18. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
harmonic sum spectrum
sum instead product sum
XHSS(k, n) =
O
X
j=1
|X(j · k, n)|2
• advantage
▶ robust against missing harmonics
• disadvantage
▶ less pronounced peak
module 7.3.3: fundamental frequency detection in monophonic signals 9 / 15
19. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
ACF of magnitude spectrum
rXX (η, n) =
K/2−1
X
k=−K/2
|X(k, n)| · |X(k + η, n)|
⇒ detect maximum location
module 7.3.3: fundamental frequency detection in monophonic signals 10 / 15
matlab
source:
plotF0AcfOfFft.m
20. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
cepstral pitch detection
1 compute cepstrum
2 detect periodicities
module 7.3.3: fundamental frequency detection in monophonic signals 11 / 15
matlab
source:
plotF0Cepstrum.m
21. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
spectral maximum likelihood
create template matrix with (smoothed) delta pulses for all possible frequencies
compute the cross correlation (lag = 0) between spectrum and all templates
pick the result with the highest correlation ⇒ frequency estimate
module 7.3.3: fundamental frequency detection in monophonic signals 12 / 15
22. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
spectral maximum likelihood
create template matrix with (smoothed) delta pulses for all possible frequencies
compute the cross correlation (lag = 0) between spectrum and all templates
pick the result with the highest correlation ⇒ frequency estimate
module 7.3.3: fundamental frequency detection in monophonic signals 12 / 15
23. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
spectral maximum likelihood
create template matrix with (smoothed) delta pulses for all possible frequencies
compute the cross correlation (lag = 0) between spectrum and all templates
pick the result with the highest correlation ⇒ frequency estimate
module 7.3.3: fundamental frequency detection in monophonic signals 12 / 15
matlab
source:
plotF0Template.m
24. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
auditory-motivated pitch tracking 1/2
1 filterbank of band pass filters (e.g., mel scale)
2 HWR
3 smoothing
4 within-band periodicity estimate (e.g. ACF)
5 combination of bands
module 7.3.3: fundamental frequency detection in monophonic signals 13 / 15
25. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
auditory-motivated pitch tracking 1/2
1 filterbank of band pass filters (e.g., mel scale)
2 HWR
3 smoothing
4 within-band periodicity estimate (e.g. ACF)
5 combination of bands
module 7.3.3: fundamental frequency detection in monophonic signals 13 / 15
26. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
auditory-motivated pitch tracking 1/2
1 filterbank of band pass filters (e.g., mel scale)
2 HWR
3 smoothing
4 within-band periodicity estimate (e.g. ACF)
5 combination of bands
module 7.3.3: fundamental frequency detection in monophonic signals 13 / 15
27. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
auditory-motivated pitch tracking 1/2
1 filterbank of band pass filters (e.g., mel scale)
2 HWR
3 smoothing
4 within-band periodicity estimate (e.g. ACF)
5 combination of bands
module 7.3.3: fundamental frequency detection in monophonic signals 13 / 15
28. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
auditory-motivated pitch tracking 1/2
1 filterbank of band pass filters (e.g., mel scale)
2 HWR
3 smoothing
4 within-band periodicity estimate (e.g. ACF)
5 combination of bands
module 7.3.3: fundamental frequency detection in monophonic signals 13 / 15
29. overview intro mono f0 time domain frequency domain summary
monophonic fundamental frequency detection
auditory-motivated pitch tracking 2/2
1 filterbank output
2 half wave
rectification
3 smoothed output
module 7.3.3: fundamental frequency detection in monophonic signals 14 / 15
matlab
source:
plotF0Auditory.m
30. overview intro mono f0 time domain frequency domain summary
summary
lecture content
basic commonality
• all approaches look for periodicity
▶ waveform similarity in time domain
▶ equidistant harmonics/peaks in freq domain
state-of-the-art
• despite the age of the presented methods, tweaked versions of the presented
approaches are still often considered state-of-the-art
• combinations of different approaches can be robust
module 7.3.3: fundamental frequency detection in monophonic signals 15 / 15