Pitch detection in Tabla accompaniment

CSE, Indian Institute of Technology Bombay
Pitch Detection of Singing Voice in Tabla
Accompaniment
Ashutosh Bapat
(03305010)

Outline
 Motivation
 Music transcription
 Pitch & pitch detection
 Signal characteristics
 Two-way mismatch procedure
 Post processing
 DP based smoothing
 Pitch correction
 Experimental evaluation
 Conclusion

Automatic Music Transcription (AMT) system
 Converts acoustic musical signal to symbolic
representation
 Documents musical attributes
 Pitch
 Timbre
 Rhythm
 Pitch is the most salient
 Melody = pitch contour

AMT for Indian classical music
 Components of Indian classical and semi-classical
music
 Melody line sung by a single main voice
o Gamakas: are described by detailed pitch contour
 Accompaniment of tabla and tanpura
 Musicological and pedagogical applications
 Rich archives of audio recordings => need for a
reliable PDA for singing voice pitch tracking in tabla
accompaniment

Pitch
 Many definitions exist
 Pitch of a signal is defined as the fundamental
frequency of an approximately harmonic pattern in
spectral representation of signal.
 Pitch period is defined as average length of several
periods of the signal.
0
0
1
T
F =

Pitch detection
• Input: musical signal
• Output: pitch contour
Song waveform Pitch contour

Pitch Detection Algorithm
 Preprocessor: Data reduction and enhancement
 Commonly used method: Filtering
 Basic Extractor: Estimates single/multiple pitch
candidates per frame
 Commonly used method: ACF
 Post processing:
 Measure of reliability of each candidate
 Smoothness of pitch contour
o Commonly used method: Dynamic programming

Singing voice
• Pitch evolves
continuously
• Shows inflexions
like bents, stresses,
oscillations etc.

Classification of tabla strokes
 Number of drums
 Simple strokes: Na, Ge, Ke, Tit, Tun etc.
 Complex strokes: Dha, Dhin, Dhun
 Harmonicity
 Harmonic: Na, Tin, Tun, Ge
 Inharmonic: Ke, Tit
 Rate of Decay:
 Slowly decaying: Na, Ge, Tin, Tun
 Fast decaying: Ke, Tit

Harmonic interference: Na

Single partial interference: Ge

Noisy interference: Ke

Pitch detection of mixed song
Waveform of song mixed with
stroke Na
Pitch contour by ACF

Two-way mismatch procedure
TWM error, F0 = 300 Hz

TWM and ACF: harmonic interference
• Complex tone of 450 Hz + signal
simulating Na
• In TWM error we can see
minimum at correct pitch
• In ACF all peaks are at lags
corresponding to 790 Hz
TWM error ACF
Magnitude plot

TWM and ACF: single partial interference
• Complex tone of 300 Hz mixed with a single partial with
amplitude varied from 0 to 100.
• TWM is more robust than ACF
400 Hz 450 Hz

TWM and ACF correlograms
• Correlograms of complex tone of 300 Hz mixed with stroke
Na
• Notice horizontal line at 300 Hz in TWM
• No clue to lag 73 (corresponds to 300 Hz)
ACF correlogram TWM error correlogram

TWM pitch contour
• Pitch contour of song mixed with stroke Na
• Notice large pitch artifacts during strokes

Post processing

DP based smoothing
 Smoothing based on
 Measure of reliability of pitch candidates
 Smoothness of pitch contour
 Measurement cost:
 Smoothness cost:
 Local transition cost:
 Global transition cost:
∑=
−=
N
j
jjpjpTNpjppS
1
)),1(),(())(,),(,),1(( 
))(),1(()),(()),1(),(( jpjpWjjpEjjpjpT −+=−
),( jpE
)',( ppW

Smoothness cost
• The width of bell varies
proportional to pitch
• Pitch variation at high
pitches is expected to be
more than that at low
pitches
• Saturates at high values
pc
s
pp
ppW e
*
)'(
1)',(
2
2
=
−
−=
−
σ
σ

Pitch contour after applying DP
• Smoothened pitch contour
• Suppresses fast pitch variations
• May introduce errors where tabla is absent

Pitch correction
• Searches for deepest local minimum in 6% range near pitch
estimated by DP
• Corrects most of the fine errors

Experimental evaluation
 Test samples
 Samples produced by digitally adding tabla strokes Na,
Ge, Ke to pure song waveforms sung with syllable /la/
and /aa/
 Algorithms
 TWM:
 TDP: TWM + DP
 TDC: TWM + DP + PC
 Errors
 Fine error: error magnitude between 3% to 6%
 Gross error: error magnitue above 6%

Results
• DP has decreased number of gross errors increasing number
of fine errors
• PC has decreased number of fine errors
• Better performance in case of songs with slowly varying pitch
contours
TWM TDP TDC
F G F G F G
Na 0.0 49.
3
4.6 13.
4
2.1 14.
8
Ge 0.0 20.
9
3.9 2.1 3.4 2.5
Ke 0.0 25.
7
4.9 5.1 0.2 5.1
TWM TDP TDC
F G F G F G
Na 4.7 14.
7
11.
6
1.6 5.3 2.1
Ge 0.0 22.
5
8.1 4.4 1.5 4.1
Ke 0.0 17.
9
7.2 2.5 0.0 2.5
Song with many fast variations of pitch Song with slowly varying pitch contour
Error rates in percentage

Errors after application of DP + PC
• Errors remaining after application of DP and pitch correction
are found in regions with fast variations in pitch

Conclusion
 Importance of music transcription
 Characteristics of tabla strokes
 Two-way mismatch PDA
 Results showing improvements by application of DP
smoothing and pitch correction
 Applications in building pitch detector for Indian
classical and semi classical music

Future work
 Combination of ACF and TWM to take advantage of
 Lesser computational complexity of ACF
 ACF’s robustness to noise, thus better results in Ke
 Classification of frames by presence/ absence of
tabla strokes
 Use pitch estimated by DP and pitch correction only in
frames containing tabla stroke
 Application of advanced techniques:
 adaptive windowing, peak selection, selective search
 Pitch tracking in case of complex strokes like Dha
and words like TiReKiTa

Thank you

State space formulation of DP

Pitch detection in Tabla accompaniment

Recommended

Recommended

More Related Content

What's hot

What's hot (11)

Similar to Pitch detection in Tabla accompaniment

Similar to Pitch detection in Tabla accompaniment (16)

More from Ashutosh Bapat

More from Ashutosh Bapat (6)

Recently uploaded

Recently uploaded (20)

Pitch detection in Tabla accompaniment

Editor's Notes