Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
5[Y a M U[ M 3 [MOT _ R[ [PUO
6 _O U U[ U PUM 3 a_UO 5[ [ M
.
6[O [ M ET _U_ B _ M U[
a_UO E OT [ [Se 9 [a
6 M Y [R R[ YM ...
Aa U
[PaO U[
a_UO 5[ [ M
M P 6M M_ _
[Pe 6 _O U [ _
M P C _ M U[ _
[Pe BM
B [O __U S
3 UOM U[ _
DaYYM e M P
a a B _ O Ub _...
Aa U
[PaO U[
a_UO 5[ [ M
M P 6M M_ _
[Pe 6 _O U [ _
M P C _ M U[ _
[Pe BM
B [O __U S
3 UOM U[ _
DaYYM e M P
a a B _ O Ub _...
[PaO U[
6
Chapter 1
Introduction
Information technology (IT) is constantly shaping the world that we live in. Its ad-
vanc...
[ UbM U[
3a [YM UO Ya_UO P _O U U[
7
[ UbM U[
] ]_Ue]gl& <]fWbiYel&
YUeW & IUX]b&
IYWb YaXUg]ba
GYXU[b[l& =XhWUg]ba&
=agYegU]a Yag&
=a UaWYX _]fgYa]a[
b chgUg]...
[ UbM U[
IYceYfYagUg]ba
Y[ YagUg]ba
] ]_Ue]gl
Dbg]Zf <]fWbiYel
GUggYea UeUWgYe]mUg]ba
DY_bX]W
<YfWe]cg]ba
9
5[Y a_UO B [V O
• Serra, X. (2011). A multicultural approach to music information research. In Proc. of ISMIR, pp. 151–156...
PUM M a_UO
11
]aXhfgUa]
BUhfghi B( ?Ua[h_]
UeaUg]W
M][aYf Af Ue
E YU [ [Se
§ E[ UO1 4M_ U OT GM U _ MO [__ M U_ _
§ [Pe1 BU OT [R T P[YU M Y [PUO _[a O
§ ClSM1 [PUO R MY c[ W
§ DbM M1 3 ...
A [ a U U _ M P 5TM S _
§ A [ a U U _
13
A [ a U U _ M P 5TM S _
§ A [ a U U _
§ H _ aPU P Ya_UO MPU U[
14
A [ a U U _ M P 5TM S _
§ A [ a U U _
§ H _ aPU P Ya_UO MPU U[
§ : [ T[ UO MO[a_ UOM OTM MO U_ UO_
15
A [ a U U _ M P 5TM S _
§ A [ a U U _
§ H _ aPU P Ya_UO MPU U[
§ : [ T[ UO MO[a_ UOM OTM MO U_ UO_
• http://www.music- ir....
A [ a U U _ M P 5TM S _
§ A [ a U U _
§ H _ aPU P Ya_UO MPU U[
§ : [ T[ UO MO[a_ UOM OTM MO U_ UO_
§ [PUO M _ à NaU PU S N...
A [ a U U _ M P 5TM S _
§ A [ a U U _
§ H _ aPU P Ya_UO MPU U[
§ : [ T[ UO MO[a_ UOM OTM MO U_ UO_
§ [PUO M _ à NaU PU S N...
A [ a U U _ M P 5TM S _
§ 5TM S _
19
A [ a U U _ M P 5TM S _
§ 5TM S _
§ Y [bU_M [ e Ya_UO MPU U[
Time (s)
10 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0
200
400
600
8...
A [ a U U _ M P 5TM S _
§ 5TM S _
§ Y [bU_M [ e Ya_UO MPU U[
§ [PUO M _O U U[ OTM SU S M P U P RU P
0 2 4 6 8 10
Time (s)
...
A [ a U U _ M P 5TM S _
§ 5TM S _
§ Y [bU_M [ e Ya_UO MPU U[
§ [PUO M _O U U[ OTM SU S M P U P RU P
§ [ _ M PM P R O R ]a ...
A [ a U U _ M P 5TM S _
§ 5TM S _
§ Y [bU_M [ e Ya_UO MPU U[
§ [PUO M _O U U[ OTM SU S M P U P RU P
§ [ _ M PM P R O R ]a ...
4 [MP ANV O Ub _
§ 5[Y U M U[ Oa M U[ [R O[ [ M [R 3
§ 6U_O[b e OTM MO UfM U[ [R Y [PUO M _
§ 3a [YM UO lSM O[S U U[
24
5[Y a M U[ M EM_W_
DY_bX]W
<YfWe]cg]ba
25
5[Y a M U[ M EM_W_
Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba
DY_bX]W
<YfWe]cg]ba
26
5[Y a M U[ M EM_W_
Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba
DY_bX]W
<YfWe]cg]ba
Elxf Y[ YagUg]ba
27
5[Y a M U[ M EM_W_
Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba
DY_bX]W
<YfWe]cg]ba
DY_bX]W ] ]_Ue]gl
Elxf Y[ YagUg]ba
28
5[Y a M U[ M EM_W_
Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba
DY_bX]W
<YfWe]cg]ba
GUggYea <]fWbiYel
DY_bX]W ] ]_Ue]gl
Elxf Y[ YagUg]...
5[Y a M U[ M EM_W_
Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba
DY_bX]W
<YfWe]cg]ba
GUggYea UeUWgYe]mUg]ba
GUggYea <]fWbiYel
DY_bX]W ]...
5[Y a M U[ M EM_W_
Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba
9hgb Ug]W Ix[U IYWb[a]g]ba
DY_bX]W
<YfWe]cg]ba
GUggYea <]fWbiYel
DY_bX...
a_UO 5[ [ M M P 6M M_ _
32
Chapter 3
Music Corpora and Datasets
3.1 Introduction
A research corpus is a collection of data...
5[Y a_UO 5[ [ M
33
5[Y a_UO 5[ [ M
§ E MY RR[
34
5[Y a_UO 5[ [ M
§ E MY RR[
§ B [O Pa M P P _US O U UM
• Serra, X. (2014). Creating research corpora for the computational ...
5[Y a_UO 5[ [ M
36
5[Y a_UO 5[ [ M
9hX]b =X]gbe]U_ YgUXUgU
<f Dhf]WVeU]am
37
5M M UO M P :U Pa_ M U a_UO 5[ [ M
8 9
.=3
.20891573
9 5
38
A MOO __ a_UO 5[ [ M
8 9
.=3
.20891573
9 5
39
A MOO __ a_UO 5[ [ M
U Uf
YWg]baf
DY_bX]W c eUfYf
40
E _ 6M M_ _
§ E[ UO P URUOM U[
§ el_ D SY M U[
§ [PUO DUYU M U e
§ ClSM C O[S U U[
41
[Pe 6 _O U [ _ M P C _ M U[ _
42
Chapter 4
Melody Descriptors and
Representations
4.1 Introduction
In this chapter, we des...
[Pe 6 _O U [ _ M P C _ M U[ _
Kba]W AXYag]Z]WUg]ba
#=iU_hUg]ba
Elxf Y[ YagUg]ba
GeYXb ]aUag DY_bXl =fg] Ug]ba
UaX cbfg ceb...
[Pe 6 _O U [ _ M P C _ M U[ _
Kba]W AXYag]Z]WUg]ba
#=iU_hUg]ba
Elxf Y[ YagUg]ba
GeYXb ]aUag DY_bXl =fg] Ug]ba
UaX cbfg ceb...
E[ UO P URUOM U[ 1 3 [MOT _
Method Features Feature Distribution Tonic Selection
MRS (Sengupta et al., 2005) Pitch (Datta,...
5[Y M M Ub 7bM aM U[
§ D b Y T[P_
§ DUd PUb _ PM M_ _
TIDCM1
TIDCM2
TIDCM3
TIDIITM1
TIDIITM2
TIDIISc
,1+
3-/
.,2
-2
.1,
//...
C _a _
Methods
TIDCM1 TIDCM2 TIDCM3 TIDIISc TIDIITM1 TIDIITM2
TP TPC TP TPC TP TPC TP TPC TP TPC TP TPC
MRH1 - 81.4 69.6 8...
C _a _
CM2 CM3 IITM2 IITM1
60
65
70
75
80
85
90
95
100
PerformanceAccuracy(%)
JS
SG
RH1
RH2
AB2
AB3
Dataset
Accuracy(%)
40...
C _a _
CM2 CM3 IITM2 IITM1
60
65
70
75
80
85
90
95
100
PerformanceAccuracy(%)
JS
SG
RH1
RH2
AB2
AB3
Dataset
Accuracy(%)
40...
C _a _1 DaYYM e
§ a U U OT O M__URUOM U[
§ 3aPU[
§ bM UM [ Pa M U[
§ 5[ _U_ MPU U[ S P
U _ aY
§ BU OT Y M MW UOW
§ 3aPU[ Y...
B P[YU M BU OT 7_ UYM U[
DY_bX]U
• Salamon, J. & Gómez, E. (2012). Melody extraction from polyphonic music signals using p...
el_ DbM M D SY M U[
§ [PUO _ SY M U[
§ 6 UYU Y [PUO T M_ _
178 180 182 184 186 188 190
−200
0
200
400
600
800
Time (s)
F0f...
el_ DbM M D SY M U[
Predominant pitch estimation Tonic identification
Audio
Nyās segments
Pred. pitch estimation
and repre...
el_ DbM M D SY M U[
Predominant pitch estimation Tonic identification
Audio
Nyās segments
Pred. pitch estimation
and repre...
el_ DbM M D SY M U[
Predominant pitch estimation Tonic identification
Audio
Nyās segments
Histogram
computation
Segmentati...
el_ DbM M D SY M U[
Predominant pitch estimation Tonic identification
Audio
Nyās segments
Histogram
computation
Segmentati...
el_ DbM M D SY M U[
Predominant pitch estimation Tonic identification
Audio
Nyās segments
Histogram
computation
Segmentati...
6M M_
§ k l R[ YM O _ :U Pa_ M U
§ 3 [ M U[ _1 Ya_UOUM 2)- e M _ MU U S
§ ) -/ el_ _ SY _
657
IYWbeX]a[f <heUg]ba 9eg]fgf ...
C _a _118 MELODY DESCRIPTORS AND REPRESENTATIONS
Segmentation Feat. DTW Tree k-NN NB LR SVM
PLS
FL 0.356 0.407 0.447 0.248...
C _a _1 DaYYM e
§ B [ [_ P _ SY M U[
§ M a d MO U[
§ >[OM R M a _
§ BU O cU_ U M _ SY M U[
§ BM YM OTU S 6EH
§ >[OM 5[ d a...
[PUO BM B [O __U S
61
Chapter 5
Melodic Pattern Processing:
Similarity, Discovery and
Characterization
5.1 Introduction
In...
[PUO BM B [O __U S
62
[PUO BM B [O __U S
63
[PUO BM B [O __U S
64
7dU_ U S 3 [MOT _ R[ 3
Method Task
Melody
Representation
Segmentation
Similarity
Measure
Speed-up #R¯agas #Rec #Patt #Occ
...
[PUO BM B [O __U S
Supervised Approach
Unsupervised Approach
Supervised Approach
66
[PUO BM B [O __U S
Supervised Approach
Unsupervised Approach
Supervised Approach
67
[PUO BM B [O __U S
§ 6M M_ _Uf
§ [c PS NUM_
§ :aYM [ _ UYU M U[ _
Unsupervised Approach
Supervised Approach
68
[PUO BM B [O __U S
DY_bX]W
] ]_Ue]gl
GUggYea
<]fWbiYel
GUggYea
UeUWgYe]mUg]ba
hcYei]fYX LafhcYei]fYX LafhcYei]fYX
69
[PUO BM B [O __U S
DY_bX]W
] ]_Ue]gl
GUggYea
<]fWbiYel
GUggYea
UeUWgYe]mUg]ba
hcYei]fYX LafhcYei]fYX LafhcYei]fYX
70
[PUO DUYU M U e
71
[PUO DUYU M U e
Time (s)
10 2 3 4 5 6 7 8 9 10 11 12 13 14 15
00
00
00
00
00
00
00
U__Ya[Y
72
[PUO DUYU M U e
U__Ya[Y
73
+ , - .
[PUO DUYU M U e
Predominant Pitch Estimation
Pitch Normalization
Uniform Time-scaling
Distance Computation
Melodic Similar...
[PUO DUYU M U e
U c_]a[ eUgY8
PYf ) Eb& N Ug ]aX8
PYf ) Eb& b hW 8
=hW_]XYUa be
<laU ]W g] Y Uec]a[ #<KN &
GUeU YgYef8
Pre...
[PUO DUYU M U e
n+ & 01& / & . & --p m
sg Standard deviation of
computation
Q Heaviside step functi
u Flatness measure
˜u ...
§ Eb abe U_]mUg]ba
§ Kba]W
§ Q]aZ
§ Q,.
§ Q+,
§ DYUa
§ DYX]Ua
§ Q abe
§ DYX]Ua UVfb_hgY XYi]Ug]ba
[PUO DUYU M U e
Predomin...
[PUO DUYU M U e
§ vbZZ
§ vba n (3& (3/& +( & +( /& +(+p
Predominant Pitch Estimation
Pitch Normalization
Uniform Time-scal...
[PUO DUYU M U e
§ =hW_]XYUa
§ <laU ]W g] Y Uec]a[
§ ?_bVU_ WbafgeU]ag #-
§ CbWU_ WbafgeU]ag #
Predominant Pitch Estimation...
[PUO DUYU M U e
Predominant Pitch Estimation
Pitch Normalization
Uniform Time-scaling
Distance Computation
Melodic Similar...
7bM aM U[ 1 D a
E GUggYeaf
+ E IUaXb
fY[ Yagf
Kbc +
aYUeYfg
aY][ Vbef
DYUa
9iYeU[Y
GeYW]f]ba
81
6M M_
49
IYWbeX]a[f <heUg]ba Ix[UfGUggYea FWW(
49
IYWbeX]a[f <heUg]ba Ix[UfGUggYea FWW(
UeaUg]W hf]W XUgUfYg
]aXhfgUa] hf]...
C _a _
ODIC SIMILARITY: APPROACHES AND EVALUATION 131
Dataset MAP Srate Norm TScale Dist
MSDcmd
iitm
0.413 w67 Zmean Woff ...
C _a _
ODIC SIMILARITY: APPROACHES AND EVALUATION 131
Dataset MAP Srate Norm TScale Dist
MSDcmd
iitm
0.413 w67 Zmean Woff ...
C _a _1 DaYYM e
U c_]a[ eUgY
Ebe U_]mUg]ba
K] Y fWU_]a[
<]fgUaWY
CbWU_ WbafgeU]ag
?_bVU_ WbafgeU]ag
][
DYUa
Eb WbafYafhf
<...
C _a _1 DaYYM e
Bab a La ab a
G eUfY Y[ YagUg]ba
UeaUg]W
]aXhfgUa]
• Gulati, S., Serrà, J., & Serra, X. (2015). An evaluat...
Y [bU S [PUO DUYU M U e
Predominant Pitch Estimation
Pitch Normalization
Uniform Time-scaling
Distance Computation
Melodic...
Y [bU S [PUO DUYU M U e
]aXhfgUa] hf]W
88
Y [bU S [PUO DUYU M U e
• G. E. Batista, X. Wang, and E. J Keogh. A complexity-invariant distance measure for time series,...
Y [bU S [PUO DUYU M U e
Predominant Pitch Est. + Post Processing
Pitch Normalization
Svara Duration Truncation
Distance Co...
6M M_
49
IYWbeX]a[f <heUg]ba Ix[UfGUggYea FWW(
49
IYWbeX]a[f <heUg]ba Ix[UfGUggYea FWW(
UeaUg]W hf]W XUgUfYg
]aXhfgUa] hf]...
C _a _ MELODIC PATTERN PROCESSING
MSDhmd
CM
Norm MB MDT MCW1 MCW2
Ztonic 0.45 (0.25) 0.52 (0.24) - -
Zmean 0.25 (0.20) 0.3...
C _a _ MELODIC PATTERN PROCESSING
MSDhmd
CM
Norm MB MDT MCW1 MCW2
Ztonic 0.45 (0.25) 0.52 (0.24) - -
Zmean 0.25 (0.20) 0.3...
C _a _1 DaYYM e
<heUg]ba gehaWUg]ba
b c_Yk]gl NY][ g]a[
UeaUg]W
]aXhfgUa]
UeaUg]W
• Gulati, S., Serrà, J., & Serra, X. (20...
[PUO BM B [O __U S
DY_bX]W
] ]_Ue]gl
GUggYea
<]fWbiYel
GUggYea
UeUWgYe]mUg]ba
hcYei]fYX LafhcYei]fYX LafhcYei]fYX
95
BM 6U_O[b e
AageU eYWbeX]a[
GUggYea <]fWbiYel
AagYe eYWbeX]a[
GUggYea YUeW
IUa IYZ]aY Yag
I+
I,
I-
96
M O[ PU S BM 6U_O[b e
§ 6e MYUO UY cM U S 6EH
?_bVU_ WbafgeU]ag
#+ _Ya[g
Eb CbWU_ WbafgeU]ag
gYc4 n# &+ & #+&+ & #+& p
97
CM W C RU Y
CbWU_ WbafgeU]ag
gYc4 n#,&+ & #+&+ & #+&, p
CbWU_ Wbfg ZhaWg]ba
#M+& M,& M-& M.
98
5[Y a M U[ M 5[Y dU e
F#a, à U c_Yf
F#a, à UaX]XUgYf
-/ k #0 k 0 k / k (/ 6 -+ D]__]ba
<]fgUaWY Cb Ye :bhaXf
99
>[c 4[a P_ R[ 6EH
<KehY 7 <C:
<C: 7 <CUfg
<CUfg
Kbc E dhYhY
100
>[c 4[a P_ R[ 6EH
§ 5M_OMP P [c N[a P_ JRakthanmanon et al. (2013)K
§ U _ >M_ N[a P JKim et al. (2001)K
§ >4L [ST JKeogh e...
6M M_
§ D P M _1 /0 (((
§ D M OT M _1 j)- ((( (((
49
IYWbeX]a[f <heUg]ba
UeaUg]W hf]W
102
C _a _
103
C _a _1 5[Y a M U[ _
§ M O[ PU S1 )&,) E
§ O[ PU S1 ) &, E
>]efg CUfg
C:TBYb[ T=H
C:TBYb[ T=
/,
,-
+
./
/+
-
Cb Ye VbhaX A...
C _a _
160 MELODIC PATTER
Seed Category V1 V2 V3 V4
S1 0.92 0.92 0.91 0.89
S2 0.68 0.73 0.73 0.66
S3 0.35 0.34 0.35 0.35
T...
C _a _
160 MELODIC PATTER
Seed Category V1 V2 V3 V4
S1 0.92 0.92 0.91 0.89
S2 0.68 0.73 0.73 0.66
S3 0.35 0.34 0.35 0.35
T...
C _a _
AageU eYWbeX]a[
AagYe eYWbeX]a[
+ >GI à 2 KGI
+ >GI à / KGI
• Gulati, S., Serrà, J., Ishwar, V., & Serra, X. (2014)...
[PUO BM B [O __U S
6U__UYU M M _ bM M _
108
[PUO BM B [O __U S
DY_bX]W
] ]_Ue]gl
GUggYea
<]fWbiYel
GUggYea
UeUWgYe]mUg]ba
hcYei]fYX LafhcYei]fYX LafhcYei]fYX
109
[PUO BM 5TM MO UfM U[
9MYMWM_ h 3 M Wl _
5[Y [_U U[ _ OURUO M _
ClSM Y[ UR_
110
[PUO BM 5TM MO UfM U[
Inter-recording Pattern
Detection
Intra-recording Pattern
Discovery
Data Processing
Pattern Network
...
[PUO BM 5TM MO UfM U[
Inter-recording Pattern
Detection
Intra-recording Pattern
Discovery
Data Processing
Pattern Network
...
[PUO BM 5TM MO UfM U[
• M. EJ Newman, “The structure and function of complex networks,” Society for Industrial and Applied...
[PUO BM 5TM MO UfM U[
• S. Maslov and K. Sneppen, “Specificity and stability in topology of protein networks,” Science, vo...
[PUO BM 5TM MO UfM U[
• V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in l...
[PUO BM 5TM MO UfM U[
Pattern Network
Generation
Network
Filtering
Community Detection and
Characterization
b ha]gl KlcY E...
[PUO BM 5TM MO UfM U[
Pattern Network
Generation
Network
Filtering
Community Detection and
Characterization
Ix[U X]fge]Vhg...
[PUO BM 5TM MO UfM U[
Pattern Network
Generation
Network
Filtering
Community Detection and
Characterization
IYWbeX]a[ X]fg...
[PUO BM 5TM MO UfM U[
Pattern Network
Generation
Network
Filtering
Community Detection and
Characterization
5.24: Graphica...
7bM aM U[
49
IYWbeX]a[f <heUg]ba Ix[Uf b cbf]g]baf
+ Ix[U k + b ha]gl 6 + G eUfYf
+ Dhf]W]Uaf
120
C _a _
5.5 CHARACTERIZATION OF MELODIC PATTERNS 173
R¯aga µr sr R¯aga µr sr
Hamsadhv¯ani 0.84 0.23 B¯egad. a 0.88 0.11
K¯a...
C _a _
5.5 CHARACTERIZATION OF MELODIC PATTERNS 173
R¯aga µr sr R¯aga µr sr
Hamsadhv¯ani 0.84 0.23 B¯egad. a 0.88 0.11
K¯a...
C _a _1 DaYYM e
DYUaIx[U# DYUaG eUfY# IUg]a[f 5 (3,(1/ 5
(2/
• Gulati, S., Serrà, J., Ishwar, V., & Serra, X. (2016). Disc...
3a [YM UO ClSM C O[S U U[
124
Chapter 6
Automatic R¯aga Recognition
6.1 Introduction
In this chapter, we address the task ...
3a [YM UO ClSM C O[S U U[
38BACKGROUND
Methods Svara set
Svara
salience
Svara
intonation
¯ar¯ohana-
avr¯ohana
Melodic
phra...
3a [YM UO ClSM C O[S U U[
38BACKGROUND
Methods Svara set
Svara
salience
Svara
intonation
¯ar¯ohana-
avr¯ohana
Melodic
phra...
3a [YM UO ClSM C O[S U U[
38BACKGROUND
Methods Svara set
Svara
salience
Svara
intonation
¯ar¯ohana-
avr¯ohana
Melodic
phra...
3a [YM UO ClSM C O[S U U[
38BACKGROUND
Methods Svara set
Svara
salience
Svara
intonation
¯ar¯ohana-
avr¯ohana
Melodic
phra...
3a [YM UO ClSM C O[S U U[
38BACKGROUND
Methods Svara set
Svara
salience
Svara
intonation
¯ar¯ohana-
avr¯ohana
Melodic
phra...
3a [YM UO ClSM C O[S U U[
38BACKGROUND
Methods Svara set
Svara
salience
Svara
intonation
¯ar¯ohana-
avr¯ohana
Melodic
phra...
3a [YM UO ClSM C O[S U U[
38BACKGROUND
Methods Svara set
Svara
salience
Svara
intonation
¯ar¯ohana-
avr¯ohana
Melodic
phra...
3a [YM UO ClSM C O[S U U[
38BACKGROUND
Methods Svara set
Svara
salience
Svara
intonation
¯ar¯ohana-
avr¯ohana
Melodic
phra...
3a [YM UO ClSM C O[S U U[
38BACKGROUND
Methods Svara set
Svara
salience
Svara
intonation
¯ar¯ohana-
avr¯ohana
Melodic
phra...
3a [YM UO ClSM C O[S U U[
2.4RELATEDWORKININDIANARTMUSIC39
Method Tonal Feature Tonic
Identification
Feature Recognition Me...
2.4RELATEDWORKININDIANARTMUSIC39
Method Tonal Feature Tonic
Identification
Feature Recognition Method #R¯agas Dataset
(Dur....
6M M_
49
IYWbeX]a[f <heUg]ba CYUX 9eg]fgfbaWYegfIx[Uf
49
IYWbeX]a[f <heUg]ba CYUX 9eg]fgfIY_YUfYfIx[Uf
UeaUg]W hf]W XUgUfY...
B [ [_ P 3 [MOT _
time (s)
10
time (s)
10
30
0 20 40 60 80 100 120
Index
0
20
40
60
80
100
120
Index
(a)
0 20 40 60 80 100...
B [ [_ P 3 [MOT _
time (s)
10
time (s)
10
30
0 20 40 60 80 100 120
Index
0
20
40
60
80
100
120
Index
(a)
0 20 40 60 80 100...
BT M_ NM_ P ClSM C O[S U U[
Inter-recording
Pattern Detection
Intra-recording
Pattern Discovery
Data
Processing
Music coll...
BT M_ NM_ P ClSM C O[S U U[
Inter-recording
Pattern Detection
Intra-recording
Pattern Discovery
Data
Processing
Music coll...
BT M_ NM_ P ClSM C O[S U U[
141
BT M_ NM_ P ClSM C O[S U U[
142
BT M_ NM_ P ClSM C O[S U U[
Kbc]W DbXY_]a[ o KYkg _Uff]Z]WUg]ba
G eUfYf
Ix[U
IYWbeX]a[f
NbeXf
Kbc]W
<bWh Yagf
143
BT M_ NM_ P ClSM C O[S U U[
Kbc]W DbXY_]a[ o KYkg _Uff]Z]WUg]ba
KYe >eYdhYaWl t AaiYefY <bWh Yag
>eYdhYaWl #K> A<>
144
E 6 M a 7d MO U[
Vocabulary
Extraction
Term-frequency
Feature Extraction
Feature
Normalization
145
E 6 M a 7d MO U[
Vocabulary
Extraction
Term-frequency
Feature Extraction
Feature
Normalization
146
E 6 M a 7d MO U[
Vocabulary
Extraction
Term-frequency
Feature Extraction
Feature
Normalization
+ .
, ,
+ +
- - -
147
E 6 M a 7d MO U[
Vocabulary
Extraction
Term-frequency
Feature Extraction
Feature
Normalization
ies except the ones that co...
C _a _
2 PATTERN-BASED R ¯AGA RECOGNITION 185
Method Feature SVML SGD NBM NBG RF LR 1-NN
MVSM
f1 51.04 55 37.5 54.37 25.41...
C _a _
2 PATTERN-BASED R ¯AGA RECOGNITION 185
Method Feature SVML SGD NBM NBG RF LR 1-NN
MVSM
f1 51.04 55 37.5 54.37 25.41...
C _a _1 DaYYM e
beX]U wYague #, +-
GebcbfYX
BbXhe] Yg U_( #, +.
G <>h__
K> A<> #>-
G <cUeU
1-
01
//
3,
2-
UeaUg]W
#
]aXhfg...
7 [ 3 M e_U_
R1-S.anmukhapriya
R2-K¯api
R3-Bhairavi
R4-Madhyam¯avati
R5-Bilahari
R6-M¯ohana˙m
R7-Sencurut.t.i
R8-´Sr¯ıranj...
7 [ 3 M e_U_
R1-S.anmukhapriya
R2-K¯api
R3-Bhairavi
R4-Madhyam¯avati
R5-Bilahari
R6-M¯ohana˙m
R7-Sencurut.t.i
R8-´Sr¯ıranj...
7 [ 3 M e_U_
R1-S.anmukhapriya
R2-K¯api
R3-Bhairavi
R4-Madhyam¯avati
R5-Bilahari
R6-M¯ohana˙m
R7-Sencurut.t.i
R8-´Sr¯ıranj...
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
Upcoming SlideShare
Loading in …5
×

Computational Approaches for Melodic Description in Indian Art Music Corpora

Presentation for my PhD defense, Music Technology Group, Barcelona, Spain.
Resources: http://compmusic.upf.edu/node/304

Short abstract:
Automatically describing contents of recorded music is crucial for interacting with large volumes of audio recordings, and for developing novel tools to facilitate music pedagogy. Melody is a fundamental facet in most music traditions and, therefore, is an indispensable component in such description. In this thesis, we develop computational approaches for analyzing high-level melodic aspects of music performances in Indian art music (IAM), with which we can describe and interlink large amounts of audio recordings. With its complex melodic framework and well-grounded theory, the description of IAM melody beyond pitch contours offers a very interesting and challenging research topic. We analyze melodies within their tonal context, identify melodic patterns, compare them both within and across music pieces, and finally, characterize the specific melodic context of IAM, the rāgas. All these analyses are done using data-driven methodologies on sizable curated music corpora. Our work paves the way for addressing several interesting research problems in the field of music information research, as well as developing novel applications in the context of music discovery and music pedagogy.

  • Be the first to comment

  • Be the first to like this

Computational Approaches for Melodic Description in Indian Art Music Corpora

  1. 1. 5[Y a M U[ M 3 [MOT _ R[ [PUO 6 _O U U[ U PUM 3 a_UO 5[ [ M . 6[O [ M ET _U_ B _ M U[ a_UO E OT [ [Se 9 [a 6 M Y [R R[ YM U[ M P 5[YYa UOM U[ E OT [ [SU _ F Ub _U M B[Y a MN M 4M O [ M D MU ET _U_ PU O [ 1 ET _U_ O[YYU Y YN _1 . )/ [b YN ().
  2. 2. Aa U [PaO U[ a_UO 5[ [ M M P 6M M_ _ [Pe 6 _O U [ _ M P C _ M U[ _ [Pe BM B [O __U S 3 UOM U[ _ DaYYM e M P a a B _ O Ub _ClSM C O[S U U[ 4
  3. 3. Aa U [PaO U[ a_UO 5[ [ M M P 6M M_ _ [Pe 6 _O U [ _ M P C _ M U[ _ [Pe BM B [O __U S 3 UOM U[ _ DaYYM e M P a a B _ O Ub _ClSM C O[S U U[ Chapter 1 Introduction Information technology (IT) is constantly shaping the world that we live in. Its ad- vancements have changed human behavior and influenced the way we connect with our environment. Music being a universal socio-cultural phenomena has been deeply influenced by these advancements. The way music is created, stored, disseminated, listened to, and even learned has changed drastically over the last few decades. A massive amount of audio music content is now available on demand. Thus, it becomes necessary to develop computational techniques that can process and automatically de- scribe large volumes of digital music content to facilitate novel ways of interaction with it. There are different information sources such as editorial metadata, social data, and audio recordings that can be exploited to generate a description of music. Melody, along with harmony and rhythm, is a fundamental facet of music, and therefore, an essential component in its description. In this thesis, we focus on describing melodic aspects of music through an automated analysis of audio content. 1.1 Motivation “It is melody that enables us to distinguish one work from another. It is melody that human beings are innately able to reproduce by singing, humming, and whistling. It is melody that makes music memorable: we are likely to recall a tune long after we have forgotten its text.” (Selfridge-Field, 1998) The importance of melody in our musical experiences makes its analysis and descrip- tion a crucial component in music content processing. It becomes even more import- ant for melody dominant music traditions such as Indian art music (IAM), where the concept of harmony (functional harmony as understood in common practice) does not exist, and the complex melodic structure takes the central role in music aesthetics. 1 Chapter 3 Music Corpora and Datasets 3.1 Introduction A research corpus is a collection of data compiled to study a research problem. A well designed research corpus is representative of the domain under study. It is practically infeasible to work with the entire universe of data. Therefore, to ensure scalability of information retrieval technologies to real-world scenarios, it is to important to develop and test computational approaches using a representative data corpus. Moreover, an easily accessible data corpus provides a common ground for researchers to evaluate their methods, and thus, accelerates knowledge sharing and advancement. Not every computational task requires the entire research corpus for development and evaluation of approaches. Typically a subset of the corpus is used in a specific research task. We call this subset a test corpus or test dataset. The models built over a test dataset can later be extended to the entire research corpus. Test dataset is a static collection of data specific to an experiment, as opposed to a research corpus, which can evolve over time. Therefore, different versions of the test dataset used in a specific experiment should be retained for ensuring reproducibility of the research results. Note that a test dataset should not be confused with the training and testing split of a dataset, which are the terms used in the context of a cross validation experimental setup. In MIR, a considerable number of the computational approaches follow a data-driven methodology, and hence, a well curated research corpus becomes a key factor in de- termining the success of these approaches. Due to the importance of a good data corpus in research, building a corpus in itself is a fundamental research task (MacMul- len, 2003). MIR can be regarded as a relatively new research area within information retrieval, which has primarily gained popularity in last two decades. Even today, a significant number of the studies in MIR use ad-hoc procedures to build a collection of data to be used in the experiments. Quite often the audio recordings used in the experiments are taken from a researcher’s personal music collection. Availability of a good representative research corpus has been a challenge in MIR (Serra, 2014). This 61 Chapter 4 Melody Descriptors and Representations 4.1 Introduction In this chapter, we describe methods for extracting relevant melodic descriptors and low-level melody representation from raw audio signals. These melodic features are used as inputs to perform higher level melodic analyses by the methods described in the subsequent chapters. Since these features are used in a number of computational tasks, we consolidate and present these common processing blocks in the current chapter. This chapter is largely based on our published work presented in Gulati et al. (2014a,b,c). There are four sections in this chapter, and in each section, we describe the extraction of a different melodic descriptor or a representation. In Section 4.2, we focus on the task of automatically identifying the tonic pitch in an audio recording. Our main objective is to perform a comparative evalu- ation of different existing methods and select the best method to be used in our work. In Section 4.3, we present the method used to extract the predominant pitch from audio signals, and describe the post-processing steps to reduce frequently occurring errors. In Section 4.4, we describe the process of segmenting the solo percussion re- gions (Tani sections) in the audio recordings of Carnatic music. In Section 4.5, we describe our ny¯as-based approach for segmenting melodies in Hindustani music. 87 Chapter 5 Melodic Pattern Processing: Similarity, Discovery and Characterization 5.1 Introduction In this chapter, we present our methodology for discovering musically relevant melodic patterns in sizable audio collections of IAM. We address three main computational tasks involved in this process: melodic similarity, pattern discovery and characteriz- ation of the discovered melodic patterns. We refer to these different tasks jointly as melodic pattern processing. “Only by repetition can a series of tones be characterized as something definite. Only repetition can demarcate a series of tones and its purpose. Repetition thus is the basis of music as an art” (Schenker et al., 1980) Repeating patterns are at the core of music. Consequently, analysis of patterns is fundamental in music analysis. In IAM, recurring melodic patterns are the building blocks of melodic structures. They provide a base for improvisation and composition, and thus, are crucial to the analysis and description of r¯agas, compositions, and artists in this music tradition. A detailed account of the importance of melodic patterns in IAM is provided in Section 2.3.2. To recapitulate, from the literature review presented in Section 2.4.2 and Section 2.5.2 we see that the approaches for pattern processing in music can be broadly put into two categories (Figure 5.1). One of the types of approaches perform pattern detection (or matching) and follow a supervised methodology. In these approaches the system knows a priori the pattern to be extracted. Typically such a system is fed with exem- plar patterns or queries and is expected to extract all of their occurrences in a piece of 121 Chapter 6 Automatic R¯aga Recognition 6.1 Introduction In this chapter, we address the task of automatically recognizing r¯agas in audio re- cordings of IAM. We describe two novel approaches for r¯aga recognition that jointly capture the tonal and the temporal aspects of melody. The contents of this chapter are largely based on our published work in Gulati et al. (2016a,b). R¯aga is a core musical concept used in compositions, performances, music organiz- ation, and pedagogy of IAM. Even beyond the art music, numerous compositions in Indian folk and film music are also based on r¯agas (Ganti, 2013). R¯aga is therefore one of the most desired melodic descriptions of a recorded performance of IAM, and an important criterion used by listeners to browse its audio music collections. Despite its significance, there exists a large volume of audio content whose r¯aga is incorrectly labeled or, simply, unlabeled. A computational approach to r¯aga recognition will al- low us to automatically annotate large collections of audio music recordings. It will enable r¯aga-based music retrieval in large audio archives, semantically-meaningful music discovery and musicologically-informed navigation. Furthermore, a deeper understanding of the r¯aga framework from a computational perspective will pave the way for building applications for music pedagogy in IAM. R¯aga recognition is the most studied research topic in MIR of IAM. There exist a considerable number of approaches utilizing different characteristic aspects of r¯agas such as svara set, svara salience and ¯ar¯ohana-avr¯ohana. A critical in-depth review of the existing approaches for r¯aga recognition is presented in Section 2.4.3, wherein we identify several shortcomings in these approaches and possible avenues for scientific contribution to take this task to the next level. Here we provide a short summary of this analysis: Nearly half of the number of the existing approaches for r¯aga recognition do not utilize the temporal aspects of melody at all (Table 2.3), which are crucial 177 Chapter 7 Applications 7.1 Introduction The research work presented in this thesis has several applications. A number of these applications have already been described in Section 1.1. While some applications such as r¯aga-based music retrieval and music discovery are relevant mainly in the context of large audio collections, there are several applications that can be developed on the scale of the music collection compiled in the CompMusic project. In this chapter, we present some concrete examples of such applications that have already incorporated parts of our work. We provide a brief overview of Dunya, a collection of the music corpora and software tools developed during the CompMusic project and, Sar¯aga and Riy¯az, the mobile applications developed within the technology transfer project, Culture Aware MUsic Technologies (CAMUT). In addition, we present three web demos that showcase parts of the outcomes of our computational methods. We also briefly present one of our recent studies that performs musicologically motivated exploration of melodic structures in IAM. It serves as an example of how our methods can facilitate investigations in computational musicology. 7.2 Dunya Dunya55 comprises the music corpora and the software tools that have been developed as part of the CompMusic project. It includes data for five music traditions - Hindus- tani, Carnatic, Turkish Makam, Jingju and Andalusian music. By providing access to the gathered data (such as audio recordings and metadata), and the generated data (such as audio descriptors) in the project, Dunya aims to facilitate study and explor- ation of relevant musical aspects of different music repertoires. The CompMusic corpora mainly comprise audio recordings and complementary information that de- scribes those recordings. This complementary information can be in the form of the 55http://dunya.compmusic.upf.edu/ 207 Chapter 8 Summary and Future Perspectives 8.1 Introduction In this thesis, we have presented a number of computational approaches for analyzing melodic elements at different levels of melodic organization in IAM. In tandem, these approaches generate a high-level melodic description of the audio collections in this music tradition. They build upon each other to finally allow us to achieve the goals that we set at the beginning of this thesis, which are: To curate and structure representative music corpora of IAM that comprise au- dio recordings and the associated metadata, and use that to compile sizable and well annotated tests datasets for melodic analyses. To develop data-driven computational approaches for the discovery and char- acterization of musically relevant melodic patterns in sizable audio collections of IAM To devise computational approaches for automatically recognizing r¯agas in re- corded performances of IAM. Based on the results presented in this thesis, we can now say that our goals are suc- cessfully met. We started this thesis by presenting our motivation behind the analysis and description of melodies in IAM, highlighting the opportunities and challenges that this music tradition offers within the context of music information research and the CompMusic project (Chapter 1). We provided a brief introduction to IAM and its melodic organization, and critically reviewed the existing literature on related topics within the context of MIR and IAM (Chapter 2). In Chapter 3, we provided a comprehensive overview of the music corpora and data- sets that were compiled and structured in our work as a part of the CompMusic project. To the best of our knowledge, these corpora comprise the largest audio collections of IAM along with curated metadata and automatically extracted music 221 Chapter 2 Background 2.1 Introduction In this chapter, we present our review of the existing literature related with the work presented in this thesis. In addition, we also present a brief overview of the relevant music and mathematical background. We start with a brief discussion on the termin- ology used in this thesis (Section 2.2). Subsequently, we provide an overview of the selected music concepts to better understand the computational tasks addressed in our work (Section 2.3). We then present a review of the relevant literature, which we divide into two parts. We first present a review of the work done in computational ana- lysis of IAM (Section 2.4). This includes approaches for tonic identification, melodic pattern processing and automatic r¯aga recognition. In the second part, we present relevant work done in MIR in general, including topics related to pattern processing (detection and discovery), and key estimation (Section 2.5). Finally, we provide a brief overview of selected scientific concepts (Section 2.6). 2.2 Terminology In this section, we provide our working definition of selected terms that we have used throughout the thesis. To begin with, it is important to understand the mean- ing of melody in the context of this thesis. As we see from the literature, defining melody in itself has been a challenge, with no consensus on a single definition of melody (Gómez et al., 2003; Salamon, 2013). We do not aim here to formally define melody in the universal sense, but present its working definition within the scope of this thesis. Before we proceed, it is important to understand the setup of a concert of IAM. Every performance of IAM has a lead artist, who plays the central role (also literally positioned in the center of the stage), and all other instruments are considered as accompaniments. There is a main melody line by the lead performer, and typic- ally, also a melodic accompaniment that follows the main melody. A more detailed 13 5
  4. 4. [PaO U[ 6 Chapter 1 Introduction Information technology (IT) is constantly shaping the world that we live in. Its ad- vancements have changed human behavior and influenced the way we connect with our environment. Music being a universal socio-cultural phenomena has been deeply influenced by these advancements. The way music is created, stored, disseminated, listened to, and even learned has changed drastically over the last few decades. A massive amount of audio music content is now available on demand. Thus, it becomes necessary to develop computational techniques that can process and automatically de- scribe large volumes of digital music content to facilitate novel ways of interaction with it. There are different information sources such as editorial metadata, social data, and audio recordings that can be exploited to generate a description of music. Melody, along with harmony and rhythm, is a fundamental facet of music, and therefore, an essential component in its description. In this thesis, we focus on describing melodic aspects of music through an automated analysis of audio content. 1.1 Motivation “It is melody that enables us to distinguish one work from another. It is melody that human beings are innately able to reproduce by singing, humming, and whistling. It is melody that makes music memorable: we are likely to recall a tune long after we have forgotten its text.” (Selfridge-Field, 1998) The importance of melody in our musical experiences makes its analysis and descrip- tion a crucial component in music content processing. It becomes even more import- ant for melody dominant music traditions such as Indian art music (IAM), where the concept of harmony (functional harmony as understood in common practice) does not exist, and the complex melodic structure takes the central role in music aesthetics. 1
  5. 5. [ UbM U[ 3a [YM UO Ya_UO P _O U U[ 7
  6. 6. [ UbM U[ ] ]_Ue]gl& <]fWbiYel& YUeW & IUX]b& IYWb YaXUg]ba GYXU[b[l& =XhWUg]ba& =agYegU]a Yag& =a UaWYX _]fgYa]a[ b chgUg]baU_ hf]Wb_b[l 8
  7. 7. [ UbM U[ IYceYfYagUg]ba Y[ YagUg]ba ] ]_Ue]gl Dbg]Zf <]fWbiYel GUggYea UeUWgYe]mUg]ba DY_bX]W <YfWe]cg]ba 9
  8. 8. 5[Y a_UO B [V O • Serra, X. (2011). A multicultural approach to music information research. In Proc. of ISMIR, pp. 151–156. <YfWe]cg]ba <UgU Xe]iYa h_gheY U UeY FcYa UWWYff IYcebXhW]V]_]gl 10
  9. 9. PUM M a_UO 11 ]aXhfgUa] BUhfghi B( ?Ua[h_] UeaUg]W M][aYf Af Ue
  10. 10. E YU [ [Se § E[ UO1 4M_ U OT GM U _ MO [__ M U_ _ § [Pe1 BU OT [R T P[YU M Y [PUO _[a O § ClSM1 [PUO R MY c[ W § DbM M1 3 M [S[a_ [ M [ § 3 mTM M Mb mTM M1 3_O PU S P _O PU S [S __U[ § 5TM M 1 3N_ MO U[ [R T Y [PUO [S __U[ § [ UR_1 5TM MO U_ UO Y [PUO M _ § 9MYMWM_1 9 UPU S Y [PUO S _ a _ 5M M UO Ya_UO 12
  11. 11. A [ a U U _ M P 5TM S _ § A [ a U U _ 13
  12. 12. A [ a U U _ M P 5TM S _ § A [ a U U _ § H _ aPU P Ya_UO MPU U[ 14
  13. 13. A [ a U U _ M P 5TM S _ § A [ a U U _ § H _ aPU P Ya_UO MPU U[ § : [ T[ UO MO[a_ UOM OTM MO U_ UO_ 15
  14. 14. A [ a U U _ M P 5TM S _ § A [ a U U _ § H _ aPU P Ya_UO MPU U[ § : [ T[ UO MO[a_ UOM OTM MO U_ UO_ • http://www.music- ir.org/mirex/wiki/MIREX_HOME • Salamon, J. & Gómez, E. (2012). Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20(6), 1759–1770. DAI=O /DAI=O 3 /<: AE<A9E 2 16 < <
  15. 15. A [ a U U _ M P 5TM S _ § A [ a U U _ § H _ aPU P Ya_UO MPU U[ § : [ T[ UO MO[a_ UOM OTM MO U_ UO_ § [PUO M _ à NaU PU S N [OW_ 17
  16. 16. A [ a U U _ M P 5TM S _ § A [ a U U _ § H _ aPU P Ya_UO MPU U[ § : [ T[ UO MO[a_ UOM OTM MO U_ UO_ § [PUO M _ à NaU PU S N [OW_ § ClSM R MY c[ W1 • Martinez, J. L. (2001). Semiosis in Hindustani Music. Motilal Banarsidass Publishers. 18 rK Y ex[U ]f beY Z]kYX g Ua U bXY& UaX _Yff Z]kYX g Ua g Y Y_bXl& VYlbaX g Y bXY UaX f beg bZ Y_bXl& UaX e]W Ye Vbg g Ua U []iYa bXY be U []iYa Y_bXl(s RDUeg]aYm& G(30S DbXY 5 Ix[U 5 DY_bXl
  17. 17. A [ a U U _ M P 5TM S _ § 5TM S _ 19
  18. 18. A [ a U U _ M P 5TM S _ § 5TM S _ § Y [bU_M [ e Ya_UO MPU U[ Time (s) 10 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 200 400 600 800 000 200 400 20
  19. 19. A [ a U U _ M P 5TM S _ § 5TM S _ § Y [bU_M [ e Ya_UO MPU U[ § [PUO M _O U U[ OTM SU S M P U P RU P 0 2 4 6 8 10 Time (s) 100 0 100 200 300 400 500 600 Frequency(Cents) S m g Characteristic movement • http://www.freesound.org/people/sankalp/sounds/360428/ 21
  20. 20. A [ a U U _ M P 5TM S _ § 5TM S _ § Y [bU_M [ e Ya_UO MPU U[ § [PUO M _O U U[ OTM SU S M P U P RU P § [ _ M PM P R O R ]a Oe E[ UO U OT 9 #++ 9 #++0 : #+,- ? #+ - ? #32 22
  21. 21. A [ a U U _ M P 5TM S _ § 5TM S _ § Y [bU_M [ e Ya_UO MPU U[ § [PUO M _O U U[ OTM SU S M P U P RU P § [ _ M PM P R O R ]a Oe E[ UO U OT § >[ S MaPU[ O[ PU S_ q+ FLI 23
  22. 22. 4 [MP ANV O Ub _ § 5[Y U M U[ Oa M U[ [R O[ [ M [R 3 § 6U_O[b e OTM MO UfM U[ [R Y [PUO M _ § 3a [YM UO lSM O[S U U[ 24
  23. 23. 5[Y a M U[ M EM_W_ DY_bX]W <YfWe]cg]ba 25
  24. 24. 5[Y a M U[ M EM_W_ Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba DY_bX]W <YfWe]cg]ba 26
  25. 25. 5[Y a M U[ M EM_W_ Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba DY_bX]W <YfWe]cg]ba Elxf Y[ YagUg]ba 27
  26. 26. 5[Y a M U[ M EM_W_ Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba DY_bX]W <YfWe]cg]ba DY_bX]W ] ]_Ue]gl Elxf Y[ YagUg]ba 28
  27. 27. 5[Y a M U[ M EM_W_ Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba DY_bX]W <YfWe]cg]ba GUggYea <]fWbiYel DY_bX]W ] ]_Ue]gl Elxf Y[ YagUg]ba time (s) 10 time (s) 10 29
  28. 28. 5[Y a M U[ M EM_W_ Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba DY_bX]W <YfWe]cg]ba GUggYea UeUWgYe]mUg]ba GUggYea <]fWbiYel DY_bX]W ] ]_Ue]gl Elxf Y[ YagUg]ba time (s) 10 time (s) 10 30
  29. 29. 5[Y a M U[ M EM_W_ Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba 9hgb Ug]W Ix[U IYWb[a]g]ba DY_bX]W <YfWe]cg]ba GUggYea <]fWbiYel DY_bX]W ] ]_Ue]gl Elxf Y[ YagUg]ba time (s) 10 time (s) 10 31 GUggYea UeUWgYe]mUg]ba
  30. 30. a_UO 5[ [ M M P 6M M_ _ 32 Chapter 3 Music Corpora and Datasets 3.1 Introduction A research corpus is a collection of data compiled to study a research problem. A well designed research corpus is representative of the domain under study. It is practically infeasible to work with the entire universe of data. Therefore, to ensure scalability of information retrieval technologies to real-world scenarios, it is to important to develop and test computational approaches using a representative data corpus. Moreover, an easily accessible data corpus provides a common ground for researchers to evaluate their methods, and thus, accelerates knowledge sharing and advancement. Not every computational task requires the entire research corpus for development and evaluation of approaches. Typically a subset of the corpus is used in a specific research task. We call this subset a test corpus or test dataset. The models built over a test dataset can later be extended to the entire research corpus. Test dataset is a static collection of data specific to an experiment, as opposed to a research corpus, which can evolve over time. Therefore, different versions of the test dataset used in a specific experiment should be retained for ensuring reproducibility of the research results. Note that a test dataset should not be confused with the training and testing split of a dataset, which are the terms used in the context of a cross validation experimental setup. In MIR, a considerable number of the computational approaches follow a data-driven methodology, and hence, a well curated research corpus becomes a key factor in de- termining the success of these approaches. Due to the importance of a good data corpus in research, building a corpus in itself is a fundamental research task (MacMul- len, 2003). MIR can be regarded as a relatively new research area within information retrieval, which has primarily gained popularity in last two decades. Even today, a significant number of the studies in MIR use ad-hoc procedures to build a collection of data to be used in the experiments. Quite often the audio recordings used in the experiments are taken from a researcher’s personal music collection. Availability of a good representative research corpus has been a challenge in MIR (Serra, 2014). This 61
  31. 31. 5[Y a_UO 5[ [ M 33
  32. 32. 5[Y a_UO 5[ [ M § E MY RR[ 34
  33. 33. 5[Y a_UO 5[ [ M § E MY RR[ § B [O Pa M P P _US O U UM • Serra, X. (2014). Creating research corpora for the computational study of music: the case of the Compmusic project. In Proc. of the 53rd AES Int. Conf. on Semantic Audio. London. • Srinivasamurthy, A., Koduri, G. K., Gulati, S., Ishwar, V., & Serra, X. (2014). Corpora for music information research in Indian art music. In Int. Computer Music Conf./Sound and Music Computing Conf., pp. 1029–1036. Athens, Greece. 35
  34. 34. 5[Y a_UO 5[ [ M 36
  35. 35. 5[Y a_UO 5[ [ M 9hX]b =X]gbe]U_ YgUXUgU <f Dhf]WVeU]am 37
  36. 36. 5M M UO M P :U Pa_ M U a_UO 5[ [ M 8 9 .=3 .20891573 9 5 38
  37. 37. A MOO __ a_UO 5[ [ M 8 9 .=3 .20891573 9 5 39
  38. 38. A MOO __ a_UO 5[ [ M U Uf YWg]baf DY_bX]W c eUfYf 40
  39. 39. E _ 6M M_ _ § E[ UO P URUOM U[ § el_ D SY M U[ § [PUO DUYU M U e § ClSM C O[S U U[ 41
  40. 40. [Pe 6 _O U [ _ M P C _ M U[ _ 42 Chapter 4 Melody Descriptors and Representations 4.1 Introduction In this chapter, we describe methods for extracting relevant melodic descriptors and low-level melody representation from raw audio signals. These melodic features are used as inputs to perform higher level melodic analyses by the methods described in the subsequent chapters. Since these features are used in a number of computational tasks, we consolidate and present these common processing blocks in the current chapter. This chapter is largely based on our published work presented in Gulati et al. (2014a,b,c). There are four sections in this chapter, and in each section, we describe the extraction of a different melodic descriptor or a representation. In Section 4.2, we focus on the task of automatically identifying the tonic pitch in an audio recording. Our main objective is to perform a comparative evalu- ation of different existing methods and select the best method to be used in our work. In Section 4.3, we present the method used to extract the predominant pitch from audio signals, and describe the post-processing steps to reduce frequently occurring errors. In Section 4.4, we describe the process of segmenting the solo percussion re- gions (Tani sections) in the audio recordings of Carnatic music. In Section 4.5, we describe our ny¯as-based approach for segmenting melodies in Hindustani music. 87
  41. 41. [Pe 6 _O U [ _ M P C _ M U[ _ Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba Elxf Y[ YagUg]ba GeYXb ]aUag DY_bXl =fg] Ug]ba UaX cbfg cebWYff]a[ KUa] Y[ YagUg]ba 43
  42. 42. [Pe 6 _O U [ _ M P C _ M U[ _ Kba]W AXYag]Z]WUg]ba #=iU_hUg]ba Elxf Y[ YagUg]ba GeYXb ]aUag DY_bXl =fg] Ug]ba UaX cbfg cebWYff]a[ KUa] Y[ YagUg]ba 44
  43. 43. E[ UO P URUOM U[ 1 3 [MOT _ Method Features Feature Distribution Tonic Selection MRS (Sengupta et al., 2005) Pitch (Datta, 1996) NA Error minimization MRH1/MRH2 (Ranjani et al., 2011) Pitch (Boersma & Weenink, 2001) Parzen-window-based PDE GMM fitting MJS (Salamon et al., 2012) Multi-pitch salience (Salamon et al., 2011) Multi-pitch histogram Decision tree MSG (Gulati et al., 2012) Multi-pitch salience (Salamon et al., 2011) Multi-pitch histogram Decision tree Predominant melody (Salamon & Gómez, 2012) Pitch histogram Decision tree MAB1 (Bellur et al., 2012) Pitch (De Cheveigné & Kawahara, 2002) GD histogram Highest peak MAB2 (Bellur et al., 2012) Pitch (De Cheveigné & Kawahara, 2002) GD histogram Template matching MAB2 (Bellur et al., 2012) Pitch (De Cheveigné & Kawahara, 2002) GD histogram Highest peak MCS (Chordia & ¸Sentürk, 2013) ABBREVIATIONS: NA=Not applicable; GD=Group Delay, PDE=Probability Density Estimate Table 2.1: Summary of the existing tonic identification approaches. • Salamon, J., Gulati, S., & Serra, X. (2012). A multipitch approach to tonic identification in Indian classical music. In Proc. of Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 499–504. Porto, Portugal. • Gulati, S., Salamon, J., & Serra, X. (2012). A two-stage approach for tonic identification in indian art music. In Proc. of the 2nd CompMusic Workshop, pp. 119–127. Istanbul, Turkey: Universitat Pompeu Fabra. Method Features Feature Distribution Tonic Selection MRS (Sengupta et al., 2005) Pitch (Datta, 1996) NA Error minimization MRH1/MRH2 (Ranjani et al., 2011) Pitch (Boersma & Weenink, 2001) Parzen-window-based PDE GMM fitting MJS (Salamon et al., 2012) Multi-pitch salience (Salamon et al., 2011) Multi-pitch histogram Decision tree MSG (Gulati et al., 2012) Multi-pitch salience (Salamon et al., 2011) Multi-pitch histogram Decision tree Predominant melody (Salamon & Gómez, 2012) Pitch histogram Decision tree MAB1 (Bellur et al., 2012) Pitch (De Cheveigné & Kawahara, 2002) GD histogram Highest peak MAB2 (Bellur et al., 2012) Pitch (De Cheveigné & Kawahara, 2002) GD histogram Template matching MAB2 (Bellur et al., 2012) Pitch (De Cheveigné & Kawahara, 2002) GD histogram Highest peak MCS (Chordia & ¸Sentürk, 2013) ABBREVIATIONS: NA=Not applicable; GD=Group Delay, PDE=Probability Density Estimate Table 2.1: Summary of the existing tonic identification approaches. Dh_g]c]gW % _Uff]Z]WUg]ba GeYXb ]aUag c]gW % gY c_UgY)cYU c]W 45
  44. 44. 5[Y M M Ub 7bM aM U[ § D b Y T[P_ § DUd PUb _ PM M_ _ TIDCM1 TIDCM2 TIDCM3 TIDIITM1 TIDIITM2 TIDIISc ,1+ 3-/ .,2 -2 .1, // - - +/ +.. +, 1 =kWYecgf DYUa XheUg]ba # ]af 46
  45. 45. C _a _ Methods TIDCM1 TIDCM2 TIDCM3 TIDIISc TIDIITM1 TIDIITM2 TP TPC TP TPC TP TPC TP TPC TP TPC TP TPC MRH1 - 81.4 69.6 84.9 73.2 90.8 81.8 83.6 92.1 97.4 80.2 86.9 MRH2 - 63.2 65.7 78.2 68.5 83.5 83.6 83.6 94.7 97.4 83.8 88.8 MAB1 - - - - - - - - 89.5 89.5 - - MAB2 - 88.9 74.5 82.9 78.5 83.4 72.7 76.4 92.1 92.1 86.6 89.1 MAB3 - 86 61.1 80.5 67.8 79.9 72.7 72.7 94.7 94.7 85 86.6 MJS - 88.9 87.4 90.1 88.4 91 75.6 77.5 89.5 97.4 90.8 94.1 MSG - 92.2 87.8 90.9 87.7 90.5 79.8 85.3 97.4 97.4 93.6 93.6 Table 4.1: Accuracies (%) for tonic pitch (TP) and tonic pitch-class (TPC) identification by seven methods on six different datasets using only audio data. The best accuracy obtained for each dataset is highlighted using bold text. The dashed horizontal line divides the methods based on supervised learning (MJS and MSG) and those based on expert knowledge (MRH1, MRH2, MAB1, MAB2 and MAB3). TP column for TIDCM1 is marked as ‘-’, because it consists of only instrumental excerpts for which we not evaluate tonic pitch accuracy. MAB1 is only evaluated on TIDIITM1 since it works on the whole concert recording. 94 MELODY DESCRIPTORS AND REPRESENTATION Methods TIDCM1 TIDCM2 TIDCM3 TIDIISc TIDIITM1 TIDIITM2 MRH1 87.7 83.5 88.9 87.3 97.4 91.7 MRH2 79.55 76.3 82 85.5 97.4 91.5 MAB1 - - - - 97.4 - MAB2 92.3 91.5 94.2 81.8 97.4 91.1 MAB3 87.5 86.7 90.9 81.8 94.7 89.9 MJS 88.9 93.6 92.4 80.9 97.4 92.3 MSG 92.2 90.9 90.5 85.3 97.4 93.6 3aPU[ 3aPU[ MPM M 47
  46. 46. C _a _ CM2 CM3 IITM2 IITM1 60 65 70 75 80 85 90 95 100 PerformanceAccuracy(%) JS SG RH1 RH2 AB2 AB3 Dataset Accuracy(%) 40 50 60 70 80 90 100 PerformanceAccuracy(%) JS SG R H 1R H 2 AB2AB3 JS SG R H 1R H 2 AB2AB3 JS SG R H 1R H 2AB2 AB3 JS SG R H 1R H 2 AB2AB3 Tonic Pitch Tonic Pitch−class Hindustani Carnatic Male Female Accuracy(%) Tonic Pitch-class Tonic Pitch 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 CM1 CM2 CM3 IISCB1 IITM1 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 CM1 CM2 CM3 IISCB1 IITM1 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H Pa E Ma E Othe CM1 CM2 CM3 IISCB1 IITM1 IITM 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Error CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 Errors(%) Datasets 0 5 10 15 20 25 30 PerformanceAccuracy(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors Hindustani! Carnatic Male Female Accuracy(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors Hindustani! Carnatic Male Female 0 5 10 15 20 25 30 PerformanceAccuracy(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors Hindustani! Carnatic Male Female 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 CM1 CM2 CM3 IISCB1 IITM1 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 CM1 CM2 CM3 IISCB1 IITM1 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H P M O CM1 CM2 CM3 IISCB1 IITM1 II 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2A Pa Errors Ma Error Other Er CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 Errors(%) Datasets 48
  47. 47. C _a _ CM2 CM3 IITM2 IITM1 60 65 70 75 80 85 90 95 100 PerformanceAccuracy(%) JS SG RH1 RH2 AB2 AB3 Dataset Accuracy(%) 40 50 60 70 80 90 100 PerformanceAccuracy(%) JS SG R H 1R H 2 AB2AB3 JS SG R H 1R H 2 AB2AB3 JS SG R H 1R H 2AB2 AB3 JS SG R H 1R H 2 AB2AB3 Tonic Pitch Tonic Pitch−class Hindustani Carnatic Male Female Accuracy(%) Tonic Pitch-class Tonic Pitch 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 CM1 CM2 CM3 IISCB1 IITM1 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 CM1 CM2 CM3 IISCB1 IITM1 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H Pa E Ma E Othe CM1 CM2 CM3 IISCB1 IITM1 IITM 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Error CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 Errors(%) Datasets 0 5 10 15 20 25 30 PerformanceAccuracy(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors Hindustani! Carnatic Male Female Accuracy(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors Hindustani! Carnatic Male Female 0 5 10 15 20 25 30 PerformanceAccuracy(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors Hindustani! Carnatic Male Female 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 CM1 CM2 CM3 IISCB1 IITM1 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 CM1 CM2 CM3 IISCB1 IITM1 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H P M O CM1 CM2 CM3 IISCB1 IITM1 II 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2A Pa Errors Ma Error Other Er CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 0 5 10 15 20 25 30 Errors(%) JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 JS SGR H 1R H 2AB2AB3 Pa Errors Ma Errors Other Errors CM1 CM2 CM3 IISCB1 IITM1 IITM2 Errors(%) Datasets <heUg]ba bZ YkWYecg UgY[bel ]fY UaU_lf]f <YgU]_YX Yeebe UaU_lf]f UgY[bel ]fY Yeebe 9aU_lf]f 49
  48. 48. C _a _1 DaYYM e § a U U OT O M__URUOM U[ § 3aPU[ § bM UM [ Pa M U[ § 5[ _U_ MPU U[ S P U _ aY § BU OT Y M MW UOW § 3aPU[ Y MPM M § D _U Ub [ Pa M U[ § O[ _U_ MPU U[ S P U _ aY • Gulati, S., Bellur, A., Salamon, J., Ranjani, H. G., Ishwar, V., Murthy, H. A., & Serra, X. (2014). Automatic tonic identification in Indian art music: approaches and evaluation. JNMR, 43(1), 53–71. 50
  49. 49. B P[YU M BU OT 7_ UYM U[ DY_bX]U • Salamon, J. & Gómez, E. (2012). Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20(6), 1759–1770. • Bogdanov, D., Wack, N., Gómez, E., Gulati, S., Herrera, P., Mayor, O., Roma, G., Salamon, J., Zapata, J., & Serra, X. (2013). Essentia: an audio analysis library for music information retrieval. In ISMIR, pp. 493–498. 51
  50. 50. el_ DbM M D SY M U[ § [PUO _ SY M U[ § 6 UYU Y [PUO T M_ _ 178 180 182 184 186 188 190 −200 0 200 400 600 800 Time (s) F0frequency(cents) N1 N5N4N3N2 52
  51. 51. el_ DbM M D SY M U[ Predominant pitch estimation Tonic identification Audio Nyās segments Pred. pitch estimation and representation Histogram computation SegmentationSvara identification Segmentation Local Feature extractionContextual Local + Contextual Segment classification Segment fusion Segment classification and fusion 53
  52. 52. el_ DbM M D SY M U[ Predominant pitch estimation Tonic identification Audio Nyās segments Pred. pitch estimation and representation Histogram computation SegmentationSvara identification Segmentation Local Feature extractionContextual Local + Contextual Segment classification Segment fusion Segment classification and fusion 54
  53. 53. el_ DbM M D SY M U[ Predominant pitch estimation Tonic identification Audio Nyās segments Histogram computation SegmentationSvara identification Segmentation Local Feature extractionContextual Local + Contextual Segment classification Segment fusion Segment classification and fusion Pred. pitch estimation and representation 55
  54. 54. el_ DbM M D SY M U[ Predominant pitch estimation Tonic identification Audio Nyās segments Histogram computation SegmentationSvara identification Segmentation Local Feature extractionContextual Local + Contextual Segment classification Segment fusion Segment classification and fusion Pred. pitch estimation and representation 0.12, 0.34, 0.59, 0.23, 0.54 0.21, 0.24, 0.54, 0.54, 0.42 0.32, 0.23, 0.34, 0.41, 0.63 0.66, 0.98, 0.74, 0.33, 0.12 0.90, 0.42, 0.14, 0.83, 0.76 56
  55. 55. el_ DbM M D SY M U[ Predominant pitch estimation Tonic identification Audio Nyās segments Histogram computation SegmentationSvara identification Segmentation Local Feature extractionContextual Local + Contextual Segment classification Segment fusion Segment classification and fusion Pred. pitch estimation and representation 0.12, 0.34, 0.59, 0.23, 0.54 0.21, 0.24, 0.54, 0.54, 0.42 0.32, 0.23, 0.34, 0.41, 0.63 0.66, 0.98, 0.74, 0.33, 0.12 0.90, 0.42, 0.14, 0.83, 0.76 Nyās Non-nyās Nyās Nyās Non-nyās 57
  56. 56. 6M M_ § k l R[ YM O _ :U Pa_ M U § 3 [ M U[ _1 Ya_UOUM 2)- e M _ MU U S § ) -/ el_ _ SY _ 657 IYWbeX]a[f <heUg]ba 9eg]fgf Ix[Uf 58
  57. 57. C _a _118 MELODY DESCRIPTORS AND REPRESENTATIONS Segmentation Feat. DTW Tree k-NN NB LR SVM PLS FL 0.356 0.407 0.447 0.248 0.449 0.453 FC 0.284 0.394 0.387 0.383 0.389 0.406 FL+FC 0.289 0.414 0.426 0.409 0.432 0.437 Proposed FL 0.524 0.672 0.719 0.491 0.736 0.749 FC 0.436 0.629 0.615 0.641 0.621 0.673 FL+FC 0.446 0.682 0.708 0.591 0.725 0.735 Table 4.3: F-scores for ny¯as boundary detection using PLS method and the proposed seg- mentation method. Results are shown for different classifiers (Tree, k-NN, NB, LR, SVM) and local (FL), contextual (FC) and local together with contextual (FL+FC) features. DTW is the baseline method used for comparison. F-score for the random baseline obtained using BR2 is 0.184. Segmentation Feat. DTW Tree k-NN NB LR SVM PLS FL 0.553 0.685 0.723 0.621 0.727 0.722 FC 0.251 0.639 0.631 0.690 0.688 0.674 FL+FC 0.389 0.694 0.693 0.708 0.722 0.706 Segmentation Feat. DTW Tree k-NN NB LR SVM PLS FL 0.356 0.407 0.447 0.248 0.449 0.453 FC 0.284 0.394 0.387 0.383 0.389 0.406 FL+FC 0.289 0.414 0.426 0.409 0.432 0.437 Proposed FL 0.524 0.672 0.719 0.491 0.736 0.749 FC 0.436 0.629 0.615 0.641 0.621 0.673 FL+FC 0.446 0.682 0.708 0.591 0.725 0.735 Table 4.3: F-scores for ny¯as boundary detection using PLS method and the proposed seg- mentation method. Results are shown for different classifiers (Tree, k-NN, NB, LR, SVM) and local (FL), contextual (FC) and local together with contextual (FL+FC) features. DTW is the baseline method used for comparison. F-score for the random baseline obtained using BR2 is 0.184. Segmentation Feat. DTW Tree k-NN NB LR SVM PLS FL 0.553 0.685 0.723 0.621 0.727 0.722 FC 0.251 0.639 0.631 0.690 0.688 0.674 FL+FC 0.389 0.694 0.693 0.708 0.722 0.706 Proposed FL 0.546 0.708 0.754 0.714 0.749 0.758 FC 0.281 0.671 0.611 0.697 0.689 0.697 FL+FC 0.332 0.672 0.710 0.730 0.743 0.731 :bhaXUelCUVY_ 59
  58. 58. C _a _1 DaYYM e § B [ [_ P _ SY M U[ § M a d MO U[ § >[OM R M a _ § BU O cU_ U M _ SY M U[ § BM YM OTU S 6EH § >[OM 5[ d aM R M a _ • Gulati, S., Serrà, J., Ganguli, K. K., & Serra, X. (2014). Landmark detection in Hindustani music melodies. In ICMC-SMC, pp. 1062-1068. 60
  59. 59. [PUO BM B [O __U S 61 Chapter 5 Melodic Pattern Processing: Similarity, Discovery and Characterization 5.1 Introduction In this chapter, we present our methodology for discovering musically relevant melodic patterns in sizable audio collections of IAM. We address three main computational tasks involved in this process: melodic similarity, pattern discovery and characteriz- ation of the discovered melodic patterns. We refer to these different tasks jointly as melodic pattern processing. “Only by repetition can a series of tones be characterized as something definite. Only repetition can demarcate a series of tones and its purpose. Repetition thus is the basis of music as an art” (Schenker et al., 1980) Repeating patterns are at the core of music. Consequently, analysis of patterns is fundamental in music analysis. In IAM, recurring melodic patterns are the building blocks of melodic structures. They provide a base for improvisation and composition, and thus, are crucial to the analysis and description of r¯agas, compositions, and artists in this music tradition. A detailed account of the importance of melodic patterns in IAM is provided in Section 2.3.2. To recapitulate, from the literature review presented in Section 2.4.2 and Section 2.5.2 we see that the approaches for pattern processing in music can be broadly put into two categories (Figure 5.1). One of the types of approaches perform pattern detection (or matching) and follow a supervised methodology. In these approaches the system knows a priori the pattern to be extracted. Typically such a system is fed with exem- plar patterns or queries and is expected to extract all of their occurrences in a piece of 121
  60. 60. [PUO BM B [O __U S 62
  61. 61. [PUO BM B [O __U S 63
  62. 62. [PUO BM B [O __U S 64
  63. 63. 7dU_ U S 3 [MOT _ R[ 3 Method Task Melody Representation Segmentation Similarity Measure Speed-up #R¯agas #Rec #Patt #Occ Ishwar et al. (2012) Distinction Continuous GT annotations HMM - 5c NA 6 431 Ross & Rao (2012) Detection Continuous Pa ny¯as DTW Segmentation 1h 2 2 107 Ross et al. (2012) Detection Continuous, SAX-12,1200 Sama location Euc., DTW Segmentation 3h 4 3 107 Ishwar et al. (2013) Detection Stationary point, Continuous Brute-force RLCS Two stage method 2c 47 4 173 Rao et al. (2013) Distinction Continuous GT annotations DTW - 1h 8 3 268 Dutta & Murthy (2014a) Discovery Stationary point, Continuous Brute-force RLCS - 5c 59 - - Dutta & Murthy (2014b) Detection Stationary point, Continuous NA Modified- RLCS Two stage method 1c 16 NA 59 Rao et al. (2014) Distinction Continuous GT annotations DTW - 1h 8 3 268 Distinction Continuous GT annotations HMM - 5c NA 10 652 Ganguli et al. (2015) Detection BSS, Transcription - Smith- Waterman Discretization 34h 50 NA 1075 h Hindustani music collection c Carnatic music collection ABBREVIATIONS: #Rec=Number of recordings; #Patt=Number of unique patterns; #Occ=Total number of annotated occurrences of the patterns; GT=Ground truth; NA=Not available; “-”= Not applicable; Euc.=Euclidean distance. Table 2.2: Summary of the methods proposed in the literature for melodic pattern processing in IAM. Note that all of these studies were published during the course of our work. <YgYWg]ba #/ <]fg]aWg]ba #. <]fWbiYel #+ , +, , +0 65
  64. 64. [PUO BM B [O __U S Supervised Approach Unsupervised Approach Supervised Approach 66
  65. 65. [PUO BM B [O __U S Supervised Approach Unsupervised Approach Supervised Approach 67
  66. 66. [PUO BM B [O __U S § 6M M_ _Uf § [c PS NUM_ § :aYM [ _ UYU M U[ _ Unsupervised Approach Supervised Approach 68
  67. 67. [PUO BM B [O __U S DY_bX]W ] ]_Ue]gl GUggYea <]fWbiYel GUggYea UeUWgYe]mUg]ba hcYei]fYX LafhcYei]fYX LafhcYei]fYX 69
  68. 68. [PUO BM B [O __U S DY_bX]W ] ]_Ue]gl GUggYea <]fWbiYel GUggYea UeUWgYe]mUg]ba hcYei]fYX LafhcYei]fYX LafhcYei]fYX 70
  69. 69. [PUO DUYU M U e 71
  70. 70. [PUO DUYU M U e Time (s) 10 2 3 4 5 6 7 8 9 10 11 12 13 14 15 00 00 00 00 00 00 00 U__Ya[Y 72
  71. 71. [PUO DUYU M U e U__Ya[Y 73 + , - .
  72. 72. [PUO DUYU M U e Predominant Pitch Estimation Pitch Normalization Uniform Time-scaling Distance Computation Melodic Similarity Audio1 Audio2 74
  73. 73. [PUO DUYU M U e U c_]a[ eUgY8 PYf ) Eb& N Ug ]aX8 PYf ) Eb& b hW 8 =hW_]XYUa be <laU ]W g] Y Uec]a[ #<KN & GUeU YgYef8 Predominant Pitch Estimation Pitch Normalization Uniform Time-scaling Distance Computation Melodic Similarity Audio1 Audio2 75
  74. 74. [PUO DUYU M U e n+ & 01& / & . & --p m sg Standard deviation of computation Q Heaviside step functi u Flatness measure ˜u Flatness threshold ° Complexity weightin w Sampling rate of the V Maximum error para method z Complexity estimate Predominant Pitch Estimation Pitch Normalization Uniform Time-scaling Distance Computation Melodic Similarity Audio1 Audio2 76
  75. 75. § Eb abe U_]mUg]ba § Kba]W § Q]aZ § Q,. § Q+, § DYUa § DYX]Ua § Q abe § DYX]Ua UVfb_hgY XYi]Ug]ba [PUO DUYU M U e Predominant Pitch Estimation Pitch Normalization Uniform Time-scaling Distance Computation Melodic Similarity Audio1 Audio2 77
  76. 76. [PUO DUYU M U e § vbZZ § vba n (3& (3/& +( & +( /& +(+p Predominant Pitch Estimation Pitch Normalization Uniform Time-scaling Distance Computation Melodic Similarity Audio1 Audio2 78
  77. 77. [PUO DUYU M U e § =hW_]XYUa § <laU ]W g] Y Uec]a[ § ?_bVU_ WbafgeU]ag #- § CbWU_ WbafgeU]ag # Predominant Pitch Estimation Pitch Normalization Uniform Time-scaling Distance Computation Melodic Similarity Audio1 Audio2 79
  78. 78. [PUO DUYU M U e Predominant Pitch Estimation Pitch Normalization Uniform Time-scaling Distance Computation Melodic Similarity Audio1 Audio2 k k k 80
  79. 79. 7bM aM U[ 1 D a E GUggYeaf + E IUaXb fY[ Yagf Kbc + aYUeYfg aY][ Vbef DYUa 9iYeU[Y GeYW]f]ba 81
  80. 80. 6M M_ 49 IYWbeX]a[f <heUg]ba Ix[UfGUggYea FWW( 49 IYWbeX]a[f <heUg]ba Ix[UfGUggYea FWW( UeaUg]W hf]W XUgUfYg ]aXhfgUa] hf]W XUgUfYg 82
  81. 81. C _a _ ODIC SIMILARITY: APPROACHES AND EVALUATION 131 Dataset MAP Srate Norm TScale Dist MSDcmd iitm 0.413 w67 Zmean Woff DDTW_L1_G90 0.412 w67 Zmean Won DDTW_L1_G10 0.411 w100 Zmean Woff DDTW_L1_G90 MSDhmd iitb 0.552 w100 Ztonic Woff DDTW_L0_G90 0.551 w67 Ztonic Woff DDTW_L0_G90 0.547 w50 Ztonic Woff DDTW_L0_G90 MAP score and details of the parameter settings for the three best performing MSDcmd iitm and MSDhmd iitb dataset. Srate: sampling rate of the melody representation, malization technique, TScale: uniform time-scaling and Dist: distance measure. an average precision (MAP), a typical evaluation measure in information Manning et al., 2008). Mean average precision (MAP) is computed by mean of the average precision values of each query in the dataset. This way, single number to evaluate and compare the performance of a variant. In sess if the difference in the performance of any two variants is statistically , we use the Wilcoxon signed rank-test (Wilcoxon, 1945) with p < 0.01. To e for multiple comparisons, we apply the Holm-Bonferroni method (Holm, us, considering that we compare 560 different variants, we effectively use a CMD HMD 134 MELODIC PATTERN PR Dataset MAP Srate Norm TScale Dist MSDcmd iitm 0.279 w67 Zmean Won DDTW_L1_G10 0.277 w67 ZtonicQ12 Won DDTW_L1_G10 0.275 w100 ZtonicQ12 Won DDTW_L1_G10 MSDhmd iitb 0.259 w40 ZtonicQ12 Won DDTW_L1_G90 0.259 w100 ZtonicQ12 Won DDTW_L1_G90 0.259 w67 ZtonicQ12 Won DDTW_L1_G90 Table 5.2: MAP score and details of the parameter settings for the three best perf83
  82. 82. C _a _ ODIC SIMILARITY: APPROACHES AND EVALUATION 131 Dataset MAP Srate Norm TScale Dist MSDcmd iitm 0.413 w67 Zmean Woff DDTW_L1_G90 0.412 w67 Zmean Won DDTW_L1_G10 0.411 w100 Zmean Woff DDTW_L1_G90 MSDhmd iitb 0.552 w100 Ztonic Woff DDTW_L0_G90 0.551 w67 Ztonic Woff DDTW_L0_G90 0.547 w50 Ztonic Woff DDTW_L0_G90 MAP score and details of the parameter settings for the three best performing MSDcmd iitm and MSDhmd iitb dataset. Srate: sampling rate of the melody representation, malization technique, TScale: uniform time-scaling and Dist: distance measure. an average precision (MAP), a typical evaluation measure in information Manning et al., 2008). Mean average precision (MAP) is computed by mean of the average precision values of each query in the dataset. This way, single number to evaluate and compare the performance of a variant. In sess if the difference in the performance of any two variants is statistically , we use the Wilcoxon signed rank-test (Wilcoxon, 1945) with p < 0.01. To e for multiple comparisons, we apply the Holm-Bonferroni method (Holm, us, considering that we compare 560 different variants, we effectively use a CMD HMD 134 MELODIC PATTERN PR Dataset MAP Srate Norm TScale Dist MSDcmd iitm 0.279 w67 Zmean Won DDTW_L1_G10 0.277 w67 ZtonicQ12 Won DDTW_L1_G10 0.275 w100 ZtonicQ12 Won DDTW_L1_G10 MSDhmd iitb 0.259 w40 ZtonicQ12 Won DDTW_L1_G90 0.259 w100 ZtonicQ12 Won DDTW_L1_G90 0.259 w67 ZtonicQ12 Won DDTW_L1_G90 Table 5.2: MAP score and details of the parameter settings for the three best perf GUeU YgYef GUggYea WUgY[be]Yf GUggYea fY[ YagUg]ba 84
  83. 83. C _a _1 DaYYM e U c_]a[ eUgY Ebe U_]mUg]ba K] Y fWU_]a[ <]fgUaWY CbWU_ WbafgeU]ag ?_bVU_ WbafgeU]ag ][ DYUa Eb WbafYafhf <KN N]g AaiUe]Uag AaiUe]Uag Kba]W Eb WbafYafhf <KN N]g bhg N]g bhg UeaUg]W # (. D9G ]aXhfgUa] # (// D9G GUeU YgYe 85
  84. 84. C _a _1 DaYYM e Bab a La ab a G eUfY Y[ YagUg]ba UeaUg]W ]aXhfgUa] • Gulati, S., Serrà, J., & Serra, X. (2015). An evaluation of methodologies for melodic similarity in audio recordings of Indian art music. In ICASSP, pp. 678–682. 86
  85. 85. Y [bU S [PUO DUYU M U e Predominant Pitch Estimation Pitch Normalization Uniform Time-scaling Distance Computation Melodic Similarity Audio1 Audio2 8h_gheY fcYW]Z]W]g]Yf 87
  86. 86. Y [bU S [PUO DUYU M U e ]aXhfgUa] hf]W 88
  87. 87. Y [bU S [PUO DUYU M U e • G. E. Batista, X. Wang, and E. J Keogh. A complexity-invariant distance measure for time series, in SDM, volume 11, pp. 699–710, 2011 b c_Yk]gl 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 Time (s) 500 0 500 1000 1500 Frequency(Cent) 1 2 3 U Y c eUfY UeaUg]W hf]W 89 142 MELODIC P based distance (DDTW) in order to compute the final similarit We compute ° as: ° = max(zi,zj) min(zi,zj) zi = 2 s N 1 Â i=1 (ˆpi ˆpi+1)2 where, zi is the complexity estimate of a melodic pattern of le 142 MELODIC based distance (DDTW) in order to compute the final simila We compute ° as: ° = max(zi,zj) min(zi,zj) zi = 2 s N 1 Â i=1 (ˆpi ˆpi+1)2 where, zi is the complexity estimate of a melodic pattern of is the pitch value of the ith sample. We explore two variants o
  88. 88. Y [bU S [PUO DUYU M U e Predominant Pitch Est. + Post Processing Pitch Normalization Svara Duration Truncation Distance Computation + Complexity Weight Melodic Similarity Audio1 Audio2 Partial Transcription Elxf fY[ YagUg]ba n (+& (-& (/& (1/& +( & +(/& ,( p 90
  89. 89. 6M M_ 49 IYWbeX]a[f <heUg]ba Ix[UfGUggYea FWW( 49 IYWbeX]a[f <heUg]ba Ix[UfGUggYea FWW( UeaUg]W hf]W XUgUfYg ]aXhfgUa] hf]W XUgUfYg 91
  90. 90. C _a _ MELODIC PATTERN PROCESSING MSDhmd CM Norm MB MDT MCW1 MCW2 Ztonic 0.45 (0.25) 0.52 (0.24) - - Zmean 0.25 (0.20) 0.31 (0.23) - - Ztetra 0.40 (0.23) 0.47 (0.23) - - MSDcmd CM Norm MB MDT MCW1 MCW2 Ztonic 0.39 (0.29) 0.42 (0.29) 0.41 (0.28) 0.41 (0.29) Zmean 0.39 (0.26) 0.45 (0.28) 0.43 (0.27) 0.45 (0.27) Ztetra 0.45 (0.26) 0.50 (0.27) 0.49 (0.28) 0.51 (0.27) 5.3: MAP scores for the two datasets MSDhmd CM and MSDcmd CM for the four method vari- MB, MDT, MCW1 and MCW2 and for different normalization techniques. Standard devi- of average precision is reported within round brackets. rst analyse the results for the MSDhmd CM dataset. From Table 5.3 (upper half), e that the proposed method variant that applies a duration truncation performs than the baseline method for all the normalization techniques. Moreover, this ence is found to be statistically significant in each case. The results for the hmd CM in this table correspond to F =500 ms, for which we obtain the highest ac- y compared to the other F values as shown in Figure 5.10. Furthermore, we see tonic results in the best accuracy for the MSDhmd CM for all the method variants and fference is found to be statistically significant in each case. From Table 5.3, we e a high standard deviation of the average precision values. This is because some MB MDT H1 H2 H3 H4 H5 MB MB MB MBMDT MDT MDT MDT MB MDT MCW2 MB MB MB MBMDT MDT MDT C1 C2 C3 C4 C5 MDTMCW2 MCW2 MCW2 MCW2 0.1 0.3 0.5 0.75 1.0 1.5 2.0 0.44 0.46 0.48 0.50 0.52 MAP HMD CMD (s) 92
  91. 91. C _a _ MELODIC PATTERN PROCESSING MSDhmd CM Norm MB MDT MCW1 MCW2 Ztonic 0.45 (0.25) 0.52 (0.24) - - Zmean 0.25 (0.20) 0.31 (0.23) - - Ztetra 0.40 (0.23) 0.47 (0.23) - - MSDcmd CM Norm MB MDT MCW1 MCW2 Ztonic 0.39 (0.29) 0.42 (0.29) 0.41 (0.28) 0.41 (0.29) Zmean 0.39 (0.26) 0.45 (0.28) 0.43 (0.27) 0.45 (0.27) Ztetra 0.45 (0.26) 0.50 (0.27) 0.49 (0.28) 0.51 (0.27) 5.3: MAP scores for the two datasets MSDhmd CM and MSDcmd CM for the four method vari- MB, MDT, MCW1 and MCW2 and for different normalization techniques. Standard devi- of average precision is reported within round brackets. rst analyse the results for the MSDhmd CM dataset. From Table 5.3 (upper half), e that the proposed method variant that applies a duration truncation performs than the baseline method for all the normalization techniques. Moreover, this ence is found to be statistically significant in each case. The results for the hmd CM in this table correspond to F =500 ms, for which we obtain the highest ac- y compared to the other F values as shown in Figure 5.10. Furthermore, we see tonic results in the best accuracy for the MSDhmd CM for all the method variants and fference is found to be statistically significant in each case. From Table 5.3, we e a high standard deviation of the average precision values. This is because some MB MDT H1 H2 H3 H4 H5 MB MB MB MBMDT MDT MDT MDT MB MDT MCW2 MB MB MB MBMDT MDT MDT C1 C2 C3 C4 C5 MDTMCW2 MCW2 MCW2 MCW2 0.1 0.3 0.5 0.75 1.0 1.5 2.0 0.44 0.46 0.48 0.50 0.52 MAP HMD CMD (s) <heUg]ba gehaWUg]ba Fcg] U_ XheUg]ba8 b c_Yk]gl NY][ g]a[ 93
  92. 92. C _a _1 DaYYM e <heUg]ba gehaWUg]ba b c_Yk]gl NY][ g]a[ UeaUg]W ]aXhfgUa] UeaUg]W • Gulati, S., Serrà, J., & Serra, X. (2015). Improving melodic similarity in Indian art music using culture-specific melodic characteristics. In ISMIR, pp. 680–686. 94
  93. 93. [PUO BM B [O __U S DY_bX]W ] ]_Ue]gl GUggYea <]fWbiYel GUggYea UeUWgYe]mUg]ba hcYei]fYX LafhcYei]fYX LafhcYei]fYX 95
  94. 94. BM 6U_O[b e AageU eYWbeX]a[ GUggYea <]fWbiYel AagYe eYWbeX]a[ GUggYea YUeW IUa IYZ]aY Yag I+ I, I- 96
  95. 95. M O[ PU S BM 6U_O[b e § 6e MYUO UY cM U S 6EH ?_bVU_ WbafgeU]ag #+ _Ya[g Eb CbWU_ WbafgeU]ag gYc4 n# &+ & #+&+ & #+& p 97
  96. 96. CM W C RU Y CbWU_ WbafgeU]ag gYc4 n#,&+ & #+&+ & #+&, p CbWU_ Wbfg ZhaWg]ba #M+& M,& M-& M. 98
  97. 97. 5[Y a M U[ M 5[Y dU e F#a, à U c_Yf F#a, à UaX]XUgYf -/ k #0 k 0 k / k (/ 6 -+ D]__]ba <]fgUaWY Cb Ye :bhaXf 99
  98. 98. >[c 4[a P_ R[ 6EH <KehY 7 <C: <C: 7 <CUfg <CUfg Kbc E dhYhY 100
  99. 99. >[c 4[a P_ R[ 6EH § 5M_OMP P [c N[a P_ JRakthanmanon et al. (2013)K § U _ >M_ N[a P JKim et al. (2001)K § >4L [ST JKeogh et al. (2001)K § 7M e MNM P[ U S • Kim, S. W., Park, S., & Chu, W. W. (2001). An index-based approach for similarity search supporting time warping in large sequence databases. In 17th International Conference on Data Engineering, pp. 607–614. • Keogh, E. & Ratanamahatana, C. A. (2004). Exact indexing of dynamic time warping. Knowledge and Information Systems, 7(3), 358–386. • Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., & Keogh, E. (2013). Addressing big data time series: mining trillions of time series subsequences under dynamic time warping. ACM Transactions on Knowledge Discovery from Data (TKDD), 7(3), 10:1–10:31 101
  100. 100. 6M M_ § D P M _1 /0 ((( § D M OT M _1 j)- ((( ((( 49 IYWbeX]a[f <heUg]ba UeaUg]W hf]W 102
  101. 101. C _a _ 103
  102. 102. C _a _1 5[Y a M U[ _ § M O[ PU S1 )&,) E § O[ PU S1 ) &, E >]efg CUfg C:TBYb[ T=H C:TBYb[ T= /, ,- + ./ /+ - Cb Ye VbhaX AageU eYWbeX]a[ # AagYe eYWbeX]a[ # + K 6 + +, 10 33 104
  103. 103. C _a _ 160 MELODIC PATTER Seed Category V1 V2 V3 V4 S1 0.92 0.92 0.91 0.89 S2 0.68 0.73 0.73 0.66 S3 0.35 0.34 0.35 0.35 Table 5.5: MAP scores for four variants of rank refinement method (Vi) for e (S1, S2 and S3). S1 S2 S3 S1 S2 S3 105
  104. 104. C _a _ 160 MELODIC PATTER Seed Category V1 V2 V3 V4 S1 0.92 0.92 0.91 0.89 S2 0.68 0.73 0.73 0.66 S3 0.35 0.34 0.35 0.35 Table 5.5: MAP scores for four variants of rank refinement method (Vi) for e (S1, S2 and S3). S1 S2 S3 S1 S2 S3 2, cUggYeaf CbWU_ Wbfg ZhaWg]ba <]fgUaWY fYcUeUV]_]gl AageU if AagYe eYWbeX]a[ 106
  105. 105. C _a _ AageU eYWbeX]a[ AagYe eYWbeX]a[ + >GI à 2 KGI + >GI à / KGI • Gulati, S., Serrà, J., Ishwar, V., & Serra, X. (2014). Mining melodic patterns in large audio collections of Indian art music. In SITIS-MIRA, pp. 264–271. CbWU_ Wbfg ZhaWg]ba à ]gl V_bW 107
  106. 106. [PUO BM B [O __U S 6U__UYU M M _ bM M _ 108
  107. 107. [PUO BM B [O __U S DY_bX]W ] ]_Ue]gl GUggYea <]fWbiYel GUggYea UeUWgYe]mUg]ba hcYei]fYX LafhcYei]fYX LafhcYei]fYX 109
  108. 108. [PUO BM 5TM MO UfM U[ 9MYMWM_ h 3 M Wl _ 5[Y [_U U[ _ OURUO M _ ClSM Y[ UR_ 110
  109. 109. [PUO BM 5TM MO UfM U[ Inter-recording Pattern Detection Intra-recording Pattern Discovery Data Processing Pattern Network Generation Network Filtering Community Detection and Characterization Rāga motifs Carnatic music collection Melodic Pattern Discovery Pattern Characterization 111
  110. 110. [PUO BM 5TM MO UfM U[ Inter-recording Pattern Detection Intra-recording Pattern Discovery Data Processing Pattern Network Generation Network Filtering Community Detection and Characterization Rāga motifs Carnatic music collection Melodic Pattern Discovery Pattern Characterization 112
  111. 111. [PUO BM 5TM MO UfM U[ • M. EJ Newman, “The structure and function of complex networks,” Society for Industrial and Applied Mathematics (SIAM) review, vol. 45, no. 2, pp. 167–256, 2003. LaX]eYWgYX Pattern Network Generation Network Filtering Community Detection and Characterization 113
  112. 112. [PUO BM 5TM MO UfM U[ • S. Maslov and K. Sneppen, “Specificity and stability in topology of protein networks,” Science, vol. 296, no. 5569, pp. 910– 913, 2002. Pattern Network Generation Network Filtering Community Detection and Characterization 114
  113. 113. [PUO BM 5TM MO UfM U[ • V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2008, no. 10, pp. P10008, 2008. • S Fortunato, “Community detection in graphs,” Physics Reports, vol. 486, no. 3, pp. 75–174, 2010. Pattern Network Generation Network Filtering Community Detection and Characterization 115
  114. 114. [PUO BM 5TM MO UfM U[ Pattern Network Generation Network Filtering Community Detection and Characterization b ha]gl KlcY EbXYf IYWbeX]a[f Ix[Uf ?U U U ][ ][ ][ b cbf]g]ba c eUfYf DYX]h CYff CYff Ix[U c eUfYf DYX]h DYX]h CYff 116
  115. 115. [PUO BM 5TM MO UfM U[ Pattern Network Generation Network Filtering Community Detection and Characterization Ix[U X]fge]Vhg]ba 4: Graphical representation of the melodic pattern network after filterin The detected communities in the network are indicated by different colors these communities are shown on the right. ommunities we empirically devise a goodness measure G, which de od that a community Cq represents a r¯aga motif. We propose to use G = NL4 c, an estimate of the likelihood of r¯aga a0 q in Cq, L = Aq,1 N , N = 5 bhag Ix[U _] Y_] bbX - + + 3 5 = 117
  116. 116. [PUO BM 5TM MO UfM U[ Pattern Network Generation Network Filtering Community Detection and Characterization IYWbeX]a[ X]fge]Vhg]ba N = 5 bhag 1x2 + 2x2 + 3x1 5 Yageb]X , , + munities we empirically devise a goodness measure G, which den hat a community Cq represents a r¯aga motif. We propose to use G = NL4 c, estimate of the likelihood of r¯aga a0 q in Cq, L = Aq,1 N , how uniformly the nodes of the community are distributed over a c = Â Lb l=1 l ·Bq,l N . = 118
  117. 117. [PUO BM 5TM MO UfM U[ Pattern Network Generation Network Filtering Community Detection and Characterization 5.24: Graphical representation of the melodic pattern network after filt ld ˜D. The detected communities in the network are indicated by different col es of these communities are shown on the right. e communities we empirically devise a goodness measure G, which elihood that a community Cq represents a r¯aga motif. We propose to us G = NL4 c, L is an estimate of the likelihood of r¯aga a0 q in Cq, A ?bbXaYff YUfheY 119 EbXYf Yageb]X Ix[U _] Y_] bbX
  118. 118. 7bM aM U[ 49 IYWbeX]a[f <heUg]ba Ix[Uf b cbf]g]baf + Ix[U k + b ha]gl 6 + G eUfYf + Dhf]W]Uaf 120
  119. 119. C _a _ 5.5 CHARACTERIZATION OF MELODIC PATTERNS 173 R¯aga µr sr R¯aga µr sr Hamsadhv¯ani 0.84 0.23 B¯egad. a 0.88 0.11 K¯amavardani 0.78 0.17 K¯api 0.75 0.10 Darb¯ar 0.81 0.23 Bhairavi 0.91 0.15 Kaly¯an. i 0.90 0.10 Beh¯ag 0.84 0.16 K¯a ˙mbh¯oji 0.87 0.12 T¯od.¯ı 0.92 0.07 Table 5.7: Mean (µr) and standard deviation (sr) of µ√ for each r¯aga. R¯agas with µr 0.85 Rāgas 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 µ 0 5 10 15 20 25 30 35 Count 121
  120. 120. C _a _ 5.5 CHARACTERIZATION OF MELODIC PATTERNS 173 R¯aga µr sr R¯aga µr sr Hamsadhv¯ani 0.84 0.23 B¯egad. a 0.88 0.11 K¯amavardani 0.78 0.17 K¯api 0.75 0.10 Darb¯ar 0.81 0.23 Bhairavi 0.91 0.15 Kaly¯an. i 0.90 0.10 Beh¯ag 0.84 0.16 K¯a ˙mbh¯oji 0.87 0.12 T¯od.¯ı 0.92 0.07 Table 5.7: Mean (µr) and standard deviation (sr) of µ√ for each r¯aga. R¯agas with µr 0.85 Rāgas 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 µ 0 5 10 15 20 25 30 35 Count GYe c eUfY cYeZbe UaWY GYe ex[U cYeZbe UaWY Dhf]W]Uaf’ U[eYY Yag 122
  121. 121. C _a _1 DaYYM e DYUaIx[U# DYUaG eUfY# IUg]a[f 5 (3,(1/ 5 (2/ • Gulati, S., Serrà, J., Ishwar, V., & Serra, X. (2016). Discovering rāga motifs by characterizing communities in networks of melodic patterns. In ICASSP, pp. 286–290. 123 + , - . / 0 1 2 3 + PYf bhag + , - - +, ,+ ,/ --
  122. 122. 3a [YM UO ClSM C O[S U U[ 124 Chapter 6 Automatic R¯aga Recognition 6.1 Introduction In this chapter, we address the task of automatically recognizing r¯agas in audio re- cordings of IAM. We describe two novel approaches for r¯aga recognition that jointly capture the tonal and the temporal aspects of melody. The contents of this chapter are largely based on our published work in Gulati et al. (2016a,b). R¯aga is a core musical concept used in compositions, performances, music organiz- ation, and pedagogy of IAM. Even beyond the art music, numerous compositions in Indian folk and film music are also based on r¯agas (Ganti, 2013). R¯aga is therefore one of the most desired melodic descriptions of a recorded performance of IAM, and an important criterion used by listeners to browse its audio music collections. Despite its significance, there exists a large volume of audio content whose r¯aga is incorrectly labeled or, simply, unlabeled. A computational approach to r¯aga recognition will al- low us to automatically annotate large collections of audio music recordings. It will enable r¯aga-based music retrieval in large audio archives, semantically-meaningful music discovery and musicologically-informed navigation. Furthermore, a deeper understanding of the r¯aga framework from a computational perspective will pave the way for building applications for music pedagogy in IAM. R¯aga recognition is the most studied research topic in MIR of IAM. There exist a considerable number of approaches utilizing different characteristic aspects of r¯agas such as svara set, svara salience and ¯ar¯ohana-avr¯ohana. A critical in-depth review of the existing approaches for r¯aga recognition is presented in Section 2.4.3, wherein we identify several shortcomings in these approaches and possible avenues for scientific contribution to take this task to the next level. Here we provide a short summary of this analysis: Nearly half of the number of the existing approaches for r¯aga recognition do not utilize the temporal aspects of melody at all (Table 2.3), which are crucial 177
  123. 123. 3a [YM UO ClSM C O[S U U[ 38BACKGROUND Methods Svara set Svara salience Svara intonation ¯ar¯ohana- avr¯ohana Melodic phrases Svara discretization Temporal aspects Pandey et al. (2003) • • Yes • Chordia & Rae (2007) • • • Yes • Belle et al. (2009) • • • No Shetty & Achary (2009) • • Yes • Sridhar & Geetha (2009) • • Yes • Koduri et al. (2011) • • Yes Ranjani et al. (2011) • No Chakraborty & De (2012) • Yes Koduri et al. (2012) • • • Both Chordia & ¸Sentürk (2013) • • • No Dighe et al. (2013a) • • • Yes • Dighe et al. (2013b) • • Yes Koduri et al. (2014) • • • No Kumar et al. (2014) • • • • Yes • Dutta et al. (2015)a • No • a This method performs r¯aga verification and not recognition Table 2.3: R¯aga recognition methods proposed in the literature along with the melodic characteristics they utilize to perform the task. We also indicate if a method uses a discrete svara representation of melody. The methods are arranged in chronological order. 125
  124. 124. 3a [YM UO ClSM C O[S U U[ 38BACKGROUND Methods Svara set Svara salience Svara intonation ¯ar¯ohana- avr¯ohana Melodic phrases Svara discretization Temporal aspects Pandey et al. (2003) • • Yes • Chordia & Rae (2007) • • • Yes • Belle et al. (2009) • • • No Shetty & Achary (2009) • • Yes • Sridhar & Geetha (2009) • • Yes • Koduri et al. (2011) • • Yes Ranjani et al. (2011) • No Chakraborty & De (2012) • Yes Koduri et al. (2012) • • • Both Chordia & ¸Sentürk (2013) • • • No Dighe et al. (2013a) • • • Yes • Dighe et al. (2013b) • • Yes Koduri et al. (2014) • • • No Kumar et al. (2014) • • • • Yes • Dutta et al. (2015)a • No • a This method performs r¯aga verification and not recognition Table 2.3: R¯aga recognition methods proposed in the literature along with the melodic characteristics they utilize to perform the task. We also indicate if a method uses a discrete svara representation of melody. The methods are arranged in chronological order. e I [ ? D G X < a E 126
  125. 125. 3a [YM UO ClSM C O[S U U[ 38BACKGROUND Methods Svara set Svara salience Svara intonation ¯ar¯ohana- avr¯ohana Melodic phrases Svara discretization Temporal aspects Pandey et al. (2003) • • Yes • Chordia & Rae (2007) • • • Yes • Belle et al. (2009) • • • No Shetty & Achary (2009) • • Yes • Sridhar & Geetha (2009) • • Yes • Koduri et al. (2011) • • Yes Ranjani et al. (2011) • No Chakraborty & De (2012) • Yes Koduri et al. (2012) • • • Both Chordia & ¸Sentürk (2013) • • • No Dighe et al. (2013a) • • • Yes • Dighe et al. (2013b) • • Yes Koduri et al. (2014) • • • No Kumar et al. (2014) • • • • Yes • Dutta et al. (2015)a • No • a This method performs r¯aga verification and not recognition Table 2.3: R¯aga recognition methods proposed in the literature along with the melodic characteristics they utilize to perform the task. We also indicate if a method uses a discrete svara representation of melody. The methods are arranged in chronological order. e I [ ? D G X < a E 127
  126. 126. 3a [YM UO ClSM C O[S U U[ 38BACKGROUND Methods Svara set Svara salience Svara intonation ¯ar¯ohana- avr¯ohana Melodic phrases Svara discretization Temporal aspects Pandey et al. (2003) • • Yes • Chordia & Rae (2007) • • • Yes • Belle et al. (2009) • • • No Shetty & Achary (2009) • • Yes • Sridhar & Geetha (2009) • • Yes • Koduri et al. (2011) • • Yes Ranjani et al. (2011) • No Chakraborty & De (2012) • Yes Koduri et al. (2012) • • • Both Chordia & ¸Sentürk (2013) • • • No Dighe et al. (2013a) • • • Yes • Dighe et al. (2013b) • • Yes Koduri et al. (2014) • • • No Kumar et al. (2014) • • • • Yes • Dutta et al. (2015)a • No • a This method performs r¯aga verification and not recognition Table 2.3: R¯aga recognition methods proposed in the literature along with the melodic characteristics they utilize to perform the task. We also indicate if a method uses a discrete svara representation of melody. The methods are arranged in chronological order. e I [ ? D G X < a E 128
  127. 127. 3a [YM UO ClSM C O[S U U[ 38BACKGROUND Methods Svara set Svara salience Svara intonation ¯ar¯ohana- avr¯ohana Melodic phrases Svara discretization Temporal aspects Pandey et al. (2003) • • Yes • Chordia & Rae (2007) • • • Yes • Belle et al. (2009) • • • No Shetty & Achary (2009) • • Yes • Sridhar & Geetha (2009) • • Yes • Koduri et al. (2011) • • Yes Ranjani et al. (2011) • No Chakraborty & De (2012) • Yes Koduri et al. (2012) • • • Both Chordia & ¸Sentürk (2013) • • • No Dighe et al. (2013a) • • • Yes • Dighe et al. (2013b) • • Yes Koduri et al. (2014) • • • No Kumar et al. (2014) • • • • Yes • Dutta et al. (2015)a • No • a This method performs r¯aga verification and not recognition Table 2.3: R¯aga recognition methods proposed in the literature along with the melodic characteristics they utilize to perform the task. We also indicate if a method uses a discrete svara representation of melody. The methods are arranged in chronological order. 129
  128. 128. 3a [YM UO ClSM C O[S U U[ 38BACKGROUND Methods Svara set Svara salience Svara intonation ¯ar¯ohana- avr¯ohana Melodic phrases Svara discretization Temporal aspects Pandey et al. (2003) • • Yes • Chordia & Rae (2007) • • • Yes • Belle et al. (2009) • • • No Shetty & Achary (2009) • • Yes • Sridhar & Geetha (2009) • • Yes • Koduri et al. (2011) • • Yes Ranjani et al. (2011) • No Chakraborty & De (2012) • Yes Koduri et al. (2012) • • • Both Chordia & ¸Sentürk (2013) • • • No Dighe et al. (2013a) • • • Yes • Dighe et al. (2013b) • • Yes Koduri et al. (2014) • • • No Kumar et al. (2014) • • • • Yes • Dutta et al. (2015)a • No • a This method performs r¯aga verification and not recognition Table 2.3: R¯aga recognition methods proposed in the literature along with the melodic characteristics they utilize to perform the task. We also indicate if a method uses a discrete svara representation of melody. The methods are arranged in chronological order. Ix[U 9 Ix[U : Ix[U 130
  129. 129. 3a [YM UO ClSM C O[S U U[ 38BACKGROUND Methods Svara set Svara salience Svara intonation ¯ar¯ohana- avr¯ohana Melodic phrases Svara discretization Temporal aspects Pandey et al. (2003) • • Yes • Chordia & Rae (2007) • • • Yes • Belle et al. (2009) • • • No Shetty & Achary (2009) • • Yes • Sridhar & Geetha (2009) • • Yes • Koduri et al. (2011) • • Yes Ranjani et al. (2011) • No Chakraborty & De (2012) • Yes Koduri et al. (2012) • • • Both Chordia & ¸Sentürk (2013) • • • No Dighe et al. (2013a) • • • Yes • Dighe et al. (2013b) • • Yes Koduri et al. (2014) • • • No Kumar et al. (2014) • • • • Yes • Dutta et al. (2015)a • No • a This method performs r¯aga verification and not recognition Table 2.3: R¯aga recognition methods proposed in the literature along with the melodic characteristics they utilize to perform the task. We also indicate if a method uses a discrete svara representation of melody. The methods are arranged in chronological order. 131
  130. 130. 3a [YM UO ClSM C O[S U U[ 38BACKGROUND Methods Svara set Svara salience Svara intonation ¯ar¯ohana- avr¯ohana Melodic phrases Svara discretization Temporal aspects Pandey et al. (2003) • • Yes • Chordia & Rae (2007) • • • Yes • Belle et al. (2009) • • • No Shetty & Achary (2009) • • Yes • Sridhar & Geetha (2009) • • Yes • Koduri et al. (2011) • • Yes Ranjani et al. (2011) • No Chakraborty & De (2012) • Yes Koduri et al. (2012) • • • Both Chordia & ¸Sentürk (2013) • • • No Dighe et al. (2013a) • • • Yes • Dighe et al. (2013b) • • Yes Koduri et al. (2014) • • • No Kumar et al. (2014) • • • • Yes • Dutta et al. (2015)a • No • a This method performs r¯aga verification and not recognition Table 2.3: R¯aga recognition methods proposed in the literature along with the melodic characteristics they utilize to perform the task. We also indicate if a method uses a discrete svara representation of melody. The methods are arranged in chronological order. 132
  131. 131. 3a [YM UO ClSM C O[S U U[ 38BACKGROUND Methods Svara set Svara salience Svara intonation ¯ar¯ohana- avr¯ohana Melodic phrases Svara discretization Temporal aspects Pandey et al. (2003) • • Yes • Chordia & Rae (2007) • • • Yes • Belle et al. (2009) • • • No Shetty & Achary (2009) • • Yes • Sridhar & Geetha (2009) • • Yes • Koduri et al. (2011) • • Yes Ranjani et al. (2011) • No Chakraborty & De (2012) • Yes Koduri et al. (2012) • • • Both Chordia & ¸Sentürk (2013) • • • No Dighe et al. (2013a) • • • Yes • Dighe et al. (2013b) • • Yes Koduri et al. (2014) • • • No Kumar et al. (2014) • • • • Yes • Dutta et al. (2015)a • No • a This method performs r¯aga verification and not recognition Table 2.3: R¯aga recognition methods proposed in the literature along with the melodic characteristics they utilize to perform the task. We also indicate if a method uses a discrete svara representation of melody. The methods are arranged in chronological order. Ix[U MYe]Z]WUg]ba 133
  132. 132. 3a [YM UO ClSM C O[S U U[ 2.4RELATEDWORKININDIANARTMUSIC39 Method Tonal Feature Tonic Identification Feature Recognition Method #R¯agas Dataset (Dur./Num.)a Audio Type Pandey et al. (2003) Pitch (Boersma & Weenink, 2001) NA Svara sequence HMM and n-Gram 2 - / 31 MP Chordia & Rae (2007) Pitch (Sun, 2000) Manual PCD, PCDD SVM classifier 31 20 / 127 MP Belle et al. (2009) Pitch (Rao & Rao, 2009) Manual PCD (parameterized) k-NN classifier 4 0.6 / 10 PP Shetty & Achary (2009) Pitch (Sridhar & Geetha, 2006) NA #Svaras, Vakra svaras Neural Network classifier 20 - / 90 MP Sridhar & Geetha (2009) Pitch (Lee, 2006) Singer identification Svara set, its sequence String matching 3 - / 30 PP Koduri et al. (2011) Pitch (Rao & Rao, 2010) Brute force PCD k-NN classifier 10 2.82 / 170 PP Ranjani et al. (2011) Pitch (Boersma & Weenink, 2001) GMM fitting PDE SC-GMM and Set matching 7 - / 48 PP Chakraborty & De (2012) Pitch (Sengupta, 1990) Error minimization Svara set Set matching - - / - - Koduri et al. (2012) Predominant pitch (Salamon & Gómez, 2012) Multipitch-based PCD variants k-NN classifier 43 - / 215 PP Chordia & ¸Sentürk (2013) Pitch (Camacho, 2007) Brute force PCD variants k-NN and statistical classifiers 31 20 / 127 MP Dighe et al. (2013a) Chroma (Lartillot et al., 2008) Brute force (v¯adi-based) Chroma, Timbre features HMM 4 9.33 / 56 PP Dighe et al. (2013b) Chroma (Lartillot et al., 2008) Brute force (v¯adi-based) PCD variant RF classifier 8 16.8 / 117 PP Koduri et al. (2014) Predominant pitch (Salamon & Gómez, 2012) Multipitch-based PCD (parameterized) Different classifiers 45? 93/424 PP Kumar et al. (2014) Predominant pitch (Salamon & Gómez, 2012) Brute force PCD + n-Gram distribution SVM classifier 10 2.82 / 170 PP Dutta et al. (2015)* Predominant pitch (Salamon & Gómez, 2012) Cepstrum-based Pitch contours LCS with k-NN 30† 3 / 254 PP a In the case of multiple datasets we list the larger one * This method performs r¯aga verification and not recognition ? Authors do not use all 45 r¯agas at once in a single experiment, but consider groups of 3 r¯agas per experiment † Authors finally use only 17 r¯agas in their experiment ABBREVIATIONS: Dur.: Duration of the dataset, Num.: Number of recordings, NA: Not applicable, ‘-’: Not available, SC-GMM: semi-continuous GMM, MP: Monophonic, PP: Polyphonic Table 2.4: Summary of the R¯aga recognition methods proposed in the literature. The methods are arranged in chronological order. 134
  133. 133. 2.4RELATEDWORKININDIANARTMUSIC39 Method Tonal Feature Tonic Identification Feature Recognition Method #R¯agas Dataset (Dur./Num.)a Audio Type Pandey et al. (2003) Pitch (Boersma & Weenink, 2001) NA Svara sequence HMM and n-Gram 2 - / 31 MP Chordia & Rae (2007) Pitch (Sun, 2000) Manual PCD, PCDD SVM classifier 31 20 / 127 MP Belle et al. (2009) Pitch (Rao & Rao, 2009) Manual PCD (parameterized) k-NN classifier 4 0.6 / 10 PP Shetty & Achary (2009) Pitch (Sridhar & Geetha, 2006) NA #Svaras, Vakra svaras Neural Network classifier 20 - / 90 MP Sridhar & Geetha (2009) Pitch (Lee, 2006) Singer identification Svara set, its sequence String matching 3 - / 30 PP Koduri et al. (2011) Pitch (Rao & Rao, 2010) Brute force PCD k-NN classifier 10 2.82 / 170 PP Ranjani et al. (2011) Pitch (Boersma & Weenink, 2001) GMM fitting PDE SC-GMM and Set matching 7 - / 48 PP Chakraborty & De (2012) Pitch (Sengupta, 1990) Error minimization Svara set Set matching - - / - - Koduri et al. (2012) Predominant pitch (Salamon & Gómez, 2012) Multipitch-based PCD variants k-NN classifier 43 - / 215 PP Chordia & ¸Sentürk (2013) Pitch (Camacho, 2007) Brute force PCD variants k-NN and statistical classifiers 31 20 / 127 MP Dighe et al. (2013a) Chroma (Lartillot et al., 2008) Brute force (v¯adi-based) Chroma, Timbre features HMM 4 9.33 / 56 PP Dighe et al. (2013b) Chroma (Lartillot et al., 2008) Brute force (v¯adi-based) PCD variant RF classifier 8 16.8 / 117 PP Koduri et al. (2014) Predominant pitch (Salamon & Gómez, 2012) Multipitch-based PCD (parameterized) Different classifiers 45? 93/424 PP Kumar et al. (2014) Predominant pitch (Salamon & Gómez, 2012) Brute force PCD + n-Gram distribution SVM classifier 10 2.82 / 170 PP Dutta et al. (2015)* Predominant pitch (Salamon & Gómez, 2012) Cepstrum-based Pitch contours LCS with k-NN 30† 3 / 254 PP a In the case of multiple datasets we list the larger one * This method performs r¯aga verification and not recognition ? Authors do not use all 45 r¯agas at once in a single experiment, but consider groups of 3 r¯agas per experiment † Authors finally use only 17 r¯agas in their experiment ABBREVIATIONS: Dur.: Duration of the dataset, Num.: Number of recordings, NA: Not applicable, ‘-’: Not available, SC-GMM: semi-continuous GMM, MP: Monophonic, PP: Polyphonic Table 2.4: Summary of the R¯aga recognition methods proposed in the literature. The methods are arranged in chronological order. 3a [YM UO ClSM C O[S U U[ 135
  134. 134. 6M M_ 49 IYWbeX]a[f <heUg]ba CYUX 9eg]fgfbaWYegfIx[Uf 49 IYWbeX]a[f <heUg]ba CYUX 9eg]fgfIY_YUfYfIx[Uf UeaUg]W hf]W XUgUfYg ]aXhfgUa] hf]W XUgUfYg 136
  135. 135. B [ [_ P 3 [MOT _ time (s) 10 time (s) 10 30 0 20 40 60 80 100 120 Index 0 20 40 60 80 100 120 Index (a) 0 20 40 60 80 100 120 Index 0 20 40 60 80 100 120 (b) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 G eUfY VUfYX Ix[U IYWb[a]g]ba K<D VUfYX Ix[U IYWb[a]g]ba KbaU_ % KY cbeU_ <]fWeYg]mUg]ba 137
  136. 136. B [ [_ P 3 [MOT _ time (s) 10 time (s) 10 30 0 20 40 60 80 100 120 Index 0 20 40 60 80 100 120 Index (a) 0 20 40 60 80 100 120 Index 0 20 40 60 80 100 120 (b) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 G eUfY VUfYX Ix[U IYWb[a]g]ba K<D VUfYX Ix[U IYWb[a]g]ba KbaU_ % KY cbeU_ 138 <]fWeYg]mUg]ba
  137. 137. BT M_ NM_ P ClSM C O[S U U[ Inter-recording Pattern Detection Intra-recording Pattern Discovery Data Processing Music collection Melodic Pattern Discovery Vocabulary Extraction Term-frequency Feature Extraction Feature Normalization TF-IDF Feature Extraction Feature Matrix Pattern Network Generation Similarity Thresholding Community Detection Melodic Pattern Clustering 139
  138. 138. BT M_ NM_ P ClSM C O[S U U[ Inter-recording Pattern Detection Intra-recording Pattern Discovery Data Processing Music collection Melodic Pattern Discovery Vocabulary Extraction Term-frequency Feature Extraction Feature Normalization TF-IDF Feature Extraction Feature Matrix Pattern Network Generation Similarity Thresholding Community Detection Melodic Pattern Clustering 140
  139. 139. BT M_ NM_ P ClSM C O[S U U[ 141
  140. 140. BT M_ NM_ P ClSM C O[S U U[ 142
  141. 141. BT M_ NM_ P ClSM C O[S U U[ Kbc]W DbXY_]a[ o KYkg _Uff]Z]WUg]ba G eUfYf Ix[U IYWbeX]a[f NbeXf Kbc]W <bWh Yagf 143
  142. 142. BT M_ NM_ P ClSM C O[S U U[ Kbc]W DbXY_]a[ o KYkg _Uff]Z]WUg]ba KYe >eYdhYaWl t AaiYefY <bWh Yag >eYdhYaWl #K> A<> 144
  143. 143. E 6 M a 7d MO U[ Vocabulary Extraction Term-frequency Feature Extraction Feature Normalization 145
  144. 144. E 6 M a 7d MO U[ Vocabulary Extraction Term-frequency Feature Extraction Feature Normalization 146
  145. 145. E 6 M a 7d MO U[ Vocabulary Extraction Term-frequency Feature Extraction Feature Normalization + . , , + + - - - 147
  146. 146. E 6 M a 7d MO U[ Vocabulary Extraction Term-frequency Feature Extraction Feature Normalization ies except the ones that comprise patterns extracted from only a single audio record- ing. Such communities are analogous to the words that only occur within a document and, hence, are irrelevant for modeling a topic. We experiment with three different sets of features f1, f2 and f3, which are similar to the TF-IDF features typically used in text information retrieval. We denote our corpus by R comprising NR = |R| number of recordings. A melodic phrase and a recording is denoted by √and r , respectively f1(√,r) = ( 1, if f(√,r) > 0 0, otherwise (6.1) where, f(√,r) denotes the raw frequency of occurrence of pattern √ in recording r. f1 only considers the presence or absence of a pattern in a recording. In order to investigate if the frequency of occurrence of melodic patterns is relevant for charac- terizing r¯agas, we take f2(√,r) = f(√,r). As mentioned, the melodic patterns that occur across different r¯agas and in several recordings do not aid r¯aga recognition. Therefore, to reduce their effect in the feature vector we employ a weighting scheme, similar to the inverse document frequency (idf) weighting in text retrieval. f3(√,r) = f(√,r)I(√,R) (6.2) ✓ NR ◆ We experiment with three different sets of features f1, f2 and f3, which are similar to the TF-IDF features typically used in text information retrieval. We denote our corpus by R comprising NR = |R| number of recordings. A melodic phrase and a recording is denoted by√and r , respectively f1(√,r) = ( 1, if f(√,r) > 0 0, otherwise (6.1) where, f(√,r) denotes the raw frequency of occurrence of pattern √ in recording r. f1 only considers the presence or absence of a pattern in a recording. In order to investigate if the frequency of occurrence of melodic patterns is relevant for charac- terizing r¯agas, we take f2(√,r) = f(√,r). As mentioned, the melodic patterns that occur across different r¯agas and in several recordings do not aid r¯aga recognition. Therefore, to reduce their effect in the feature vector we employ a weighting scheme, similar to the inverse document frequency (idf) weighting in text retrieval. f3(√,r) = f(√,r)I(√,R) (6.2) I(√,R) = log ✓ NR |{r 2 R :√2 r}| ◆ (6.3) 1 2 3 the TF-IDF features typically used in text information retrieval. We denote our corp by R comprising NR = |R| number of recordings. A melodic phrase and a recordi is denoted by√and r , respectively f1(√,r) = ( 1, if f(√,r) > 0 0, otherwise (6 where, f(√,r) denotes the raw frequency of occurrence of pattern √ in recordi r. f1 only considers the presence or absence of a pattern in a recording. In order investigate if the frequency of occurrence of melodic patterns is relevant for chara terizing r¯agas, we take f2(√,r) = f(√,r). As mentioned, the melodic patterns th occur across different r¯agas and in several recordings do not aid r¯aga recognitio Therefore, to reduce their effect in the feature vector we employ a weighting schem similar to the inverse document frequency (idf) weighting in text retrieval. f3(√,r) = f(√,r)I(√,R) (6 I(√,R) = log ✓ NR |{r 2 R :√2 r}| ◆ (6 and, hence, are irrelevant for modeling a topic. We experiment with three different sets of features f1, f2 and f3, which are similar to the TF-IDF features typically used in text information retrieval. We denote our corpus by R comprising NR = |R| number of recordings. A melodic phrase and a recording is denoted by √and r , respectively f1(√,r) = ( 1, if f(√,r) > 0 0, otherwise (6.1) where, f(√,r) denotes the raw frequency of occurrence of pattern √ in recording r. f1 only considers the presence or absence of a pattern in a recording. In order to investigate if the frequency of occurrence of melodic patterns is relevant for charac- terizing r¯agas, we take f2(√,r) = f(√,r). As mentioned, the melodic patterns tha occur across different r¯agas and in several recordings do not aid r¯aga recognition Therefore, to reduce their effect in the feature vector we employ a weighting scheme similar to the inverse document frequency (idf) weighting in text retrieval. f3(√,r) = f(√,r)I(√,R) (6.2) I(√,R) = log ✓ NR |{r 2 R :√2 r}| ◆ (6.3) 148
  147. 147. C _a _ 2 PATTERN-BASED R ¯AGA RECOGNITION 185 Method Feature SVML SGD NBM NBG RF LR 1-NN MVSM f1 51.04 55 37.5 54.37 25.41 55.83 - f2 45.83 50.41 35.62 47.5 26.87 51.87 - f3 45.83 51.66 67.29 44.79 23.75 51.87 - MPC PCDfull - - - - - - 73.12 MGK PDparam 30.41 22.29 27.29 28.12 42.91 30.83 25.62 PDcontext 54.16 43.75 5.2 33.12 49.37 54.79 26.25 ble 6.1: Accuracy (%) of r¯aga recognition on RRDCMD dataset by MVSM and other methods thods using different features and classifiers. Bold text signifies the best accuracy by a thod among all its variants ndustani music recordings. We therefore do not consider this method for comparing ults on Hindustani music. e authors of MPC courteously ran the experiments on our dataset using the original plementations of the method. For MGK, the authors kindly extracted the features Dparam and PDcontext) using the original implementation of their method and the periments using different classification strategies were done by us. 2.3 Results and Discussion fore we proceed to present our results, we notify readers that the accuracies repor- d in this section for different methods vary slightly from the ones reported in Gulati 6 AUTOMATIC R ¯AGA RECOGNITION Method Feature SVML SGD NBM NBG RF LR 1-NN MVSM f1 71 72.33 69.33 79.33 38.66 74.33 - f2 65.33 64.33 67.66 72.66 40.33 68 - f3 65.33 62.66 82.66 72 41.33 67.66 - MPC PCDfull - - - - - - 91.66 ble 6.2: Accuracy (%) of r¯aga recognition on RRDHMD dataset by MVSM and other methods hods using different features and classifiers. Bold text signifies the best accuracy by a hod among all its variants. 3 4 5 6 7 8 9 10 11 12 13 14 ˜ (bin index) 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 ClusteringCoefficient(C) 0 10 20 30 40 50 60 70 Accuracy(%) C(G) C(Gr) Accuracy of MVSM 3 4 5 6 7 8 9 10 11 12 13 14 ˜ (bin index) 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 0.24 ClusteringCoefficient(C) 30 40 50 60 70 80 90 Accuracy(%) C(G) C(Gr) Accuracy of MVSM 149
  148. 148. C _a _ 2 PATTERN-BASED R ¯AGA RECOGNITION 185 Method Feature SVML SGD NBM NBG RF LR 1-NN MVSM f1 51.04 55 37.5 54.37 25.41 55.83 - f2 45.83 50.41 35.62 47.5 26.87 51.87 - f3 45.83 51.66 67.29 44.79 23.75 51.87 - MPC PCDfull - - - - - - 73.12 MGK PDparam 30.41 22.29 27.29 28.12 42.91 30.83 25.62 PDcontext 54.16 43.75 5.2 33.12 49.37 54.79 26.25 ble 6.1: Accuracy (%) of r¯aga recognition on RRDCMD dataset by MVSM and other methods thods using different features and classifiers. Bold text signifies the best accuracy by a thod among all its variants ndustani music recordings. We therefore do not consider this method for comparing ults on Hindustani music. e authors of MPC courteously ran the experiments on our dataset using the original plementations of the method. For MGK, the authors kindly extracted the features Dparam and PDcontext) using the original implementation of their method and the periments using different classification strategies were done by us. 2.3 Results and Discussion fore we proceed to present our results, we notify readers that the accuracies repor- d in this section for different methods vary slightly from the ones reported in Gulati 6 AUTOMATIC R ¯AGA RECOGNITION Method Feature SVML SGD NBM NBG RF LR 1-NN MVSM f1 71 72.33 69.33 79.33 38.66 74.33 - f2 65.33 64.33 67.66 72.66 40.33 68 - f3 65.33 62.66 82.66 72 41.33 67.66 - MPC PCDfull - - - - - - 91.66 ble 6.2: Accuracy (%) of r¯aga recognition on RRDHMD dataset by MVSM and other methods hods using different features and classifiers. Bold text signifies the best accuracy by a hod among all its variants. 3 4 5 6 7 8 9 10 11 12 13 14 ˜ (bin index) 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 ClusteringCoefficient(C) 0 10 20 30 40 50 60 70 Accuracy(%) C(G) C(Gr) Accuracy of MVSM 3 4 5 6 7 8 9 10 11 12 13 14 ˜ (bin index) 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 0.24 ClusteringCoefficient(C) 30 40 50 60 70 80 90 Accuracy(%) C(G) C(Gr) Accuracy of MVSM b cUeY ZYUgheYf gUgY bZ g Y Ueg 150
  149. 149. C _a _1 DaYYM e beX]U wYague #, +- GebcbfYX BbXhe] Yg U_( #, +. G <>h__ K> A<> #>- G <cUeU 1- 01 // 3, 2- UeaUg]W # ]aXhfgUa] # • Gulati, S., Serrà, J., Ishwar, V., Şentürk, S., & Serra, X. (2016). Phrase-based rāga recognition using vector space modeling. In ICASSP, pp. 66–70. 151
  150. 150. 7 [ 3 M e_U_ R1-S.anmukhapriya R2-K¯api R3-Bhairavi R4-Madhyam¯avati R5-Bilahari R6-M¯ohana˙m R7-Sencurut.t.i R8-´Sr¯ıranjani R9-R¯ıtigaul.a R10-Huss¯en¯ı R11-Dhany¯asi R12-At.¯ana R13-Beh¯ag R14-Surat.i R15-K¯amavardani R16-Mukh¯ari R17-Sindhubhairavi R18-Sah¯an¯a R19-K¯anad.a R20-M¯ay¯am¯al.avagaul.a R21-N¯at.a R22-´Sankar¯abharan.a˙m R23-S¯av¯eri R24-Kam¯as R25-T¯od.i R26-B¯egad.a R27-Harik¯ambh¯oji R28-´Sr¯ı R29-Kaly¯an.i R30-S¯ama R31-N¯at.akurinji R32-P¯urv¯ıkal.y¯an.i R33-Yadukulak¯a˙mb¯oji R34-D¯evag¯andh¯ari R35-K¯ed¯aragaul.a R36-¯Anandabhairavi R37-Gaul.a R38-Var¯al.i R39-K¯a˙mbh¯oji R40-Karaharapriya R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19 R20 R21 R22 R23 R24 R25 R26 R27 R28 R29 R30 R31 R32 R33 R34 R35 R36 R37 R38 R39 R40 8 1 1 1 1 4 1 1 1 1 1 1 1 1 9 1 1 1 1 5 3 1 2 7 1 1 1 2 1 5 1 1 1 1 1 1 1 5 1 1 1 1 1 1 2 5 1 1 3 12 1 1 2 5 1 1 1 7 1 2 1 1 1 7 2 1 1 11 1 10 1 1 10 2 1 11 1 1 1 2 1 4 1 1 1 9 1 1 1 1 1 1 1 3 1 1 1 1 11 1 12 12 1 3 7 1 10 2 12 1 10 1 1 9 2 1 3 1 1 2 1 2 1 11 1 1 3 1 1 6 1 1 10 1 11 1 1 1 1 5 2 1 1 1 10 1 1 1 8 1 2 1 2 1 1 1 3 1 2 2 1 7 12 1 1 10 1 11 0 1 2 3 4 5 6 7 8 9 10 11 12 152
  151. 151. 7 [ 3 M e_U_ R1-S.anmukhapriya R2-K¯api R3-Bhairavi R4-Madhyam¯avati R5-Bilahari R6-M¯ohana˙m R7-Sencurut.t.i R8-´Sr¯ıranjani R9-R¯ıtigaul.a R10-Huss¯en¯ı R11-Dhany¯asi R12-At.¯ana R13-Beh¯ag R14-Surat.i R15-K¯amavardani R16-Mukh¯ari R17-Sindhubhairavi R18-Sah¯an¯a R19-K¯anad.a R20-M¯ay¯am¯al.avagaul.a R21-N¯at.a R22-´Sankar¯abharan.a˙m R23-S¯av¯eri R24-Kam¯as R25-T¯od.i R26-B¯egad.a R27-Harik¯ambh¯oji R28-´Sr¯ı R29-Kaly¯an.i R30-S¯ama R31-N¯at.akurinji R32-P¯urv¯ıkal.y¯an.i R33-Yadukulak¯a˙mb¯oji R34-D¯evag¯andh¯ari R35-K¯ed¯aragaul.a R36-¯Anandabhairavi R37-Gaul.a R38-Var¯al.i R39-K¯a˙mbh¯oji R40-Karaharapriya R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19 R20 R21 R22 R23 R24 R25 R26 R27 R28 R29 R30 R31 R32 R33 R34 R35 R36 R37 R38 R39 R40 8 1 1 1 1 4 1 1 1 1 1 1 1 1 9 1 1 1 1 5 3 1 2 7 1 1 1 2 1 5 1 1 1 1 1 1 1 5 1 1 1 1 1 1 2 5 1 1 3 12 1 1 2 5 1 1 1 7 1 2 1 1 1 7 2 1 1 11 1 10 1 1 10 2 1 11 1 1 1 2 1 4 1 1 1 9 1 1 1 1 1 1 1 3 1 1 1 1 11 1 12 12 1 3 7 1 10 2 12 1 10 1 1 9 2 1 3 1 1 2 1 2 1 11 1 1 3 1 1 6 1 1 10 1 11 1 1 1 1 5 2 1 1 1 10 1 1 1 8 1 2 1 2 1 1 1 3 1 2 2 1 7 12 1 1 10 1 11 0 1 2 3 4 5 6 7 8 9 10 11 12 G eUfY VUfYX ex[Uf 153
  152. 152. 7 [ 3 M e_U_ R1-S.anmukhapriya R2-K¯api R3-Bhairavi R4-Madhyam¯avati R5-Bilahari R6-M¯ohana˙m R7-Sencurut.t.i R8-´Sr¯ıranjani R9-R¯ıtigaul.a R10-Huss¯en¯ı R11-Dhany¯asi R12-At.¯ana R13-Beh¯ag R14-Surat.i R15-K¯amavardani R16-Mukh¯ari R17-Sindhubhairavi R18-Sah¯an¯a R19-K¯anad.a R20-M¯ay¯am¯al.avagaul.a R21-N¯at.a R22-´Sankar¯abharan.a˙m R23-S¯av¯eri R24-Kam¯as R25-T¯od.i R26-B¯egad.a R27-Harik¯ambh¯oji R28-´Sr¯ı R29-Kaly¯an.i R30-S¯ama R31-N¯at.akurinji R32-P¯urv¯ıkal.y¯an.i R33-Yadukulak¯a˙mb¯oji R34-D¯evag¯andh¯ari R35-K¯ed¯aragaul.a R36-¯Anandabhairavi R37-Gaul.a R38-Var¯al.i R39-K¯a˙mbh¯oji R40-Karaharapriya R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19 R20 R21 R22 R23 R24 R25 R26 R27 R28 R29 R30 R31 R32 R33 R34 R35 R36 R37 R38 R39 R40 8 1 1 1 1 4 1 1 1 1 1 1 1 1 9 1 1 1 1 5 3 1 2 7 1 1 1 2 1 5 1 1 1 1 1 1 1 5 1 1 1 1 1 1 2 5 1 1 3 12 1 1 2 5 1 1 1 7 1 2 1 1 1 7 2 1 1 11 1 10 1 1 10 2 1 11 1 1 1 2 1 4 1 1 1 9 1 1 1 1 1 1 1 3 1 1 1 1 11 1 12 12 1 3 7 1 10 2 12 1 10 1 1 9 2 1 3 1 1 2 1 2 1 11 1 1 3 1 1 6 1 1 10 1 11 1 1 1 1 5 2 1 1 1 10 1 1 1 8 1 2 1 2 1 1 1 3 1 2 2 1 7 12 1 1 10 1 11 0 1 2 3 4 5 6 7 8 9 10 11 12 9__]YX ex[Uf 154

×