Music similarity: what for?

Music
similarity:
what
for?

emilia.gomez@upf.edu

Building
real
applica.ons

for
real
users

PXVLF
FRQWHQW
([DPSOHV
UKWKP
WLPEUH
PHORG
KDUPRQ
ORXGQHVV
VRQJ OULFV
PXVLF
FRQWH[W
XVHU
FRQWH[W
([DPSOHV
VHPDQWLF ODEHOV
SHUIRUPHUµV UHSXWDWLRQ
DOEXP FRYHU DUWZRUN
DUWLVW
V EDFNJURXQG
PXVLF YLGHR FOLSV
([DPSOHV
PRRG
DFWLYLWLHV
VRFLDO FRQWH[W
VSDWLRWHPSRUDO FRQWH[W
SKVLRORJLFDO DVSHFWV
XVHU SURSHUWLHV
PXVLF
SHUFHSWLRQ
([DPSOHV
PXVLF SUHIHUHQFHV
PXVLFDO WUDLQLQJ
PXVLFDO H[SHULHQFH
GHPRJUDSKLFV
RSLQLRQ DERXW SHUIRUPHU
DUWLVW¶V SRSXODULW DPRQJ IULHQGV
Concepts
and
models
of
similarity

•  Aim
of
the
day:

modeling
similarity
of

musical
content

– Challenges,
goals

– Formal
models
vs.

informal
expert

knowledge

(Schedl et al. 2011)

Outline

•  Music
similarity
in
MIR
à
audio

•  Challenges

•  2
projects
related
to
music
similarity

Music
similarity
in

Music
InformaFon
Retrieval

(Casey et al. 2008)

“Help
people
ﬁnd
music”:

Music
similarity
in

Music
InformaFon
Retrieval

(Casey et al. 2008) (Grosche et al. 2011)

“Help
people
ﬁnd
music”:

•  Speciﬁcity

Music
similarity
in

Music
InformaFon
Retrieval

(Casey et al. 2008)

“Help
people
ﬁnd
music”:

•  Speciﬁcity

•  Granularity
/
temporal
scope

! #$
!#$#%' ()*$#%'
%'()* +,'-./01
*+,-.')#
2! !
()*$#%'$/)*$0-)0,12-)
3.-2#2-) 40/.2#1,/#2-)
!
5,-6/,$1''/47
#2' 0/,
!#$#%' .*/,
Keyscape (Sapp 2005) (Martorell Gómez 2011)

Music
similarity
in

Music
InformaFon
Retrieval

Diﬀerent
tasks
and
applicaFons:

(Grosche et al. 2011)

Music
similarity
in

Music
InformaFon
Retrieval

Diﬀerent
tasks
and
applicaFons:

(Grosche et al. 2011)
SIMILARITY

Music
similarity
measures

•  Task
dependent:

–  Content:
audio,
score,
lyrics,
etc.

–  Musical
facets:
melody,
rhythm,
tonality,
Fmbre,

instrumentaFon.

–  Descriptors.

–  Weights.

Audio
music
similarity

1.  Low-‐level
spectral
descriptors:
Aucouturier
and Pachet (2004), Pampalk (2006)
– High
speciﬁcity
–
global

– “Audio
quality”
(Urbano et al. 2014)
– “Timbre”
à
sound
quality

0 1000 2000 3000 4000 5000 6000 7000
−60
−40
−20
LTSA Flute − C4
Frequency (Hz)
0 1000 2000 3000 4000 5000 6000 7000
−60
−40
−20
LTSA Oboe − C4
Frequency (Hz)
Spectralmagnitude(dB)
(McAdams and Giordano 2008)

Audio
music
similarity

1.  Low-‐level
spectral
descriptors:
Aucouturier and
Pachet (2004), Pampalk (2006)
2.  Incorporate
mid-‐level
musical
descriptors:

–  Rhythm:
Foote (2002)
–  Pitch:
Müller et al. (2006), Serrà et al. (2007)
à
cover

version
iden.ﬁca.on,
audio-‐score
alignment

Cover
version
idenFﬁcaFon

(Gómez et al. 2006)

Approaches
in
audio
music
similarity

1.  Low-‐level
spectral
descriptors:
Aucouturier and
Pachet (2004), Pampalk (2006)
2.  Incorporate
mid-‐level
musical
descriptors

3.  Combine
those
with
semanFc
descriptors

obtained
by
automaFc
classiﬁcaFon
(ex:
genre,

instrument,
mood):
Bogdanov et al. (2013)

PersonalizaFon

1.  Let
users
control
weights

–  Lot
of
eﬀort
for
a
high
number
of
descriptors

–  The
user
should
make
his
preference
explicit

2.  Gather
raFngs
of
the
similarity
of
pairs
of
songs
à

robustness
(Urbano et al. 2010)
3.  CollecFon
clustering:
ask
users
to
group
songs
in
a

2D
plot
(Stober 2011)

EvaluaFon

•  Similarity
vs
categorizaFon:
arFst,
genre,

instrument,
covers,
co-‐occurrence
in
personal

collecFons
and
playlists
(Berenzweig et al. 2003)
•  Surveys
(Vignoli and Pauws 2005)

But

As…”Similarity
is
an
ill-‐deﬁned
concept”

we
should
evaluate
each
task
separately!

Audio
Music
Similarity
Task

•  7000
30-‐second
audio
clips
drawn
from
10

genres:
Blues,
Jazz,
Country/Western,
Baroque,
Classical,

Roman.c,
Electronica,
Hip-‐Hop,
Rock,
HardRock/Metal

•  Songs
from
the
same
arFst
ﬁlter
out

•  EvaluaFon
criteria:

–  User
raFngs:
not
similar,
somewhat
similar,
very

similar

–  ObjecFve
staFsFcs:
similarity
in
terms
of
genre,
arFst

and
album.

•  More
on
talk
by
A.
Flexer.

Tasks
related
to
similarity

•  Audio
cover
idenFficaFon

•  Audio
classificaFon

•  Query
by
singing
(humming)

•  Query
by
tapping

•  Audio
to
score
alignment

•  Discovery
of
repeated
themes
/
secFons

•  Structural
segmentaFon

•  Audio
fingerprinFng

•  Symbolic
melodic
similarity

•  …

Challenges

1.  Music
is
mulFmodal,
mulF-‐faceted

2.  Similarity
depends
on

a.  the
user/listener,

b.  the
repertoire,
and

c.  the
task

Use-‐cases

Use
case
1:

•  Repertoire:
symphonic
music

•  ModaliFes:
audio,
score,

video,
gestures

•  Task:
structural
analysis
à

visualizaFon

•  PersonalizaFon:
“experts”
–

listeners
exposed
to
it
(me)
–

naïve
listeners
(young

people?)

Beethoven Symphony No. 3 Eroica

http://phenicx.upf.edu/

ModaliFes

•  Audio:
dynamics,
Fmbre
tempo,
f0
(Grachten et al. 2013) (Bosch
Gómez 2013)
•  Score:
key,
pitch-‐class
sets,
orchestraFon
(Martorell and Gómez 2014)
•  Video:
performers,
movement
(Bazzica, Liem and Hanjalic 2014)
•  Gestures:
movement
(Sarasúa and Guaus 2014)
•  Context:
manual
annotaFons

Strategies

•  SynchronizaFon

•  Generate
diﬀerent
layers
of
informaFon


– Understand
user
needs:
naïve
listeners,
music

experts,
performers

– Let
them
choose
by
means
of
visualizaFon,

interacFon
à
HCI

Use
case
2:

•  Repertoire:
ﬂamenco
singing

•  ModaliFes:
audio

•  Task:
style
and
variant

characterizaFon
“experts”
–

listeners
exposed
to
it
(Me)
–

naïve
listeners
(you?)

http://mtg.upf.edu/research/projects/coﬂa

Melodic
similarity

No se puede mostrar la imagen. Puede que su equipo no tenga suﬁciente memoria para abrir la imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo.
1.  Each
style
is
characterized
by
a
common
melodic
skeleton

2.  Spontaneous
improvisaFon:
ornamentaFon,
prolongaFon,

rhythmic
and
melodic
modiﬁcaFon

Antonio
Mairena
Chano
Lobato

Melodic
similarity
-‐
style

•  Ground
truth:
style
annotaFons

•  Speciﬁc

standard
measures:

– High-‐level
expert
speciﬁc
features

– Fundamental
frequency
(Dynamic
Fme
warping)

– Symbolic-‐based
descriptors

– Chroma
similarity

(Huson 1998)

Melodic
similarity
–
variants

•  Ground
truth:

– human
judgements

– ﬂamenco
experts
vs
naïve
listeners

•  Strongest
agreement
among
experts
and

diﬀerent
criteria
à
no
consensus
/
general

soluFon
yet!

– Large
scale
user
studies

(Gómez et al. 2012) (Kroher et al. 2014)

Conclusions

•  Music
is
mulF-‐modal,
mulF-‐faceted,
mulF-‐
layer

•  Similarity
is
not
a
general
concept,
but

depends
on

–  the
task

–  the
repertoire,
and

–  the
listener!
(and
his
context…)

Music similarity: what for?

Recommended

Recommended

More Related Content

Similar to Music similarity: what for?

Similar to Music similarity: what for? (20)

Recently uploaded

Recently uploaded (20)

Music similarity: what for?