13. Approaches
in
audio
music
similarity
1. Low-‐level
spectral
descriptors:
Aucouturier and
Pachet (2004), Pampalk (2006)
2. Incorporate
mid-‐level
musical
descriptors
3. Combine
those
with
semanFc
descriptors
obtained
by
automaFc
classificaFon
(ex:
genre,
instrument,
mood):
Bogdanov et al. (2013)
14. PersonalizaFon
(Schedl et al. 2012)
1. Let
users
control
weights
– Lot
of
effort
for
a
high
number
of
descriptors
– The
user
should
make
his
preference
explicit
2. Gather
raFngs
of
the
similarity
of
pairs
of
songs
à
robustness
(Urbano et al. 2010)
3. CollecFon
clustering:
ask
users
to
group
songs
in
a
2D
plot
(Stober 2011)
15. EvaluaFon
• Similarity
vs
categorizaFon:
arFst,
genre,
instrument,
covers,
co-‐occurrence
in
personal
collecFons
and
playlists
(Berenzweig et al. 2003)
• Surveys
(Vignoli and Pauws 2005)
16. But
As…”Similarity
is
an
ill-‐defined
concept”
we
should
evaluate
each
task
separately!
17. Audio
Music
Similarity
Task
• 7000
30-‐second
audio
clips
drawn
from
10
genres:
Blues,
Jazz,
Country/Western,
Baroque,
Classical,
Roman.c,
Electronica,
Hip-‐Hop,
Rock,
HardRock/Metal
• Songs
from
the
same
arFst
filter
out
• EvaluaFon
criteria:
– User
raFngs:
not
similar,
somewhat
similar,
very
similar
– ObjecFve
staFsFcs:
similarity
in
terms
of
genre,
arFst
and
album.
• More
on
talk
by
A.
Flexer.
18. Tasks
related
to
similarity
• Audio
cover
idenFficaFon
• Audio
classificaFon
• Query
by
singing
(humming)
• Query
by
tapping
• Audio
to
score
alignment
• Discovery
of
repeated
themes
/
secFons
• Structural
segmentaFon
• Audio
fingerprinFng
• Symbolic
melodic
similarity
• …
19. Challenges
1. Music
is
mulFmodal,
mulF-‐faceted
2. Similarity
depends
on
a. the
user/listener,
b. the
repertoire,
and
c. the
task
Use-‐cases
20. Use
case
1:
• Repertoire:
symphonic
music
• ModaliFes:
audio,
score,
video,
gestures
• Task:
structural
analysis
à
visualizaFon
• PersonalizaFon:
“experts”
–
listeners
exposed
to
it
(me)
–
naïve
listeners
(young
people?)
Beethoven Symphony No. 3 Eroica
http://phenicx.upf.edu/
21. ModaliFes
• Audio:
dynamics,
Fmbre
tempo,
f0
(Grachten et al. 2013) (Bosch
Gómez 2013)
• Score:
key,
pitch-‐class
sets,
orchestraFon
(Martorell and Gómez 2014)
• Video:
performers,
movement
(Bazzica, Liem and Hanjalic 2014)
• Gestures:
movement
(Sarasúa and Guaus 2014)
• Context:
manual
annotaFons
(Schedl et al. 2014)
22. Strategies
• SynchronizaFon
• Generate
different
layers
of
informaFon
• PersonalizaFon:
– Understand
user
needs:
naïve
listeners,
music
experts,
performers
– Let
them
choose
by
means
of
visualizaFon,
interacFon
à
HCI
23. Use
case
2:
• Repertoire:
flamenco
singing
• ModaliFes:
audio
• Task:
style
and
variant
characterizaFon
• PersonalizaFon:
“experts”
–
listeners
exposed
to
it
(Me)
–
naïve
listeners
(you?)
http://mtg.upf.edu/research/projects/cofla
24. Melodic
similarity
No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo.
1. Each
style
is
characterized
by
a
common
melodic
skeleton
2. Spontaneous
improvisaFon:
ornamentaFon,
prolongaFon,
rhythmic
and
melodic
modificaFon
Antonio
Mairena
Chano
Lobato
25. Melodic
similarity
-‐
style
• Ground
truth:
style
annotaFons
• Specific
standard
measures:
– High-‐level
expert
specific
features
– Fundamental
frequency
(Dynamic
Fme
warping)
– Symbolic-‐based
descriptors
– Chroma
similarity
(Huson 1998)
26. Melodic
similarity
–
variants
• Ground
truth:
– human
judgements
– flamenco
experts
vs
naïve
listeners
• Strongest
agreement
among
experts
and
different
criteria
à
no
consensus
/
general
soluFon
yet!
– Large
scale
user
studies
(Gómez et al. 2012) (Kroher et al. 2014)
27. Conclusions
• Music
is
mulF-‐modal,
mulF-‐faceted,
mulF-‐
layer
• Similarity
is
not
a
general
concept,
but
depends
on
– the
task
– the
repertoire,
and
– the
listener!
(and
his
context…)