SlideShare a Scribd company logo
1 of 27
Download to read offline
Music	
  similarity:	
  what	
  for?	
  
emilia.gomez@upf.edu	
  
Building	
  real	
  applica.ons	
  
for	
  real	
  users	
  
PXVLF
FRQWHQW
([DPSOHV
 UKWKP
 WLPEUH
 PHORG
 KDUPRQ
 ORXGQHVV
 VRQJ OULFV
PXVLF
FRQWH[W
XVHU
FRQWH[W
([DPSOHV
 VHPDQWLF ODEHOV
 SHUIRUPHUµV UHSXWDWLRQ
 DOEXP FRYHU DUWZRUN
 DUWLVW
V EDFNJURXQG
 PXVLF YLGHR FOLSV
([DPSOHV
 PRRG
 DFWLYLWLHV
 VRFLDO FRQWH[W
 VSDWLRWHPSRUDO FRQWH[W
 SKVLRORJLFDO DVSHFWV
XVHU SURSHUWLHV
PXVLF
SHUFHSWLRQ
([DPSOHV
 PXVLF SUHIHUHQFHV
 PXVLFDO WUDLQLQJ
 PXVLFDO H[SHULHQFH
 GHPRJUDSKLFV
 RSLQLRQ DERXW SHUIRUPHU
 DUWLVW¶V SRSXODULW DPRQJ IULHQGV
Concepts	
  and	
  models	
  of	
  similarity	
  
•  Aim	
  of	
  the	
  day:	
  
modeling	
  similarity	
  of	
  
musical	
  content	
  	
  
– Challenges,	
  goals	
  
– Formal	
  models	
  vs.	
  
informal	
  expert	
  
knowledge	
  
(Schedl et al. 2011)
Outline	
  
•  Music	
  similarity	
  in	
  MIR	
  à	
  audio	
  
•  Challenges	
  
•  2	
  projects	
  related	
  to	
  music	
  similarity	
  
Music	
  similarity	
  in	
  
Music	
  InformaFon	
  Retrieval	
  
(Casey et al. 2008)

“Help	
  people	
  find	
  music”:	
  	
  
Music	
  similarity	
  in	
  
Music	
  InformaFon	
  Retrieval	
  
(Casey et al. 2008) (Grosche et al. 2011) 

“Help	
  people	
  find	
  music”:	
  	
  
•  Specificity	
  
Music	
  similarity	
  in	
  
Music	
  InformaFon	
  Retrieval	
  
(Casey et al. 2008)

“Help	
  people	
  find	
  music”:	
  	
  
•  Specificity	
  
•  Granularity	
  /	
  temporal	
  scope	
  
! #$
!#$#%' ()*$#%'
%'()* +,'-./01
*+,-.')#
2! !
()*$#%'$/)*$0-)0,12-)
3.-2#2-) 40/.2#1,/#2-)
!
5,-6/,$1''/47
#2' 0/,
!#$#%' .*/,
Keyscape (Sapp 2005) (Martorell  Gómez 2011)
Music	
  similarity	
  in	
  
Music	
  InformaFon	
  Retrieval	
  
Different	
  tasks	
  and	
  applicaFons:	
  	
  
(Grosche et al. 2011)
Music	
  similarity	
  in	
  
Music	
  InformaFon	
  Retrieval	
  
Different	
  tasks	
  and	
  applicaFons:	
  	
  
(Grosche et al. 2011) 
SIMILARITY	
  
Music	
  similarity	
  measures	
  
•  Task	
  dependent:	
  	
  
–  Content:	
  audio,	
  score,	
  lyrics,	
  etc.	
  
–  Musical	
  facets:	
  melody,	
  rhythm,	
  tonality,	
  Fmbre,	
  
instrumentaFon.	
  
–  Descriptors.	
  
–  Weights.	
  
Audio	
  music	
  similarity	
  
1.  Low-­‐level	
  spectral	
  descriptors:	
  Aucouturier
and Pachet (2004), Pampalk (2006)
– High	
  specificity	
  –	
  global	
  	
  
– “Audio	
  quality”	
  (Urbano et al. 2014)
– “Timbre”	
  à	
  sound	
  quality	
  
	
  
0 1000 2000 3000 4000 5000 6000 7000
−60
−40
−20
LTSA Flute − C4
Frequency (Hz)
0 1000 2000 3000 4000 5000 6000 7000
−60
−40
−20
LTSA Oboe − C4
Frequency (Hz)
Spectralmagnitude(dB)
(McAdams and Giordano 2008)
Audio	
  music	
  similarity	
  
1.  Low-­‐level	
  spectral	
  descriptors:	
  Aucouturier and
Pachet (2004), Pampalk (2006)
2.  Incorporate	
  mid-­‐level	
  musical	
  descriptors:	
  
–  Rhythm:	
  Foote (2002)
–  Pitch:	
  Müller et al. (2006), Serrà et al. (2007) 	
  à	
  cover	
  
version	
  iden.fica.on,	
  audio-­‐score	
  alignment	
  
Cover	
  version	
  idenFficaFon	
  
(Gómez et al. 2006)	
  
Approaches	
  in	
  audio	
  music	
  similarity	
  
1.  Low-­‐level	
  spectral	
  descriptors:	
  Aucouturier and
Pachet (2004), Pampalk (2006)
2.  Incorporate	
  mid-­‐level	
  musical	
  descriptors	
  
3.  Combine	
  those	
  with	
  semanFc	
  descriptors	
  
obtained	
  by	
  automaFc	
  classificaFon	
  (ex:	
  genre,	
  
instrument,	
  mood):	
  Bogdanov et al. (2013)
PersonalizaFon	
  
(Schedl et al. 2012)
1.  Let	
  users	
  control	
  weights	
  
–  Lot	
  of	
  effort	
  for	
  a	
  high	
  number	
  of	
  descriptors	
  
–  The	
  user	
  should	
  make	
  his	
  preference	
  explicit	
  
2.  Gather	
  raFngs	
  of	
  the	
  similarity	
  of	
  pairs	
  of	
  songs	
  à	
  
robustness	
  (Urbano et al. 2010)
3.  CollecFon	
  clustering:	
  ask	
  users	
  to	
  group	
  songs	
  in	
  a	
  
2D	
  plot	
  (Stober 2011)	
  
EvaluaFon	
  
•  Similarity	
  vs	
  categorizaFon:	
  arFst,	
  genre,	
  
instrument,	
  covers,	
  co-­‐occurrence	
  in	
  personal	
  
collecFons	
  and	
  playlists	
  (Berenzweig et al. 2003)
•  Surveys	
  (Vignoli and Pauws 2005)
But	
  
As…”Similarity	
  is	
  an	
  ill-­‐defined	
  concept”	
  	
  
	
  
we	
  should	
  evaluate	
  each	
  task	
  separately!	
  	
  
	
  
Audio	
  Music	
  Similarity	
  Task	
  
•  7000	
  30-­‐second	
  audio	
  clips	
  drawn	
  from	
  10	
  
genres:	
  Blues,	
  Jazz,	
  Country/Western,	
  Baroque,	
  Classical,	
  
Roman.c,	
  Electronica,	
  Hip-­‐Hop,	
  Rock,	
  HardRock/Metal	
  	
  
•  Songs	
  from	
  the	
  same	
  arFst	
  filter	
  out	
  
•  EvaluaFon	
  criteria:	
  
–  User	
  raFngs:	
  not	
  similar,	
  somewhat	
  similar,	
  very	
  
similar	
  	
  
–  ObjecFve	
  staFsFcs:	
  similarity	
  in	
  terms	
  of	
  genre,	
  arFst	
  
and	
  album.	
  
•  More	
  on	
  talk	
  by	
  A.	
  Flexer.	
  	
  
Tasks	
  related	
  to	
  similarity	
  
•  Audio	
  cover	
  idenFficaFon	
  
•  Audio	
  classificaFon	
  
•  Query	
  by	
  singing	
  (humming)	
  
•  Query	
  by	
  tapping	
  
•  Audio	
  to	
  score	
  alignment	
  
•  Discovery	
  of	
  repeated	
  themes	
  /	
  secFons	
  
•  Structural	
  segmentaFon	
  
•  Audio	
  fingerprinFng	
  
•  Symbolic	
  melodic	
  similarity	
  
•  …	
  
Challenges	
  
1.  Music	
  is	
  mulFmodal,	
  mulF-­‐faceted	
  	
  
2.  Similarity	
  depends	
  on	
  	
  
a.  the	
  user/listener,	
  
b.  the	
  repertoire,	
  and	
  
c.  the	
  task	
  
Use-­‐cases	
  
Use	
  case	
  1:	
  
•  Repertoire:	
  symphonic	
  music	
  
•  ModaliFes:	
  audio,	
  score,	
  
video,	
  gestures	
  	
  
•  Task:	
  structural	
  analysis	
  à	
  
visualizaFon	
  	
  
•  PersonalizaFon:	
  “experts”	
  –	
  
listeners	
  exposed	
  to	
  it	
  (me)	
  –	
  
naïve	
  listeners	
  (young	
  
people?)	
  
	
  
Beethoven Symphony No. 3 Eroica	

http://phenicx.upf.edu/
ModaliFes	
  
•  Audio:	
  dynamics,	
  Fmbre	
  tempo,	
  f0	
  (Grachten et al. 2013) (Bosch 
Gómez 2013)
•  Score:	
  key,	
  pitch-­‐class	
  sets,	
  orchestraFon	
  (Martorell and Gómez 2014) 
•  Video:	
  performers,	
  movement	
  (Bazzica, Liem and Hanjalic 2014)
•  Gestures:	
  movement	
  (Sarasúa and Guaus 2014)
•  Context:	
  manual	
  annotaFons	
  (Schedl et al. 2014)
Strategies	
  
•  SynchronizaFon	
  	
  
•  Generate	
  different	
  layers	
  of	
  informaFon	
  
•  PersonalizaFon:	
  
– Understand	
  user	
  needs:	
  naïve	
  listeners,	
  music	
  
experts,	
  performers	
  
– Let	
  them	
  choose	
  by	
  means	
  of	
  visualizaFon,	
  
interacFon	
  à	
  HCI	
  
Use	
  case	
  2:	
  
•  Repertoire:	
  flamenco	
  singing	
  
•  ModaliFes:	
  audio	
  
•  Task:	
  style	
  and	
  variant	
  
characterizaFon
•  PersonalizaFon:	
  “experts”	
  –	
  
listeners	
  exposed	
  to	
  it	
  (Me)	
  –	
  
naïve	
  listeners	
  (you?)	
  
http://mtg.upf.edu/research/projects/cofla
Melodic	
  similarity	
  
No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo.
1.  Each	
  style	
  is	
  characterized	
  by	
  a	
  common	
  melodic	
  skeleton	
  
2.  Spontaneous	
  improvisaFon:	
  ornamentaFon,	
  prolongaFon,	
  
rhythmic	
  and	
  melodic	
  modificaFon	
  
	
  
Antonio	
  Mairena	
   Chano	
  Lobato	
  
Melodic	
  similarity	
  -­‐	
  style	
  
•  Ground	
  truth:	
  style	
  annotaFons	
  
•  Specific	
  	
  standard	
  measures:	
  	
  
– High-­‐level	
  expert	
  specific	
  features	
  
– Fundamental	
  frequency	
  (Dynamic	
  Fme	
  warping)	
  
– Symbolic-­‐based	
  descriptors	
  
– Chroma	
  similarity	
  
	
  
(Huson 1998)
Melodic	
  similarity	
  –	
  variants	
  
•  Ground	
  truth:	
  	
  
– human	
  judgements	
  
– flamenco	
  experts	
  vs	
  naïve	
  listeners	
  
•  Strongest	
  agreement	
  among	
  experts	
  and	
  
different	
  criteria	
  à	
  no	
  consensus	
  /	
  general	
  
soluFon	
  yet!	
  	
  
– Large	
  scale	
  user	
  studies	
  
(Gómez et al. 2012) (Kroher et al. 2014)	
  
Conclusions	
  
•  Music	
  is	
  mulF-­‐modal,	
  mulF-­‐faceted,	
  mulF-­‐
layer	
  
•  Similarity	
  is	
  not	
  a	
  general	
  concept,	
  but	
  
depends	
  on	
  
–  the	
  task	
  
–  the	
  repertoire,	
  and	
  
–  the	
  listener!	
  (and	
  his	
  context…)	
  

More Related Content

Similar to Music similarity: what for?

北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)kthrlab
 
Music genre detection using hidden markov models
Music genre detection using hidden markov modelsMusic genre detection using hidden markov models
Music genre detection using hidden markov modelsMeghana Kantharaj
 
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)ALATechSource
 
Presentation for "Provas de Agergação" - 3 licao
Presentation for "Provas de Agergação" - 3 licaoPresentation for "Provas de Agergação" - 3 licao
Presentation for "Provas de Agergação" - 3 licaoAlvaro Barbosa
 
Discovering music: small-scale, web-scale, facets, and beyond-Belford
Discovering music: small-scale, web-scale, facets, and beyond-BelfordDiscovering music: small-scale, web-scale, facets, and beyond-Belford
Discovering music: small-scale, web-scale, facets, and beyond-BelfordNASIG
 
Session 1 Musicology Introduction
Session 1  Musicology  IntroductionSession 1  Musicology  Introduction
Session 1 Musicology IntroductionPaul Carr
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010ocor203
 
Writing A Concert Report
Instructors in introductory music cours.docx
Writing A Concert Report
Instructors in introductory music cours.docxWriting A Concert Report
Instructors in introductory music cours.docx
Writing A Concert Report
Instructors in introductory music cours.docxambersalomon88660
 
Sound editing and remix ISBAT University, Kampala, Uganda
Sound editing and remix ISBAT University, Kampala, Uganda Sound editing and remix ISBAT University, Kampala, Uganda
Sound editing and remix ISBAT University, Kampala, Uganda B. Randhir Prasad Yadav
 
Audio descriptive analysis of singer and musical instrument identification in...
Audio descriptive analysis of singer and musical instrument identification in...Audio descriptive analysis of singer and musical instrument identification in...
Audio descriptive analysis of singer and musical instrument identification in...eSAT Journals
 
--I need writing concert report as an example below.------.docx
--I need writing concert report as an example below.------.docx--I need writing concert report as an example below.------.docx
--I need writing concert report as an example below.------.docxhoney725342
 
Everything on sound
Everything on soundEverything on sound
Everything on soundrowanelwell1
 
World History Radio Shows
World History Radio ShowsWorld History Radio Shows
World History Radio ShowsHeidi Whitus
 
Human Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A ReviewHuman Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A ReviewEditor IJCATR
 
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...Stefan
 
Automatic Music Transcription
Automatic Music TranscriptionAutomatic Music Transcription
Automatic Music TranscriptionKhyati Ganatra
 

Similar to Music similarity: what for? (20)

北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
 
MIR
MIRMIR
MIR
 
Music genre detection using hidden markov models
Music genre detection using hidden markov modelsMusic genre detection using hidden markov models
Music genre detection using hidden markov models
 
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
 
Group 6 Presentation
Group 6 PresentationGroup 6 Presentation
Group 6 Presentation
 
Presentation for "Provas de Agergação" - 3 licao
Presentation for "Provas de Agergação" - 3 licaoPresentation for "Provas de Agergação" - 3 licao
Presentation for "Provas de Agergação" - 3 licao
 
Discovering music: small-scale, web-scale, facets, and beyond-Belford
Discovering music: small-scale, web-scale, facets, and beyond-BelfordDiscovering music: small-scale, web-scale, facets, and beyond-Belford
Discovering music: small-scale, web-scale, facets, and beyond-Belford
 
Session 1 Musicology Introduction
Session 1  Musicology  IntroductionSession 1  Musicology  Introduction
Session 1 Musicology Introduction
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010
 
Writing A Concert Report
Instructors in introductory music cours.docx
Writing A Concert Report
Instructors in introductory music cours.docxWriting A Concert Report
Instructors in introductory music cours.docx
Writing A Concert Report
Instructors in introductory music cours.docx
 
Sound editing and remix ISBAT University, Kampala, Uganda
Sound editing and remix ISBAT University, Kampala, Uganda Sound editing and remix ISBAT University, Kampala, Uganda
Sound editing and remix ISBAT University, Kampala, Uganda
 
Audio descriptive analysis of singer and musical instrument identification in...
Audio descriptive analysis of singer and musical instrument identification in...Audio descriptive analysis of singer and musical instrument identification in...
Audio descriptive analysis of singer and musical instrument identification in...
 
--I need writing concert report as an example below.------.docx
--I need writing concert report as an example below.------.docx--I need writing concert report as an example below.------.docx
--I need writing concert report as an example below.------.docx
 
Everything on sound
Everything on soundEverything on sound
Everything on sound
 
World History Radio Shows
World History Radio ShowsWorld History Radio Shows
World History Radio Shows
 
Btp 1st
Btp 1stBtp 1st
Btp 1st
 
Human Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A ReviewHuman Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A Review
 
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
 
Mit21 m 380s12_complecnot
Mit21 m 380s12_complecnotMit21 m 380s12_complecnot
Mit21 m 380s12_complecnot
 
Automatic Music Transcription
Automatic Music TranscriptionAutomatic Music Transcription
Automatic Music Transcription
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

Music similarity: what for?

  • 1. Music  similarity:  what  for?   emilia.gomez@upf.edu   Building  real  applica.ons   for  real  users  
  • 2. PXVLF FRQWHQW ([DPSOHV UKWKP WLPEUH PHORG KDUPRQ ORXGQHVV VRQJ OULFV PXVLF FRQWH[W XVHU FRQWH[W ([DPSOHV VHPDQWLF ODEHOV SHUIRUPHUµV UHSXWDWLRQ DOEXP FRYHU DUWZRUN DUWLVW V EDFNJURXQG PXVLF YLGHR FOLSV ([DPSOHV PRRG DFWLYLWLHV VRFLDO FRQWH[W VSDWLRWHPSRUDO FRQWH[W SKVLRORJLFDO DVSHFWV XVHU SURSHUWLHV PXVLF SHUFHSWLRQ ([DPSOHV PXVLF SUHIHUHQFHV PXVLFDO WUDLQLQJ PXVLFDO H[SHULHQFH GHPRJUDSKLFV RSLQLRQ DERXW SHUIRUPHU DUWLVW¶V SRSXODULW DPRQJ IULHQGV Concepts  and  models  of  similarity   •  Aim  of  the  day:   modeling  similarity  of   musical  content     – Challenges,  goals   – Formal  models  vs.   informal  expert   knowledge   (Schedl et al. 2011)
  • 3. Outline   •  Music  similarity  in  MIR  à  audio   •  Challenges   •  2  projects  related  to  music  similarity  
  • 4. Music  similarity  in   Music  InformaFon  Retrieval   (Casey et al. 2008) “Help  people  find  music”:    
  • 5. Music  similarity  in   Music  InformaFon  Retrieval   (Casey et al. 2008) (Grosche et al. 2011) “Help  people  find  music”:     •  Specificity  
  • 6. Music  similarity  in   Music  InformaFon  Retrieval   (Casey et al. 2008) “Help  people  find  music”:     •  Specificity   •  Granularity  /  temporal  scope   ! #$ !#$#%' ()*$#%' %'()* +,'-./01 *+,-.')# 2! ! ()*$#%'$/)*$0-)0,12-) 3.-2#2-) 40/.2#1,/#2-) ! 5,-6/,$1''/47 #2' 0/, !#$#%' .*/, Keyscape (Sapp 2005) (Martorell Gómez 2011)
  • 7. Music  similarity  in   Music  InformaFon  Retrieval   Different  tasks  and  applicaFons:     (Grosche et al. 2011)
  • 8. Music  similarity  in   Music  InformaFon  Retrieval   Different  tasks  and  applicaFons:     (Grosche et al. 2011) SIMILARITY  
  • 9. Music  similarity  measures   •  Task  dependent:     –  Content:  audio,  score,  lyrics,  etc.   –  Musical  facets:  melody,  rhythm,  tonality,  Fmbre,   instrumentaFon.   –  Descriptors.   –  Weights.  
  • 10. Audio  music  similarity   1.  Low-­‐level  spectral  descriptors:  Aucouturier and Pachet (2004), Pampalk (2006) – High  specificity  –  global     – “Audio  quality”  (Urbano et al. 2014) – “Timbre”  à  sound  quality     0 1000 2000 3000 4000 5000 6000 7000 −60 −40 −20 LTSA Flute − C4 Frequency (Hz) 0 1000 2000 3000 4000 5000 6000 7000 −60 −40 −20 LTSA Oboe − C4 Frequency (Hz) Spectralmagnitude(dB) (McAdams and Giordano 2008)
  • 11. Audio  music  similarity   1.  Low-­‐level  spectral  descriptors:  Aucouturier and Pachet (2004), Pampalk (2006) 2.  Incorporate  mid-­‐level  musical  descriptors:   –  Rhythm:  Foote (2002) –  Pitch:  Müller et al. (2006), Serrà et al. (2007)  à  cover   version  iden.fica.on,  audio-­‐score  alignment  
  • 12. Cover  version  idenFficaFon   (Gómez et al. 2006)  
  • 13. Approaches  in  audio  music  similarity   1.  Low-­‐level  spectral  descriptors:  Aucouturier and Pachet (2004), Pampalk (2006) 2.  Incorporate  mid-­‐level  musical  descriptors   3.  Combine  those  with  semanFc  descriptors   obtained  by  automaFc  classificaFon  (ex:  genre,   instrument,  mood):  Bogdanov et al. (2013)
  • 14. PersonalizaFon   (Schedl et al. 2012) 1.  Let  users  control  weights   –  Lot  of  effort  for  a  high  number  of  descriptors   –  The  user  should  make  his  preference  explicit   2.  Gather  raFngs  of  the  similarity  of  pairs  of  songs  à   robustness  (Urbano et al. 2010) 3.  CollecFon  clustering:  ask  users  to  group  songs  in  a   2D  plot  (Stober 2011)  
  • 15. EvaluaFon   •  Similarity  vs  categorizaFon:  arFst,  genre,   instrument,  covers,  co-­‐occurrence  in  personal   collecFons  and  playlists  (Berenzweig et al. 2003) •  Surveys  (Vignoli and Pauws 2005)
  • 16. But   As…”Similarity  is  an  ill-­‐defined  concept”       we  should  evaluate  each  task  separately!      
  • 17. Audio  Music  Similarity  Task   •  7000  30-­‐second  audio  clips  drawn  from  10   genres:  Blues,  Jazz,  Country/Western,  Baroque,  Classical,   Roman.c,  Electronica,  Hip-­‐Hop,  Rock,  HardRock/Metal     •  Songs  from  the  same  arFst  filter  out   •  EvaluaFon  criteria:   –  User  raFngs:  not  similar,  somewhat  similar,  very   similar     –  ObjecFve  staFsFcs:  similarity  in  terms  of  genre,  arFst   and  album.   •  More  on  talk  by  A.  Flexer.    
  • 18. Tasks  related  to  similarity   •  Audio  cover  idenFficaFon   •  Audio  classificaFon   •  Query  by  singing  (humming)   •  Query  by  tapping   •  Audio  to  score  alignment   •  Discovery  of  repeated  themes  /  secFons   •  Structural  segmentaFon   •  Audio  fingerprinFng   •  Symbolic  melodic  similarity   •  …  
  • 19. Challenges   1.  Music  is  mulFmodal,  mulF-­‐faceted     2.  Similarity  depends  on     a.  the  user/listener,   b.  the  repertoire,  and   c.  the  task   Use-­‐cases  
  • 20. Use  case  1:   •  Repertoire:  symphonic  music   •  ModaliFes:  audio,  score,   video,  gestures     •  Task:  structural  analysis  à   visualizaFon     •  PersonalizaFon:  “experts”  –   listeners  exposed  to  it  (me)  –   naïve  listeners  (young   people?)     Beethoven Symphony No. 3 Eroica http://phenicx.upf.edu/
  • 21. ModaliFes   •  Audio:  dynamics,  Fmbre  tempo,  f0  (Grachten et al. 2013) (Bosch Gómez 2013) •  Score:  key,  pitch-­‐class  sets,  orchestraFon  (Martorell and Gómez 2014) •  Video:  performers,  movement  (Bazzica, Liem and Hanjalic 2014) •  Gestures:  movement  (Sarasúa and Guaus 2014) •  Context:  manual  annotaFons  (Schedl et al. 2014)
  • 22. Strategies   •  SynchronizaFon     •  Generate  different  layers  of  informaFon   •  PersonalizaFon:   – Understand  user  needs:  naïve  listeners,  music   experts,  performers   – Let  them  choose  by  means  of  visualizaFon,   interacFon  à  HCI  
  • 23. Use  case  2:   •  Repertoire:  flamenco  singing   •  ModaliFes:  audio   •  Task:  style  and  variant   characterizaFon •  PersonalizaFon:  “experts”  –   listeners  exposed  to  it  (Me)  –   naïve  listeners  (you?)   http://mtg.upf.edu/research/projects/cofla
  • 24. Melodic  similarity   No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo. 1.  Each  style  is  characterized  by  a  common  melodic  skeleton   2.  Spontaneous  improvisaFon:  ornamentaFon,  prolongaFon,   rhythmic  and  melodic  modificaFon     Antonio  Mairena   Chano  Lobato  
  • 25. Melodic  similarity  -­‐  style   •  Ground  truth:  style  annotaFons   •  Specific    standard  measures:     – High-­‐level  expert  specific  features   – Fundamental  frequency  (Dynamic  Fme  warping)   – Symbolic-­‐based  descriptors   – Chroma  similarity     (Huson 1998)
  • 26. Melodic  similarity  –  variants   •  Ground  truth:     – human  judgements   – flamenco  experts  vs  naïve  listeners   •  Strongest  agreement  among  experts  and   different  criteria  à  no  consensus  /  general   soluFon  yet!     – Large  scale  user  studies   (Gómez et al. 2012) (Kroher et al. 2014)  
  • 27. Conclusions   •  Music  is  mulF-­‐modal,  mulF-­‐faceted,  mulF-­‐ layer   •  Similarity  is  not  a  general  concept,  but   depends  on   –  the  task   –  the  repertoire,  and   –  the  listener!  (and  his  context…)