SlideShare a Scribd company logo
1 of 16
Download to read offline
Introduction to Audio Content Analysis
module 12.1: music similarity
alexander lerch
overview intro approach visual eval summary
introduction
overview
corresponding textbook section
section 12.1
lecture content
• music similarity and its relation to musical genre
• clustering and visualization of feature space
learning objectives
• describe potential issues with algorithms for measuring music similarity
• understand common approaches for timbre similarity
module 12.1: music similarity 1 / 9
overview intro approach visual eval summary
introduction
overview
corresponding textbook section
section 12.1
lecture content
• music similarity and its relation to musical genre
• clustering and visualization of feature space
learning objectives
• describe potential issues with algorithms for measuring music similarity
• understand common approaches for timbre similarity
module 12.1: music similarity 1 / 9
overview intro approach visual eval summary
audio similarity
introduction
perception of music similarity
• multi-dimensional (melodic, rhythmic, sound quality, . . . )
• user dependent
• associative, may also depend on editorial data
• may be context dependent
genres are clusters of musical similarity
⇒ genre classification is a special case of audio similarity measures
instead of assigning (genre) labels, the similarity/distance between (pairs) of files
is measured
module 12.1: music similarity 2 / 9
overview intro approach visual eval summary
audio similarity
introduction
perception of music similarity
• multi-dimensional (melodic, rhythmic, sound quality, . . . )
• user dependent
• associative, may also depend on editorial data
• may be context dependent
genres are clusters of musical similarity
⇒ genre classification is a special case of audio similarity measures
instead of assigning (genre) labels, the similarity/distance between (pairs) of files
is measured
module 12.1: music similarity 2 / 9
overview intro approach visual eval summary
audio similarity
introduction
perception of music similarity
• multi-dimensional (melodic, rhythmic, sound quality, . . . )
• user dependent
• associative, may also depend on editorial data
• may be context dependent
genres are clusters of musical similarity
⇒ genre classification is a special case of audio similarity measures
instead of assigning (genre) labels, the similarity/distance between (pairs) of files
is measured
module 12.1: music similarity 2 / 9
overview intro approach visual eval summary
music similarity
introduction
commonalities with genre classification
• similar set of features
• ambiguous ’ground truth’
• unclear value/impact of low level and high level features
differences to genre classification
• distance/similarity measure instead of categorical classification
tasks
• playlist generation
• music discovery
module 12.1: music similarity 3 / 9
overview intro approach visual eval summary
music similarity
introduction
commonalities with genre classification
• similar set of features
• ambiguous ’ground truth’
• unclear value/impact of low level and high level features
differences to genre classification
• distance/similarity measure instead of categorical classification
tasks
• playlist generation
• music discovery
module 12.1: music similarity 3 / 9
overview intro approach visual eval summary
music similarity
introduction
commonalities with genre classification
• similar set of features
• ambiguous ’ground truth’
• unclear value/impact of low level and high level features
differences to genre classification
• distance/similarity measure instead of categorical classification
tasks
• playlist generation
• music discovery
module 12.1: music similarity 3 / 9
overview intro approach visual eval summary
audio similarity
approaches
standard approach
• define set of features and aggregate them appropriately
• use vector distance in latent space as similarity approximation
modern approach
• train ’meaningful’ representation (e.g., VAE etc.)
• use vector distance in latent space as similarity approximation
module 12.1: music similarity 4 / 9
overview intro approach visual eval summary
audio similarity
visualization in a 2D space
problem
• feature space is high-dimensional
→ cannot be visualized
find mapping to 2D “preserving” (high-dimensional) distance metrics
example:
• Self-Organizing Maps
module 12.1: music similarity 5 / 9
overview intro approach visual eval summary
audio similarity
visualization in a 2D space
problem
• feature space is high-dimensional
→ cannot be visualized
find mapping to 2D “preserving” (high-dimensional) distance metrics
example:
• Self-Organizing Maps
module 12.1: music similarity 5 / 9
overview intro approach visual eval summary
audio similarity
visualization example: SOM 1/2
1 create a map with ’neurons’
2 train
• for each training sample find BMU (best matching unit)
• adapt BMU and neighbors toward training sample
Wv (t + 1) = Wv (t) + θ(u, v, t)α(t) D(t) − Wv (t)

▶ θ(u, v, t): depends on neighborhood distance from BMU
▶ α(t): learning restraint
▶ D(t) training sample
module 12.1: music similarity 6 / 9
https://en.wikipedia.org/wiki/Self-organizing_map
overview intro approach visual eval summary
audio similarity
SOM 2/2
graph from1
1
E. Pampalk, “Islands of Music,” Diploma Thesis, Technische Universität Wien, 2001.
module 12.1: music similarity 7 / 9
overview intro approach visual eval summary
audio similarity
evaluation
as the task is not clearly defined, it often can only be indirectly evaluated
• evaluation through genre/artist/album labels
• evaluation through listening surveys
• qualitative evaluation (picking pairs of samples and discussing them)
lack of ground truth and lack of established metrics
module 12.1: music similarity 8 / 9
overview intro approach visual eval summary
summary
lecture content
music similarity
• even less clearly defined than music genre
processing steps
1 extract features
2 define some distance metric in feature space
clustering algorithms
• work to a certain degree with traditional features
module 12.1: music similarity 9 / 9

More Related Content

More from AlexanderLerch4

More from AlexanderLerch4 (20)

0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf
 
12-02-ACA-Genre.pdf
12-02-ACA-Genre.pdf12-02-ACA-Genre.pdf
12-02-ACA-Genre.pdf
 
09-03-ACA-Temporal-Onsets.pdf
09-03-ACA-Temporal-Onsets.pdf09-03-ACA-Temporal-Onsets.pdf
09-03-ACA-Temporal-Onsets.pdf
 
13-00-ACA-Mood.pdf
13-00-ACA-Mood.pdf13-00-ACA-Mood.pdf
13-00-ACA-Mood.pdf
 
09-01-ACA-Temporal-Intro.pdf
09-01-ACA-Temporal-Intro.pdf09-01-ACA-Temporal-Intro.pdf
09-01-ACA-Temporal-Intro.pdf
 
07-06-ACA-Tonal-Chord.pdf
07-06-ACA-Tonal-Chord.pdf07-06-ACA-Tonal-Chord.pdf
07-06-ACA-Tonal-Chord.pdf
 
09-04-ACA-Temporal-BeatHisto.pdf
09-04-ACA-Temporal-BeatHisto.pdf09-04-ACA-Temporal-BeatHisto.pdf
09-04-ACA-Temporal-BeatHisto.pdf
 
09-05-ACA-Temporal-Tempo.pdf
09-05-ACA-Temporal-Tempo.pdf09-05-ACA-Temporal-Tempo.pdf
09-05-ACA-Temporal-Tempo.pdf
 
11-00-ACA-Fingerprinting.pdf
11-00-ACA-Fingerprinting.pdf11-00-ACA-Fingerprinting.pdf
11-00-ACA-Fingerprinting.pdf
 
08-00-ACA-Intensity.pdf
08-00-ACA-Intensity.pdf08-00-ACA-Intensity.pdf
08-00-ACA-Intensity.pdf
 
03-05-ACA-Input-Features.pdf
03-05-ACA-Input-Features.pdf03-05-ACA-Input-Features.pdf
03-05-ACA-Input-Features.pdf
 
07-03-03-ACA-Tonal-Monof0.pdf
07-03-03-ACA-Tonal-Monof0.pdf07-03-03-ACA-Tonal-Monof0.pdf
07-03-03-ACA-Tonal-Monof0.pdf
 
05-00-ACA-Data-Intro.pdf
05-00-ACA-Data-Intro.pdf05-00-ACA-Data-Intro.pdf
05-00-ACA-Data-Intro.pdf
 
07-03-05-ACA-Tonal-F0Eval.pdf
07-03-05-ACA-Tonal-F0Eval.pdf07-03-05-ACA-Tonal-F0Eval.pdf
07-03-05-ACA-Tonal-F0Eval.pdf
 
03-03-01-ACA-Input-TF-Fourier.pdf
03-03-01-ACA-Input-TF-Fourier.pdf03-03-01-ACA-Input-TF-Fourier.pdf
03-03-01-ACA-Input-TF-Fourier.pdf
 
03-06-ACA-Input-FeatureLearning.pdf
03-06-ACA-Input-FeatureLearning.pdf03-06-ACA-Input-FeatureLearning.pdf
03-06-ACA-Input-FeatureLearning.pdf
 
07-03-04-ACA-Tonal-Polyf0.pdf
07-03-04-ACA-Tonal-Polyf0.pdf07-03-04-ACA-Tonal-Polyf0.pdf
07-03-04-ACA-Tonal-Polyf0.pdf
 
03-07-01-ACA-Input-Post-Processing.pdf
03-07-01-ACA-Input-Post-Processing.pdf03-07-01-ACA-Input-Post-Processing.pdf
03-07-01-ACA-Input-Post-Processing.pdf
 
07-02-ACA-Tonal-Music.pdf
07-02-ACA-Tonal-Music.pdf07-02-ACA-Tonal-Music.pdf
07-02-ACA-Tonal-Music.pdf
 
06-00-ACA-Evaluation.pdf
06-00-ACA-Evaluation.pdf06-00-ACA-Evaluation.pdf
06-00-ACA-Evaluation.pdf
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdf
Overkill Security
 

Recently uploaded (20)

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdf
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 

12-01-ACA-Similarity.pdf

  • 1. Introduction to Audio Content Analysis module 12.1: music similarity alexander lerch
  • 2. overview intro approach visual eval summary introduction overview corresponding textbook section section 12.1 lecture content • music similarity and its relation to musical genre • clustering and visualization of feature space learning objectives • describe potential issues with algorithms for measuring music similarity • understand common approaches for timbre similarity module 12.1: music similarity 1 / 9
  • 3. overview intro approach visual eval summary introduction overview corresponding textbook section section 12.1 lecture content • music similarity and its relation to musical genre • clustering and visualization of feature space learning objectives • describe potential issues with algorithms for measuring music similarity • understand common approaches for timbre similarity module 12.1: music similarity 1 / 9
  • 4. overview intro approach visual eval summary audio similarity introduction perception of music similarity • multi-dimensional (melodic, rhythmic, sound quality, . . . ) • user dependent • associative, may also depend on editorial data • may be context dependent genres are clusters of musical similarity ⇒ genre classification is a special case of audio similarity measures instead of assigning (genre) labels, the similarity/distance between (pairs) of files is measured module 12.1: music similarity 2 / 9
  • 5. overview intro approach visual eval summary audio similarity introduction perception of music similarity • multi-dimensional (melodic, rhythmic, sound quality, . . . ) • user dependent • associative, may also depend on editorial data • may be context dependent genres are clusters of musical similarity ⇒ genre classification is a special case of audio similarity measures instead of assigning (genre) labels, the similarity/distance between (pairs) of files is measured module 12.1: music similarity 2 / 9
  • 6. overview intro approach visual eval summary audio similarity introduction perception of music similarity • multi-dimensional (melodic, rhythmic, sound quality, . . . ) • user dependent • associative, may also depend on editorial data • may be context dependent genres are clusters of musical similarity ⇒ genre classification is a special case of audio similarity measures instead of assigning (genre) labels, the similarity/distance between (pairs) of files is measured module 12.1: music similarity 2 / 9
  • 7. overview intro approach visual eval summary music similarity introduction commonalities with genre classification • similar set of features • ambiguous ’ground truth’ • unclear value/impact of low level and high level features differences to genre classification • distance/similarity measure instead of categorical classification tasks • playlist generation • music discovery module 12.1: music similarity 3 / 9
  • 8. overview intro approach visual eval summary music similarity introduction commonalities with genre classification • similar set of features • ambiguous ’ground truth’ • unclear value/impact of low level and high level features differences to genre classification • distance/similarity measure instead of categorical classification tasks • playlist generation • music discovery module 12.1: music similarity 3 / 9
  • 9. overview intro approach visual eval summary music similarity introduction commonalities with genre classification • similar set of features • ambiguous ’ground truth’ • unclear value/impact of low level and high level features differences to genre classification • distance/similarity measure instead of categorical classification tasks • playlist generation • music discovery module 12.1: music similarity 3 / 9
  • 10. overview intro approach visual eval summary audio similarity approaches standard approach • define set of features and aggregate them appropriately • use vector distance in latent space as similarity approximation modern approach • train ’meaningful’ representation (e.g., VAE etc.) • use vector distance in latent space as similarity approximation module 12.1: music similarity 4 / 9
  • 11. overview intro approach visual eval summary audio similarity visualization in a 2D space problem • feature space is high-dimensional → cannot be visualized find mapping to 2D “preserving” (high-dimensional) distance metrics example: • Self-Organizing Maps module 12.1: music similarity 5 / 9
  • 12. overview intro approach visual eval summary audio similarity visualization in a 2D space problem • feature space is high-dimensional → cannot be visualized find mapping to 2D “preserving” (high-dimensional) distance metrics example: • Self-Organizing Maps module 12.1: music similarity 5 / 9
  • 13. overview intro approach visual eval summary audio similarity visualization example: SOM 1/2 1 create a map with ’neurons’ 2 train • for each training sample find BMU (best matching unit) • adapt BMU and neighbors toward training sample Wv (t + 1) = Wv (t) + θ(u, v, t)α(t) D(t) − Wv (t) ▶ θ(u, v, t): depends on neighborhood distance from BMU ▶ α(t): learning restraint ▶ D(t) training sample module 12.1: music similarity 6 / 9 https://en.wikipedia.org/wiki/Self-organizing_map
  • 14. overview intro approach visual eval summary audio similarity SOM 2/2 graph from1 1 E. Pampalk, “Islands of Music,” Diploma Thesis, Technische Universität Wien, 2001. module 12.1: music similarity 7 / 9
  • 15. overview intro approach visual eval summary audio similarity evaluation as the task is not clearly defined, it often can only be indirectly evaluated • evaluation through genre/artist/album labels • evaluation through listening surveys • qualitative evaluation (picking pairs of samples and discussing them) lack of ground truth and lack of established metrics module 12.1: music similarity 8 / 9
  • 16. overview intro approach visual eval summary summary lecture content music similarity • even less clearly defined than music genre processing steps 1 extract features 2 define some distance metric in feature space clustering algorithms • work to a certain degree with traditional features module 12.1: music similarity 9 / 9