SlideShare a Scribd company logo
Music Information Retrieval
Deema Aloum Noor Orfahly
Overview
Introduction
Music Document
Retrieval
Emotion Detection
Overview
Introduction
Music Document
Retrieval
Emotion Detection
What is MIR?
• Music Information Retrieval (MIR): the
interdisciplinary science of retrieving information
from music. MIR is a small but growing field of
research with many real-world applications.
• Objective: make the world’s vast store of music
accessible to all.
• The contributing disciplines: computer science,
information retrieval, audio engineering, digital
sound processing, musicology, library science,
cognitive science, psychology, philosophy and law.
MIR Applications
Music
Document
Retrieval
Recommender
System
Track
Separation
Automatic
Music
Transcription
Rights
Managements
Emotion
Detection
Music Terms - Pitch & Melody
• Pitch is a particular frequency of sound
• E.g., 440 Hz
• Note is a named pitch by us humans.
• E.g., Western music generally refers to the
440 Hz pitch as A, specifically A4
• Melody is A pattern of pitches
• Only a sound produced electronically can have
only one pitch; all other sounds consist of
multiple pitches.
• The mix of frequencies in a sound results in the
Timbre
Music Terms - Timbre
• In music
– The characteristic quality of sound produced by
a particular instrument or voice; tone color.
• In acoustics and phonetics
– The characteristic quality of a sound,
independent of pitch and loudness
– Depends on the relative strengths
of its component frequencies;
– E.g, A4 on a guitar a sound
composed of the following Freq:
440 Hz, 880 Hz, 1320 Hz, 1760 Hz,
etc
Overview
Introduction
Music Document
Retrieval
Emotion Detection
Music Document Retrieval
Music Identification Music Similarity
MDR - Music Identification
• Metadata-based Approach:
– Music identification relies on information about
the content rather than the content itself.
– Ex. TOC
• Content-based Approach:
– Ex. Shazam Service
MDR - Music Identification - TOC
• TOC (Table Of Contents): a representation of the
start positions and lengths of the tracks on the disc.
• This feature is highly specific, because it is extremely
rare for different albums to share the same lengths
of tracks in the same order.
• But, slight differences in the generation of CDs, even
from the same source audio material, can produce
different TOCs, which will then fail to match each
other.
• Ex. freedb
MDR - Music Identification - Shazam
• Shazam:
a mobile app that recognizes music and TV around
you. (it lets you record up to 15 seconds of the song
you are hearing and then it will tell you everything
you want to know about that song: the artist, the
name of the song, the album, offer you links to
YouTube or to buy the song on iTunes)
MDR - Music Identification - Shazam
The Initial Spectrogram
MDR - Music Identification - Shazam
• They will store only the intense sounds in the song, the time
when they appear in the song and at which frequency.
The Simplified Spectrogram
MDR - Music Identification - Shazam
• To store this in the database in a way in which is efficient to search for a
match (easy to index), they choose some of the points from within the
simplified spectrogram (called “anchor points”) and zones in the vicinity of
them (called “target zone”)
Pairing the anchor point with points in a target zone
MDR - Music Identification - Shazam
• For each point in the target zone, they will create a hash that
will be the aggregation of the following:
– F1: the frequency at which the anchor point is located
– F2: the frequency at which the point in the target zone is
located
– T2 - T1: the time difference between the time when the
point in the target zone is located in the song (t2) and the
time when the anchor point is located in the song (t1)
• 64-bit struct, 32 bits for the hash and 32 bits for the time
offset and track ID.
MDR - Music Identification - Shazam
How do they find the song based on the recorded sample ?
• Repeat the same fingerprinting to the recorded sample.
• Each hash generated from the sample sound, will be searched
for a match in the database.
• If a match is found you will have:
– The time of the hash from the sample (th1)
– The time of the hash from the song in the database (th2)
• Draw a new graph called scatter graph.
– The horizontal axis (X): th2
– The vertical axis (Y): th1
– The point of intersection of the two occurrence times (th1 and th2)
will be marked with a small circle.
MDR - Music Identification - Shazam
• If the graph will contain a lot of pairs of th1‘s and th2‘s from the same
song, a diagonal line will form.
Scatter graph of a matching run
MDR - Music Identification - Shazam
• Calculate a difference between th2 and th1 (dth) and they will plot it in
a histogram.
• If there is a match in the graph plotted, then there will be a lot
of dths with the same value.
Histogram of a matching run
MDR – Similarity Search
• The concept of similarity is less specific than identity.
• There are many different types of musical similarity.
– Two different performances played from the same
notation
– Same composer
– Same function, for example dances
– Same genre
– Same culture
Query by Humming
QBH – Query Formatting
QBH – Query Comparision
• The elements in the database must
have the same representation as
the query.
• EX: Dynamic Time Warping
Dynamic Time Warping
QBH – Ranking evaluation measures
A. Mean Reciprocal Rank (MRR):
MRR = (1/3 + 1/2 + 1)/3 = 11/18 or about 0.61
QBH – Ranking evaluation measures
B. Top-X Hit Rate
• The position r of the correct result of the
search is in the first X positions or not.
• Mathematically: r(Qi) ≤ X.
Overview
Introduction
Music Document
Retrieval
Emotion Detection
Emotions
Emotions?
• Music is language of emotion.
• Users often want to listen to music that is in a certain category
of emotions or they want to listen to music that brings them
in a certain mood.
• What affect the mood of the song?
– Harmony
– Timbre
– Interpretation
– lyrics
Challenging Problem  !!
• Ambiguous
– Due to the ambiguities of human emotions.
– Different mood interpretation & perception between
individuals
• Cross disciplinary endeavor
– Signal processing
– Machine learning
– Understanding of auditory perception, psychology, and
music theory.
• Mood may change over its durations
Different Methods
Contextual
text
information
• websites
• tags
• lyrics
Content-
based
approaches
• audios
• images
• videos
combining
multiple
feature
domains
• Audio & Lyrics
• Audio & Tags
• Audio & Images
(album covers, artist photos, etc.)
Contextual text information
• Web-Documents
– Artist biographies, album reviews, and song
reviews are rich sources of information about
music.
– Collect from the Internet by
• querying search engines
• monitoring MP3 blogs
• crawling a music website
– Can be noisy 
Mood Representation
Categorical psychometrics
• A set of emotional descriptors (tags)
Scalar/dimensional psychometrics
• Mood can be scaled and measured by a
continuum of descriptors or simple
multidimensional metrics.
• Most noted: two dimensional Valence-Arousal
(V-A) space
Hevner adjective circle
Valence-Arousal (V-A) space
Excited
Clated
Happy
Tense
Stressed
Upset
Arousal
Valiance
Sad
Depressed
Fatigued
Serene
Relaxed
Calm
Activation
Deactivation
PleasantUnpleasant
Valence-Arousal (V-A) space
• Simple, powerful way of thinking about the spectrum
of human emotions.
• Both valence and arousal can be defined as
subjective experiences (Russell, 1989).
– Valiance describes whether the emotion is positive or
negative
– Arousal describes the level of alertness or energy involved
in the emotion.
Emotion Recognition Problem
• Multiclass multi label classification or regression
problem
• A music piece
– an entire song
– a section of a song (e.g., chorus, verse)
– a fixed-length clip (e.g., 30-second song snipet)
– a short-term segment (e.g., 1 second )
Emotion Classification System
Mood representation - vectors
a single multi-dimensional vector
• Each dimension represents
• a single emotion (e.g., angry).
• or a bi-polar pair of emotions
(e.g., positive/negative).
a time-series of vectors over a
semantic space of emotions
• Track changes in emotional content over the
duration of a piece
Mood Representation- Vector Values
• a binary label
– The presence or absence of the emotion
• a real-valued score
– e.g., Likert scale value
– Probability estimate
• A Likert scale is a psychometric scale commonly involved in
research that employs questionnaires. It is the most widely used
approach to scaling responses in survey research
Emotion Classification System
Annotation
• Labeling tasks are time consuming, tedious, and
expensive
• Online games “Games With a Purpose”.
Features
Features
Timbre Features
• Musical instruments usually produce sound waves with
frequencies
• The lowest frequency is
– The fundamental frequency f0
– Close relation with pitch
• The second and higher frequencies are
– Called overtones
MIR

More Related Content

Similar to MIR

Genre Classification and Analysis
Genre Classification and AnalysisGenre Classification and Analysis
Genre Classification and Analysis
Anat Gilboa
 
Basic Elements of Music
Basic Elements of MusicBasic Elements of Music
Basic Elements of Music
Queennie Bautista
 
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
Stefan
 
楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索
台灣資料科學年會
 
Elements of Music
Elements of MusicElements of Music
Elements of Music
Charlene Mangaldan
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010
ocor203
 
Sound physics
Sound physicsSound physics
Sound physics
Pratik Nakrani
 
Symbolic Melodic Similarity (through Shape Similarity)
Symbolic Melodic Similarity (through Shape Similarity)Symbolic Melodic Similarity (through Shape Similarity)
Symbolic Melodic Similarity (through Shape Similarity)
Julián Urbano
 
Marc-André Rappaz - Metaphors, gestures, and emotions in music
Marc-André Rappaz - Metaphors, gestures, and emotions in musicMarc-André Rappaz - Metaphors, gestures, and emotions in music
Marc-André Rappaz - Metaphors, gestures, and emotions in music
swissnex San Francisco
 
MUSIC OF LOWLAND LUZON-ELEMENTS OF MUSIC.pptx
MUSIC OF LOWLAND LUZON-ELEMENTS OF MUSIC.pptxMUSIC OF LOWLAND LUZON-ELEMENTS OF MUSIC.pptx
MUSIC OF LOWLAND LUZON-ELEMENTS OF MUSIC.pptx
RhiaLopez3
 
394772798.pptx
394772798.pptx394772798.pptx
394772798.pptx
Rodolfo Laycano
 
Mnri web page
Mnri web pageMnri web page
Music similarity: what for?
Music similarity: what for?Music similarity: what for?
Music similarity: what for?
Emilia Gómez
 
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
ALATechSource
 
Understanding Music Playlists
Understanding Music PlaylistsUnderstanding Music Playlists
Understanding Music Playlists
Keunwoo Choi
 
Session 1 Musicology Introduction
Session 1  Musicology  IntroductionSession 1  Musicology  Introduction
Session 1 Musicology Introduction
Paul Carr
 
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
kthrlab
 
Insight {SetList]
Insight {SetList]Insight {SetList]
Insight {SetList]
Nicole H. Romano
 
{SetList]
{SetList]{SetList]
{SetList]
Nicole H. Romano
 
Research Skills Musicology Final Session Prior To Easter Break
Research Skills Musicology Final Session Prior To Easter BreakResearch Skills Musicology Final Session Prior To Easter Break
Research Skills Musicology Final Session Prior To Easter Break
Paul Carr
 

Similar to MIR (20)

Genre Classification and Analysis
Genre Classification and AnalysisGenre Classification and Analysis
Genre Classification and Analysis
 
Basic Elements of Music
Basic Elements of MusicBasic Elements of Music
Basic Elements of Music
 
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
GS6887A: Applying Musical Expectation- Perception and Interpretation (Individ...
 
楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索
 
Elements of Music
Elements of MusicElements of Music
Elements of Music
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010
 
Sound physics
Sound physicsSound physics
Sound physics
 
Symbolic Melodic Similarity (through Shape Similarity)
Symbolic Melodic Similarity (through Shape Similarity)Symbolic Melodic Similarity (through Shape Similarity)
Symbolic Melodic Similarity (through Shape Similarity)
 
Marc-André Rappaz - Metaphors, gestures, and emotions in music
Marc-André Rappaz - Metaphors, gestures, and emotions in musicMarc-André Rappaz - Metaphors, gestures, and emotions in music
Marc-André Rappaz - Metaphors, gestures, and emotions in music
 
MUSIC OF LOWLAND LUZON-ELEMENTS OF MUSIC.pptx
MUSIC OF LOWLAND LUZON-ELEMENTS OF MUSIC.pptxMUSIC OF LOWLAND LUZON-ELEMENTS OF MUSIC.pptx
MUSIC OF LOWLAND LUZON-ELEMENTS OF MUSIC.pptx
 
394772798.pptx
394772798.pptx394772798.pptx
394772798.pptx
 
Mnri web page
Mnri web pageMnri web page
Mnri web page
 
Music similarity: what for?
Music similarity: what for?Music similarity: what for?
Music similarity: what for?
 
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
 
Understanding Music Playlists
Understanding Music PlaylistsUnderstanding Music Playlists
Understanding Music Playlists
 
Session 1 Musicology Introduction
Session 1  Musicology  IntroductionSession 1  Musicology  Introduction
Session 1 Musicology Introduction
 
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
北原研究室の研究事例紹介:ベーシストの旋律分析とイコライザーの印象分析(Music×Analytics Meetup vol.5 ロングトーク)
 
Insight {SetList]
Insight {SetList]Insight {SetList]
Insight {SetList]
 
{SetList]
{SetList]{SetList]
{SetList]
 
Research Skills Musicology Final Session Prior To Easter Break
Research Skills Musicology Final Session Prior To Easter BreakResearch Skills Musicology Final Session Prior To Easter Break
Research Skills Musicology Final Session Prior To Easter Break
 

Recently uploaded

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 

Recently uploaded (20)

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 

MIR

  • 4. What is MIR? • Music Information Retrieval (MIR): the interdisciplinary science of retrieving information from music. MIR is a small but growing field of research with many real-world applications. • Objective: make the world’s vast store of music accessible to all. • The contributing disciplines: computer science, information retrieval, audio engineering, digital sound processing, musicology, library science, cognitive science, psychology, philosophy and law.
  • 6. Music Terms - Pitch & Melody • Pitch is a particular frequency of sound • E.g., 440 Hz • Note is a named pitch by us humans. • E.g., Western music generally refers to the 440 Hz pitch as A, specifically A4 • Melody is A pattern of pitches • Only a sound produced electronically can have only one pitch; all other sounds consist of multiple pitches. • The mix of frequencies in a sound results in the Timbre
  • 7. Music Terms - Timbre • In music – The characteristic quality of sound produced by a particular instrument or voice; tone color. • In acoustics and phonetics – The characteristic quality of a sound, independent of pitch and loudness – Depends on the relative strengths of its component frequencies; – E.g, A4 on a guitar a sound composed of the following Freq: 440 Hz, 880 Hz, 1320 Hz, 1760 Hz, etc
  • 9. Music Document Retrieval Music Identification Music Similarity
  • 10. MDR - Music Identification • Metadata-based Approach: – Music identification relies on information about the content rather than the content itself. – Ex. TOC • Content-based Approach: – Ex. Shazam Service
  • 11. MDR - Music Identification - TOC • TOC (Table Of Contents): a representation of the start positions and lengths of the tracks on the disc. • This feature is highly specific, because it is extremely rare for different albums to share the same lengths of tracks in the same order. • But, slight differences in the generation of CDs, even from the same source audio material, can produce different TOCs, which will then fail to match each other. • Ex. freedb
  • 12. MDR - Music Identification - Shazam • Shazam: a mobile app that recognizes music and TV around you. (it lets you record up to 15 seconds of the song you are hearing and then it will tell you everything you want to know about that song: the artist, the name of the song, the album, offer you links to YouTube or to buy the song on iTunes)
  • 13. MDR - Music Identification - Shazam The Initial Spectrogram
  • 14. MDR - Music Identification - Shazam • They will store only the intense sounds in the song, the time when they appear in the song and at which frequency. The Simplified Spectrogram
  • 15. MDR - Music Identification - Shazam • To store this in the database in a way in which is efficient to search for a match (easy to index), they choose some of the points from within the simplified spectrogram (called “anchor points”) and zones in the vicinity of them (called “target zone”) Pairing the anchor point with points in a target zone
  • 16. MDR - Music Identification - Shazam • For each point in the target zone, they will create a hash that will be the aggregation of the following: – F1: the frequency at which the anchor point is located – F2: the frequency at which the point in the target zone is located – T2 - T1: the time difference between the time when the point in the target zone is located in the song (t2) and the time when the anchor point is located in the song (t1) • 64-bit struct, 32 bits for the hash and 32 bits for the time offset and track ID.
  • 17. MDR - Music Identification - Shazam How do they find the song based on the recorded sample ? • Repeat the same fingerprinting to the recorded sample. • Each hash generated from the sample sound, will be searched for a match in the database. • If a match is found you will have: – The time of the hash from the sample (th1) – The time of the hash from the song in the database (th2) • Draw a new graph called scatter graph. – The horizontal axis (X): th2 – The vertical axis (Y): th1 – The point of intersection of the two occurrence times (th1 and th2) will be marked with a small circle.
  • 18. MDR - Music Identification - Shazam • If the graph will contain a lot of pairs of th1‘s and th2‘s from the same song, a diagonal line will form. Scatter graph of a matching run
  • 19. MDR - Music Identification - Shazam • Calculate a difference between th2 and th1 (dth) and they will plot it in a histogram. • If there is a match in the graph plotted, then there will be a lot of dths with the same value. Histogram of a matching run
  • 20. MDR – Similarity Search • The concept of similarity is less specific than identity. • There are many different types of musical similarity. – Two different performances played from the same notation – Same composer – Same function, for example dances – Same genre – Same culture
  • 22. QBH – Query Formatting
  • 23. QBH – Query Comparision • The elements in the database must have the same representation as the query. • EX: Dynamic Time Warping
  • 25. QBH – Ranking evaluation measures A. Mean Reciprocal Rank (MRR): MRR = (1/3 + 1/2 + 1)/3 = 11/18 or about 0.61
  • 26. QBH – Ranking evaluation measures B. Top-X Hit Rate • The position r of the correct result of the search is in the first X positions or not. • Mathematically: r(Qi) ≤ X.
  • 29. Emotions? • Music is language of emotion. • Users often want to listen to music that is in a certain category of emotions or they want to listen to music that brings them in a certain mood. • What affect the mood of the song? – Harmony – Timbre – Interpretation – lyrics
  • 30. Challenging Problem  !! • Ambiguous – Due to the ambiguities of human emotions. – Different mood interpretation & perception between individuals • Cross disciplinary endeavor – Signal processing – Machine learning – Understanding of auditory perception, psychology, and music theory. • Mood may change over its durations
  • 31. Different Methods Contextual text information • websites • tags • lyrics Content- based approaches • audios • images • videos combining multiple feature domains • Audio & Lyrics • Audio & Tags • Audio & Images (album covers, artist photos, etc.)
  • 32. Contextual text information • Web-Documents – Artist biographies, album reviews, and song reviews are rich sources of information about music. – Collect from the Internet by • querying search engines • monitoring MP3 blogs • crawling a music website – Can be noisy 
  • 33. Mood Representation Categorical psychometrics • A set of emotional descriptors (tags) Scalar/dimensional psychometrics • Mood can be scaled and measured by a continuum of descriptors or simple multidimensional metrics. • Most noted: two dimensional Valence-Arousal (V-A) space
  • 36. Valence-Arousal (V-A) space • Simple, powerful way of thinking about the spectrum of human emotions. • Both valence and arousal can be defined as subjective experiences (Russell, 1989). – Valiance describes whether the emotion is positive or negative – Arousal describes the level of alertness or energy involved in the emotion.
  • 37. Emotion Recognition Problem • Multiclass multi label classification or regression problem • A music piece – an entire song – a section of a song (e.g., chorus, verse) – a fixed-length clip (e.g., 30-second song snipet) – a short-term segment (e.g., 1 second )
  • 39. Mood representation - vectors a single multi-dimensional vector • Each dimension represents • a single emotion (e.g., angry). • or a bi-polar pair of emotions (e.g., positive/negative). a time-series of vectors over a semantic space of emotions • Track changes in emotional content over the duration of a piece
  • 40. Mood Representation- Vector Values • a binary label – The presence or absence of the emotion • a real-valued score – e.g., Likert scale value – Probability estimate • A Likert scale is a psychometric scale commonly involved in research that employs questionnaires. It is the most widely used approach to scaling responses in survey research
  • 42. Annotation • Labeling tasks are time consuming, tedious, and expensive • Online games “Games With a Purpose”.
  • 45. Timbre Features • Musical instruments usually produce sound waves with frequencies • The lowest frequency is – The fundamental frequency f0 – Close relation with pitch • The second and higher frequencies are – Called overtones

Editor's Notes

  1. Two well-known examples of music-recommendation systems are Pandora Radio, which is a content-based system, and Last.fm, thought as a metadata-based system.
  2. For example, playing A4 on a guitar will actually result in a sound composed of the following frequencies: 440 Hz, 880 Hz, 1320 Hz, 1760 Hz, etc. The particular strength, or amplitude, of the frequencies results in the timbre.
  3. http://www.mblondel.org/journal/2009/08/31/dynamic-time-warping-theory/
  4. When a user produces a query Q related to a certain tune A, the QBH system returns a rank of a certain length N in which the tune A is located at position r. The particular reciprocal rank for that A query is defined as 1/r. Mean Reciprocal Rank: the mean value of the reciprocal ranks obtained when the system is evaluated with n queries.
  5. By doing this, we can obtain the average of how many times the QBH system retrieves the correct result among the first X positions.