Anat Gilboa's thesis presentation to the University of Virginia School of Engineering and Applied Science.
Over the course of her final semester, Anat and Dr. Qi, PhD, tested various methods to determine similarity between songs using features extracted from metadata in the Million Song Dataset.
2. The Journey
Music Information
Retrieval 101
Constructing approaches
to a not-so-well-defined
problem
Finding good data
Simplifying the
problem
Data visualization
Finding not-so-
good data
Fall 2014
Today!Iterating
Machine
Learning
101
Spring 2015
3. Let’s find out…
• What makes one song similar to another?
• What are the characteristics by which we can “classify” the genre of a song?
The Problem
5. Music Information Retrieval
101
• Aims to extend the understanding and usefulness of
music data, through research, development and
application of computational approaches and tools
• Combines concepts and techniques from music,
computer science, signal processing and cognition
• Music information: bibliographical, surveys, tags,
scores, MIDI, audio, etc
7. • Collection of audio features and metadata for 1,000,000 contemporary popular
music tracks.
• 44,745 unique artists w/dated tracks starting from 1922
• 10,000 song subset (1%, 1.8 gb)
• Each song has a number of features…
The Million Song Dataset
8. loudness
mode
mode confidence
release
release 7digitalid
sections confidence
sections start
segments confidence
segments loudness max
segments loudness max time
segments loudness max start
segments pitches
segments start
segments timbre
similar artists
song hotttnesss
song id
start of fade out
tatums confidence
tatums start
tempo
time signature
time signature confidence
title
track id
track 7digitalid
year
analysis sample rate
artist 7digitalid
artist familiarity
artist hotttnesss
artist id
artist latitude
artist location
artist longitude
artist mbid
artist mbtags
artist mbtags count
artist name
artist playmeid
artist terms
artist terms freq
artist terms weight
audio md5
bars confidence
bars start
beats confidence
beats start
danceability
duration
end of fade in
energy
key
key confidence
key
tempo
Song Fields
9. Numerical Features
Danceability - how danceable a song is. 0 is least danceable, 100 is most danceable.
Duration - the length of the song in seconds.
Energy - the overall energy of the song, 0 is least, 100 is most.
Hotttnesss - the popularity of the song, 0 is least, 100 is most.
Key - the key the song. 0 is C, 1 is C# and so on.
Liveness - the likelihood that a song was performed in front of an audience. Above 80 is usually live.
Loudness - the overall loudness of the song in decibels.
Mode - the mode of the song where major is 0 and minor is 1.
Speechiness - how much spoken word is in the song. 0 is least, 100 is most
Tempo - the most frequently occurring tempo in the song, in beats-per-minute.
Time signature - the number of beats per measure in the song.
Acousticness how acoustic vs. electric is the song
Valence how positive or negative is the mood of the song
11. • 8,761 songs
• (ty, API request timeouts & rate limiting)
• 307 genres-extracted from the Artist API
• k-means centroids
• 3,944 artists
• Between 1 - 11 appearances in the set
The Facts
12. • Use K-means to create centroids for each genre
• Hypothesis: If there are 307 genres
represented, would each be in the same
cluster?
• Create K-nearest neighbor tool to fetch k nearest
songs to some specified datapoint
• f(Tempo, Key, K)
Tasks
non-profit organisation which, among other things, oversees the organisation of the ISMIR Conference. The ISMIR conference is held annually and is the world's leading research forum on processing, searching, organising and accessing music-related data.
six original collections: the Popular Music Database (100 songs), Royalty-Free Music Database (15 songs), Classical Music Database (50 pieces), Jazz Music Database (50 pieces), Music Genre Database (100 pieces), and Musical Instrument Sound Database (50 instruments)
MSD is a freely-available collection of audio features and metadata for a million contemporary popular music tracks.
By the way, this is metadata…I didn’t casually download 10,000 songs and make a hadoop cluster to compute, although this could potentially go there…
Each song has a number of features but we’re interested in
I met an engineer who represented Spotify,
Not entirely sure why Aerosmith and Red Hot Chilly Peppers have 11 songs, but maybe it’s because they came out with more songs, too.
Not entirely sure why Aerosmith and Red Hot Chilly Peppers have 11 songs, but maybe it’s because they came out with more songs, too.