SlideShare a Scribd company logo
http://mac.citi.sinica.edu.tw/~yang/
yhyang@ailabs.tw
yang@citi.sinica.edu.tw
Yi-Hsuan Yang Ph.D. 1,2
1 Taiwan AI Labs
2 Research Center for IT Innovation, Academia Sinica
October 26, 2021
About Me
• Ph.D. National Taiwan University 2010
• Research Professor, Music & AI Lab, Academia Sinica, since 2011
• Chief Music Scientist, Taiwan AI Labs, 2019/3‒2023/2
• Over 200 publications
2
About the Music and AI Lab @ Sinica
About Academia Sinica
 National academy of Taiwan, founded in 1928
 About 1,000 Full/Associate/Assistant Researchers
About Music and AI Lab (musicai)
 Since Sep 2011
 Members
 PI [me]
 research assistants
 PhD/master students
3
About the Music AI Team @
About Taiwan AI Labs
 Privately-funded research organization,
founded by Ethan Tu (PTT) in 2017
 Three main research area: 1) HCI, 2) medicine, 3) smart city
About the Music AI team
 Members
 scientist [me; since March 2019]
 ML engineers (for models)
 musicians
 program manager
 software engineers (for frontend/backend)
4
(an image of our musicians)
Outline
• Types of music related research/products
• Fundamentals of music signal processing
• Types of data
9
Types of Music Related Research/Products
10
• Intelligent ways to analyze, retrieve, and create music
1. Music informa-
tion analysis
2. Music informa-
tion retrieval
3. Music
generation
music → features query → music X → music
Types of Music Related Research/Products
1. Music information
“analysis”
11
automatic page turner
automatic
Karaoke scoring
interactive
concert
Types of Music Related Research/Products
1. Music information
“analysis”
12
chord recognizer music browsing assistant
Types of Music Related Research/Products
2. Music information “retrieval”
• Search
‒ through keywords/labels (genre, instrument, emotion)
13
Types of Music Related Research/Products
2. Music information “retrieval”
• Search
‒ through keywords/labels (genre, instrument, emotion)
14
musical event localization
J.-Y. Liu and Y.-H. Yang, "Event localization in music auto-tagging," MM 2016
Types of Music Related Research/Products
2. Music information “retrieval”
• Search
‒ through keywords/labels (genre, instrument, emotion)
‒ through audio examples (humming, audio recording)
15
Types of Music Related Research/Products
2. Music information “retrieval”
• Search
‒ through keywords/labels (genre, instrument, emotion)
‒ through audio examples (humming, audio recording)
16
…
Types of Music Related Research/Products
2. Music information “retrieval”
• Match
‒ to match 1) a video clip, 2) a photo slideshow,
3) a song lyrics, or 4) a given context
‒ cross-domain retrieval
17
Types of Music Related Research/Products
2. Music information “retrieval”
• Discover
‒ recommendation: diversity, serendipity, explanations
18
Types of Music Related Research/Products
2. Music information “retrieval”
• Discover
‒ recommendation: diversity, serendipity, explanations
19
Types of Music Related Research/Products
Context
Music User
• Activity: driving, studying, working, walking
• Mood: happy, sad, angry, relaxed
• Location: home, work, public place
• Social company: alone, w/ friends, w/ strangers
• age
• gender
• personality
• cultural background
• musical background
2. Music information “retrieval”
• Discover
‒ Context-aware Music Recommendation
Types of Music Related Research/Products
3. Music creation
21
Types of Music Related Research/Products
3. Music creation
22
https://www.youtube.com/watch?v=k1DgNfz1g_s
Types of Music Related Research/Products
3. Music creation
23
Types of Music Related Research/Products
3. Music creation
24
http://www.inside.com.tw/2016/05/04/positive-grid-bias-head
Types of Music Related Research/Products
3. Music creation
25
https://youtu.be/rL5YKZ9ecpg?t=50m
Types of Music Related Research/Products
1. Music information analysis
• Education, data visualization
2. Music information retrieval
• Search: through keywords (genre, instrument, emotion) or
audio examples (humming or audio recording)
• Match: cross domain retrieval
• Discover: recommendation
3. Music creation
• Google Magenta, Smule AutoRap, Samsung Hum-On,
Positive Grid, Yamaha Vocaloid
26
ML in Music: “Music Info Retrieval/Analysis”
28
Music transcription (audio2score)
• audio → note (pitch, onset, offset)
• audio → instrument (flute, cello)
• audio → meter (4/4)
• audio → key (E-flat major)
audio score
Music semantic labeling
• audio → genre (classical)
• audio → emotion (yearning)
• audio → other attributes (slow/fast)
labels
applications in
music retrieval,
education,
archival, etc
(existing
song)
AI listener
Music transcription (audio2score)
• audio → note (pitch, onset, offset)
• audio → instrument (flute, cello)
• audio → meter (4/4)
• audio → key (E-flat major)
ML in Music: “Music Generation/Synthesis”
29
audio score
Music semantic labeling
• audio → genre (classical)
• audio → emotion (yearning)
• audio → other attributes (slow/fast)
labels
(new
song)
AI composer
random seed
AI performer (score2audio)
Music transcription (audio2score)
• audio → note (pitch, onset, offset)
• audio → instrument (flute, cello)
• audio → meter (4/4)
• audio → key (E-flat major)
ML in Music: “Music Generation/Synthesis”
30
audio
features
Music semantic labeling
• audio → genre (classical)
• audio → emotion (yearning)
• audio → other attributes (slow/fast)
labels
(existing
songs)
AI listener
score
AI DJ
audio
(a new
song)
remix, mashup, etc
(image from the Internet)
Music AI Research
• Four broad topics
 audio → audio: signal processing
 audio → score: transcription
 score → score: composition
 score → audio: synthesis
31
Outline
• Types of music related research/products
• Fundamentals of music signal processing
• Types of data
32
Fundamentals of Music Signal Processing
• Pitch: which notes are played?
• Rhythm: how fast?
• Timbre: which instrument(s)?
33
Mozart’s Variationen
(1st phrase)
Music Information Analysis
• music → features
34
melody
1. pitch
2. onset, offset
3. tempo
Music Information Analysis
• music → features
35
accompaniment
4. chord
Music Information Analysis
• music → features
36
5. instruments (timbre)
Music Information Analysis
• music → features
37
6. source separation
7. key, beat, downbeat, meter
Music Information Analysis
• music → features
38
8. Semantic description
─ genre: pop, classical, jazz, rock, R&B, “Tai,” “aboriginal”
─ emotion: happy, angry, sad, relaxed
─ usage: at party, working, driving, reading, sleeping, romance
─ theme: lonely, breakup, celebration, in love, friend, battle
─ vocal timbre: aggressive, breathy, duet, emotional, rapping, screaming
genre listening context
emotion
Fundamentals of Music Signal Processing
Pitch ♪♪♪ ♪♪♪ ♪♪♪
Rhythm
Timbre ♪♪
39
Karaoke scorer chord recognizer
page turner
Fundamentals of Music Signal Processing
Pitch ♪♪♪ ♪
Rhythm ♪♪♪
Timbre ♪♪♪ ♪
40
instrument
classifier
content ID Spotify running
Fundamentals of Music Signal Processing
Pitch ♪♪♪ ♪♪♪ ♪♪♪
Rhythm ♪♪♪ ♪♪♪ ♪♪♪
Timbre ♪♪♪ ♪♪♪ ♪♪♪
41
similarity search
or
recommendation
music
emotion or
genre
recognizer
automatic
music video
generation
Fundamentals of Music Signal Processing
42
• Listens to music
tempo, instrumentation,
key, time signature, energy,
harmonic & timbral structures
• Reads about music
lyrics, blog posts, reviews,
playlists and discussion forums
• Learns about trends
online music behavior — who's
talking about which artists this
week, what songs are being
streamed or downloaded
• Not everything is in audio
Fundamentals of Music Signal Processing
• Let’s have a look at what we can extract from audio
anyway
• Time-domain waveform
43
Fundamentals of Music Signal Processing
• Frequency domain
representation
• Spectrogram (obtained
by Short-Time Fourier
Transform)
44
Fundamentals of Music Signal Processing
• Pitch
• Simple for monophonic
signals (almost table
lookup)
• Challenging for polyphonic
signals; known as multi-
pitch estimation (MPE)
‒ overlapping partials
‒ missing fundamentals
45
8ve
8ve
8ve
8ve
8ve
L. Su and Y.-H. Yang, "Combining spectral and temporal representations for multipitch
estimation of polyphonic music,“ TASLP 2015
Fundamentals of Music Signal Processing
• Tempo: beats
per minute (bpm)
• Onset detection,
downbeat estimation
tempo estimation,
beat tracking,
rhythm pattern
extraction
48
energy-based spectrum-based
Fundamentals of Music Signal Processing
• Timbre: difference in time-frequency distribution
50
Fundamentals of Music Signal Processing
• Timbre: difference in time-frequency distribution
‒ odd-to-even harmonic ratio, decay rate, vibrato etc
51
piano solo human voice
Fundamentals of Music Signal Processing
• Spectrogram, or the reduced-dimension version “Mel-
spectrogram,” is usually considered as a “raw” feature
representation of music
• Can be treated as an image and then processed by
convolutional neural nets (CNN)
52
figure made by
Sander Dieleman
http://benanne.github.io/2014/
08/05/spotify-cnns.html
Fundamentals of Music Signal Processing
• Chromagram: a better “timbre-invariant” feature
representation for pitch related tasks (e.g. chord
recognition, cover song identification)
‒ merge all the frequency bins
with the same note name
(C, C#, D, D#, …)
‒ 12-dim vector for each
time frame
53
figure made by
Meinard Meuller
• Source separation can sometimes be helpful
‒ harmonic/percussion separation: given a mixture, separate
the percussive part from the harmonic part
‒ harmonic: pitch related info
‒ percussive: tempo related info
Fundamentals of Music Signal Processing
54
(a) original (b) harmonic (c) percussive
• Source separation can sometimes be helpful
‒ singing voice separation: given a mixture, separate the
singing voice from the accompaniment
Fundamentals of Music Signal Processing
55
Fundamentals of Music Signal Processing
• Pitch, tempo, timbre play different roles in different
tasks
• Spectrogram: a basic feature representation
• Multipitch estimation: for better pitch info
• Source separation: might improve the extraction for
pitch, tempo and also timbre
• Feature design (based on domain knowledge) versus
feature learning (data-driven; deep learning)
56
Outline
• Types of music related research/products
• Fundamentals of music signal processing
• Types of data
57
Types of Data
• Music audio data
─ not sharable due to copyright issues and business interest
─ however, audio features can be shared
─ or, start with copyright free music
58
free music
archive
Types of Data
• Music listening data
‒ from social platforms via e.g., last.fm API, Spotify API
‒ from Twitter: #nowplaying dataset
59
Types of Data
• Big music text data
─ score, lyrics, review, playlist, tags, Wikipedia, etc
─ not everything is in audio
─ some of them are easier to get from non-audio data
60
Types of Data
• Big sensor data?
─ sensors attached to “things” or “human beings”
61
Data Science in Music
• The missing “D” in Data Science —
domain knowledge
• Music information retrieval
= musicology
+ signal processing
+ machine learning
+ others
62
Resources
• Conference proceedings
‒ Int’l Soc. Music Information Retrieval Conf. (ISMIR)
‒ Int’l Conf. Acoustic, Speech, and Signal Processing (ICASSP)
‒ AAAI, IJCAI, ICML, NeurIPS, ICLR, ACM MM
• Transactions
‒ Transactions of the Int’l Soc. Music Information Retrieval
(TISMIR)
‒ IEEE Trans. Audio, Speech and Language Processing (TASLP)
‒ IEEE Trans. Multimedia (TMM)
63
Resources
• MIREX (MIR Evaluation eXchange)
‒ Part of ISMIR
‒ http://www.music-ir.org/mirex/wiki/MIREX_HOME
 Audio Onset Detection
 Audio Beat Tracking
 Audio Key Detection
 Audio Downbeat Detection
 Real-time Audio to Score
Alignment(a.k.a Score Following)
 Audio Cover Song Identification
 Discovery of Repeated Themes &
Sections
 Audio Melody Extraction
 Query by Singing/Humming
 Audio Chord Estimation
 Singing Voice Separation
 Audio Fingerprinting
 Music/Speech
Classification/Detection
 Audio Offset Detection
Resources
• Courses
‒ Juhan Nam @ KAIST
https://mac.kaist.ac.kr/~juhan/gct634/index.html
‒ Meinard Meuller @ Universität Erlangen-Nürnberg
https://www.audiolabs-
erlangen.de/fau/professor/mueller/teaching
‒ Juan Bello @ NYU
https://wp.nyu.edu/jpbello/teaching/mir/
‒ CCRMA summer school @ Stanford
https://ccrma.stanford.edu/workshops/music-
information-retrieval-mir-2015
‒ Xavier Serra @ UPF, Spain
https://zh-tw.coursera.org/course/audio
65

More Related Content

What's hot

Recommending and searching @ Spotify
Recommending and searching @ SpotifyRecommending and searching @ Spotify
Recommending and searching @ Spotify
Mounia Lalmas-Roelleke
 
Music Recommendation Tutorial
Music Recommendation TutorialMusic Recommendation Tutorial
Music Recommendation Tutorial
Oscar Celma
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
Chris Johnson
 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ Spotify
Neville Li
 
Search @ Spotify
Search @ Spotify Search @ Spotify
Search @ Spotify
Mounia Lalmas-Roelleke
 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at Spotify
Ching-Wei Chen
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
Chris Johnson
 
Machine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data MeetupMachine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data Meetup
Andy Sloane
 
Product School - Spotify presentation
Product School - Spotify presentationProduct School - Spotify presentation
Product School - Spotify presentation
Suleiman Younossi
 
"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi
Keunwoo Choi
 
Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)
Mounia Lalmas-Roelleke
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
Mounia Lalmas-Roelleke
 
Building Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at SpotifyBuilding Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at Spotify
Vidhya Murali
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
Andrea Gazzarini
 
楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索
台灣資料科學年會
 
Music Recommendation 2018
Music Recommendation 2018Music Recommendation 2018
Music Recommendation 2018
Fabien Gouyon
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
Chris Johnson
 
CF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At SpotifyCF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At Spotify
Vidhya Murali
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at Spotify
Neville Li
 
The Spotify Brand
The Spotify BrandThe Spotify Brand
The Spotify Brand
SpencerMcLeod2
 

What's hot (20)

Recommending and searching @ Spotify
Recommending and searching @ SpotifyRecommending and searching @ Spotify
Recommending and searching @ Spotify
 
Music Recommendation Tutorial
Music Recommendation TutorialMusic Recommendation Tutorial
Music Recommendation Tutorial
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ Spotify
 
Search @ Spotify
Search @ Spotify Search @ Spotify
Search @ Spotify
 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at Spotify
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
 
Machine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data MeetupMachine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data Meetup
 
Product School - Spotify presentation
Product School - Spotify presentationProduct School - Spotify presentation
Product School - Spotify presentation
 
"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi
 
Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 
Building Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at SpotifyBuilding Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at Spotify
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
 
楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索
 
Music Recommendation 2018
Music Recommendation 2018Music Recommendation 2018
Music Recommendation 2018
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
CF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At SpotifyCF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At Spotify
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at Spotify
 
The Spotify Brand
The Spotify BrandThe Spotify Brand
The Spotify Brand
 

Similar to 20211026 taicca 1 intro to mir

Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)
Yi-Hsuan Yang
 
MIR
MIRMIR
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017
Yi-Hsuan Yang
 
Understanding Music Playlists
Understanding Music PlaylistsUnderstanding Music Playlists
Understanding Music Playlists
Keunwoo Choi
 
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Oscar Celma
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010
ocor203
 
Teaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology ResourcesTeaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology Resourcesbradfordswanson
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
Sease
 
Session 1 Musicology Introduction
Session 1  Musicology  IntroductionSession 1  Musicology  Introduction
Session 1 Musicology Introduction
Paul Carr
 
Genre Classification and Analysis
Genre Classification and AnalysisGenre Classification and Analysis
Genre Classification and Analysis
Anat Gilboa
 
Gracenote Music Recognition, Metadata, and Discovery APIs
Gracenote Music Recognition, Metadata, and Discovery APIsGracenote Music Recognition, Metadata, and Discovery APIs
Gracenote Music Recognition, Metadata, and Discovery APIs
Ching-Wei Chen
 
Music Information Retrieval: Overview and Current Trends 2008
Music Information Retrieval: Overview and Current Trends 2008Music Information Retrieval: Overview and Current Trends 2008
Music Information Retrieval: Overview and Current Trends 2008Rui Pedro Paiva
 
Using mashup technology to improve findability
Using mashup technology to improve findabilityUsing mashup technology to improve findability
Using mashup technology to improve findabilitySten Govaerts
 
Human Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A ReviewHuman Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A Review
Editor IJCATR
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...
kthrlab
 
Who Offers the Best Sheet Music by Sheet music international
Who Offers the Best Sheet Music by Sheet music internationalWho Offers the Best Sheet Music by Sheet music international
Who Offers the Best Sheet Music by Sheet music international
SheetMusic International
 
Introduction musictech
Introduction musictechIntroduction musictech
Introduction musictechJia Liu
 
02-00-ACA-System-Intro.pdf
02-00-ACA-System-Intro.pdf02-00-ACA-System-Intro.pdf
02-00-ACA-System-Intro.pdf
AlexanderLerch4
 

Similar to 20211026 taicca 1 intro to mir (20)

Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)
 
MIR
MIRMIR
MIR
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017
 
Understanding Music Playlists
Understanding Music PlaylistsUnderstanding Music Playlists
Understanding Music Playlists
 
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010
 
Mnri web page
Mnri web pageMnri web page
Mnri web page
 
Teaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology ResourcesTeaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology Resources
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
 
Session 1 Musicology Introduction
Session 1  Musicology  IntroductionSession 1  Musicology  Introduction
Session 1 Musicology Introduction
 
Genre Classification and Analysis
Genre Classification and AnalysisGenre Classification and Analysis
Genre Classification and Analysis
 
Gracenote Music Recognition, Metadata, and Discovery APIs
Gracenote Music Recognition, Metadata, and Discovery APIsGracenote Music Recognition, Metadata, and Discovery APIs
Gracenote Music Recognition, Metadata, and Discovery APIs
 
Music Information Retrieval: Overview and Current Trends 2008
Music Information Retrieval: Overview and Current Trends 2008Music Information Retrieval: Overview and Current Trends 2008
Music Information Retrieval: Overview and Current Trends 2008
 
Using mashup technology to improve findability
Using mashup technology to improve findabilityUsing mashup technology to improve findability
Using mashup technology to improve findability
 
Human Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A ReviewHuman Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A Review
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...
 
Ism2011
Ism2011Ism2011
Ism2011
 
Who Offers the Best Sheet Music by Sheet music international
Who Offers the Best Sheet Music by Sheet music internationalWho Offers the Best Sheet Music by Sheet music international
Who Offers the Best Sheet Music by Sheet music international
 
Introduction musictech
Introduction musictechIntroduction musictech
Introduction musictech
 
02-00-ACA-System-Intro.pdf
02-00-ACA-System-Intro.pdf02-00-ACA-System-Intro.pdf
02-00-ACA-System-Intro.pdf
 

More from Yi-Hsuan Yang

20211026 taicca 3 music analysis sota
20211026 taicca 3 music analysis sota20211026 taicca 3 music analysis sota
20211026 taicca 3 music analysis sota
Yi-Hsuan Yang
 
Automatic Music Composition with Transformers, Jan 2021
Automatic Music Composition with Transformers, Jan 2021Automatic Music Composition with Transformers, Jan 2021
Automatic Music Composition with Transformers, Jan 2021
Yi-Hsuan Yang
 
Research on Automatic Music Composition at the Taiwan AI Labs, April 2020
Research on Automatic Music Composition at the Taiwan AI Labs, April 2020Research on Automatic Music Composition at the Taiwan AI Labs, April 2020
Research on Automatic Music Composition at the Taiwan AI Labs, April 2020
Yi-Hsuan Yang
 
ISMIR 2019 tutorial: Generating music with generative adverairal networks (GANs)
ISMIR 2019 tutorial: Generating music with generative adverairal networks (GANs)ISMIR 2019 tutorial: Generating music with generative adverairal networks (GANs)
ISMIR 2019 tutorial: Generating music with generative adverairal networks (GANs)
Yi-Hsuan Yang
 
Learning to Generate Jazz & Pop Piano Music from Audio via MIR Techniques
Learning to Generate Jazz & Pop Piano Music from Audio via MIR TechniquesLearning to Generate Jazz & Pop Piano Music from Audio via MIR Techniques
Learning to Generate Jazz & Pop Piano Music from Audio via MIR Techniques
Yi-Hsuan Yang
 
20190625 Research at Taiwan AI Labs: Music and Speech AI
20190625 Research at Taiwan AI Labs: Music and Speech AI20190625 Research at Taiwan AI Labs: Music and Speech AI
20190625 Research at Taiwan AI Labs: Music and Speech AI
Yi-Hsuan Yang
 
Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)
Yi-Hsuan Yang
 
Dimensional Music Emotion Recognition
Dimensional Music Emotion RecognitionDimensional Music Emotion Recognition
Dimensional Music Emotion Recognition
Yi-Hsuan Yang
 

More from Yi-Hsuan Yang (8)

20211026 taicca 3 music analysis sota
20211026 taicca 3 music analysis sota20211026 taicca 3 music analysis sota
20211026 taicca 3 music analysis sota
 
Automatic Music Composition with Transformers, Jan 2021
Automatic Music Composition with Transformers, Jan 2021Automatic Music Composition with Transformers, Jan 2021
Automatic Music Composition with Transformers, Jan 2021
 
Research on Automatic Music Composition at the Taiwan AI Labs, April 2020
Research on Automatic Music Composition at the Taiwan AI Labs, April 2020Research on Automatic Music Composition at the Taiwan AI Labs, April 2020
Research on Automatic Music Composition at the Taiwan AI Labs, April 2020
 
ISMIR 2019 tutorial: Generating music with generative adverairal networks (GANs)
ISMIR 2019 tutorial: Generating music with generative adverairal networks (GANs)ISMIR 2019 tutorial: Generating music with generative adverairal networks (GANs)
ISMIR 2019 tutorial: Generating music with generative adverairal networks (GANs)
 
Learning to Generate Jazz & Pop Piano Music from Audio via MIR Techniques
Learning to Generate Jazz & Pop Piano Music from Audio via MIR TechniquesLearning to Generate Jazz & Pop Piano Music from Audio via MIR Techniques
Learning to Generate Jazz & Pop Piano Music from Audio via MIR Techniques
 
20190625 Research at Taiwan AI Labs: Music and Speech AI
20190625 Research at Taiwan AI Labs: Music and Speech AI20190625 Research at Taiwan AI Labs: Music and Speech AI
20190625 Research at Taiwan AI Labs: Music and Speech AI
 
Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)
 
Dimensional Music Emotion Recognition
Dimensional Music Emotion RecognitionDimensional Music Emotion Recognition
Dimensional Music Emotion Recognition
 

Recently uploaded

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

20211026 taicca 1 intro to mir

  • 1. http://mac.citi.sinica.edu.tw/~yang/ yhyang@ailabs.tw yang@citi.sinica.edu.tw Yi-Hsuan Yang Ph.D. 1,2 1 Taiwan AI Labs 2 Research Center for IT Innovation, Academia Sinica October 26, 2021
  • 2. About Me • Ph.D. National Taiwan University 2010 • Research Professor, Music & AI Lab, Academia Sinica, since 2011 • Chief Music Scientist, Taiwan AI Labs, 2019/3‒2023/2 • Over 200 publications 2
  • 3. About the Music and AI Lab @ Sinica About Academia Sinica  National academy of Taiwan, founded in 1928  About 1,000 Full/Associate/Assistant Researchers About Music and AI Lab (musicai)  Since Sep 2011  Members  PI [me]  research assistants  PhD/master students 3
  • 4. About the Music AI Team @ About Taiwan AI Labs  Privately-funded research organization, founded by Ethan Tu (PTT) in 2017  Three main research area: 1) HCI, 2) medicine, 3) smart city About the Music AI team  Members  scientist [me; since March 2019]  ML engineers (for models)  musicians  program manager  software engineers (for frontend/backend) 4 (an image of our musicians)
  • 5. Outline • Types of music related research/products • Fundamentals of music signal processing • Types of data 9
  • 6. Types of Music Related Research/Products 10 • Intelligent ways to analyze, retrieve, and create music 1. Music informa- tion analysis 2. Music informa- tion retrieval 3. Music generation music → features query → music X → music
  • 7. Types of Music Related Research/Products 1. Music information “analysis” 11 automatic page turner automatic Karaoke scoring interactive concert
  • 8. Types of Music Related Research/Products 1. Music information “analysis” 12 chord recognizer music browsing assistant
  • 9. Types of Music Related Research/Products 2. Music information “retrieval” • Search ‒ through keywords/labels (genre, instrument, emotion) 13
  • 10. Types of Music Related Research/Products 2. Music information “retrieval” • Search ‒ through keywords/labels (genre, instrument, emotion) 14 musical event localization J.-Y. Liu and Y.-H. Yang, "Event localization in music auto-tagging," MM 2016
  • 11. Types of Music Related Research/Products 2. Music information “retrieval” • Search ‒ through keywords/labels (genre, instrument, emotion) ‒ through audio examples (humming, audio recording) 15
  • 12. Types of Music Related Research/Products 2. Music information “retrieval” • Search ‒ through keywords/labels (genre, instrument, emotion) ‒ through audio examples (humming, audio recording) 16 …
  • 13. Types of Music Related Research/Products 2. Music information “retrieval” • Match ‒ to match 1) a video clip, 2) a photo slideshow, 3) a song lyrics, or 4) a given context ‒ cross-domain retrieval 17
  • 14. Types of Music Related Research/Products 2. Music information “retrieval” • Discover ‒ recommendation: diversity, serendipity, explanations 18
  • 15. Types of Music Related Research/Products 2. Music information “retrieval” • Discover ‒ recommendation: diversity, serendipity, explanations 19
  • 16. Types of Music Related Research/Products Context Music User • Activity: driving, studying, working, walking • Mood: happy, sad, angry, relaxed • Location: home, work, public place • Social company: alone, w/ friends, w/ strangers • age • gender • personality • cultural background • musical background 2. Music information “retrieval” • Discover ‒ Context-aware Music Recommendation
  • 17. Types of Music Related Research/Products 3. Music creation 21
  • 18. Types of Music Related Research/Products 3. Music creation 22 https://www.youtube.com/watch?v=k1DgNfz1g_s
  • 19. Types of Music Related Research/Products 3. Music creation 23
  • 20. Types of Music Related Research/Products 3. Music creation 24 http://www.inside.com.tw/2016/05/04/positive-grid-bias-head
  • 21. Types of Music Related Research/Products 3. Music creation 25 https://youtu.be/rL5YKZ9ecpg?t=50m
  • 22. Types of Music Related Research/Products 1. Music information analysis • Education, data visualization 2. Music information retrieval • Search: through keywords (genre, instrument, emotion) or audio examples (humming or audio recording) • Match: cross domain retrieval • Discover: recommendation 3. Music creation • Google Magenta, Smule AutoRap, Samsung Hum-On, Positive Grid, Yamaha Vocaloid 26
  • 23. ML in Music: “Music Info Retrieval/Analysis” 28 Music transcription (audio2score) • audio → note (pitch, onset, offset) • audio → instrument (flute, cello) • audio → meter (4/4) • audio → key (E-flat major) audio score Music semantic labeling • audio → genre (classical) • audio → emotion (yearning) • audio → other attributes (slow/fast) labels applications in music retrieval, education, archival, etc (existing song) AI listener
  • 24. Music transcription (audio2score) • audio → note (pitch, onset, offset) • audio → instrument (flute, cello) • audio → meter (4/4) • audio → key (E-flat major) ML in Music: “Music Generation/Synthesis” 29 audio score Music semantic labeling • audio → genre (classical) • audio → emotion (yearning) • audio → other attributes (slow/fast) labels (new song) AI composer random seed AI performer (score2audio)
  • 25. Music transcription (audio2score) • audio → note (pitch, onset, offset) • audio → instrument (flute, cello) • audio → meter (4/4) • audio → key (E-flat major) ML in Music: “Music Generation/Synthesis” 30 audio features Music semantic labeling • audio → genre (classical) • audio → emotion (yearning) • audio → other attributes (slow/fast) labels (existing songs) AI listener score AI DJ audio (a new song) remix, mashup, etc (image from the Internet)
  • 26. Music AI Research • Four broad topics  audio → audio: signal processing  audio → score: transcription  score → score: composition  score → audio: synthesis 31
  • 27. Outline • Types of music related research/products • Fundamentals of music signal processing • Types of data 32
  • 28. Fundamentals of Music Signal Processing • Pitch: which notes are played? • Rhythm: how fast? • Timbre: which instrument(s)? 33 Mozart’s Variationen (1st phrase)
  • 29. Music Information Analysis • music → features 34 melody 1. pitch 2. onset, offset 3. tempo
  • 30. Music Information Analysis • music → features 35 accompaniment 4. chord
  • 31. Music Information Analysis • music → features 36 5. instruments (timbre)
  • 32. Music Information Analysis • music → features 37 6. source separation 7. key, beat, downbeat, meter
  • 33. Music Information Analysis • music → features 38 8. Semantic description ─ genre: pop, classical, jazz, rock, R&B, “Tai,” “aboriginal” ─ emotion: happy, angry, sad, relaxed ─ usage: at party, working, driving, reading, sleeping, romance ─ theme: lonely, breakup, celebration, in love, friend, battle ─ vocal timbre: aggressive, breathy, duet, emotional, rapping, screaming genre listening context emotion
  • 34. Fundamentals of Music Signal Processing Pitch ♪♪♪ ♪♪♪ ♪♪♪ Rhythm Timbre ♪♪ 39 Karaoke scorer chord recognizer page turner
  • 35. Fundamentals of Music Signal Processing Pitch ♪♪♪ ♪ Rhythm ♪♪♪ Timbre ♪♪♪ ♪ 40 instrument classifier content ID Spotify running
  • 36. Fundamentals of Music Signal Processing Pitch ♪♪♪ ♪♪♪ ♪♪♪ Rhythm ♪♪♪ ♪♪♪ ♪♪♪ Timbre ♪♪♪ ♪♪♪ ♪♪♪ 41 similarity search or recommendation music emotion or genre recognizer automatic music video generation
  • 37. Fundamentals of Music Signal Processing 42 • Listens to music tempo, instrumentation, key, time signature, energy, harmonic & timbral structures • Reads about music lyrics, blog posts, reviews, playlists and discussion forums • Learns about trends online music behavior — who's talking about which artists this week, what songs are being streamed or downloaded • Not everything is in audio
  • 38. Fundamentals of Music Signal Processing • Let’s have a look at what we can extract from audio anyway • Time-domain waveform 43
  • 39. Fundamentals of Music Signal Processing • Frequency domain representation • Spectrogram (obtained by Short-Time Fourier Transform) 44
  • 40. Fundamentals of Music Signal Processing • Pitch • Simple for monophonic signals (almost table lookup) • Challenging for polyphonic signals; known as multi- pitch estimation (MPE) ‒ overlapping partials ‒ missing fundamentals 45 8ve 8ve 8ve 8ve 8ve L. Su and Y.-H. Yang, "Combining spectral and temporal representations for multipitch estimation of polyphonic music,“ TASLP 2015
  • 41. Fundamentals of Music Signal Processing • Tempo: beats per minute (bpm) • Onset detection, downbeat estimation tempo estimation, beat tracking, rhythm pattern extraction 48 energy-based spectrum-based
  • 42. Fundamentals of Music Signal Processing • Timbre: difference in time-frequency distribution 50
  • 43. Fundamentals of Music Signal Processing • Timbre: difference in time-frequency distribution ‒ odd-to-even harmonic ratio, decay rate, vibrato etc 51 piano solo human voice
  • 44. Fundamentals of Music Signal Processing • Spectrogram, or the reduced-dimension version “Mel- spectrogram,” is usually considered as a “raw” feature representation of music • Can be treated as an image and then processed by convolutional neural nets (CNN) 52 figure made by Sander Dieleman http://benanne.github.io/2014/ 08/05/spotify-cnns.html
  • 45. Fundamentals of Music Signal Processing • Chromagram: a better “timbre-invariant” feature representation for pitch related tasks (e.g. chord recognition, cover song identification) ‒ merge all the frequency bins with the same note name (C, C#, D, D#, …) ‒ 12-dim vector for each time frame 53 figure made by Meinard Meuller
  • 46. • Source separation can sometimes be helpful ‒ harmonic/percussion separation: given a mixture, separate the percussive part from the harmonic part ‒ harmonic: pitch related info ‒ percussive: tempo related info Fundamentals of Music Signal Processing 54 (a) original (b) harmonic (c) percussive
  • 47. • Source separation can sometimes be helpful ‒ singing voice separation: given a mixture, separate the singing voice from the accompaniment Fundamentals of Music Signal Processing 55
  • 48. Fundamentals of Music Signal Processing • Pitch, tempo, timbre play different roles in different tasks • Spectrogram: a basic feature representation • Multipitch estimation: for better pitch info • Source separation: might improve the extraction for pitch, tempo and also timbre • Feature design (based on domain knowledge) versus feature learning (data-driven; deep learning) 56
  • 49. Outline • Types of music related research/products • Fundamentals of music signal processing • Types of data 57
  • 50. Types of Data • Music audio data ─ not sharable due to copyright issues and business interest ─ however, audio features can be shared ─ or, start with copyright free music 58 free music archive
  • 51. Types of Data • Music listening data ‒ from social platforms via e.g., last.fm API, Spotify API ‒ from Twitter: #nowplaying dataset 59
  • 52. Types of Data • Big music text data ─ score, lyrics, review, playlist, tags, Wikipedia, etc ─ not everything is in audio ─ some of them are easier to get from non-audio data 60
  • 53. Types of Data • Big sensor data? ─ sensors attached to “things” or “human beings” 61
  • 54. Data Science in Music • The missing “D” in Data Science — domain knowledge • Music information retrieval = musicology + signal processing + machine learning + others 62
  • 55. Resources • Conference proceedings ‒ Int’l Soc. Music Information Retrieval Conf. (ISMIR) ‒ Int’l Conf. Acoustic, Speech, and Signal Processing (ICASSP) ‒ AAAI, IJCAI, ICML, NeurIPS, ICLR, ACM MM • Transactions ‒ Transactions of the Int’l Soc. Music Information Retrieval (TISMIR) ‒ IEEE Trans. Audio, Speech and Language Processing (TASLP) ‒ IEEE Trans. Multimedia (TMM) 63
  • 56. Resources • MIREX (MIR Evaluation eXchange) ‒ Part of ISMIR ‒ http://www.music-ir.org/mirex/wiki/MIREX_HOME  Audio Onset Detection  Audio Beat Tracking  Audio Key Detection  Audio Downbeat Detection  Real-time Audio to Score Alignment(a.k.a Score Following)  Audio Cover Song Identification  Discovery of Repeated Themes & Sections  Audio Melody Extraction  Query by Singing/Humming  Audio Chord Estimation  Singing Voice Separation  Audio Fingerprinting  Music/Speech Classification/Detection  Audio Offset Detection
  • 57. Resources • Courses ‒ Juhan Nam @ KAIST https://mac.kaist.ac.kr/~juhan/gct634/index.html ‒ Meinard Meuller @ Universität Erlangen-Nürnberg https://www.audiolabs- erlangen.de/fau/professor/mueller/teaching ‒ Juan Bello @ NYU https://wp.nyu.edu/jpbello/teaching/mir/ ‒ CCRMA summer school @ Stanford https://ccrma.stanford.edu/workshops/music- information-retrieval-mir-2015 ‒ Xavier Serra @ UPF, Spain https://zh-tw.coursera.org/course/audio 65