Machine Learning for
Creative AI Applications
in Music
Music and Audio Compuing Lab,
Research Center for IT Innovation,
Academia Sinica
Yi-Hsuan Yang Ph.D.
http://www.citi.sinica.edu.tw/pages/yang/
yang@citi.sinica.edu.tw
May 24, 2018
ML in Music: “Music Info Retrieval/Analysis”
2
Music transcription (audio2score)
• audio → note (pitch, onset, offset)
• audio → instrument (flute, cello)
• audio → meter (4/4)
• audio → key (E-flat major)
audio score
Music semantic labeling
• audio → genre (classical)
• audio → emotion (yearning)
• audio → other attributes (slow/fast)
labels
applications in
music retrieval,
education,
archival, etc
(existing
song)
Music transcription (audio2score)
• audio → note (pitch, onset, offset)
• audio → instrument (flute, cello)
• audio → meter (4/4)
• audio → key (E-flat major)
ML in Music: “Music Generation/Synthesis”
3
audio score
Music semantic labeling
• audio → genre (classical)
• audio → emotion (yearning)
• audio → other attributes (slow/fast)
labels
(new
song)
AI composer
random seed
AI performer (score2audio)
Music transcription (audio2score)
• audio → note (pitch, onset, offset)
• audio → instrument (flute, cello)
• audio → meter (4/4)
• audio → key (E-flat major)
ML in Music: “Music Generation/Synthesis”
4
audio
features
Music semantic labeling
• audio → genre (classical)
• audio → emotion (yearning)
• audio → other attributes (slow/fast)
labels
(existing
songs)
AI listener
score
AI DJ
audio
(a new
song)
concatenation of the three songs
https://www.youtube.com/watch?v=6MCB8Jbndyw
Recap
• ML in Music
 Music information retrieval/analysis
 AI listener
 Music transcription (audio → score)
 Music semantic labeling (audio → label)
 For analyzing and indexing existing songs
 Music generation/synthesis
 AI composer (random seed → score)
 AI performer (score → audio)
 AI DJ (exis ng songs → new song)
 For creating new music
5
ML for Creative AI Applications in Music
• Three examples
1. audio (mixture) → audio (vocal/accompaniment)
6
(image from the Internet)
ML for Creative AI Applications in Music
• Three examples
1. audio (mixture) → audio (vocal/accompaniment)
2. audio (full song) → audio (highlight)
7
ML for Creative AI Applications in Music
• Three examples
1. audio (mixture) → audio (vocal/accompaniment)
2. audio (full song) → audio (highlight)
3. audio (many songs) → audio clip (medley)
8
ML for Creative AI Applications in Music
• Three examples
1. Separator: mixture → vocal/accompaniment
2. Highlighter: full song → highlight
3. Sequencer: many songs → medley
9
Figure from Spotify’s ISMIR’17 paper
1) Separator: Goal
10
# Input # Output
1) Separator: Demo
http://ss.ciaua.com/
Evaluation Campaign: SiSEC 2018
13
Ours
Sony
Oracle
Ours
2) Highlighter: Goal
• Extract music highlights
• Application: music browsing, ringtone generation
 “Automatic DJ mix generation using highlight detection,”
ISMIR 2017 (from Clova Line WAVE)
14
↙30 sec highlight
“A song”
2) Highlighter: Methodology
• CNN for emotion prediction + attention
(predicting the weights of different parts of a song)
• Transfer learning: does not need annotations of highlights
15
TISMIR’18TISMIR’18
Pop music highlighter:
Marking the emotion
keypoints
https://remyhuang.github.io/music_thumbnailing/
2) Highlighter: Demo
16
周杰倫 - 稻香
光良 - 童話
胡夏 - 那些年
Linkin Park - Burn It Down
Adam Lambert - Whataya Want from Me
3) Sequencer: Goal
• Find an ordering of music pieces
17
“Automatic playlist sequencing and transitions,” Proc. ISMIR 2017 (from )
3) Sequencer: Methodology
• Make the computer play “music puzzle games”
 Divide a song into several non-overlapping chunks
 Create positive pairs (R1R2) and negative pairs (R2R1)
• Siamese CNN + similarity embedding
 [a b c d], [a b c d],
[a b c d] [d c a b]
18
AAAI’18AAAI’18
Generating music
medleys via playing
music puzzle games
https://remyhuang.github.io/music_puzzle_game/
3) Sequencer: Demo
19
20
audio features
(existing
songs)
AI listener AI DJ audio
(a new
song)
Music transcription (audio2score)
• audio → note (pitch, onset, offset)
• audio → instrument (flute, cello)
• audio → meter (4/4)
• audio → key (E-flat major)
audio score
Music semantic labeling
• audio → genre (classical)
• audio → emotion (yearning)
• audio → other attributes (slow/fast)
labels
(a new
song)
AI composer
random seed
AI performer (score2audio)
Conclusion
Conclusion
• Interesting topics
 Lyrics + score generation
 Interactive generation
 Data driven approach + music theory
 Style transfer
 More AI DJs
21

Machine Learning for Creative AI Applications in Music (2018 May)

  • 1.
    Machine Learning for CreativeAI Applications in Music Music and Audio Compuing Lab, Research Center for IT Innovation, Academia Sinica Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw May 24, 2018
  • 2.
    ML in Music:“Music Info Retrieval/Analysis” 2 Music transcription (audio2score) • audio → note (pitch, onset, offset) • audio → instrument (flute, cello) • audio → meter (4/4) • audio → key (E-flat major) audio score Music semantic labeling • audio → genre (classical) • audio → emotion (yearning) • audio → other attributes (slow/fast) labels applications in music retrieval, education, archival, etc (existing song)
  • 3.
    Music transcription (audio2score) •audio → note (pitch, onset, offset) • audio → instrument (flute, cello) • audio → meter (4/4) • audio → key (E-flat major) ML in Music: “Music Generation/Synthesis” 3 audio score Music semantic labeling • audio → genre (classical) • audio → emotion (yearning) • audio → other attributes (slow/fast) labels (new song) AI composer random seed AI performer (score2audio)
  • 4.
    Music transcription (audio2score) •audio → note (pitch, onset, offset) • audio → instrument (flute, cello) • audio → meter (4/4) • audio → key (E-flat major) ML in Music: “Music Generation/Synthesis” 4 audio features Music semantic labeling • audio → genre (classical) • audio → emotion (yearning) • audio → other attributes (slow/fast) labels (existing songs) AI listener score AI DJ audio (a new song) concatenation of the three songs https://www.youtube.com/watch?v=6MCB8Jbndyw
  • 5.
    Recap • ML inMusic  Music information retrieval/analysis  AI listener  Music transcription (audio → score)  Music semantic labeling (audio → label)  For analyzing and indexing existing songs  Music generation/synthesis  AI composer (random seed → score)  AI performer (score → audio)  AI DJ (exis ng songs → new song)  For creating new music 5
  • 6.
    ML for CreativeAI Applications in Music • Three examples 1. audio (mixture) → audio (vocal/accompaniment) 6 (image from the Internet)
  • 7.
    ML for CreativeAI Applications in Music • Three examples 1. audio (mixture) → audio (vocal/accompaniment) 2. audio (full song) → audio (highlight) 7
  • 8.
    ML for CreativeAI Applications in Music • Three examples 1. audio (mixture) → audio (vocal/accompaniment) 2. audio (full song) → audio (highlight) 3. audio (many songs) → audio clip (medley) 8
  • 9.
    ML for CreativeAI Applications in Music • Three examples 1. Separator: mixture → vocal/accompaniment 2. Highlighter: full song → highlight 3. Sequencer: many songs → medley 9 Figure from Spotify’s ISMIR’17 paper
  • 10.
  • 11.
  • 12.
    Evaluation Campaign: SiSEC2018 13 Ours Sony Oracle Ours
  • 13.
    2) Highlighter: Goal •Extract music highlights • Application: music browsing, ringtone generation  “Automatic DJ mix generation using highlight detection,” ISMIR 2017 (from Clova Line WAVE) 14 ↙30 sec highlight “A song”
  • 14.
    2) Highlighter: Methodology •CNN for emotion prediction + attention (predicting the weights of different parts of a song) • Transfer learning: does not need annotations of highlights 15 TISMIR’18TISMIR’18 Pop music highlighter: Marking the emotion keypoints
  • 15.
    https://remyhuang.github.io/music_thumbnailing/ 2) Highlighter: Demo 16 周杰倫- 稻香 光良 - 童話 胡夏 - 那些年 Linkin Park - Burn It Down Adam Lambert - Whataya Want from Me
  • 16.
    3) Sequencer: Goal •Find an ordering of music pieces 17 “Automatic playlist sequencing and transitions,” Proc. ISMIR 2017 (from )
  • 17.
    3) Sequencer: Methodology •Make the computer play “music puzzle games”  Divide a song into several non-overlapping chunks  Create positive pairs (R1R2) and negative pairs (R2R1) • Siamese CNN + similarity embedding  [a b c d], [a b c d], [a b c d] [d c a b] 18 AAAI’18AAAI’18 Generating music medleys via playing music puzzle games
  • 18.
  • 19.
    20 audio features (existing songs) AI listenerAI DJ audio (a new song) Music transcription (audio2score) • audio → note (pitch, onset, offset) • audio → instrument (flute, cello) • audio → meter (4/4) • audio → key (E-flat major) audio score Music semantic labeling • audio → genre (classical) • audio → emotion (yearning) • audio → other attributes (slow/fast) labels (a new song) AI composer random seed AI performer (score2audio) Conclusion
  • 20.
    Conclusion • Interesting topics Lyrics + score generation  Interactive generation  Data driven approach + music theory  Style transfer  More AI DJs 21