楊奕軒/音樂資料檢索

1,242 views

Published on

Yi-Hsuan Yang is an Associate Research Fellow with Academia Sinica. He received his Ph.D. degree in Communication Engineering from National Taiwan University in 2010, and became an Assistant Research Fellow in Academia Sinica in 2011. He is also an Adjunct Associate Professor with the National Tsing Hua University, Taiwan. His research interests include music information retrieval, machine learning and affective computing. Dr. Yang was a recipient of the 2011 IEEE Signal Processing Society (SPS) Young Author Best Paper Award, the 2012 ACM Multimedia Grand Challenge First Prize, and the 2014 Ta-You Wu Memorial Research Award of the Ministry of Science and Technology, Taiwan. He is an author of the book Music Emotion Recognition (CRC Press 2011) and a tutorial speaker on music affect recognition in the International Society for Music Information Retrieval Conference (ISMIR 2012). In 2014, he served as a Technical Program Co-chair of ISMIR, and a Guest Editor of the IEEE Transactions on Affective Computing and the ACM Transactions on Intelligent Systems and Technology.

Published in: Data & Analytics

楊奕軒/音樂資料檢索

  1. 1. Music Information Retrieval Music & Audio Computing Lab, Research Center for IT Innovation, Academia Sinica Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw
  2. 2. Prelude • PI @ Music & Audio Computing Lab, Academia Sinica, since 2011 • 10420CS 573100 “Music Information Retrieval” @ NTHU, 2016 https://twtmir.wordpress.com/ 2 https://teachingmir.wikispaces.com/courses
  3. 3. Outline • Types of music related research • Fundamentals of music signal processing • New opportunities in the big data era 3
  4. 4. Types of Music Related Research 1. Music creation 4 https://www.youtube.com/watch?v=3OEmzI52stk
  5. 5. Types of Music Related Research 1. Music creation 5 https://www.youtube.com/watch?v=k1DgNfz1g_s
  6. 6. Types of Music Related Research 1. Music creation 6 https://www.youtube.com/watch?v=wj1r9YJ6INA
  7. 7. Types of Music Related Research 1. Music creation 7 http://www.inside.com.tw/2016/05/04/positive-grid-bias-head
  8. 8. Types of Music Related Research 1. Music creation 8 https://youtu.be/rL5YKZ9ecpg?t=50m
  9. 9. Types of Music Related Research 2. Music information “analysis” 9 automatic page turner automatic Karaoke scoring interactive concert
  10. 10. Types of Music Related Research 2. Music information “analysis” 10 chord recognizer music browsing assistant
  11. 11. Types of Music Related Research 3. Music information “retrieval” • Search ‒ through keywords/labels (genre, instrument, emotion) 11
  12. 12. Types of Music Related Research 3. Music information “retrieval” • Search ‒ through keywords/labels (genre, instrument, emotion) ‒ through audio examples (humming, audio recording) 12
  13. 13. Types of Music Related Research 3. Music information “retrieval” • Match ‒ to match 1) a video clip, 2) a photo slideshow, 3) a song lyrics, or 4) a given context ‒ cross-domain retrieval 13
  14. 14. Types of Music Related Research 3. Music information “retrieval” • Discover ‒ recommendation: diversity, serendipity, explanations 14
  15. 15. Types of Music Related Research 3. Music information “retrieval” • Discover ‒ recommendation: diversity, serendipity, explanations 15
  16. 16. Types of Music Related Research 1. Music creation • Google Magenta, Smule AutoRap, Samsung Hum-On, Positive Grid, Yamaha Vocaloid 2. Music information analysis • Education, data visualization 3. Music information retrieval • Search: through keywords (genre, instrument, emotion) or audio examples (humming or audio recording) • Match: cross domain retrieval • Discover: recommendation 16
  17. 17. Outline • Types of music related research • Fundamentals of music signal processing • New opportunities in the big data era 17
  18. 18. Fundamentals of Music Signal Processing • Pitch: which notes are played? • Tempo: how fast? • Timbre: which instrument(s)? 18 Mozart’s Variationen (1st phrase)
  19. 19. Fundamentals of Music Signal Processing Pitch ♪♪♪ ♪♪♪ ♪♪♪ Tempo ♪ ♪ ♪ Timbre ♪ ♪♪ ♪ 19 Karaoke scorer chord recognizerpage turner
  20. 20. Fundamentals of Music Signal Processing Pitch ♪♪♪ ♪ Tempo ♪♪♪ Timbre ♪♪♪ ♪ 20 instrument classifier content ID Spotify running
  21. 21. Fundamentals of Music Signal Processing Pitch ♪♪♪ ♪♪♪ ♪♪♪ Tempo ♪♪♪ ♪♪♪ ♪♪♪ Timbre ♪♪♪ ♪♪♪ ♪♪♪ 21 similarity search or recommendation music emotion or genre recognizer automatic music video generation
  22. 22. Fundamentals of Music Signal Processing 22 • Listens to music tempo, instrumentation, key, time signature, energy, harmonic & timbral structures • Reads about music lyrics, blog posts, reviews, playlists and discussion forums • Learns about trends online music behavior — who's talking about which artists this week, what songs are being streamed or downloaded • Not everything is in audio
  23. 23. Fundamentals of Music Signal Processing • Let’s have a look at what we can extract from audio anyway • Time-domain waveform 23
  24. 24. Fundamentals of Music Signal Processing • Frequency domain representation • Spectrogram (obtained by Short-Time Fourier Transform) 24
  25. 25. Fundamentals of Music Signal Processing • Pitch • Simple for monophonic signals (almost table lookup) • Challenging for polyphonic signals; known as multi- pitch estimation (MPE) ‒ overlapping partials ‒ missing fundamentals 25 8ve 8ve 8ve 8ve 8ve
  26. 26. Fundamentals of Music Signal Processing • Tempo: beats per minute (bpm) • Onset detection, downbeat estimation tempo estimation, beat tracking, rhythm pattern extraction 26 energy-based spectrum-based
  27. 27. Fundamentals of Music Signal Processing • Timbre: difference in time-frequency distribution 27
  28. 28. Fundamentals of Music Signal Processing • Timbre: difference in time-frequency distribution ‒ odd-to-even harmonic ratio, decay rate, vibrato etc 28 piano solo human voice
  29. 29. Fundamentals of Music Signal Processing • Spectrogram, or the reduced-dimension version “Mel- spectrogram,” is usually considered as a “raw” feature representation of music • Can be treated as an image and then processed by convolutional neural nets (CNN) 29 figure made by Sander Dieleman http://benanne.github.io/2014/ 08/05/spotify-cnns.html
  30. 30. Fundamentals of Music Signal Processing • Chromagram: a better “timbre-invariant” feature representation for pitch related tasks (e.g. chord recognition, cover song identification) ‒ merge all the frequency bins with the same note name (C, C#, D, D#, …) ‒ 12-dim vector for each time frame 30 figure made by Meinard Meuller
  31. 31. • Source separation can sometimes be helpful ‒ harmonic/percussion separation: given a mixture, separate the percussive part from the harmonic part ‒ harmonic: pitch related info ‒ percussive: tempo related info Fundamentals of Music Signal Processing 31 (a) original (b) harmonic (c) percussive
  32. 32. • Source separation can sometimes be helpful ‒ singing voice separation: given a mixture, separate the singing voice from the accompaniment Fundamentals of Music Signal Processing 32
  33. 33. Fundamentals of Music Signal Processing • Pitch, tempo, timbre play different roles in different tasks • Spectrogram: a basic feature representation • Multipitch estimation: for better pitch info • Source separation: might improve the extraction for pitch, tempo and also timbre • Feature design (based on domain knowledge) versus feature learning (data-driven; deep learning) 33
  34. 34. Outline • Types of music related research • Fundamentals of music signal processing • New opportunities in the big data era 34
  35. 35. New Opportunities in the Big Data Era • Big music audio data? No, only if you work for a big company ─ not sharable due to copyright issues and business interest ─ however, audio features can be shared ─ or, start with copyright free music 35 free music archive
  36. 36. New Opportunities in the Big Data Era • Big music listening data? Yes, some of them can be crawled from social platform websites ‒ from last.fm API, EchoNest API ‒ from Twitter: #nowplaying dataset 36
  37. 37. New Opportunities in the Big Data Era • Big music text data? Yes, plenty of data ─ score, lyrics, review, playlist, tags, Wikipedia, etc ─ not everything is in audio ─ some of them are easier to get from non-audio data 37
  38. 38. New Opportunities in the Big Data Era • Big sensor data? Yes, everywhere ─ sensors attached to “things” or “human beings” ─ emerging new applications 1) music generation 2) context aware music recommendation 38 figure from pinterest figure from ask.audio
  39. 39. New Opportunities in the Big Data Era • The missing “D” in Data Science — domain knowledge • Music information retrieval = musicology + signal processing + machine learning + others 39
  40. 40. Postlude • Extension reading ‒ International Conference on Music Information Retrieval (ISMIR) ‒ International Conference on Acoustic, Speech, and Signal Processing (ICASSP) ‒ MIREX (MIR Evaluation eXchange) ‒ IEEE Transactions on Audio, Speech and Language Processing (TASLP) ‒ IEEE Transactions on Multimedia (TMM) 40

×