CSTalks - Music Information Retrieval - 23 Feb

0 views
1,439 views

Published on

Published in: Technology
1 Comment
2 Likes
Statistics
Notes
No Downloads
Views
Total views
0
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

CSTalks - Music Information Retrieval - 23 Feb

  1. 1. CSTalks Similarity Measures in Music Information Retrieval Systems Speaker: Zhonghua Li Supervisor: Ye Wang lizhongh@comp.nus.edu.sg 1
  2. 2. Where do you search for music? Music is an important part of our way of life. Music stores Online music Mufin 2
  3. 3. 3
  4. 4. Outline Music Information Retrieval (MIR)  Definition  Applications – Search  Applications -- Recommendation Similarity Measure Methods  Text-based Method  Audio Feature-based Method  Semantic Concept-based Method  Multimodal Fusion Method Conclusion and Future Directions 4
  5. 5. Definition Music Information Retrieval  Is the process of searching for, and finding, music objects, or part of objects, via a query framed musically and/or in musical terms.  Music objects: Recordings (wav, mp3, etc.), scores, parts, etc.  Musically framed query: Singing, humming, keyboad, notation-based, MIDI files, sound files, etc.  Music terms: Genre, style, tempo, bibliography, etc.  Applications  Music search, recommendation, identification ,etc. 5
  6. 6. Outline Music Information Retrieval (MIR)  Definition  Applications – Search  Applications -- Recommendation Music Similarity Measure  Text-based Method  Audio Feature-based Method  Semantic Concept-based Method  Multimodal Fusion Method Conclusion and Future Directions 6
  7. 7. Applications -- Search Text-based Music Search  Compare a textual query with the metadata  Is adopted by most existing systems.  Examples: Last.fm, Musicovery, … 7
  8. 8. Applications -- Search Content-based Music Search  Compare audio query with audio content  Query-by-humming/singing/recording: midomi 8
  9. 9. Applications -- Search Content-based Music Search  Compare rhythm tapped with audio content  Query-by-tapping: SongTapper 9
  10. 10. Applications -- Search User’s information need Online Offline (intention) : Query Explicit Query: text, audio, etc. Intention Gap Query Formation Similarity measure: Descriptors Documents Query  Music documents in the Match Descriptor Extraction Database Indexing Semantic Gap Index Descriptors Ranking: relevant documents by domain specific criterions (no. of Ranking hits ). Ranked List Presentation Results 10
  11. 11. Outline Music Information Retrieval (MIR)  Definition  Applications – Search  Applications -- Recommendation Similarity Measure Methods  Text-based Method  Audio Feature-based Method  Semantic Concept-based Method  Multimodal Fusion Method Conclusion and Future Directions 11
  12. 12. Applications – Recommendation Collaborative-Filtering-based Recommendation  Last.fm: what you (and others ) listen to and like,  Amazon: customers who shopped for … also shopped for … 12
  13. 13. Applications – Recommendation Collaborative-Filtering-based Recommendation  Last.fm: what you (and others ) listen to and like,  Amazon: customers who shopped for … also shopped for …  Example:  Users: A, B, and C  Music: 1, 2, …, 8 Small similarity Large similarity C A B Similarity 4 Similarity 1 Measure 1 Measure 2 2 6 3 3 8 4 5 Recommend to user A 13
  14. 14. Applications -- Recommendation Audio Content-based Recommendation  Recommend songs which have similar audio content to the songs that you like.  Pandora: Music database Music Experts User Listen Instrument: Instrument: Similarity Instrument: Vocal: Vocal: Vocal: Structure: Measure Structure: …Structure: … … 400 Attributes/song Recommendations 14
  15. 15. Applications -- Recommendation User’s information need Online / Offline Offline (intention): User Profile Implicit user profiles: ratings, Intention Gap Profile Capture listening history, etc. Descriptors Documents Similarity measure: Descriptor User profiles Music Match Extraction documents/other user profiles Semantic Gap Index Indexing Descriptors Ranking: relevant documents Ranking by some domain specific Ranked List criterions (no. of hits). Presentation Results 15
  16. 16. Similarity Measure one of the most fundamental concepts in MIR Online / Offline Offline Closely related to User Profile /query  What information music Profile/query Intention Gap Capture contains.  How this information is Descriptors Documents represented. Match Descriptor Extraction  How to match between themSemantic Gap Indexing Index Descriptors Ranking Ranked List Presentation Results 16
  17. 17. Music Information Plane Similarity can be measure from different aspects. Song1: New favorite - Alison Krauss Song2: She is Beautiful - Andrew W.K. Song1 Song2 Female Male Dissimilar Gentle Aggressive Slow fast Guitar Similar Tempo: ~162 BPM (Beat Per Minute) * O. C. Herrada. Music recommendation and discovery in Music the long tail. PhD thsis. 2008. 17
  18. 18. Outline Music Information Retrieval (MIR)  Definition  Applications – Search  Applications -- Recommendation Similarity Measure Methods  Text-based Method  Audio Feature-based Method  Semantic Concept-based Method  Multimodal Fusion Method Conclusion and Future Directions 18
  19. 19. Similarity Measure Methods Text-based Method: Okapi BM-25 Ranking  Given: queryQ, containing keywords q1, …, qn, music documents: bag of words.  BM25 ranking function can be formulated as:  f(qi, D) is qi’s term frequency (tf) in document D.  |D| is the length of document D in words.  avgdl is the average document length in the collection.  k1 and b are free parameters, usually set as k1=2.0 and b=0.75.  IDF(qi) is the inverse document frequency (idf), calculated as:. The query term appears in this document frequently. f (qi, D). And it doesn’t appear in other document. IDF 19
  20. 20. Similarity Measure Methods Text-based Method:  Pros  Simple & efficient  Cons  Affected by noisy/wrong texts  Songs with no text cannot be retrieved  Require high-level domain knowledge to create good metadata  “Text retrieval on audio metadata” not pure music retrieval 20
  21. 21. Outline Music Information Retrieval (MIR)  Definition  Applications – Search  Applications -- Recommendation Similarity Measure Methods  Text-based Method  Audio Feature-based Method  Semantic Concept-based Method  Multimodal Fusion Method Conclusion and Future Directions 21
  22. 22. Similarity Measure Methods Audio Feature-based Method Audio feature Distribution Model Music extraction modeling comparison 22
  23. 23. Similarity Measure Methods Audio Feature-based Method  Audio Feature extraction -> Distribution modeling -> Model Comparison frame Feature Vector … 23
  24. 24. Existing Works Audio Feature-based Method  Audio Feature extraction  Distribution modeling  Model Comparison  Use low-level feature directly  Pitch, loudness, MFCC (Blum et al.[3], 1999)  Histogram of MFCC (Foote[4], 1997)  Spectrum, rhythm, chord changesingleVector (Tzanetakis [5], 2002)  Low-level features  higher-level features.  Cluster MFCC=>model comparison (Aucouturier[6], 2002)  MFCC => Gaussian Mixture Models => model comparison  MFCC =>“anchor space”, compare probability models (Berenzweig et al. [7], 2003) 24
  25. 25. Similarity Measure Methods Audio Feature-based Method  Audio Feature extraction  Distribution modeling  Model Comparison  Euclidean /Cosine distance (uniform-length feature vectors)  Distance between two probability distributions  Kullback-Leibler divergence (KL Distance) / relative entropy  No closed form for Gaussian Distributions  Centroid distance: Euclidean distance between the overall means;  Sampling based method: compute the likelihood of one model given points sampled from another; very computationally expensive;  Earth-Mover’s distance Berenzweig, A., Logan, B., Ellis, D. P., and Whitman, B. P., A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures. Computer Music Journal. 28, 2004, 25
  26. 26. Similarity Measure Methods Audio Feature-based Method  Pros  Can deal with new songs with no or few texts.  Save human labors from annotating each song manually  Cons  Time complexity is relatively high.  Features ≠ audio piece: Two songs with very similar features may sounds very different.  The average performance is reaching the glass ceiling of around 65% in accuracy. 26
  27. 27. Outline Music Information Retrieval (MIR)  Definition  Applications – Search  Applications -- Recommendation Similarity Measure Methods  Text-based Method  Audio Feature-based Method  Semantic Concept-based Method  Multimodal Fusion Method Conclusion and Future Directions 27
  28. 28. Similarity Measure Methods Semantic Concept-based Method  Nature of user queries  Far beyond of bibliographic text and audio search  Semantically-rich  Syntactically- undetermined e.g.: “Find me a classical and happy song”, or “Find me a song to relax” “Find me some songs for parties/ weddings/ in churches” …  Collaborative(social) tagging is very popular on Web 2.0. Users annotate their feelings or opinions to the music. Tags, comments, etc. 28
  29. 29. Similarity Measure Methods Semantic Concept-based Method  Tags VS user queries (Last.fm)Tag Type Frequency Multi-tag search queriesGenre 68% 51%Locale 12% 7%Mood 5% 4%Opinion 4% 2%Instrumentation 4% 5%Style 3% 26% . Paul Lamere. Social tagging and music information retrieval. Journal of New Music Research. 2008. . Klaas Bosteels, Elias Pampalk, and Etienne E. Kerre. Music retrieval based on social tags: a case study. ISMIR, 2008. 29
  30. 30. Similarity Measure Methods Vocabulary: classical, jazz, … piano, violin, …, female, male, … … Model Model … Model … Probability vectorSong1 Similarity …Song2 30
  31. 31. Outline Music Information Retrieval (MIR)  Definition  Applications – Search  Applications -- Recommendation Similarity Measure Methods  Text-based Method  Audio Feature-based Method  Semantic Concept-based Method  Multimodal Fusion Method Conclusion and Future Directions 31
  32. 32. Similarity Measure Methods Multimodal Method  Information keeps growing.  One of the most important ongoing trends: Metadata Audio Semantic Content Concept  Users are important. 32
  33. 33. Similarity Measure Methods Multimodal Method Document Vectors Customization Fuzzy Music Semantic Vector B. Zhang, J. Shen, Q. Xiang, and Y. Wang. CompositMap: a novel framework for music similarity measure. ACM Multimedia, 2009. 33
  34. 34. Outline Music Information Retrieval (MIR)  Definition  Applications – Search  Applications -- Recommendation Similarity Measure Methods  Text-based Method  Audio Feature-based Method  Semantic Concept-based Method  Multimodal Fusion Method Conclusion and Future Directions 34
  35. 35. Conclusion and Future Directions What makes MIR (and the similarity measure) so tricky? Music information is  Multimodal: audio, metadata, social , …  Multicultural: e.g., modern art, Indian ragas, …  Multirepresentational: audio, MIDI, score, …  Multifaceted: melody, tempo, beat, … … Similarity can be measured from different aspects. 35
  36. 36. Conclusions and Future Directions What do users really want? Intention Gap  User interactions with the system.  Learn a good user preference modeling What kind of music features can really capture this need?  Content –Tags Semantic Gap  Leverage more social data? Comments, ratings, groups, playlist, other user created information, … How to fuse multiple information effectively?  Identify the relevant/discriminative information aspects  Fusion Methods 36
  37. 37. 37
  38. 38. References [2] F[1] O. C. Herrada. Music recommendation and discovery in the long tail. PhD thsis. 2008. . Pachet. Knowledge management and musical metadata. Encyclopedia of Knowledge Management. Idea Group, 2005. [3] T. L. Blum, D. F. Keislar, J. A. Wheaton, and E. H. Wold. Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information. U.S. Patent 5, 918, 223. [4] J. T. Foote. Content-based retrieval of music and audio. SPIE, 1997. [5] G. Tzanetakis. Manipulation, analysis, and retrieval system for audio signals. PhD thsis, 2002. [6] J. J. Aucouturier and F. Pachet. Music similarity measure: What’s the use? International Symposium on Music information retrieval. 2002. [7] A. Berenzweig, D. P. W. Ellis and S. Lawrence. Anchor space for classification and similarity measurement for music. ICME 2003. 38
  39. 39. References [8] B. Zhang, J. Shen, Q. Xiang and Y. Wang. CompositeMap: a novel framework for music similarity measure. SIGIR, 2009. [9] B. Whiteman and S. Lawrence. Inferring descriptions and similarity for music from community metadata. International computer music conference. 2002. [10] M. Schedl, T. Pohle, P. Knees and G. Widmer. Assigning and visualizing music genre by web-based co-occurrence analysis. ISMIR 2006. [11] B. Whitman and Paris Smaragdis. Combining musical and cultural features for intelligent style detection. ISMIR 2002. [12] L. Chen, P. Wright, and W. Nejdl. Improving music genre classification using collaborative tagging data. WSDM, 2009. [13] Benedikt Raes. Automatic generation of music metadata. ISMIR, 2009. 39

×