Brave New Task: Musiclef Multimodal Music Tagging

1,204 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,204
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Brave New Task: Musiclef Multimodal Music Tagging

  1. 1. Multimodal Music Tagging TaskNicola Orio – University of PadovaCynthia C. S. Liem – Delft University of TechnologyGeoffroy Peeters – UMR STMS IRCAM-CNRS, ParisMarkus Schedl – Johannes Kepler University, LinzMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 1
  2. 2. Multimodal music tagging• Definition • Songs of a commercial music library need to be categorized according to their usage in TV and radio broadcasts (e.g. soundtracks, jingles)• Practical motivation • The search for suitable music for video productions is a major activity for professionals and lay users alike • Collaborative filtering systems are taking their role • Notwithstanding their known limitations: long-tail, cold start… • Annotating professional music libraries is another important professional activityMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 2
  3. 3. Human assessment Different sources of information are routinely exploited by professionals to overcome limitations of individual mediaMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 3
  4. 4. Goals of MusiClef• To focus evaluation on professional application scenarios • Textual description of music items• To grant replication of experiments and results • Feature extraction phase is crucial – released features computed with public, open-source library (MIRToolbox)• To promote the exploitation of multimodal sources of information • Content (audio) + Context (tags & webpages)• To disseminate music related initiatives • Outside the music information retrieval communityMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 4
  5. 5. Evaluation initiatives – 1• MIREX (since 2004) • Community-based selection of tasks • Many tasks address audio feature extraction algorithms • Participants submit algorithms that are run by organizers • Music files are not shared with participants• Million Song Dataset (since 2011) • Task on music recommendation proposed by organizers • Audio features are computed using proprietary algorithms • Only features are shared with participantsMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 5
  6. 6. Evaluation initiatives – 2• Quaero-Eval (since 2012) • Tasks agreed with participants • Strategies to grant public access to evaluation results • Participants run training experiments on a shared repository • Runs on test set made by the organizersMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 6
  7. 7. Test collection – 1• Individual songs of pop and rock music • 1355 songs (from 218 artists) • train (975) and test (380) split• Social tags • Gathered from Last.fm API• Multilingual sets of Web pages related to artists+albums • Mined querying Google• Acoustic features: MFCC (using MIRToolbox) with a window length of 200ms and 50% overlapMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 7
  8. 8. Test collection – 2• Test collection created starting from the “500 Greatest Songs of All Time” (Rolling Stone) • Expected high number of social tags and web pages• Ground truth created by experts in the domain • 355 tags selected (167 genre, 288 usage) • Tags associated to less than 20 songs were discarded• Reference implementation in Matlab • Participants has an example to run a complete experiment • Code for the evaluation made already availableMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 8
  9. 9. Evaluation measures• Standard IR measures• Accuracy• Precision• Recall• Specificity• F-measureMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 9
  10. 10. Examining tags more closely• Some tags are more equal than others… hard rock ballroom melancholic travel countryside bright• Thus, we propose to also analyze results employing a higher-level tag categorizationMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 10
  11. 11. Tag categorization – 1• Affective, mood-related aspects: • activity: the amount of perceived music activity, without implying strong positive or negative affective qualities (e.g. fast, mellow, lazy) • affective state: affective qualities that can only be connected and attributed to living beings (e.g. aggressive, hopeful) • atmosphere: affective qualities that can be connected to environments (e.g. chaotic, intimate).MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 11
  12. 12. Tag categorization – 2• Situation, time and space aspects of the music: • Physical situation: concrete physical environments (e.g. city, night). • Occasion: implications of time and space, typically connected to social events (e.g. holiday, glamour).• Sociocultural genre (e.g. new wave, r&b, punk)• Sound qualities: • timbral aspects (e.g. acoustic, bright) • temporal aspects (e.g. beat, groove).• Other (e.g. catchy, evocative).MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 12
  13. 13. Reference implementation• Made in MATLAB and released publicly• Simple and straightforward approaches: • Individual GMMs for audio, user tags, web pages • Tagging process: 1-NN qualification using symmetrized KL• Scenarios tested: • Audio, user tags, web pages individually • Majority vote • UnionMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 13
  14. 14. Baseline results – 1• Evaluation of the submitted runs and of the reference implementation • Results with different modalities over the full datasetstrategy accuracy recall precision specificity f-measureaudio 0.894 0.148 0.127 0.939 0.126tags 0.898 0.061 0.039 0.942 0.037web pages 0.897 0.050 0.007 0.954 0.011majority 0.880 0.123 0.086 0.922 0.086union 0.824 0.240 0.115 0.845 0.134MediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 14
  15. 15. Baseline results – 2 1. activity, energy 2. affective state 3. atmosphere 4. other 5. situation: occasion 6. situation: physical 7. sociocultural genre 8. sound: temporal 9: sound: timbralMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 15
  16. 16. Participation• Initially a lot of interest - about 8 explicitly interested parties• But ultimately just one participant (LUTIN UserLab) • Aggregation of estimators• Currently investigating what happened to the 7 others • So far, it appears ISMIR 2012 was inconveniently close • The 3 other MusiClef co-organizers will discuss this thereMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 16
  17. 17. Conclusions• We established a multimodal music tagging benchmark task• Special effort in facilitating deeper tag analysis• We would like a 2013 multimodal music benchmark task • Depending on survey input • Depending on your inputMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 17
  18. 18. Thank you for your attention!For contact and more information: musiclef@dei.unipd.itMediaEval, Pisa 05/10/2012 MusiClef: Multimodal Music Tagging Task 18

×