Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recognition from Multiple Sources

99 views

Published on

Presenters: Dmitry Bogdanov, Universitat Pompeu Fabra, Spain
Alastair Porter, Universitat Pompeu Fabra, Spain
Hendrik Schreiber, tagtraum industries incorporated

Paper: http://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_6.pdf

Video: https://youtu.be/NpN2Fr3go_Y

Authors: Dmitry Bogdanov, Alastair Porter, Julián Urbano, Hendrik Schreiber

Abstract: This paper provides an overview of the AcousticBrainz Genre Task organized as part of the MediaEval 2017 Benchmarking Initiative for Multimedia Evaluation. The task is focused on content-based music genre recognition using genre annotations from multiple sources and large-scale music features data available in the AcousticBrainz database. The goal of our task is to explore how the same music pieces can be annotated differently by different communities following different genre taxonomies, and how this should be addressed by content-based genre recognition systems. We present the task challenges, the employed ground-truth information and datasets, and the evaluation methodology.

Published in: Science
  • Be the first to comment

  • Be the first to like this

The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recognition from Multiple Sources

  1. 1. AcousticBrainz Genre Task Content-based music genre recognition from multiple sources Dmitry Bogdanov, Alastair Porter (Universitat Pompeu Fabra) Julián Urbano (Delft University of Technology) Hendrik Schreiber (tagtraum industries incorporated)
  2. 2. Genre recognition in Music Information Retrieval ● A popular task in MIR (Sturm 2014) ● Only small number of broad genres (e.g., rock, jazz, classical, electronic) ● Almost no studies on more specific genres (subgenres) ● Studies don’t consider the subjective nature of genre labels and taxonomies ● Single-class classification problem instead of a multi-class problem ● Genre hierarchy is not exploited ● Small datasets B. L. Sturm. 2014. The State of the Art Ten Years After a State of the Art: Future Research in Music Information Retrieval. Journal of New Music Research 43, 2 (2014), 147–172.
  3. 3. AcousticBrainz AcousticBrainz: a community database containing music features extracted from audio (https://acousticbrainz.org) (Porter et al. 2015) ● Open data computed by open algorithms ● Built on submissions from the community ● Over 5,600,000 analyzed recordings (tracks) ● ~3,000 music features (bags-of-frames) ● Statistical information about spectral shape, rhythm, tonality, loudness, etc. ● Rich music metadata from MusicBrainz (https://musicbrainz.org) ● Lots of data... What can we do with it? A Porter, D Bogdanov, R Kaye, R Tsukanov, and X Serra. 2015. AcousticBrainz: a community platform for gathering music information obtained from audio. In International Society for Music Information Retrieval (ISMIR’15) Conference. Málaga, Spain, 786–792.
  4. 4. The 2017 AcousticBrainz MediaEval Task Content-based music genre recognition from multiple ground truth sources Goal: Predict genre and subgenre of unknown music recordings given precomputed music features Task novelty: ● Four different genre annotation sources (and taxonomies) ● Hundreds of specific subgenres ● Multi-label genre classification problem ● A very large dataset (~2 million recordings in total)
  5. 5. Sources of genre information ● Scrape from internet sources ● Discogs (discogs.com) and AllMusic (allmusic.com) ● Explicit genre and subgenre annotations at an album level ● predefined taxonomies ● AcousticBrainz song → album → genre
  6. 6. Sources of genre information ● Tagtraum dataset based on beaTunes ○ Consumer application for Windows and Mac by tagtraum industries incorporated ○ Encourages users to correct metadata ○ Collects anonymized, user-submitted metadata ○ Relationship Song:Genre is 1:n ● Last.fm ○ Folksonomy tags for each song ○ Relative strength (0-100) ● Tag cleaning (normalization and blacklisting) ● Automatic inference of genre-subgenre relations
  7. 7. Mapping user genre labels to a genre taxonomy 1. Normalization (lowercase, smart subs, ...) R&B → rnb Rhythm and Blues → rnb R and B → rnb 2. Removal of unwanted labels via blacklisting (80spop, love, charts, djonly, ...) 3. Inferring hierarchical relationships via co-occurrence
  8. 8. Co-Occurrence matrix If a song is labeled with Alternative, how often is it also labeled with Rock?
  9. 9. Co-Occurrence matrix What about the other way around? If a song is labeled with Rock, how often is it also labeled with Alternative? Co-Occurrence is not symmetric!
  10. 10. What does the data look like? https://www.youtube.com/watch?v=zlaz7aR7B44
  11. 11. Subjectivity in music genre ● Classification tasks typically rely on an agreed answer for ground truth ● What should we do if we can’t find agreement between our ground truth? ● What if different sources use a label, but source has a different definition? Reggae → Dub Electronic → Ambient Dub Electronic → World Fusion World, Dub, Fusion
  12. 12. Sub-tasks ● Task 1: Build a separate system for each ground-truth dataset ● Task 2: Can we benefit from combining different ground truths into one system? Task 1 Task 2
  13. 13. Development and testing dataset split ● 4 development and 4 testings datasets (70%-15% split, 15% kept for future) ● Album filter ● Each label has at least 40 recordings from 6 release groups in training dataset (20 from 3 for test dataset) ● Development datasets statistics:
  14. 14. Results Submissions ● Participants from five teams ● Maximum of 5 submissions for each subtask per team ● (5 submissions ✖ 2 tasks ✖ 4 datasets = 40 runs per team) ● 115 runs received in total Baselines ● Random baseline: following the distribution of labels ● Popularity baseline: always predicts the most popular genre
  15. 15. Methodologies ● Manual feature preselection ● Classifiers ○ Hierarchical (SVMs + extra trees) ○ Neural networks ○ Random Forest Classifiers ● Task 2 (combining datasets) ○ Genre similarity based on text string matching distance, voting ○ Genre/subgenre similarity based on co-occurrence, conversion matrix, weighting
  16. 16. Evaluation metrics Effectiveness: Precision, Recall and F-score ● Per recording, all labels (genres and subgenres) ● Per recording, only genres ● Per recording, only subgenres ● Per label, all recordings ● Per genre label, all recordings ● Per subgenre label, all recordings
  17. 17. Per-track F-measure All labels (genres and subgenres) Per-label F-measure All labels (genres and subgenres) JKU DBIS Baselines popularity random
  18. 18. Results on genres vs subgenres
  19. 19. Conclusions: The Task is Challenging! ● Subgenre recognition is much more difficult - much space to improve! ● Datasets are heavily unbalanced ● High recall, but poor precision for many systems ● AllMusic dataset is the most difficult ● Systems should exploit hierarchies more ● No significant improvement from combining genre sources yet Team results ● JKU consistently proposes the best systems across all datasets ● DBIS exploits hierarchies and is significantly better than baselines ● KART, SAM-IRIT and ICSI are similar or close to baselines
  20. 20. Future directions ● AcousticBrainz is an ongoing experiment in collaborative extraction of music knowledge from audio ● MediaEval 2017 is our starting point ● Integrate promising systems to AcousticBrainz Next iteration of the AcousticBrainz Genre Task ● Exploit hierarchies to improve predictions on subgenre level ● Better combination of multiple genre annotation sources (Task 2) ● New music features?
  21. 21. Reproducibility ● Open development data ○ Music features computed by open-source software (Essentia) ○ Most genre annotations are open and are gathered by open-source software (MetaDB) ● Open-source code for evaluation and baselines ● Open validation datasets (will be published after workshop)
  22. 22. Thank you!
  23. 23. Differences between genre annotation sources Recording "Ambassel" by Dub Colossus ● Source 1: Electronic→ambient dub and Electronic→Downtempo ● Source 2: Electronic→dub and Hip-Hop and Reggae→Dub ● Source 3: World→Worldfusion ● Source 4: World→African and world→Worldfusion Recording "Como Poden" by In Extremo ● Source 1: Pop/Rock→Heavy Metal ● Source 2: Rock→Folk Rock ● Source 3: Metal→Folk Metal ● Source 4: Rock/Pop→Folk Metal and Rock/Pop→Metal

×