Mp25: Audio Fingerprinting and metadata correction with Python

1,677 views
1,579 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,677
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Mp25: Audio Fingerprinting and metadata correction with Python

  1. 1. Audio fingerprinting and metadata correction with Python Alastair Porter November 21, 2011
  2. 2. Me Background in Computer Science Masters McGill Music Tech Online http://github.com/alastair (20/28 music; 11 in python) http://twitter.com/alastairporter
  3. 3. Python as a go-to language Quick for prototyping Use the same code in a production release Very handy for API access (thin wrapper around urllib2)
  4. 4. Music and Metadata
  5. 5. Music and Metadata The problem: People are really bad at naming music Inconsistent over releases The solution: Crowdsourcing Get info from as many trusted sources as possible Make renaming take no effort
  6. 6. MusicBrainz
  7. 7. Amazon
  8. 8. Amazon (Coverart)
  9. 9. Last.fm
  10. 10. Last.fm (Genre tags)
  11. 11. MusicBrainz
  12. 12. albumidentify http://github.com/albumidentify/albumidentify
  13. 13. MP3, FLAC, Ogg, CDs
  14. 14. Identification strategy If there’s a CD TOC, use that (musicbrainz lookup) If no match, use audio fingerprinting If no match, do a text lookup (artist/album)
  15. 15. Fingerprinting Converts an audio signal to a short sequence of numbers Smaller to compare than an entire file Perceptual features rather than byte comparison (works with different encodings)
  16. 16. Identification strategy Fingerprinting gives us a set of candidate tracks A track could be on many albums (original release, best of, mix album) Keep a list of what tracks we have for each album Once we fill all the slots for an album, success!
  17. 17. Metadata strategy Text information from Musicbrainz Genre from last.fm Image from Amazon (or folder.jpg) Musicbrainz tells us where these are (don’t need to search) Save in every file (Text is cheap)
  18. 18. Writing it all out Custom MP3/ID3 writer Ogg meta tags FLAC meta tags Name files Artist/Artist - Year - Album/01 - Artist - Track Replaygain! Be a good citizen: Submit fingerprints to musicbrainz
  19. 19. What’s next New version of musicbrainz New fingerprinter More metadata More metadata
  20. 20. Thanks More information: MusicBrainz: http://musicbrainz.org albumidentify: http://github.com/albumidentify/albumidentify More fingerprinting: http://acoustid.org, http://echoprint.me Last.fm

×