Your SlideShare is downloading. ×
  • Like
Mp25: Audio Fingerprinting and metadata correction with Python
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Mp25: Audio Fingerprinting and metadata correction with Python



  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Audio fingerprinting and metadata correction with Python Alastair Porter November 21, 2011
  • 2. Me Background in Computer Science Masters McGill Music Tech Online (20/28 music; 11 in python)
  • 3. Python as a go-to language Quick for prototyping Use the same code in a production release Very handy for API access (thin wrapper around urllib2)
  • 4. Music and Metadata
  • 5. Music and Metadata The problem: People are really bad at naming music Inconsistent over releases The solution: Crowdsourcing Get info from as many trusted sources as possible Make renaming take no effort
  • 6. MusicBrainz
  • 7. Amazon
  • 8. Amazon (Coverart)
  • 9.
  • 10. (Genre tags)
  • 11. MusicBrainz
  • 12. albumidentify
  • 13. MP3, FLAC, Ogg, CDs
  • 14. Identification strategy If there’s a CD TOC, use that (musicbrainz lookup) If no match, use audio fingerprinting If no match, do a text lookup (artist/album)
  • 15. Fingerprinting Converts an audio signal to a short sequence of numbers Smaller to compare than an entire file Perceptual features rather than byte comparison (works with different encodings)
  • 16. Identification strategy Fingerprinting gives us a set of candidate tracks A track could be on many albums (original release, best of, mix album) Keep a list of what tracks we have for each album Once we fill all the slots for an album, success!
  • 17. Metadata strategy Text information from Musicbrainz Genre from Image from Amazon (or folder.jpg) Musicbrainz tells us where these are (don’t need to search) Save in every file (Text is cheap)
  • 18. Writing it all out Custom MP3/ID3 writer Ogg meta tags FLAC meta tags Name files Artist/Artist - Year - Album/01 - Artist - Track Replaygain! Be a good citizen: Submit fingerprints to musicbrainz
  • 19. What’s next New version of musicbrainz New fingerprinter More metadata More metadata
  • 20. Thanks More information: MusicBrainz: albumidentify: More fingerprinting:,