Music and Metadata The problem: People are really bad at naming music Inconsistent over releases The solution: Crowdsourcing Get info from as many trusted sources as possible Make renaming take no effort
Identiﬁcation strategy If there’s a CD TOC, use that (musicbrainz lookup) If no match, use audio ﬁngerprinting If no match, do a text lookup (artist/album)
Fingerprinting Converts an audio signal to a short sequence of numbers Smaller to compare than an entire ﬁle Perceptual features rather than byte comparison (works with different encodings)
Identiﬁcation strategy Fingerprinting gives us a set of candidate tracks A track could be on many albums (original release, best of, mix album) Keep a list of what tracks we have for each album Once we ﬁll all the slots for an album, success!
Metadata strategy Text information from Musicbrainz Genre from last.fm Image from Amazon (or folder.jpg) Musicbrainz tells us where these are (don’t need to search) Save in every ﬁle (Text is cheap)
Writing it all out Custom MP3/ID3 writer Ogg meta tags FLAC meta tags Name ﬁles Artist/Artist - Year - Album/01 - Artist - Track Replaygain! Be a good citizen: Submit ﬁngerprints to musicbrainz
What’s next New version of musicbrainz New ﬁngerprinter More metadata More metadata
Thanks More information: MusicBrainz: http://musicbrainz.org albumidentify: http://github.com/albumidentify/albumidentify More ﬁngerprinting: http://acoustid.org, http://echoprint.me Last.fm