SlideShare a Scribd company logo
Audio fingerprinting and metadata
     correction with Python

           Alastair Porter


         November 21, 2011
Me

     Background in Computer Science
     Masters McGill Music Tech
     Online
         http://github.com/alastair (20/28 music; 11 in python)
         http://twitter.com/alastairporter
Python as a go-to language

     Quick for prototyping
     Use the same code in a production release
     Very handy for API access (thin wrapper around urllib2)
Music and Metadata
Music and Metadata

  The problem:
      People are really bad at naming music
      Inconsistent over releases


  The solution:
      Crowdsourcing
      Get info from as many trusted sources as possible
      Make renaming take no effort
MusicBrainz
Amazon
Amazon (Coverart)
Last.fm
Last.fm (Genre tags)
MusicBrainz
albumidentify




  http://github.com/albumidentify/albumidentify
MP3, FLAC, Ogg, CDs
Identification strategy

      If there’s a CD TOC, use that (musicbrainz lookup)
      If no match, use audio fingerprinting
      If no match, do a text lookup (artist/album)
Fingerprinting

     Converts an audio signal to a short sequence of numbers
     Smaller to compare than an entire file
     Perceptual features rather than byte comparison (works
     with different encodings)
Identification strategy

      Fingerprinting gives us a set of candidate tracks
      A track could be on many albums (original release, best of,
      mix album)
      Keep a list of what tracks we have for each album
      Once we fill all the slots for an album, success!
Metadata strategy

     Text information from Musicbrainz
     Genre from last.fm
     Image from Amazon (or folder.jpg)
     Musicbrainz tells us where these are (don’t need to search)
     Save in every file (Text is cheap)
Writing it all out

      Custom MP3/ID3 writer
      Ogg meta tags
      FLAC meta tags
      Name files
          Artist/Artist - Year - Album/01 - Artist - Track
      Replaygain!
      Be a good citizen: Submit fingerprints to musicbrainz
What’s next

     New version of musicbrainz
     New fingerprinter
     More metadata
     More metadata
Thanks

  More information:
      MusicBrainz: http://musicbrainz.org
      albumidentify:
      http://github.com/albumidentify/albumidentify
      More fingerprinting: http://acoustid.org,
      http://echoprint.me
      Last.fm

More Related Content

What's hot

CFADW PRESENTATION(Music sampling in hip hop)
CFADW PRESENTATION(Music sampling in hip hop)CFADW PRESENTATION(Music sampling in hip hop)
CFADW PRESENTATION(Music sampling in hip hop)shirlon
 
1. initial plans (js)
1. initial plans (js)1. initial plans (js)
1. initial plans (js)
Jack Sullivan
 
Music Sampling in Hip Hop
Music Sampling in Hip HopMusic Sampling in Hip Hop
Music Sampling in Hip HopAshamim
 
Twitter bots I have known and loved
Twitter bots I have known and lovedTwitter bots I have known and loved
Twitter bots I have known and loved
Steve Winton
 
Podcasting Tips
Podcasting TipsPodcasting Tips
Podcasting Tips
theartguy
 
FCP #3 Importing Media
FCP #3 Importing MediaFCP #3 Importing Media
FCP #3 Importing Media
Samuel Edsall
 
Analysis of the mystery jets digi pack for the
Analysis of the mystery jets digi pack for theAnalysis of the mystery jets digi pack for the
Analysis of the mystery jets digi pack for thechrismuzz
 
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing ChinaThe Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing Chinaolympic125
 
Elvis Presley Cut Me And I Bleed 1999
Elvis Presley   Cut Me And I Bleed 1999Elvis Presley   Cut Me And I Bleed 1999
Elvis Presley Cut Me And I Bleed 1999Elvis Live
 
Project pronunciation game 1
Project pronunciation game 1Project pronunciation game 1
Project pronunciation game 1
Dian Eko Saputra STKIP PANCASAKTI
 
Sgp slideshow
Sgp slideshowSgp slideshow
Sgp slideshowjprestler
 
Scott Slotnick Personal Persona
Scott Slotnick Personal PersonaScott Slotnick Personal Persona
Scott Slotnick Personal Persona
Scott Slotnick
 
File Naming Conventions and Creating Stems and Mixes
File Naming Conventions and Creating Stems and MixesFile Naming Conventions and Creating Stems and Mixes
File Naming Conventions and Creating Stems and Mixes
Magic Finger Lounge
 
Music Horror Analysis
Music Horror AnalysisMusic Horror Analysis
Music Horror Analysisgmckillop
 
\-_-/
\-_-/\-_-/

What's hot (20)

CFADW PRESENTATION(Music sampling in hip hop)
CFADW PRESENTATION(Music sampling in hip hop)CFADW PRESENTATION(Music sampling in hip hop)
CFADW PRESENTATION(Music sampling in hip hop)
 
Props List
Props ListProps List
Props List
 
1. initial plans (js)
1. initial plans (js)1. initial plans (js)
1. initial plans (js)
 
Music Sampling in Hip Hop
Music Sampling in Hip HopMusic Sampling in Hip Hop
Music Sampling in Hip Hop
 
Assignment 53
Assignment 53Assignment 53
Assignment 53
 
Twitter bots I have known and loved
Twitter bots I have known and lovedTwitter bots I have known and loved
Twitter bots I have known and loved
 
Podcasting
PodcastingPodcasting
Podcasting
 
Podcasting Tips
Podcasting TipsPodcasting Tips
Podcasting Tips
 
Podcast Tutorial
Podcast TutorialPodcast Tutorial
Podcast Tutorial
 
FCP #3 Importing Media
FCP #3 Importing MediaFCP #3 Importing Media
FCP #3 Importing Media
 
Analysis of the mystery jets digi pack for the
Analysis of the mystery jets digi pack for theAnalysis of the mystery jets digi pack for the
Analysis of the mystery jets digi pack for the
 
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing ChinaThe Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
 
Elvis Presley Cut Me And I Bleed 1999
Elvis Presley   Cut Me And I Bleed 1999Elvis Presley   Cut Me And I Bleed 1999
Elvis Presley Cut Me And I Bleed 1999
 
Project pronunciation game 1
Project pronunciation game 1Project pronunciation game 1
Project pronunciation game 1
 
Sgp slideshow
Sgp slideshowSgp slideshow
Sgp slideshow
 
Scott Slotnick Personal Persona
Scott Slotnick Personal PersonaScott Slotnick Personal Persona
Scott Slotnick Personal Persona
 
File Naming Conventions and Creating Stems and Mixes
File Naming Conventions and Creating Stems and MixesFile Naming Conventions and Creating Stems and Mixes
File Naming Conventions and Creating Stems and Mixes
 
Magazine names
Magazine namesMagazine names
Magazine names
 
Music Horror Analysis
Music Horror AnalysisMusic Horror Analysis
Music Horror Analysis
 
\-_-/
\-_-/\-_-/
\-_-/
 

Viewers also liked

Mp25: Optical Music Recognition with Python
Mp25: Optical Music Recognition with PythonMp25: Optical Music Recognition with Python
Mp25: Optical Music Recognition with PythonMontreal Python
 
Mp24: The Bachelor, a facebook game
Mp24: The Bachelor, a facebook gameMp24: The Bachelor, a facebook game
Mp24: The Bachelor, a facebook gameMontreal Python
 
Mp24: Fabulous Mobile Development with and without Python
Mp24: Fabulous Mobile Development with and without PythonMp24: Fabulous Mobile Development with and without Python
Mp24: Fabulous Mobile Development with and without PythonMontreal Python
 
Mp26 : Connecting Startups with Talents
Mp26 : Connecting Startups with TalentsMp26 : Connecting Startups with Talents
Mp26 : Connecting Startups with TalentsMontreal Python
 
Mp25 Message Switching for Actor Based Designs
Mp25 Message Switching for Actor Based DesignsMp25 Message Switching for Actor Based Designs
Mp25 Message Switching for Actor Based DesignsMontreal Python
 
Mp26 : How do you Solve a Problem like Santa Claus?
Mp26 : How do you Solve a Problem like Santa Claus?Mp26 : How do you Solve a Problem like Santa Claus?
Mp26 : How do you Solve a Problem like Santa Claus?Montreal Python
 
Mp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is blissMp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is blissMontreal Python
 

Viewers also liked (7)

Mp25: Optical Music Recognition with Python
Mp25: Optical Music Recognition with PythonMp25: Optical Music Recognition with Python
Mp25: Optical Music Recognition with Python
 
Mp24: The Bachelor, a facebook game
Mp24: The Bachelor, a facebook gameMp24: The Bachelor, a facebook game
Mp24: The Bachelor, a facebook game
 
Mp24: Fabulous Mobile Development with and without Python
Mp24: Fabulous Mobile Development with and without PythonMp24: Fabulous Mobile Development with and without Python
Mp24: Fabulous Mobile Development with and without Python
 
Mp26 : Connecting Startups with Talents
Mp26 : Connecting Startups with TalentsMp26 : Connecting Startups with Talents
Mp26 : Connecting Startups with Talents
 
Mp25 Message Switching for Actor Based Designs
Mp25 Message Switching for Actor Based DesignsMp25 Message Switching for Actor Based Designs
Mp25 Message Switching for Actor Based Designs
 
Mp26 : How do you Solve a Problem like Santa Claus?
Mp26 : How do you Solve a Problem like Santa Claus?Mp26 : How do you Solve a Problem like Santa Claus?
Mp26 : How do you Solve a Problem like Santa Claus?
 
Mp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is blissMp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is bliss
 

Similar to Mp25: Audio Fingerprinting and metadata correction with Python

Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)
Paul Lamere
 
Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)
Yi-Hsuan Yang
 
Copyright in music a lesson in heavy metal
Copyright in music   a lesson in heavy metalCopyright in music   a lesson in heavy metal
Copyright in music a lesson in heavy metal
Stephen Marvin
 
Metadata for musicians: setting up release
Metadata for musicians: setting up releaseMetadata for musicians: setting up release
Metadata for musicians: setting up release
Kristin Thomson
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017
Yi-Hsuan Yang
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
Sease
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
Andrea Gazzarini
 
Do Androids Dream Of Algorithmic Playlists
Do Androids Dream Of Algorithmic PlaylistsDo Androids Dream Of Algorithmic Playlists
Do Androids Dream Of Algorithmic PlaylistsMatthew Hawn
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At Spotify
Vidhya Murali
 
Audio on the web
Audio on the webAudio on the web
Audio on the webJoel May
 
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Oscar Celma
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic Web
Yves Raimond
 
Audio format
Audio formatAudio format
Audio format
avid
 
Mti presentation
Mti presentationMti presentation
Mti presentationDing Xu
 
Mti presentation
Mti presentationMti presentation
Mti presentationDing Xu
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheetluisfvazquez1
 
Teaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology ResourcesTeaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology Resourcesbradfordswanson
 
Music discovery on the net
Music discovery on the netMusic discovery on the net
Music discovery on the netguestbf080
 

Similar to Mp25: Audio Fingerprinting and metadata correction with Python (20)

Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)
 
Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)
 
Copyright in music a lesson in heavy metal
Copyright in music   a lesson in heavy metalCopyright in music   a lesson in heavy metal
Copyright in music a lesson in heavy metal
 
Metadata for musicians: setting up release
Metadata for musicians: setting up releaseMetadata for musicians: setting up release
Metadata for musicians: setting up release
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
 
Do Androids Dream Of Algorithmic Playlists
Do Androids Dream Of Algorithmic PlaylistsDo Androids Dream Of Algorithmic Playlists
Do Androids Dream Of Algorithmic Playlists
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At Spotify
 
Audio on the web
Audio on the webAudio on the web
Audio on the web
 
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic Web
 
Music mobile
Music mobileMusic mobile
Music mobile
 
Audio format
Audio formatAudio format
Audio format
 
Mti presentation
Mti presentationMti presentation
Mti presentation
 
Mti presentation
Mti presentationMti presentation
Mti presentation
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
 
Teaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology ResourcesTeaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology Resources
 
Music discovery on the net
Music discovery on the netMusic discovery on the net
Music discovery on the net
 
DJ Workshop v.0.2b
DJ Workshop v.0.2bDJ Workshop v.0.2b
DJ Workshop v.0.2b
 

Recently uploaded

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 

Recently uploaded (20)

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 

Mp25: Audio Fingerprinting and metadata correction with Python

  • 1. Audio fingerprinting and metadata correction with Python Alastair Porter November 21, 2011
  • 2. Me Background in Computer Science Masters McGill Music Tech Online http://github.com/alastair (20/28 music; 11 in python) http://twitter.com/alastairporter
  • 3. Python as a go-to language Quick for prototyping Use the same code in a production release Very handy for API access (thin wrapper around urllib2)
  • 5. Music and Metadata The problem: People are really bad at naming music Inconsistent over releases The solution: Crowdsourcing Get info from as many trusted sources as possible Make renaming take no effort
  • 14. Identification strategy If there’s a CD TOC, use that (musicbrainz lookup) If no match, use audio fingerprinting If no match, do a text lookup (artist/album)
  • 15. Fingerprinting Converts an audio signal to a short sequence of numbers Smaller to compare than an entire file Perceptual features rather than byte comparison (works with different encodings)
  • 16. Identification strategy Fingerprinting gives us a set of candidate tracks A track could be on many albums (original release, best of, mix album) Keep a list of what tracks we have for each album Once we fill all the slots for an album, success!
  • 17. Metadata strategy Text information from Musicbrainz Genre from last.fm Image from Amazon (or folder.jpg) Musicbrainz tells us where these are (don’t need to search) Save in every file (Text is cheap)
  • 18. Writing it all out Custom MP3/ID3 writer Ogg meta tags FLAC meta tags Name files Artist/Artist - Year - Album/01 - Artist - Track Replaygain! Be a good citizen: Submit fingerprints to musicbrainz
  • 19. What’s next New version of musicbrainz New fingerprinter More metadata More metadata
  • 20. Thanks More information: MusicBrainz: http://musicbrainz.org albumidentify: http://github.com/albumidentify/albumidentify More fingerprinting: http://acoustid.org, http://echoprint.me Last.fm