SlideShare a Scribd company logo
Introduction to Audio Content Analysis
module 11.0: audio fingerprinting
alexander lerch
overview intro requirements approaches philips shazam summary
introduction
overview
corresponding textbook section
chapter 11
lecture content
• introduction to audio fingerprinting
• in-depth example for fingerprint extraction and retrieval
learning objectives
• discuss goals and limitations of audio fingerprinting systems as compared to
watermarking or cover song detection systems
• describe the processing steps of the Philips fingerprinting system
module 11.0: audio fingerprinting 1 / 14
overview intro requirements approaches philips shazam summary
introduction
overview
corresponding textbook section
chapter 11
lecture content
• introduction to audio fingerprinting
• in-depth example for fingerprint extraction and retrieval
learning objectives
• discuss goals and limitations of audio fingerprinting systems as compared to
watermarking or cover song detection systems
• describe the processing steps of the Philips fingerprinting system
module 11.0: audio fingerprinting 1 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
introduction
objective:
• represent a recording with a compact and unique digest
(→ fingerprint, perceptual hash)
• allow quick matching between previously stored fingerprints and an extracted
fingerprint
applications:
• broadcast monitoring:
automate verification for royalties/infringement claims
• value-added services:
offer information and meta data
module 11.0: audio fingerprinting 2 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
introduction
objective:
• represent a recording with a compact and unique digest
(→ fingerprint, perceptual hash)
• allow quick matching between previously stored fingerprints and an extracted
fingerprint
applications:
• broadcast monitoring:
automate verification for royalties/infringement claims
• value-added services:
offer information and meta data
module 11.0: audio fingerprinting 2 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
introduction
objective:
• represent a recording with a compact and unique digest
(→ fingerprint, perceptual hash)
• allow quick matching between previously stored fingerprints and an extracted
fingerprint
applications:
• broadcast monitoring:
automate verification for royalties/infringement claims
• value-added services:
offer information and meta data
module 11.0: audio fingerprinting 2 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
fingerprinting vs. watermarking
fingerprinting:
• identifies recording (but not musical content)
watermarking:
• embeds perceptually “unnoticeable” data block in the audio
• identifies instance of recording
Property Fingerprinting Watermarking
Allows Legacy Content Indexing + –
Allows Embedded (Meta) Data – +
Leaves Signal Unchanged + –
Identification of Recording User or Interaction
module 11.0: audio fingerprinting 3 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
fingerprinting vs. watermarking
fingerprinting:
• identifies recording (but not musical content)
watermarking:
• embeds perceptually “unnoticeable” data block in the audio
• identifies instance of recording
Property Fingerprinting Watermarking
Allows Legacy Content Indexing + –
Allows Embedded (Meta) Data – +
Leaves Signal Unchanged + –
Identification of Recording User or Interaction
module 11.0: audio fingerprinting 3 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
fingerprint requirements
accuracy & reliability:
minimize false negatives/positives
robustness & security:
robust against distortions and attacks
granularity:
quick identification in a real-time context
versatility:
independent of file format, etc.
scalability:
good database performance
complexity:
implementation possible on embedded devices
module 11.0: audio fingerprinting 4 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
fingerprint requirements
accuracy & reliability:
minimize false negatives/positives
robustness & security:
robust against distortions and attacks
granularity:
quick identification in a real-time context
versatility:
independent of file format, etc.
scalability:
good database performance
complexity:
implementation possible on embedded devices
module 11.0: audio fingerprinting 4 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
fingerprint requirements
accuracy & reliability:
minimize false negatives/positives
robustness & security:
robust against distortions and attacks
granularity:
quick identification in a real-time context
versatility:
independent of file format, etc.
scalability:
good database performance
complexity:
implementation possible on embedded devices
module 11.0: audio fingerprinting 4 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
fingerprint requirements
accuracy & reliability:
minimize false negatives/positives
robustness & security:
robust against distortions and attacks
granularity:
quick identification in a real-time context
versatility:
independent of file format, etc.
scalability:
good database performance
complexity:
implementation possible on embedded devices
module 11.0: audio fingerprinting 4 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
fingerprint requirements
accuracy & reliability:
minimize false negatives/positives
robustness & security:
robust against distortions and attacks
granularity:
quick identification in a real-time context
versatility:
independent of file format, etc.
scalability:
good database performance
complexity:
implementation possible on embedded devices
module 11.0: audio fingerprinting 4 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
fingerprint requirements
accuracy & reliability:
minimize false negatives/positives
robustness & security:
robust against distortions and attacks
granularity:
quick identification in a real-time context
versatility:
independent of file format, etc.
scalability:
good database performance
complexity:
implementation possible on embedded devices
module 11.0: audio fingerprinting 4 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
general fingerprinting system
Unlabeled
Recording
Fingerprint
Extraction
Find
Match
ID
Database
Creation
Identification
Recording
IDs Database
Collection of
Recordings
Fingerprint
Extraction
module 11.0: audio fingerprinting 5 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
brainstorm
How does it work? MD5?
module 11.0: audio fingerprinting 6 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
brainstorm
How does it work? MD5?
module 11.0: audio fingerprinting 6 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 1/3
Audio
Signal STFT
Power Spectrum
Filterbank
Frequency Filtering
Temporal Filtering
Quantization
1 pre-processing:
downmixing & downsampling (5 kHz)
2 STFT: K = 2048, overlap 31
32
3 log frequency bands:
33 bands from 300–2000Hz
4 freq derivative: 33 bands
5 time derivative: 32 bands
6 quantization:
vFP(k, n) =
(
1 if ∆E(k, n) − ∆E(k, n − 1)

 0
0 otherwise
⇒ 32 bit subfingerprint
module 11.0: audio fingerprinting 7 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 1/3
Audio
Signal STFT
Power Spectrum
Filterbank
Frequency Filtering
Temporal Filtering
Quantization
1 pre-processing:
downmixing  downsampling (5 kHz)
2 STFT: K = 2048, overlap 31
32
3 log frequency bands:
33 bands from 300–2000Hz
4 freq derivative: 33 bands
5 time derivative: 32 bands
6 quantization:
vFP(k, n) =
(
1 if ∆E(k, n) − ∆E(k, n − 1)

 0
0 otherwise
⇒ 32 bit subfingerprint
module 11.0: audio fingerprinting 7 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 1/3
Audio
Signal STFT
Power Spectrum
Filterbank
Frequency Filtering
Temporal Filtering
Quantization
1 pre-processing:
downmixing  downsampling (5 kHz)
2 STFT: K = 2048, overlap 31
32
3 log frequency bands:
33 bands from 300–2000Hz
4 freq derivative: 33 bands
5 time derivative: 32 bands
6 quantization:
vFP(k, n) =
(
1 if ∆E(k, n) − ∆E(k, n − 1)

 0
0 otherwise
⇒ 32 bit subfingerprint
module 11.0: audio fingerprinting 7 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 1/3
Audio
Signal STFT
Power Spectrum
Filterbank
Frequency Filtering
Temporal Filtering
Quantization
1 pre-processing:
downmixing  downsampling (5 kHz)
2 STFT: K = 2048, overlap 31
32
3 log frequency bands:
33 bands from 300–2000Hz
4 freq derivative: 33 bands
5 time derivative: 32 bands
6 quantization:
vFP(k, n) =
(
1 if ∆E(k, n) − ∆E(k, n − 1)

 0
0 otherwise
⇒ 32 bit subfingerprint
module 11.0: audio fingerprinting 7 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 1/3
Audio
Signal STFT
Power Spectrum
Filterbank
Frequency Filtering
Temporal Filtering
Quantization
1 pre-processing:
downmixing  downsampling (5 kHz)
2 STFT: K = 2048, overlap 31
32
3 log frequency bands:
33 bands from 300–2000Hz
4 freq derivative: 33 bands
5 time derivative: 32 bands
6 quantization:
vFP(k, n) =
(
1 if ∆E(k, n) − ∆E(k, n − 1)

 0
0 otherwise
⇒ 32 bit subfingerprint
module 11.0: audio fingerprinting 7 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 1/3
Audio
Signal STFT
Power Spectrum
Filterbank
Frequency Filtering
Temporal Filtering
Quantization
1 pre-processing:
downmixing  downsampling (5 kHz)
2 STFT: K = 2048, overlap 31
32
3 log frequency bands:
33 bands from 300–2000Hz
4 freq derivative: 33 bands
5 time derivative: 32 bands
6 quantization:
vFP(k, n) =
(
1 if ∆E(k, n) − ∆E(k, n − 1)

 0
0 otherwise
⇒ 32 bit subfingerprint
module 11.0: audio fingerprinting 7 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 2/3
fingerprint
• 256 subsequent subfingerprints
⇒
• length: 3 s
• size: 256 · 4 Byte = 1 kByte
example:
• 5 min song
1 kByte ·
5 · 60s
3 s
= 100 kByte
• database with 1 million songs (avg. length 5 min)
106
· 256 ·
5 · 60s
3 s
= 25.6 · 109
subfingerprints
⇒ 100 GByte storage
module 11.0: audio fingerprinting 8 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 2/3
fingerprint
• 256 subsequent subfingerprints
⇒
• length: 3 s
• size: 256 · 4 Byte = 1 kByte
example:
• 5 min song
1 kByte ·
5 · 60s
3 s
= 100 kByte
• database with 1 million songs (avg. length 5 min)
106
· 256 ·
5 · 60s
3 s
= 25.6 · 109
subfingerprints
⇒ 100 GByte storage
module 11.0: audio fingerprinting 8 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 2/3
fingerprint
• 256 subsequent subfingerprints
⇒
• length: 3 s
• size: 256 · 4 Byte = 1 kByte
example:
• 5 min song
1 kByte ·
5 · 60s
3 s
= 100 kByte
• database with 1 million songs (avg. length 5 min)
106
· 256 ·
5 · 60s
3 s
= 25.6 · 109
subfingerprints
⇒ 100 GByte storage
module 11.0: audio fingerprinting 8 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 2/3
fingerprint
• 256 subsequent subfingerprints
⇒
• length: 3 s
• size: 256 · 4 Byte = 1 kByte
example:
• 5 min song
1 kByte ·
5 · 60s
3 s
= 100 kByte
• database with 1 million songs (avg. length 5 min)
106
· 256 ·
5 · 60s
3 s
= 25.6 · 109
subfingerprints
⇒ 100 GByte storage
module 11.0: audio fingerprinting 8 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips extraction 3/3
original:
low quality encoding:
module 11.0: audio fingerprinting 9 / 14
matlab
source:
plotFingerprint.m
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 1/3
database
• contains all subfingerprints for all songs
• previous example database: 25 billion subfingerprints
problem
• how to identify fingerprint efficiently?
module 11.0: audio fingerprinting 10 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 1/3
database
• contains all subfingerprints for all songs
• previous example database: 25 billion subfingerprints
problem
• how to identify fingerprint efficiently?
module 11.0: audio fingerprinting 10 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 1/3
database
• contains all subfingerprints for all songs
• previous example database: 25 billion subfingerprints
problem
• how to identify fingerprint efficiently?
module 11.0: audio fingerprinting 10 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 2/3
simple system:
1 create lookup table with all possible subfingerprints (232
) pointing to occurrences
2 assume at least one of the extracted 256 subfingerprints is error-free
⇒ only entries listed at 256 positions of the table have to be checked
3 compute Hamming distance between extracted fingerprint and candidates
module 11.0: audio fingerprinting 11 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 2/3
simple system:
1 create lookup table with all possible subfingerprints (232
) pointing to occurrences
2 assume at least one of the extracted 256 subfingerprints is error-free
⇒ only entries listed at 256 positions of the table have to be checked
3 compute Hamming distance between extracted fingerprint and candidates
module 11.0: audio fingerprinting 11 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 2/3
simple system:
1 create lookup table with all possible subfingerprints (232
) pointing to occurrences
2 assume at least one of the extracted 256 subfingerprints is error-free
⇒ only entries listed at 256 positions of the table have to be checked
3 compute Hamming distance between extracted fingerprint and candidates
module 11.0: audio fingerprinting 11 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 3/3
variant 1:
• allow one bit error
⇒ workload increase by factor ≈ 33
variant 2:
• introduce concept of bit error probability into fingerprint extraction
▶ small energy difference → high error probability
▶ large energy difference → low error probability
• rank bits per subfingerprint by error probability and check only for bit errors at likely
positions
module 11.0: audio fingerprinting 12 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 3/3
variant 1:
• allow one bit error
⇒ workload increase by factor ≈ 33
variant 2:
• introduce concept of bit error probability into fingerprint extraction
▶ small energy difference → high error probability
▶ large energy difference → low error probability
• rank bits per subfingerprint by error probability and check only for bit errors at likely
positions
module 11.0: audio fingerprinting 12 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 3/3
variant 1:
• allow one bit error
⇒ workload increase by factor ≈ 33
variant 2:
• introduce concept of bit error probability into fingerprint extraction
▶ small energy difference → high error probability
▶ large energy difference → low error probability
• rank bits per subfingerprint by error probability and check only for bit errors at likely
positions
module 11.0: audio fingerprinting 12 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
system example: philips identification 3/3
variant 1:
• allow one bit error
⇒ workload increase by factor ≈ 33
variant 2:
• introduce concept of bit error probability into fingerprint extraction
▶ small energy difference → high error probability
▶ large energy difference → low error probability
• rank bits per subfingerprint by error probability and check only for bit errors at likely
positions
module 11.0: audio fingerprinting 12 / 14
overview intro requirements approaches philips shazam summary
audio fingerprinting
other systems: shazam
plot from1
1
A. Wang, “An Industrial Strength Audio Search Algorithm,” in Proceedings of the 4th International Conference on Music Information Retrieval
(ISMIR), Washington, 2003.
module 11.0: audio fingerprinting 13 / 14
overview intro requirements approaches philips shazam summary
summary
lecture content
audio fingerprinting
• represent recording with compact, robust, and unique fingerprint
• focus on (perceptual) audio representation rather than “musical” content
• allow efficient matching of this fingerprint with database
often confused with other tasks
1 audio watermarking
2 cover song detection
module 11.0: audio fingerprinting 14 / 14

More Related Content

Similar to 11-00-ACA-Fingerprinting.pdf

Embedded Android : System Development - Part III (Audio / Video HAL)
Embedded Android : System Development - Part III (Audio / Video HAL)Embedded Android : System Development - Part III (Audio / Video HAL)
Embedded Android : System Development - Part III (Audio / Video HAL)
Emertxe Information Technologies Pvt Ltd
 
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Maksim Shudrak
 
Digital Media Production - Future Internet
Digital Media Production - Future InternetDigital Media Production - Future Internet
Digital Media Production - Future Internet
Maarten Verwaest
 
The role of robotic innovation in ore characterization, mineralogy and geomet...
The role of robotic innovation in ore characterization, mineralogy and geomet...The role of robotic innovation in ore characterization, mineralogy and geomet...
The role of robotic innovation in ore characterization, mineralogy and geomet...
Mining On Top
 
Advanced Topics in IP Multicast Deployment
Advanced Topics in IP Multicast DeploymentAdvanced Topics in IP Multicast Deployment
Advanced Topics in IP Multicast Deployment
Arrive Technologies, Inc.
 
Track 3 session 8 - st dev con 2016 - music and voice over ble
Track 3   session 8 - st dev con 2016 - music and voice over bleTrack 3   session 8 - st dev con 2016 - music and voice over ble
Track 3 session 8 - st dev con 2016 - music and voice over ble
ST_World
 
1 Vampir Overview
1 Vampir Overview1 Vampir Overview
1 Vampir Overview
PTIHPA
 
VoIP Security 101 what you need to know
VoIP Security 101   what you need to knowVoIP Security 101   what you need to know
VoIP Security 101 what you need to know
Eric Klein
 
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
James Clause
 
NetFlow Monitoring for Cyber Threat Defense
NetFlow Monitoring for Cyber Threat DefenseNetFlow Monitoring for Cyber Threat Defense
NetFlow Monitoring for Cyber Threat Defense
Cisco Canada
 
Embedded Android : System Development - Part IV
Embedded Android : System Development - Part IVEmbedded Android : System Development - Part IV
Embedded Android : System Development - Part IV
Emertxe Information Technologies Pvt Ltd
 
practicing what you never preached: sorting and discarding from a practical ...
practicing what you never preached:  sorting and discarding from a practical ...practicing what you never preached:  sorting and discarding from a practical ...
practicing what you never preached: sorting and discarding from a practical ...
FIAT/IFTA
 
Wireless Local area network issues all perfect wireless engineering
Wireless Local area network issues all perfect wireless engineeringWireless Local area network issues all perfect wireless engineering
Wireless Local area network issues all perfect wireless engineering
testerontime
 
Developing Applications Using Host Processing Instead of DSPs
Developing Applications Using Host Processing Instead of DSPsDeveloping Applications Using Host Processing Instead of DSPs
Developing Applications Using Host Processing Instead of DSPs
Videoguy
 
Why choose pan
Why choose panWhy choose pan
Why choose pan
Achmad Yudo
 
IMA/Thales EchoVoice (VOIP) for OpenSimulator Presentation at OSCC19
IMA/Thales EchoVoice (VOIP) for OpenSimulator Presentation at OSCC19IMA/Thales EchoVoice (VOIP) for OpenSimulator Presentation at OSCC19
IMA/Thales EchoVoice (VOIP) for OpenSimulator Presentation at OSCC19
Lisa Laxton
 
Packet-to-Packet Applications
Packet-to-Packet ApplicationsPacket-to-Packet Applications
Packet-to-Packet Applications
Videoguy
 
Surf Communication Solutions - Packet To Packet Apps
Surf Communication Solutions - Packet To Packet AppsSurf Communication Solutions - Packet To Packet Apps
Surf Communication Solutions - Packet To Packet Apps
Surf Communication Solutions, Ltd.
 
Sound Recording Glossary
Sound Recording GlossarySound Recording Glossary
Sound Recording Glossary
AidenKelly
 
Video Traffic Management
Video Traffic ManagementVideo Traffic Management
Video Traffic Management
Shenick Network Systems
 

Similar to 11-00-ACA-Fingerprinting.pdf (20)

Embedded Android : System Development - Part III (Audio / Video HAL)
Embedded Android : System Development - Part III (Audio / Video HAL)Embedded Android : System Development - Part III (Audio / Video HAL)
Embedded Android : System Development - Part III (Audio / Video HAL)
 
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
 
Digital Media Production - Future Internet
Digital Media Production - Future InternetDigital Media Production - Future Internet
Digital Media Production - Future Internet
 
The role of robotic innovation in ore characterization, mineralogy and geomet...
The role of robotic innovation in ore characterization, mineralogy and geomet...The role of robotic innovation in ore characterization, mineralogy and geomet...
The role of robotic innovation in ore characterization, mineralogy and geomet...
 
Advanced Topics in IP Multicast Deployment
Advanced Topics in IP Multicast DeploymentAdvanced Topics in IP Multicast Deployment
Advanced Topics in IP Multicast Deployment
 
Track 3 session 8 - st dev con 2016 - music and voice over ble
Track 3   session 8 - st dev con 2016 - music and voice over bleTrack 3   session 8 - st dev con 2016 - music and voice over ble
Track 3 session 8 - st dev con 2016 - music and voice over ble
 
1 Vampir Overview
1 Vampir Overview1 Vampir Overview
1 Vampir Overview
 
VoIP Security 101 what you need to know
VoIP Security 101   what you need to knowVoIP Security 101   what you need to know
VoIP Security 101 what you need to know
 
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
 
NetFlow Monitoring for Cyber Threat Defense
NetFlow Monitoring for Cyber Threat DefenseNetFlow Monitoring for Cyber Threat Defense
NetFlow Monitoring for Cyber Threat Defense
 
Embedded Android : System Development - Part IV
Embedded Android : System Development - Part IVEmbedded Android : System Development - Part IV
Embedded Android : System Development - Part IV
 
practicing what you never preached: sorting and discarding from a practical ...
practicing what you never preached:  sorting and discarding from a practical ...practicing what you never preached:  sorting and discarding from a practical ...
practicing what you never preached: sorting and discarding from a practical ...
 
Wireless Local area network issues all perfect wireless engineering
Wireless Local area network issues all perfect wireless engineeringWireless Local area network issues all perfect wireless engineering
Wireless Local area network issues all perfect wireless engineering
 
Developing Applications Using Host Processing Instead of DSPs
Developing Applications Using Host Processing Instead of DSPsDeveloping Applications Using Host Processing Instead of DSPs
Developing Applications Using Host Processing Instead of DSPs
 
Why choose pan
Why choose panWhy choose pan
Why choose pan
 
IMA/Thales EchoVoice (VOIP) for OpenSimulator Presentation at OSCC19
IMA/Thales EchoVoice (VOIP) for OpenSimulator Presentation at OSCC19IMA/Thales EchoVoice (VOIP) for OpenSimulator Presentation at OSCC19
IMA/Thales EchoVoice (VOIP) for OpenSimulator Presentation at OSCC19
 
Packet-to-Packet Applications
Packet-to-Packet ApplicationsPacket-to-Packet Applications
Packet-to-Packet Applications
 
Surf Communication Solutions - Packet To Packet Apps
Surf Communication Solutions - Packet To Packet AppsSurf Communication Solutions - Packet To Packet Apps
Surf Communication Solutions - Packet To Packet Apps
 
Sound Recording Glossary
Sound Recording GlossarySound Recording Glossary
Sound Recording Glossary
 
Video Traffic Management
Video Traffic ManagementVideo Traffic Management
Video Traffic Management
 

More from AlexanderLerch4

0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf
AlexanderLerch4
 
12-02-ACA-Genre.pdf
12-02-ACA-Genre.pdf12-02-ACA-Genre.pdf
12-02-ACA-Genre.pdf
AlexanderLerch4
 
10-02-ACA-Alignment.pdf
10-02-ACA-Alignment.pdf10-02-ACA-Alignment.pdf
10-02-ACA-Alignment.pdf
AlexanderLerch4
 
09-03-ACA-Temporal-Onsets.pdf
09-03-ACA-Temporal-Onsets.pdf09-03-ACA-Temporal-Onsets.pdf
09-03-ACA-Temporal-Onsets.pdf
AlexanderLerch4
 
13-00-ACA-Mood.pdf
13-00-ACA-Mood.pdf13-00-ACA-Mood.pdf
13-00-ACA-Mood.pdf
AlexanderLerch4
 
09-07-ACA-Temporal-Structure.pdf
09-07-ACA-Temporal-Structure.pdf09-07-ACA-Temporal-Structure.pdf
09-07-ACA-Temporal-Structure.pdf
AlexanderLerch4
 
09-01-ACA-Temporal-Intro.pdf
09-01-ACA-Temporal-Intro.pdf09-01-ACA-Temporal-Intro.pdf
09-01-ACA-Temporal-Intro.pdf
AlexanderLerch4
 
07-06-ACA-Tonal-Chord.pdf
07-06-ACA-Tonal-Chord.pdf07-06-ACA-Tonal-Chord.pdf
07-06-ACA-Tonal-Chord.pdf
AlexanderLerch4
 
15-00-ACA-Music-Performance-Analysis.pdf
15-00-ACA-Music-Performance-Analysis.pdf15-00-ACA-Music-Performance-Analysis.pdf
15-00-ACA-Music-Performance-Analysis.pdf
AlexanderLerch4
 
12-01-ACA-Similarity.pdf
12-01-ACA-Similarity.pdf12-01-ACA-Similarity.pdf
12-01-ACA-Similarity.pdf
AlexanderLerch4
 
09-04-ACA-Temporal-BeatHisto.pdf
09-04-ACA-Temporal-BeatHisto.pdf09-04-ACA-Temporal-BeatHisto.pdf
09-04-ACA-Temporal-BeatHisto.pdf
AlexanderLerch4
 
09-05-ACA-Temporal-Tempo.pdf
09-05-ACA-Temporal-Tempo.pdf09-05-ACA-Temporal-Tempo.pdf
09-05-ACA-Temporal-Tempo.pdf
AlexanderLerch4
 
08-00-ACA-Intensity.pdf
08-00-ACA-Intensity.pdf08-00-ACA-Intensity.pdf
08-00-ACA-Intensity.pdf
AlexanderLerch4
 
03-05-ACA-Input-Features.pdf
03-05-ACA-Input-Features.pdf03-05-ACA-Input-Features.pdf
03-05-ACA-Input-Features.pdf
AlexanderLerch4
 
07-03-03-ACA-Tonal-Monof0.pdf
07-03-03-ACA-Tonal-Monof0.pdf07-03-03-ACA-Tonal-Monof0.pdf
07-03-03-ACA-Tonal-Monof0.pdf
AlexanderLerch4
 
05-00-ACA-Data-Intro.pdf
05-00-ACA-Data-Intro.pdf05-00-ACA-Data-Intro.pdf
05-00-ACA-Data-Intro.pdf
AlexanderLerch4
 
07-03-05-ACA-Tonal-F0Eval.pdf
07-03-05-ACA-Tonal-F0Eval.pdf07-03-05-ACA-Tonal-F0Eval.pdf
07-03-05-ACA-Tonal-F0Eval.pdf
AlexanderLerch4
 
03-03-01-ACA-Input-TF-Fourier.pdf
03-03-01-ACA-Input-TF-Fourier.pdf03-03-01-ACA-Input-TF-Fourier.pdf
03-03-01-ACA-Input-TF-Fourier.pdf
AlexanderLerch4
 
03-06-ACA-Input-FeatureLearning.pdf
03-06-ACA-Input-FeatureLearning.pdf03-06-ACA-Input-FeatureLearning.pdf
03-06-ACA-Input-FeatureLearning.pdf
AlexanderLerch4
 
07-03-04-ACA-Tonal-Polyf0.pdf
07-03-04-ACA-Tonal-Polyf0.pdf07-03-04-ACA-Tonal-Polyf0.pdf
07-03-04-ACA-Tonal-Polyf0.pdf
AlexanderLerch4
 

More from AlexanderLerch4 (20)

0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf
 
12-02-ACA-Genre.pdf
12-02-ACA-Genre.pdf12-02-ACA-Genre.pdf
12-02-ACA-Genre.pdf
 
10-02-ACA-Alignment.pdf
10-02-ACA-Alignment.pdf10-02-ACA-Alignment.pdf
10-02-ACA-Alignment.pdf
 
09-03-ACA-Temporal-Onsets.pdf
09-03-ACA-Temporal-Onsets.pdf09-03-ACA-Temporal-Onsets.pdf
09-03-ACA-Temporal-Onsets.pdf
 
13-00-ACA-Mood.pdf
13-00-ACA-Mood.pdf13-00-ACA-Mood.pdf
13-00-ACA-Mood.pdf
 
09-07-ACA-Temporal-Structure.pdf
09-07-ACA-Temporal-Structure.pdf09-07-ACA-Temporal-Structure.pdf
09-07-ACA-Temporal-Structure.pdf
 
09-01-ACA-Temporal-Intro.pdf
09-01-ACA-Temporal-Intro.pdf09-01-ACA-Temporal-Intro.pdf
09-01-ACA-Temporal-Intro.pdf
 
07-06-ACA-Tonal-Chord.pdf
07-06-ACA-Tonal-Chord.pdf07-06-ACA-Tonal-Chord.pdf
07-06-ACA-Tonal-Chord.pdf
 
15-00-ACA-Music-Performance-Analysis.pdf
15-00-ACA-Music-Performance-Analysis.pdf15-00-ACA-Music-Performance-Analysis.pdf
15-00-ACA-Music-Performance-Analysis.pdf
 
12-01-ACA-Similarity.pdf
12-01-ACA-Similarity.pdf12-01-ACA-Similarity.pdf
12-01-ACA-Similarity.pdf
 
09-04-ACA-Temporal-BeatHisto.pdf
09-04-ACA-Temporal-BeatHisto.pdf09-04-ACA-Temporal-BeatHisto.pdf
09-04-ACA-Temporal-BeatHisto.pdf
 
09-05-ACA-Temporal-Tempo.pdf
09-05-ACA-Temporal-Tempo.pdf09-05-ACA-Temporal-Tempo.pdf
09-05-ACA-Temporal-Tempo.pdf
 
08-00-ACA-Intensity.pdf
08-00-ACA-Intensity.pdf08-00-ACA-Intensity.pdf
08-00-ACA-Intensity.pdf
 
03-05-ACA-Input-Features.pdf
03-05-ACA-Input-Features.pdf03-05-ACA-Input-Features.pdf
03-05-ACA-Input-Features.pdf
 
07-03-03-ACA-Tonal-Monof0.pdf
07-03-03-ACA-Tonal-Monof0.pdf07-03-03-ACA-Tonal-Monof0.pdf
07-03-03-ACA-Tonal-Monof0.pdf
 
05-00-ACA-Data-Intro.pdf
05-00-ACA-Data-Intro.pdf05-00-ACA-Data-Intro.pdf
05-00-ACA-Data-Intro.pdf
 
07-03-05-ACA-Tonal-F0Eval.pdf
07-03-05-ACA-Tonal-F0Eval.pdf07-03-05-ACA-Tonal-F0Eval.pdf
07-03-05-ACA-Tonal-F0Eval.pdf
 
03-03-01-ACA-Input-TF-Fourier.pdf
03-03-01-ACA-Input-TF-Fourier.pdf03-03-01-ACA-Input-TF-Fourier.pdf
03-03-01-ACA-Input-TF-Fourier.pdf
 
03-06-ACA-Input-FeatureLearning.pdf
03-06-ACA-Input-FeatureLearning.pdf03-06-ACA-Input-FeatureLearning.pdf
03-06-ACA-Input-FeatureLearning.pdf
 
07-03-04-ACA-Tonal-Polyf0.pdf
07-03-04-ACA-Tonal-Polyf0.pdf07-03-04-ACA-Tonal-Polyf0.pdf
07-03-04-ACA-Tonal-Polyf0.pdf
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 

11-00-ACA-Fingerprinting.pdf

  • 1. Introduction to Audio Content Analysis module 11.0: audio fingerprinting alexander lerch
  • 2. overview intro requirements approaches philips shazam summary introduction overview corresponding textbook section chapter 11 lecture content • introduction to audio fingerprinting • in-depth example for fingerprint extraction and retrieval learning objectives • discuss goals and limitations of audio fingerprinting systems as compared to watermarking or cover song detection systems • describe the processing steps of the Philips fingerprinting system module 11.0: audio fingerprinting 1 / 14
  • 3. overview intro requirements approaches philips shazam summary introduction overview corresponding textbook section chapter 11 lecture content • introduction to audio fingerprinting • in-depth example for fingerprint extraction and retrieval learning objectives • discuss goals and limitations of audio fingerprinting systems as compared to watermarking or cover song detection systems • describe the processing steps of the Philips fingerprinting system module 11.0: audio fingerprinting 1 / 14
  • 4. overview intro requirements approaches philips shazam summary audio fingerprinting introduction objective: • represent a recording with a compact and unique digest (→ fingerprint, perceptual hash) • allow quick matching between previously stored fingerprints and an extracted fingerprint applications: • broadcast monitoring: automate verification for royalties/infringement claims • value-added services: offer information and meta data module 11.0: audio fingerprinting 2 / 14
  • 5. overview intro requirements approaches philips shazam summary audio fingerprinting introduction objective: • represent a recording with a compact and unique digest (→ fingerprint, perceptual hash) • allow quick matching between previously stored fingerprints and an extracted fingerprint applications: • broadcast monitoring: automate verification for royalties/infringement claims • value-added services: offer information and meta data module 11.0: audio fingerprinting 2 / 14
  • 6. overview intro requirements approaches philips shazam summary audio fingerprinting introduction objective: • represent a recording with a compact and unique digest (→ fingerprint, perceptual hash) • allow quick matching between previously stored fingerprints and an extracted fingerprint applications: • broadcast monitoring: automate verification for royalties/infringement claims • value-added services: offer information and meta data module 11.0: audio fingerprinting 2 / 14
  • 7. overview intro requirements approaches philips shazam summary audio fingerprinting fingerprinting vs. watermarking fingerprinting: • identifies recording (but not musical content) watermarking: • embeds perceptually “unnoticeable” data block in the audio • identifies instance of recording Property Fingerprinting Watermarking Allows Legacy Content Indexing + – Allows Embedded (Meta) Data – + Leaves Signal Unchanged + – Identification of Recording User or Interaction module 11.0: audio fingerprinting 3 / 14
  • 8. overview intro requirements approaches philips shazam summary audio fingerprinting fingerprinting vs. watermarking fingerprinting: • identifies recording (but not musical content) watermarking: • embeds perceptually “unnoticeable” data block in the audio • identifies instance of recording Property Fingerprinting Watermarking Allows Legacy Content Indexing + – Allows Embedded (Meta) Data – + Leaves Signal Unchanged + – Identification of Recording User or Interaction module 11.0: audio fingerprinting 3 / 14
  • 9. overview intro requirements approaches philips shazam summary audio fingerprinting fingerprint requirements accuracy & reliability: minimize false negatives/positives robustness & security: robust against distortions and attacks granularity: quick identification in a real-time context versatility: independent of file format, etc. scalability: good database performance complexity: implementation possible on embedded devices module 11.0: audio fingerprinting 4 / 14
  • 10. overview intro requirements approaches philips shazam summary audio fingerprinting fingerprint requirements accuracy & reliability: minimize false negatives/positives robustness & security: robust against distortions and attacks granularity: quick identification in a real-time context versatility: independent of file format, etc. scalability: good database performance complexity: implementation possible on embedded devices module 11.0: audio fingerprinting 4 / 14
  • 11. overview intro requirements approaches philips shazam summary audio fingerprinting fingerprint requirements accuracy & reliability: minimize false negatives/positives robustness & security: robust against distortions and attacks granularity: quick identification in a real-time context versatility: independent of file format, etc. scalability: good database performance complexity: implementation possible on embedded devices module 11.0: audio fingerprinting 4 / 14
  • 12. overview intro requirements approaches philips shazam summary audio fingerprinting fingerprint requirements accuracy & reliability: minimize false negatives/positives robustness & security: robust against distortions and attacks granularity: quick identification in a real-time context versatility: independent of file format, etc. scalability: good database performance complexity: implementation possible on embedded devices module 11.0: audio fingerprinting 4 / 14
  • 13. overview intro requirements approaches philips shazam summary audio fingerprinting fingerprint requirements accuracy & reliability: minimize false negatives/positives robustness & security: robust against distortions and attacks granularity: quick identification in a real-time context versatility: independent of file format, etc. scalability: good database performance complexity: implementation possible on embedded devices module 11.0: audio fingerprinting 4 / 14
  • 14. overview intro requirements approaches philips shazam summary audio fingerprinting fingerprint requirements accuracy & reliability: minimize false negatives/positives robustness & security: robust against distortions and attacks granularity: quick identification in a real-time context versatility: independent of file format, etc. scalability: good database performance complexity: implementation possible on embedded devices module 11.0: audio fingerprinting 4 / 14
  • 15. overview intro requirements approaches philips shazam summary audio fingerprinting general fingerprinting system Unlabeled Recording Fingerprint Extraction Find Match ID Database Creation Identification Recording IDs Database Collection of Recordings Fingerprint Extraction module 11.0: audio fingerprinting 5 / 14
  • 16. overview intro requirements approaches philips shazam summary audio fingerprinting brainstorm How does it work? MD5? module 11.0: audio fingerprinting 6 / 14
  • 17. overview intro requirements approaches philips shazam summary audio fingerprinting brainstorm How does it work? MD5? module 11.0: audio fingerprinting 6 / 14
  • 18. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 1/3 Audio Signal STFT Power Spectrum Filterbank Frequency Filtering Temporal Filtering Quantization 1 pre-processing: downmixing & downsampling (5 kHz) 2 STFT: K = 2048, overlap 31 32 3 log frequency bands: 33 bands from 300–2000Hz 4 freq derivative: 33 bands 5 time derivative: 32 bands 6 quantization: vFP(k, n) = ( 1 if ∆E(k, n) − ∆E(k, n − 1) 0 0 otherwise ⇒ 32 bit subfingerprint module 11.0: audio fingerprinting 7 / 14
  • 19. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 1/3 Audio Signal STFT Power Spectrum Filterbank Frequency Filtering Temporal Filtering Quantization 1 pre-processing: downmixing downsampling (5 kHz) 2 STFT: K = 2048, overlap 31 32 3 log frequency bands: 33 bands from 300–2000Hz 4 freq derivative: 33 bands 5 time derivative: 32 bands 6 quantization: vFP(k, n) = ( 1 if ∆E(k, n) − ∆E(k, n − 1) 0 0 otherwise ⇒ 32 bit subfingerprint module 11.0: audio fingerprinting 7 / 14
  • 20. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 1/3 Audio Signal STFT Power Spectrum Filterbank Frequency Filtering Temporal Filtering Quantization 1 pre-processing: downmixing downsampling (5 kHz) 2 STFT: K = 2048, overlap 31 32 3 log frequency bands: 33 bands from 300–2000Hz 4 freq derivative: 33 bands 5 time derivative: 32 bands 6 quantization: vFP(k, n) = ( 1 if ∆E(k, n) − ∆E(k, n − 1) 0 0 otherwise ⇒ 32 bit subfingerprint module 11.0: audio fingerprinting 7 / 14
  • 21. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 1/3 Audio Signal STFT Power Spectrum Filterbank Frequency Filtering Temporal Filtering Quantization 1 pre-processing: downmixing downsampling (5 kHz) 2 STFT: K = 2048, overlap 31 32 3 log frequency bands: 33 bands from 300–2000Hz 4 freq derivative: 33 bands 5 time derivative: 32 bands 6 quantization: vFP(k, n) = ( 1 if ∆E(k, n) − ∆E(k, n − 1) 0 0 otherwise ⇒ 32 bit subfingerprint module 11.0: audio fingerprinting 7 / 14
  • 22. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 1/3 Audio Signal STFT Power Spectrum Filterbank Frequency Filtering Temporal Filtering Quantization 1 pre-processing: downmixing downsampling (5 kHz) 2 STFT: K = 2048, overlap 31 32 3 log frequency bands: 33 bands from 300–2000Hz 4 freq derivative: 33 bands 5 time derivative: 32 bands 6 quantization: vFP(k, n) = ( 1 if ∆E(k, n) − ∆E(k, n − 1) 0 0 otherwise ⇒ 32 bit subfingerprint module 11.0: audio fingerprinting 7 / 14
  • 23. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 1/3 Audio Signal STFT Power Spectrum Filterbank Frequency Filtering Temporal Filtering Quantization 1 pre-processing: downmixing downsampling (5 kHz) 2 STFT: K = 2048, overlap 31 32 3 log frequency bands: 33 bands from 300–2000Hz 4 freq derivative: 33 bands 5 time derivative: 32 bands 6 quantization: vFP(k, n) = ( 1 if ∆E(k, n) − ∆E(k, n − 1) 0 0 otherwise ⇒ 32 bit subfingerprint module 11.0: audio fingerprinting 7 / 14
  • 24. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 2/3 fingerprint • 256 subsequent subfingerprints ⇒ • length: 3 s • size: 256 · 4 Byte = 1 kByte example: • 5 min song 1 kByte · 5 · 60s 3 s = 100 kByte • database with 1 million songs (avg. length 5 min) 106 · 256 · 5 · 60s 3 s = 25.6 · 109 subfingerprints ⇒ 100 GByte storage module 11.0: audio fingerprinting 8 / 14
  • 25. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 2/3 fingerprint • 256 subsequent subfingerprints ⇒ • length: 3 s • size: 256 · 4 Byte = 1 kByte example: • 5 min song 1 kByte · 5 · 60s 3 s = 100 kByte • database with 1 million songs (avg. length 5 min) 106 · 256 · 5 · 60s 3 s = 25.6 · 109 subfingerprints ⇒ 100 GByte storage module 11.0: audio fingerprinting 8 / 14
  • 26. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 2/3 fingerprint • 256 subsequent subfingerprints ⇒ • length: 3 s • size: 256 · 4 Byte = 1 kByte example: • 5 min song 1 kByte · 5 · 60s 3 s = 100 kByte • database with 1 million songs (avg. length 5 min) 106 · 256 · 5 · 60s 3 s = 25.6 · 109 subfingerprints ⇒ 100 GByte storage module 11.0: audio fingerprinting 8 / 14
  • 27. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 2/3 fingerprint • 256 subsequent subfingerprints ⇒ • length: 3 s • size: 256 · 4 Byte = 1 kByte example: • 5 min song 1 kByte · 5 · 60s 3 s = 100 kByte • database with 1 million songs (avg. length 5 min) 106 · 256 · 5 · 60s 3 s = 25.6 · 109 subfingerprints ⇒ 100 GByte storage module 11.0: audio fingerprinting 8 / 14
  • 28. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips extraction 3/3 original: low quality encoding: module 11.0: audio fingerprinting 9 / 14 matlab source: plotFingerprint.m
  • 29. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 1/3 database • contains all subfingerprints for all songs • previous example database: 25 billion subfingerprints problem • how to identify fingerprint efficiently? module 11.0: audio fingerprinting 10 / 14
  • 30. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 1/3 database • contains all subfingerprints for all songs • previous example database: 25 billion subfingerprints problem • how to identify fingerprint efficiently? module 11.0: audio fingerprinting 10 / 14
  • 31. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 1/3 database • contains all subfingerprints for all songs • previous example database: 25 billion subfingerprints problem • how to identify fingerprint efficiently? module 11.0: audio fingerprinting 10 / 14
  • 32. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 2/3 simple system: 1 create lookup table with all possible subfingerprints (232 ) pointing to occurrences 2 assume at least one of the extracted 256 subfingerprints is error-free ⇒ only entries listed at 256 positions of the table have to be checked 3 compute Hamming distance between extracted fingerprint and candidates module 11.0: audio fingerprinting 11 / 14
  • 33. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 2/3 simple system: 1 create lookup table with all possible subfingerprints (232 ) pointing to occurrences 2 assume at least one of the extracted 256 subfingerprints is error-free ⇒ only entries listed at 256 positions of the table have to be checked 3 compute Hamming distance between extracted fingerprint and candidates module 11.0: audio fingerprinting 11 / 14
  • 34. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 2/3 simple system: 1 create lookup table with all possible subfingerprints (232 ) pointing to occurrences 2 assume at least one of the extracted 256 subfingerprints is error-free ⇒ only entries listed at 256 positions of the table have to be checked 3 compute Hamming distance between extracted fingerprint and candidates module 11.0: audio fingerprinting 11 / 14
  • 35. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 3/3 variant 1: • allow one bit error ⇒ workload increase by factor ≈ 33 variant 2: • introduce concept of bit error probability into fingerprint extraction ▶ small energy difference → high error probability ▶ large energy difference → low error probability • rank bits per subfingerprint by error probability and check only for bit errors at likely positions module 11.0: audio fingerprinting 12 / 14
  • 36. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 3/3 variant 1: • allow one bit error ⇒ workload increase by factor ≈ 33 variant 2: • introduce concept of bit error probability into fingerprint extraction ▶ small energy difference → high error probability ▶ large energy difference → low error probability • rank bits per subfingerprint by error probability and check only for bit errors at likely positions module 11.0: audio fingerprinting 12 / 14
  • 37. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 3/3 variant 1: • allow one bit error ⇒ workload increase by factor ≈ 33 variant 2: • introduce concept of bit error probability into fingerprint extraction ▶ small energy difference → high error probability ▶ large energy difference → low error probability • rank bits per subfingerprint by error probability and check only for bit errors at likely positions module 11.0: audio fingerprinting 12 / 14
  • 38. overview intro requirements approaches philips shazam summary audio fingerprinting system example: philips identification 3/3 variant 1: • allow one bit error ⇒ workload increase by factor ≈ 33 variant 2: • introduce concept of bit error probability into fingerprint extraction ▶ small energy difference → high error probability ▶ large energy difference → low error probability • rank bits per subfingerprint by error probability and check only for bit errors at likely positions module 11.0: audio fingerprinting 12 / 14
  • 39. overview intro requirements approaches philips shazam summary audio fingerprinting other systems: shazam plot from1 1 A. Wang, “An Industrial Strength Audio Search Algorithm,” in Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR), Washington, 2003. module 11.0: audio fingerprinting 13 / 14
  • 40. overview intro requirements approaches philips shazam summary summary lecture content audio fingerprinting • represent recording with compact, robust, and unique fingerprint • focus on (perceptual) audio representation rather than “musical” content • allow efficient matching of this fingerprint with database often confused with other tasks 1 audio watermarking 2 cover song detection module 11.0: audio fingerprinting 14 / 14