SlideShare a Scribd company logo
1 of 10
Download to read offline
Audio-based Near-Duplicate Video
Retrieval with Audio Similarity Learning
Pavlos Avgoustinakis Giorgos Kordopatis-Zilos Symeon Papadopoulos
Andreas L. Symeonidis Ioannis Kompatsiaris
Problem statement
Duplicate Audio Video Retrieval (DAVR)
• Given a video query, search a video database and retrieve videos that
share the same audio content
Query video Retrieved videosVideo database
Motivation
State-of-the-art limitations
• Use of handcrafted approaches for fingerprint extraction
• Rigid aggregation schemes for similarity calculation
• Lack of evaluation benchmarks with user-generated content annotated
based on audio duplicity
Our objective
• Leverage of deep learning and transfer learning
• Composition of evaluation benchmarks
Contribution
• Robust audio-based video similarity calculation
• Transfer learning from a pre-trained CNN
• Audio similarity learning
• Two annotated datasets that serve as benchmarks for DAVR
Feature Extraction
• Generate audio Mel-spectrograms of audio signals
• Divide into overlapping time segments
• Employ a pre-trained CNN (Kumar et al. 2018)
• Max Activation of Convolutions (MAC) on intermediate CNN layers
• Apply PCA whitening and attention-based weighting
A. Kumar, M. Khadkevich, and C. Fugen. “Knowledge transfer from weakly labeled audio using convolutional neural network for sound events and
scenes”. ICASSP, 2018.
Similarity calculation
• Audio Similarity Learning
network
• Four-layer CNN
• Captures the temporal
structures
• Chamfer Similarity
CS(𝑞, 𝑝) =
1
𝑋′ ෍
𝑖=1
𝑋′
max
𝑗∈[1,𝑌′]
Htanh(𝒮 𝜐
𝑞𝑝
(𝑖, 𝑗))
• Generation of the
similarity matrix
𝑆 𝑞𝑝 = 𝑄 ⋅ 𝑃 𝑇
Evaluation datasets
FIVR-200Kα
• FIVR-200K (Kordopatis-Zilos et al., 2019) annotated for Fine-grained Incident
Video Retrieval
• 76 video queries
• 3,392 audio duplicate pairs
SVDα
• SVD (Jiang et al., 2019) annotated for Near-Duplicate Video Retrieval
• 167 video queries
• 1,492 audio duplicate pairs
G. Kordopatis-Zilos, S. Papadopoulos, I. Patras, and I.Kompatsiaris. “Fine-grained Incident Video Retrieval”. IEEE TMM, 2019.
Q. Y. Jiang, Y. He, G. Li, J. Lin, L. Li, and W. J. Li. “SVD: A Large-Scale Short Video Dataset for Near Duplicate Video Retrieval”. ICCV, 2019.
Experimental results
• Duplicate Audio Video Retrieval (DAVR)
FIVR-200Kα SVDα
• Audio speed transformations
Experimental results
• Visual-based video retrieval
Thank you!
Code available in:
https://github.com/mever-team/ausil
With the support of:
Get in touch:
Giorgos Kordopatis-Zilos: georgekordopatis@iti.gr / @g_kordo
MeVer team:
https://mever.iti.gr/web/ / @meverteam

More Related Content

Similar to Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning | Presentation@ICPR2020

Deep Learning - Speaker Recognition
Deep Learning - Speaker Recognition Deep Learning - Speaker Recognition
Deep Learning - Speaker Recognition Sai Kiran Kadam
 
Deep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker RecognitionDeep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker RecognitionSai Kiran Kadam
 
Mpeg 7 video signature tools for content recognition
Mpeg 7 video signature tools for content recognitionMpeg 7 video signature tools for content recognition
Mpeg 7 video signature tools for content recognitionParag Tamhane
 
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...DataScienceConferenc1
 
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...DataScienceConferenc1
 
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...SWAMI06
 
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...Vignesh V Menon
 
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videosAdria Recasens, DeepMind – Multi-modal self-supervised learning from videos
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videosCodiax
 
A framework for visual search in broadcasting companies' multimedia archives
A framework for visual search in broadcasting companies' multimedia archives A framework for visual search in broadcasting companies' multimedia archives
A framework for visual search in broadcasting companies' multimedia archives FIAT/IFTA
 
Automated Podcasting System for Universities
Automated Podcasting System for UniversitiesAutomated Podcasting System for Universities
Automated Podcasting System for UniversitiesEducational Technology
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSymeon Papadopoulos
 
Deep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event DetectionDeep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event DetectionSai Kiran Kadam
 
Music Gesture for Visual Sound Separation
Music Gesture for Visual Sound SeparationMusic Gesture for Visual Sound Separation
Music Gesture for Visual Sound Separationivaderivader
 
Bambi presentation
Bambi presentationBambi presentation
Bambi presentationKlaus Riede
 

Similar to Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning | Presentation@ICPR2020 (20)

Deep Learning - Speaker Recognition
Deep Learning - Speaker Recognition Deep Learning - Speaker Recognition
Deep Learning - Speaker Recognition
 
Barwick video-trial
Barwick video-trialBarwick video-trial
Barwick video-trial
 
Deep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker RecognitionDeep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker Recognition
 
Mpeg 7 video signature tools for content recognition
Mpeg 7 video signature tools for content recognitionMpeg 7 video signature tools for content recognition
Mpeg 7 video signature tools for content recognition
 
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
 
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
[DSC Europe 23] Paweł Ekk-Cierniakowski - Video transcription with deep learn...
 
Video Thumbnail Selector
Video Thumbnail SelectorVideo Thumbnail Selector
Video Thumbnail Selector
 
心理影响.ppt
心理影响.ppt心理影响.ppt
心理影响.ppt
 
phd-mark4
phd-mark4phd-mark4
phd-mark4
 
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
 
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
 
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videosAdria Recasens, DeepMind – Multi-modal self-supervised learning from videos
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos
 
A framework for visual search in broadcasting companies' multimedia archives
A framework for visual search in broadcasting companies' multimedia archives A framework for visual search in broadcasting companies' multimedia archives
A framework for visual search in broadcasting companies' multimedia archives
 
Automated Podcasting System for Universities
Automated Podcasting System for UniversitiesAutomated Podcasting System for Universities
Automated Podcasting System for Universities
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia content
 
Deep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event DetectionDeep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event Detection
 
Music Gesture for Visual Sound Separation
Music Gesture for Visual Sound SeparationMusic Gesture for Visual Sound Separation
Music Gesture for Visual Sound Separation
 
Paul Wang SOED 2016
Paul Wang SOED 2016Paul Wang SOED 2016
Paul Wang SOED 2016
 
Bambi presentation
Bambi presentationBambi presentation
Bambi presentation
 
Video processing.pptx
Video processing.pptxVideo processing.pptx
Video processing.pptx
 

Recently uploaded

Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 

Recently uploaded (20)

Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 

Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning | Presentation@ICPR2020

  • 1. Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning Pavlos Avgoustinakis Giorgos Kordopatis-Zilos Symeon Papadopoulos Andreas L. Symeonidis Ioannis Kompatsiaris
  • 2. Problem statement Duplicate Audio Video Retrieval (DAVR) • Given a video query, search a video database and retrieve videos that share the same audio content Query video Retrieved videosVideo database
  • 3. Motivation State-of-the-art limitations • Use of handcrafted approaches for fingerprint extraction • Rigid aggregation schemes for similarity calculation • Lack of evaluation benchmarks with user-generated content annotated based on audio duplicity Our objective • Leverage of deep learning and transfer learning • Composition of evaluation benchmarks
  • 4. Contribution • Robust audio-based video similarity calculation • Transfer learning from a pre-trained CNN • Audio similarity learning • Two annotated datasets that serve as benchmarks for DAVR
  • 5. Feature Extraction • Generate audio Mel-spectrograms of audio signals • Divide into overlapping time segments • Employ a pre-trained CNN (Kumar et al. 2018) • Max Activation of Convolutions (MAC) on intermediate CNN layers • Apply PCA whitening and attention-based weighting A. Kumar, M. Khadkevich, and C. Fugen. “Knowledge transfer from weakly labeled audio using convolutional neural network for sound events and scenes”. ICASSP, 2018.
  • 6. Similarity calculation • Audio Similarity Learning network • Four-layer CNN • Captures the temporal structures • Chamfer Similarity CS(𝑞, 𝑝) = 1 𝑋′ ෍ 𝑖=1 𝑋′ max 𝑗∈[1,𝑌′] Htanh(𝒮 𝜐 𝑞𝑝 (𝑖, 𝑗)) • Generation of the similarity matrix 𝑆 𝑞𝑝 = 𝑄 ⋅ 𝑃 𝑇
  • 7. Evaluation datasets FIVR-200Kα • FIVR-200K (Kordopatis-Zilos et al., 2019) annotated for Fine-grained Incident Video Retrieval • 76 video queries • 3,392 audio duplicate pairs SVDα • SVD (Jiang et al., 2019) annotated for Near-Duplicate Video Retrieval • 167 video queries • 1,492 audio duplicate pairs G. Kordopatis-Zilos, S. Papadopoulos, I. Patras, and I.Kompatsiaris. “Fine-grained Incident Video Retrieval”. IEEE TMM, 2019. Q. Y. Jiang, Y. He, G. Li, J. Lin, L. Li, and W. J. Li. “SVD: A Large-Scale Short Video Dataset for Near Duplicate Video Retrieval”. ICCV, 2019.
  • 8. Experimental results • Duplicate Audio Video Retrieval (DAVR) FIVR-200Kα SVDα • Audio speed transformations
  • 10. Thank you! Code available in: https://github.com/mever-team/ausil With the support of: Get in touch: Giorgos Kordopatis-Zilos: georgekordopatis@iti.gr / @g_kordo MeVer team: https://mever.iti.gr/web/ / @meverteam