SlideShare a Scribd company logo
1 of 17
The TUM Cumulative DTW
Approach for the Spoken Web
       Search Task


         Cyril Joder, Felix Weninger,
        Martin Wöllmer, Björn Schuller

  Institute for Human-Machine Communication
         Technische Universität München
Summary
•    Not a „system“
•    Low-level features only
•    No ASR
•    Little „engineering“
•    Method of integrating discriminative
     training into DTW


Mediaeval 2012 Workshop                     2
Cumulative DTW (CDTW)
• Limitations of DTW:
       – Only one local cost function (distance)
       – Usually manual parameter tuning
• Idea:
       – Use different local cost functions for each step
       – Automatic learning of these functions as
         combination of general features


Mediaeval 2012 Workshop                                 3
From DTW to CDTW




Mediaeval 2012 Workshop                 4
From DTW to CDTW




Mediaeval 2012 Workshop                 5
From DTW to CDTW




Mediaeval 2012 Workshop                 6
Softmax?




Mediaeval 2012 Workshop              7
Features




Mediaeval 2012 Workshop              8
Decision
• Are the two sequences instances of the
  same word/expression?

                                    Decision


• Learning of the parameters.
       – Backpropagation (stochastic gradient descent)
       – Training data: queries/utterances of dev set

Mediaeval 2012 Workshop                                  9
Search Procedure




Mediaeval 2012 Workshop                      10
Candidate Search
• Align query with entire
  utterance
       – CDTW with backtracking
       – “Scores” for each point
• Extract potential starts and
  ends
       – Peak-picking of scores
• Filter by duration
       – Only allow warping factors < 2

Mediaeval 2012 Workshop                      11
Candidate Search
• Align query with entire
  utterance
       – CDTW with backtracking
       – “Scores” for each point
• Extract potential starts and
  ends
       – Peak-picking of scores
• Filter by duration
       – Only allow warping factors < 2

Mediaeval 2012 Workshop                      12
CDTW Score Post-Processing
• Same decision function as for learning
       – Many false positives
       – Bias toward some queries
• Heuristic post-processing:
       – For each query, subtract a specific threshold
       – Threshold: 90-th percentile of the CDTW
         scores for that query


Mediaeval 2012 Workshop                                  13
Results
run                       devQ-devC   evalQ-devC   devQ-evalC   evalQ-evalC
P(miss)                   55.6%       59.5%        60.2%        54.5%
P(FA)                     1.18%       1.13%        1.17%        1.13%
ATWV                      0.263       0.333        0.164        0.290




• Great improvement over naive DTW
       – ATWV = 0.065 on devQ-devC
• ATWV scores depend on the run

Mediaeval 2012 Workshop                                                   14
Results

• DET curves similar

• CDTW seems to
  generalize well

• Decision function has
  to be improved




Mediaeval 2012 Workshop             15
Conclusion
• CDTW: promising results
       – Data-based approach with satisfactory results
       – Significantly outperforms (naive) DTW
       – Good generalization
• Future work:
       – Decision function
       – Acoustic descriptors
       – Integrate „hard“ path constraints into search
Mediaeval 2012 Workshop                                  16
Thank you.

                          Cyril.Joder@tum.de




Mediaeval 2012 Workshop                        17

More Related Content

Viewers also liked

KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesMediaEval2012
 
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMTUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMMediaEval2012
 
Como hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonComo hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonSharon Jimenez
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...MediaEval2012
 
14 10 21_презентация сту
14 10 21_презентация сту14 10 21_презентация сту
14 10 21_презентация стуStanislav Litvinenko
 
How Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-TaggingHow Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-TaggingMediaEval2012
 
2010 Marketing Plan
2010 Marketing Plan2010 Marketing Plan
2010 Marketing PlanJPemberton15
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval2012
 
6dicas– veda 4
6dicas– veda 46dicas– veda 4
6dicas– veda 4souzadea1
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account MatchingMediaEval2012
 
Activities for journalistic skills
Activities for journalistic skillsActivities for journalistic skills
Activities for journalistic skillsJNavarro0321
 
Designinteração– veda 3
Designinteração– veda 3Designinteração– veda 3
Designinteração– veda 3souzadea1
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskMediaEval2012
 
Ghent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskGhent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskMediaEval2012
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...MediaEval2012
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...MediaEval2012
 
Core companies for eee
Core companies for eeeCore companies for eee
Core companies for eeenarenans
 

Viewers also liked (20)

KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
 
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMTUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
 
Como hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonComo hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharon
 
κειμενο
κειμενοκειμενο
κειμενο
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
14 10 21_презентация сту
14 10 21_презентация сту14 10 21_презентация сту
14 10 21_презентация сту
 
How Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-TaggingHow Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-Tagging
 
2010 Marketing Plan
2010 Marketing Plan2010 Marketing Plan
2010 Marketing Plan
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
6dicas– veda 4
6dicas– veda 46dicas– veda 4
6dicas– veda 4
 
10 ρ. δρακουλησ
10 ρ. δρακουλησ10 ρ. δρακουλησ
10 ρ. δρακουλησ
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
Activities for journalistic skills
Activities for journalistic skillsActivities for journalistic skills
Activities for journalistic skills
 
Designinteração– veda 3
Designinteração– veda 3Designinteração– veda 3
Designinteração– veda 3
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
Ghent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskGhent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing Task
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
 
Papiloma humano
Papiloma humanoPapiloma humano
Papiloma humano
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
 
Core companies for eee
Core companies for eeeCore companies for eee
Core companies for eee
 

Similar to The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task

Acceptance Test Driven Development
Acceptance Test Driven DevelopmentAcceptance Test Driven Development
Acceptance Test Driven DevelopmentMike Douglas
 
Software Engineering Practice - Software Quality Management
Software Engineering Practice - Software Quality ManagementSoftware Engineering Practice - Software Quality Management
Software Engineering Practice - Software Quality ManagementRadu_Negulescu
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesAlan Said
 
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...Antonio Tejero de Pablos
 
Agile Business Rhythm
Agile Business RhythmAgile Business Rhythm
Agile Business RhythmGlen Alleman
 
Project m&e & logframe
Project m&e & logframeProject m&e & logframe
Project m&e & logframeWesley Opaki
 
A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionFederico Magliani
 
Benchmarking Execution Performance and Earned Value
Benchmarking Execution Performance and Earned ValueBenchmarking Execution Performance and Earned Value
Benchmarking Execution Performance and Earned ValueAcumen
 
18 - Impact of Process Conformance on the Effects of Test-driven Development
18 - Impact of Process Conformance on the Effects of Test-driven Development18 - Impact of Process Conformance on the Effects of Test-driven Development
18 - Impact of Process Conformance on the Effects of Test-driven DevelopmentESEM 2014
 
Kanban and TOC for Execution Excellence Lean India Summit 2014
Kanban and TOC for Execution Excellence   Lean India Summit 2014Kanban and TOC for Execution Excellence   Lean India Summit 2014
Kanban and TOC for Execution Excellence Lean India Summit 2014Lean India Summit
 
Performance-based Curriculum Architecture Design
Performance-based Curriculum Architecture DesignPerformance-based Curriculum Architecture Design
Performance-based Curriculum Architecture DesignEPPIC Inc.
 
Pdu session challenges in agile
Pdu session   challenges in agilePdu session   challenges in agile
Pdu session challenges in agileBhawani N Prasad
 
Introducing LCS to Digital Design Verification
Introducing LCS to Digital Design VerificationIntroducing LCS to Digital Design Verification
Introducing LCS to Digital Design VerificationDaniele Loiacono
 
Continuous delivery, a plugin for Kanban LKFR14
Continuous delivery, a plugin for Kanban LKFR14Continuous delivery, a plugin for Kanban LKFR14
Continuous delivery, a plugin for Kanban LKFR14Samuel RETIERE
 
Quintin Cutts - Teaching and Learning to Program: Too much doing and not enou...
Quintin Cutts - Teaching and Learning to Program: Too much doing and not enou...Quintin Cutts - Teaching and Learning to Program: Too much doing and not enou...
Quintin Cutts - Teaching and Learning to Program: Too much doing and not enou...compatsch
 

Similar to The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task (20)

CCDE Experience
CCDE ExperienceCCDE Experience
CCDE Experience
 
Qual-IT-yes2012
Qual-IT-yes2012Qual-IT-yes2012
Qual-IT-yes2012
 
The Art of Project Estimation
The Art of Project EstimationThe Art of Project Estimation
The Art of Project Estimation
 
Acceptance Test Driven Development
Acceptance Test Driven DevelopmentAcceptance Test Driven Development
Acceptance Test Driven Development
 
Software Engineering Practice - Software Quality Management
Software Engineering Practice - Software Quality ManagementSoftware Engineering Practice - Software Quality Management
Software Engineering Practice - Software Quality Management
 
The art of project estimation
The art of project estimationThe art of project estimation
The art of project estimation
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System Challenges
 
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
 
Agile Business Rhythm
Agile Business RhythmAgile Business Rhythm
Agile Business Rhythm
 
Project m&e & logframe
Project m&e & logframeProject m&e & logframe
Project m&e & logframe
 
A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognition
 
Benchmarking Execution Performance and Earned Value
Benchmarking Execution Performance and Earned ValueBenchmarking Execution Performance and Earned Value
Benchmarking Execution Performance and Earned Value
 
18 - Impact of Process Conformance on the Effects of Test-driven Development
18 - Impact of Process Conformance on the Effects of Test-driven Development18 - Impact of Process Conformance on the Effects of Test-driven Development
18 - Impact of Process Conformance on the Effects of Test-driven Development
 
Kanban and TOC for Execution Excellence Lean India Summit 2014
Kanban and TOC for Execution Excellence   Lean India Summit 2014Kanban and TOC for Execution Excellence   Lean India Summit 2014
Kanban and TOC for Execution Excellence Lean India Summit 2014
 
Performance-based Curriculum Architecture Design
Performance-based Curriculum Architecture DesignPerformance-based Curriculum Architecture Design
Performance-based Curriculum Architecture Design
 
Pdu session challenges in agile
Pdu session   challenges in agilePdu session   challenges in agile
Pdu session challenges in agile
 
Introducing LCS to Digital Design Verification
Introducing LCS to Digital Design VerificationIntroducing LCS to Digital Design Verification
Introducing LCS to Digital Design Verification
 
Orchestration, Automation and Virtualisation Maturity Model
Orchestration, Automation and Virtualisation Maturity ModelOrchestration, Automation and Virtualisation Maturity Model
Orchestration, Automation and Virtualisation Maturity Model
 
Continuous delivery, a plugin for Kanban LKFR14
Continuous delivery, a plugin for Kanban LKFR14Continuous delivery, a plugin for Kanban LKFR14
Continuous delivery, a plugin for Kanban LKFR14
 
Quintin Cutts - Teaching and Learning to Program: Too much doing and not enou...
Quintin Cutts - Teaching and Learning to Program: Too much doing and not enou...Quintin Cutts - Teaching and Learning to Program: Too much doing and not enou...
Quintin Cutts - Teaching and Learning to Program: Too much doing and not enou...
 

More from MediaEval2012

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval2012
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding MediaEval2012
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingMediaEval2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsMediaEval2012
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskMediaEval2012
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval2012
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioMediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodMediaEval2012
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskMediaEval2012
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationMediaEval2012
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskMediaEval2012
 

More from MediaEval2012 (20)

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
Closing
ClosingClosing
Closing
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
 

Recently uploaded

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Recently uploaded (20)

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task

  • 1. The TUM Cumulative DTW Approach for the Spoken Web Search Task Cyril Joder, Felix Weninger, Martin Wöllmer, Björn Schuller Institute for Human-Machine Communication Technische Universität München
  • 2. Summary • Not a „system“ • Low-level features only • No ASR • Little „engineering“ • Method of integrating discriminative training into DTW Mediaeval 2012 Workshop 2
  • 3. Cumulative DTW (CDTW) • Limitations of DTW: – Only one local cost function (distance) – Usually manual parameter tuning • Idea: – Use different local cost functions for each step – Automatic learning of these functions as combination of general features Mediaeval 2012 Workshop 3
  • 4. From DTW to CDTW Mediaeval 2012 Workshop 4
  • 5. From DTW to CDTW Mediaeval 2012 Workshop 5
  • 6. From DTW to CDTW Mediaeval 2012 Workshop 6
  • 9. Decision • Are the two sequences instances of the same word/expression? Decision • Learning of the parameters. – Backpropagation (stochastic gradient descent) – Training data: queries/utterances of dev set Mediaeval 2012 Workshop 9
  • 11. Candidate Search • Align query with entire utterance – CDTW with backtracking – “Scores” for each point • Extract potential starts and ends – Peak-picking of scores • Filter by duration – Only allow warping factors < 2 Mediaeval 2012 Workshop 11
  • 12. Candidate Search • Align query with entire utterance – CDTW with backtracking – “Scores” for each point • Extract potential starts and ends – Peak-picking of scores • Filter by duration – Only allow warping factors < 2 Mediaeval 2012 Workshop 12
  • 13. CDTW Score Post-Processing • Same decision function as for learning – Many false positives – Bias toward some queries • Heuristic post-processing: – For each query, subtract a specific threshold – Threshold: 90-th percentile of the CDTW scores for that query Mediaeval 2012 Workshop 13
  • 14. Results run devQ-devC evalQ-devC devQ-evalC evalQ-evalC P(miss) 55.6% 59.5% 60.2% 54.5% P(FA) 1.18% 1.13% 1.17% 1.13% ATWV 0.263 0.333 0.164 0.290 • Great improvement over naive DTW – ATWV = 0.065 on devQ-devC • ATWV scores depend on the run Mediaeval 2012 Workshop 14
  • 15. Results • DET curves similar • CDTW seems to generalize well • Decision function has to be improved Mediaeval 2012 Workshop 15
  • 16. Conclusion • CDTW: promising results – Data-based approach with satisfactory results – Significantly outperforms (naive) DTW – Good generalization • Future work: – Decision function – Acoustic descriptors – Integrate „hard“ path constraints into search Mediaeval 2012 Workshop 16
  • 17. Thank you. Cyril.Joder@tum.de Mediaeval 2012 Workshop 17