The Codex of Business Writing Software for Real-World Solutions 2.pptx
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
1. The TUM Cumulative DTW
Approach for the Spoken Web
Search Task
Cyril Joder, Felix Weninger,
Martin Wöllmer, Björn Schuller
Institute for Human-Machine Communication
Technische Universität München
2. Summary
• Not a „system“
• Low-level features only
• No ASR
• Little „engineering“
• Method of integrating discriminative
training into DTW
Mediaeval 2012 Workshop 2
3. Cumulative DTW (CDTW)
• Limitations of DTW:
– Only one local cost function (distance)
– Usually manual parameter tuning
• Idea:
– Use different local cost functions for each step
– Automatic learning of these functions as
combination of general features
Mediaeval 2012 Workshop 3
9. Decision
• Are the two sequences instances of the
same word/expression?
Decision
• Learning of the parameters.
– Backpropagation (stochastic gradient descent)
– Training data: queries/utterances of dev set
Mediaeval 2012 Workshop 9
11. Candidate Search
• Align query with entire
utterance
– CDTW with backtracking
– “Scores” for each point
• Extract potential starts and
ends
– Peak-picking of scores
• Filter by duration
– Only allow warping factors < 2
Mediaeval 2012 Workshop 11
12. Candidate Search
• Align query with entire
utterance
– CDTW with backtracking
– “Scores” for each point
• Extract potential starts and
ends
– Peak-picking of scores
• Filter by duration
– Only allow warping factors < 2
Mediaeval 2012 Workshop 12
13. CDTW Score Post-Processing
• Same decision function as for learning
– Many false positives
– Bias toward some queries
• Heuristic post-processing:
– For each query, subtract a specific threshold
– Threshold: 90-th percentile of the CDTW
scores for that query
Mediaeval 2012 Workshop 13
14. Results
run devQ-devC evalQ-devC devQ-evalC evalQ-evalC
P(miss) 55.6% 59.5% 60.2% 54.5%
P(FA) 1.18% 1.13% 1.17% 1.13%
ATWV 0.263 0.333 0.164 0.290
• Great improvement over naive DTW
– ATWV = 0.065 on devQ-devC
• ATWV scores depend on the run
Mediaeval 2012 Workshop 14
15. Results
• DET curves similar
• CDTW seems to
generalize well
• Decision function has
to be improved
Mediaeval 2012 Workshop 15
16. Conclusion
• CDTW: promising results
– Data-based approach with satisfactory results
– Significantly outperforms (naive) DTW
– Good generalization
• Future work:
– Decision function
– Acoustic descriptors
– Integrate „hard“ path constraints into search
Mediaeval 2012 Workshop 16