Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task

884 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task

  1. 1. The TUM Cumulative DTWApproach for the Spoken Web Search Task Cyril Joder, Felix Weninger, Martin Wöllmer, Björn Schuller Institute for Human-Machine Communication Technische Universität München
  2. 2. Summary• Not a „system“• Low-level features only• No ASR• Little „engineering“• Method of integrating discriminative training into DTWMediaeval 2012 Workshop 2
  3. 3. Cumulative DTW (CDTW)• Limitations of DTW: – Only one local cost function (distance) – Usually manual parameter tuning• Idea: – Use different local cost functions for each step – Automatic learning of these functions as combination of general featuresMediaeval 2012 Workshop 3
  4. 4. From DTW to CDTWMediaeval 2012 Workshop 4
  5. 5. From DTW to CDTWMediaeval 2012 Workshop 5
  6. 6. From DTW to CDTWMediaeval 2012 Workshop 6
  7. 7. Softmax?Mediaeval 2012 Workshop 7
  8. 8. FeaturesMediaeval 2012 Workshop 8
  9. 9. Decision• Are the two sequences instances of the same word/expression? Decision• Learning of the parameters. – Backpropagation (stochastic gradient descent) – Training data: queries/utterances of dev setMediaeval 2012 Workshop 9
  10. 10. Search ProcedureMediaeval 2012 Workshop 10
  11. 11. Candidate Search• Align query with entire utterance – CDTW with backtracking – “Scores” for each point• Extract potential starts and ends – Peak-picking of scores• Filter by duration – Only allow warping factors < 2Mediaeval 2012 Workshop 11
  12. 12. Candidate Search• Align query with entire utterance – CDTW with backtracking – “Scores” for each point• Extract potential starts and ends – Peak-picking of scores• Filter by duration – Only allow warping factors < 2Mediaeval 2012 Workshop 12
  13. 13. CDTW Score Post-Processing• Same decision function as for learning – Many false positives – Bias toward some queries• Heuristic post-processing: – For each query, subtract a specific threshold – Threshold: 90-th percentile of the CDTW scores for that queryMediaeval 2012 Workshop 13
  14. 14. Resultsrun devQ-devC evalQ-devC devQ-evalC evalQ-evalCP(miss) 55.6% 59.5% 60.2% 54.5%P(FA) 1.18% 1.13% 1.17% 1.13%ATWV 0.263 0.333 0.164 0.290• Great improvement over naive DTW – ATWV = 0.065 on devQ-devC• ATWV scores depend on the runMediaeval 2012 Workshop 14
  15. 15. Results• DET curves similar• CDTW seems to generalize well• Decision function has to be improvedMediaeval 2012 Workshop 15
  16. 16. Conclusion• CDTW: promising results – Data-based approach with satisfactory results – Significantly outperforms (naive) DTW – Good generalization• Future work: – Decision function – Acoustic descriptors – Integrate „hard“ path constraints into searchMediaeval 2012 Workshop 16
  17. 17. Thank you. Cyril.Joder@tum.deMediaeval 2012 Workshop 17

×