GTTS System for theSpoken Web Search Task   at MediaEval 2012Amparo Varona, Mikel Penagarikano, Luis Javier Rodríguez Fuen...
HEARCH: Search on Broadcast News  (ASR + Lemmatization + Index)         MediaEval 2012 - SWS Task - GTTS System - Pisa, Oc...
HEARCH-P: Search on Parliamentary Sessions     (Audio-Text Alignment + Index)           MediaEval 2012 - SWS Task - GTTS S...
Search on spoken resources         in the Internet • Searching for text queries (computer) • Searching for spoken queries ...
SWS at MediaEval 2012     (for GTTS)• Opportunity search onthe field resources  unrestricted               to enter        ...
GTTS System: how it works•   BUT decoders for Czech, Hungarian and Russian•   Reduced sets of phonetic classes (IPA cluste...
GTTS System: how it performs •   Best configurations determined in preliminary experiments     on the development dataset •...
THANKS !!! MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
Upcoming SlideShare
Loading in...5
×

GTTS System for the Spoken Web Search Task at MediaEval 2012

762

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
762
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

GTTS System for the Spoken Web Search Task at MediaEval 2012

  1. 1. GTTS System for theSpoken Web Search Task at MediaEval 2012Amparo Varona, Mikel Penagarikano, Luis Javier Rodríguez Fuentes, Germán Bordel, Mireia Diez University of the Basque Country UPV/EHU luisjavier.rodriguez@ehu.es http://gtts.ehu.es MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
  2. 2. HEARCH: Search on Broadcast News (ASR + Lemmatization + Index) MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
  3. 3. HEARCH-P: Search on Parliamentary Sessions (Audio-Text Alignment + Index) MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
  4. 4. Search on spoken resources in the Internet • Searching for text queries (computer) • Searching for spoken queries (mobile) • Need for a common representation: • Acoustic (DTW-like approaches) • Phonetic (Search on Phone-Lattices) • Word-level (ASR-based approaches) MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
  5. 5. SWS at MediaEval 2012 (for GTTS)• Opportunity search onthe field resources unrestricted to enter spoken of• Opportunity to access development and evaluation data• Opportunity tofield state-of-the-art from experts in the learn• Our approach: search of the queries phonetic representations of n-best on phone lattices of spoken resources MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
  6. 6. GTTS System: how it works• BUT decoders for Czech, Hungarian and Russian• Reduced sets of phonetic classes (IPA clusters)• Approximate string matching (n editions allowed): Dong Wang’s Lattice2Multigram tool• Scores: length-normalized + kind of log-likelihood ratio with regard to all the detections in the same audio file• Overlapped detections: only the most likely retained• For each query: K most likely detections, z-normalization and threshold applied MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
  7. 7. GTTS System: how it performs • Best configurations determined in preliminary experiments on the development dataset • Primary: 3-best query phone decodings, 2 editions allowed in matchings • Contrastive: 1-best query phone decoding, 2 editions allowed in matchings • Poor performance !!! • Change in the approach: searching for the best detection of each query in each audio document MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
  8. 8. THANKS !!! MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×