SlideShare a Scribd company logo
1 of 26
MediaEval 2012
          Spoken Web Search

Florian Metze, Marelie Davel, Etienne Barnard, Xavier
Anguera, Guillaume Gravier, and Nitendra Rajput
                                 Pisa, October 4, 2012
Outline




     The Spoken Web Search Task

     Data and Scoring

     Organizers and Participants

     Results

     Discussion
Organizers



   Florian Metze (Carnegie Mellon)

   Etienne Barnard, Marelie Davel, Charl v. Heerden (North-West University)

   Xavier Anguera (Telefonica Research)

   Guillaume Gravier (IRISA)

   Nitendra Rajput (IBM India)
Real life audio content is very diverse!




                                      “2011 Indian Data”
Spoken Web Search Task Motivation



   Any speech problem can be solved with enough:

      Money, Time, Constraints, Data

   What if we have just one constraints?

      Don’t know what language/ dialect is being used? Don’t have much data!

      But don’t have to do Large Vocabulary Speech Recognition,
        “only” content retrieval

   What can be done?

      Port outside resources (ie. run language-independent/-portable recognizer)

      Build a “zero knowledge” approach (i.e. try to directly identify similar words)
Primary Data Source: “African Data”


 “Lwazi” Corpus                           Data obtained during targeted effort
    Lwazi means Knowledge                      Meant as resource for speech
 Lwazi project aims to develop                  research, so no “found” data,
  telephony-based speech-driven                  as “Indian Data”
  information system                       E. Barnard, M. Davel, and C. van
 11 South-African languages, 3h-6h         Heerden, "ASR Corpus design for
  of speech per language                    resource-scarce languages," in Proc.
                                            INTERSPEECH, Brighton, UK; Sep.
    Phone sets, dictionaries, read &       2009, pp. 2847-2850.
     spontaneous, …
 3200 utterances used, from 4
  languages
Evaluation Paradigm:
Spoken Term Detection (STD)


   Do not attempt to convert speech to text (full recognition, ASR)

   Attempt to detect the occurrence (or absence) of “keywords”

   STD is not easier than doing ASR

      It requires less resources: particularly not a strong language model

   Evaluation metrics:

      (Spoken) Document Retrieval (SDR), when relaxing time constraints

      Actual Term Weighted Value (ATWV, MTWV – defined by NIST)
Evaluation Idea – 4 Conditions



   Test development terms on (known) development data

   Test (unknown) evaluation terms on (unknown) evaluation data

   Test development terms on evaluation data

   Test evaluation terms on development data



   Terms provided as audio examples taken from collections

   Systems could be developed with or without using external resources (i.e.
    other speech data, it is important to document, which ones were used –
    “restricted” vs “open”)
NIST Scoring Tools



   Developed for 2006 Spoken Term Detection
      Generates “Actual” and “Maximum Term Weighted Value” (ATWV, MTWV)

      Generates DET curves

   Adapted by us
      ECF = “Experiment Control File” (controls which sections to process)

      RTTM = “Rich Transcription Time Mark” (defines references)

      TLIST =“Term List” Files (links term IDs and word dictionary)

   A few parameters to choose
      Different for 2011 and 2012, to better represent characteristics of SWS task
       (thanks, Xavi)

   Best ATWV value is 1, below 0 possible
How to Interpret DET Plots



   Most useful plot

   If done right, will give you
                                                                                                                           Miss probability (in %)




                                                                                                                 10


                                                                                                                      20



                                                                                                                            40



                                                                                                                                      60



                                                                                                                                                 80


                                                                                                                                                      90

                                                                                                                                                                    95


                                                                                                                                                                                                                               98
                                                                              .0001
                                                                              5
      P(Miss) over P(FA) for all decision




                                                                                                                                                           Term Wtd. float-primary-test: CTS Subset Max Val=0.173 Scr=1.276
                                                                                                                                                             Term Wtd. float-primary-test : ALL Data Max Val=0.173 Scr=1.276
                                                                              .001 .004.01.02 .05 .1 .2 .5 1 2
        scores




                                             False Alarm probability (in %)
      A “marker” at the actual decision




                                                                                                                                                                                                                                    Combined DET Plot
      If computed using score, this will




                                                                                                                                                                                                        Random Performance
                                                                              5
        be on the line
                                                                              10
                                                                              20

   Used for evaluation (with
                                                                              40




    score.occ.txt)
2012 Spoken Web Search Participants

 Authors                                       Title

 Haipeng Wang and Tan Lee                      CUHK System for the Spoken Web Search task at
                                               Mediaeval 2012
 Cyril Joder, Felix Weninger, Martin Wöllmer   The TUM Cumulative DTW Approach for the Mediaeval
 and Björn Schuller                            2012 Spoken Web Search Task
 Andi Buzo, Horia Cucu, Mihai Safta,           ARF @ MediaEval 2012: A Romanian ASR-based
 Bogdan Ionescu, and Corneliu Burileanu        Approach to Spoken Term Detection
 Alberto Abad and Ramón F. Astudillo           The L2F Spoken Web Search system for Mediaeval
                                               2012
 Jozef Vavrek, Matus Pleva and Jozef Juhar     TUKE MediaEval 2012: Spoken Web Search using
                                               DTW and Unsupervised SVM
 Amparo Varona, Mikel Penagarikano, Luis       GTTS System for the Spoken Web Search Task at
 Javier Rodriguez-Fuentes, German Bordel,      MediaEval 2012
 and Mireia Diez
 Igor Szoke, Michal Fapšo, and Karel           BUT 2012 APPROACHES FOR SPOKEN WEB
 Veselý                                        SEARCH - MEDIAEVAL 2012
 Aren Jansen, Benjamin Van Durme, and          The JHU-HLTCOE Spoken Web Search System for
 Pascal Clark                                  MediaEval 2012
 Xavier Anguera                                (TID) Telefonica Research System for the Spoken
                                               Web Search task at Mediaeval 2012
Summary of (Primary) Results


             Team                                   Type        Dev       Eval

CUHK         cuhk_phnrecgmmasm_p-fusionprf_1       open          0,7824    0,7430

CUHK         cuhk_spch_p-gmmasmprf_1               restricted    0,6776    0,6350

L2F          l2f_12_spch_p-phonetic4_fusion_mv_1   open          0,5313    0,5195

BUT          BUT_spch_p-akws-devterms_1            open          0,4884    0,4918

BUT          BUT_spch_g-DTW-devterms_1             open          0,4426    0,4477

JHU-HLTCOE   jhu_all_spch_p-rails_1                restricted    0,3811    0,3688

TID          sws2012_IRDTW                         restricted    0,3866    0,3301

TUM          tum_spch_p-cdtw_1                     restricted    0,2628    0,2895

ARF          arf_spch_p-asrDTWAlign_w15_a08_b04    open          0,4109    0,2448

GTTS         gtts_spch_p-phone_lattice_1           open          0,0978    0,0809

TUKE         tuke_spch_p-dtwsvm                    restricted         0          0
Development data, development terms
                                                                98
                                                                                                                             Random Performance
                                                                                                                               ARF MTWV=0.471
                                                                95                                                             ARF MTWV=0.491
                                                                                                                               ARF MTWV=0.253
                                                                90                                                             ARF MTWV=0.487
Development Data, Development Terms




                                                                                                                               BUT MTWV=0.468
                                      Miss probability (in %)




                                                                80                                                             BUT MTWV=0.493
                                                                                                                              CUHK MTWV=0.735
                                                                                                                              CUHK MTWV=0.751
                                                                60                                                            CUHK MTWV=0.787
                                                                                                                              CUHK MTWV=0.631
                                                                                                                              CUHK MTWV=0.680
                                                                40                                                      JHU-HLTCOE MTWV=0.382
                                                                                                                                L2F MTWV=0.531
                                                                                                                              TUKE MTWV=0.000
                                                                20                                                             TUM MTWV=0.354
                                                                                                                               TUM MTWV=0.337
                                                                                                                               TUM MTWV=0.270
                                                                10
                                                                                                                                TID MTWV=0.390
                                                                5                                                               TID MTWV=0.375
                                                                .0001 .001.004.01.02 .05 .1 .2 .5 1 2       5 10   20   40    GTTS MTWV=0.098
                                                                                                                              GTTS MTWV=0.105
                                                                                 False Alarm probability (in %)
Development data, evaluation terms
                                                               98
                                                                                                                             Random Performance
                                                                                                                       ARF MTWV=0.443 Scr=0.470
                                                               95                                                              ARF MTWV=0.475
                                                                                                                               ARF MTWV=0.016
                                                               90                                                              ARF MTWV=0.224
                                                                                                                               ARF MTWV=0.466
                                     Miss probability (in %)




                                                               80                                                              BUT MTWV=0.481
Development Data, Evaluation Terms




                                                                                                                               BUT MTWV=0.629
                                                                                                                              CUHK MTWV=0.769
                                                               60                                                             CUHK MTWV=0.772
                                                                                                                              CUHK MTWV=0.805
                                                                                                                              CUHK MTWV=0.687
                                                               40                                                             CUHK MTWV=0.686
                                                                                                                        JHU-HLTCOE MTWV=0.440
                                                                                                                                L2F MTWV=0.633
                                                               20                                                             TUKE MTWV=0.000
                                                                                                                              TUKE MTWV=0.257
                                                                                                                               TUM MTWV=0.201
                                                               10
                                                                                                                               TUM MTWV=0.396
                                                               5                                                                TID MTWV=0.498
                                                               .0001 .001.004.01.02 .05 .1 .2 .5 1 2       5 10   20   40       TID MTWV=0.300
                                                                                                                              GTTS MTWV=0.083
                                                                                False Alarm probability (in %)
                                                                                                                              GTTS MTWV=0.109
Evaluation data, development terms
                                                               98
                                                                                                                            Random Performance
                                                                                                                              ARF MTWV=0.317
                                                               95                                                             ARF MTWV=0.339
                                                                                                                              ARF MTWV=0.000
                                                               90                                                             ARF MTWV=0.167
                                                                                                                              ARF MTWV=0.333
                                     Miss probability (in %)
Evaluation Data, Development Terms




                                                               80                                                             BUT MTWV=0.383
                                                                                                                              BUT MTWV=0.429
                                                                                                                             CUHK MTWV=0.707
                                                               60                                                            CUHK MTWV=0.715
                                                                                                                             CUHK MTWV=0.752
                                                                                                                             CUHK MTWV=0.561
                                                               40                                                            CUHK MTWV=0.620
                                                                                                                       JHU-HLTCOE MTWV=0.336
                                                                                                                               L2F MTWV=0.486
                                                               20                                                            TUKE MTWV=0.000
                                                                                                                              TUM MTWV=0.236
                                                                                                                              TUM MTWV=0.291
                                                               10
                                                                                                                              TUM MTWV=0.174
                                                               5                                                               TID MTWV=0.314
                                                               .0001 .001.004.01.02 .05 .1 .2 .5 1 2       5 10   20   40      TID MTWV=0.472
                                                                                                                             GTTS MTWV=0.070
                                                                                False Alarm probability (in %)
                                                                                                                             GTTS MTWV=0.081
Evaluation data, evaluation terms
                                                          98
                                                                                                                        Random Performance
                                                                                                                          ARF MTWV=0.268
                                                          95                                                              ARF MTWV=0.310
                                                                                                                          ARF MTWV=0.001
                                                          90                                                              ARF MTWV=0.120
                                                                                                                          ARF MTWV=0.306
                                Miss probability (in %)




                                                          80                                                              BUT MTWV=0.488
                                                                                                                          BUT MTWV=0.530
Evaluation Data, Evaluation Terms




                                                                                                                         CUHK MTWV=0.724
                                                          60                                                             CUHK MTWV=0.742
                                                                                                                         CUHK MTWV=0.762
                                                                                                                         CUHK MTWV=0.589
                                                          40                                                             CUHK MTWV=0.643
                                                                                                                   JHU-HLTCOE MTWV=0.384
                                                                                                                           L2F MTWV=0.523
                                                          20                                                             TUKE MTWV=0.000
                                                                                                                          TUM MTWV=0.187
                                                                                                                          TUM MTWV=0.164
                                                          10
                                                                                                                          TUM MTWV=0.296
                                                          5                                                                TID MTWV=0.342
                                                          .0001 .001.004.01.02 .05 .1 .2 .5 1 2       5 10    20   40      TID MTWV=0.311
                                                                                                                         GTTS MTWV=0.070
                                                                           False Alarm probability (in %)
                                                                                                                         GTTS MTWV=0.081
Spoken Web Search Task
Summary 1


   Second time around
      Last year’s participants (mostly) became organizers

      Grew from 5 to ca. 10 participants!!!

      Europe, America, Asia, Africa (where’s Australia and Antarctica?)

   Interesting differences in performance
      Thank you all participants! It was fun & interesting.

      Evaluation criteria useful, correct?
Spoken Web Search Task
Summary 2


   Could talk a bit about JHU-HLTCOE’s “RAILS” system

   Next steps?
      Do more joint analysis (hope everybody’s results agree with ours?)

      Shared Publications? ICASSP? Journal?

      Develop task further for next year?

      “Speech Kitchen” idea will be presented later …
 Thank You!
How to Interpret *.occ.txt File



   Coefficients C, V                         Values used for padding and
                                               multi-term detections are missing
      Weighting of correct vs incorrect
        detections                            In some rare cases lists different
   Probability of a Term                      values for total and only sub-class

      Expectation of terms                   Was expecting more questions

   Average and Maximum TWV

   P(FA) and P(Miss)

   Optimal decision score
Parameters used



   The tools assume you use a             Used different parameters for
    “decision score”                        African and Indian data sets to
                                            reflect different use cases
      Submit “candidates” with score
        lower than cutoff                  KoefV/ KoefC are debatable
      Submit “detections” with score
                                              What’s the cost of wrong and the
        higher than cutoff
                                                benefit of correct detections
   Enables plotting of DET curves
                                           -P Probability-of-Term
   Can be confusing                          How frequent are terms expected
                                                to be?
How to Interpret score.det.thresh.pdf



   Can be used to analyze decision




                                                                                       MaxValue 0.173 @ 1.276
    score behavior




                                                                                                      P(Miss)
                                                                                                        P(FA)
                                                                                                        Value
                                                               0.2




                                                                     0.4




                                                                           0.6




                                                                                 0.8
                                                           0




                                                                                                            1
      P(FA) False Alarms




                                                       1
      P(Miss) Missed Detections




                                                                                                                Term Wtd. Threshold Plot for float-primary-test : ALL Data
                                                       2
                                                       3
      Resulting TWV


                                      Decision Score
                                                       4
                                                       5
                                                       6
                                                       7
Dev-Dev MTWV-ATWV differences
   0.1

   0.09

   0.08

   0.07

   0.06

   0.05

   0.04

   0.03

   0.02

   0.01

     0
Eval-Eval MTWV-ATWV differences
    0.1

   0.09

   0.08

   0.07

   0.06

   0.05

   0.04

   0.03

   0.02

   0.01

     0
Dev-Eval MTWV-ATWV differences
   0.25



    0.2



   0.15



    0.1



   0.05



     0
Eval-Dev MTWV-ATWV differences
   0.25



    0.2



   0.15



    0.1



   0.05



     0

More Related Content

Viewers also liked

Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskMediaEval2012
 
When Ideas and Opportunities Collide
When Ideas and Opportunities CollideWhen Ideas and Opportunities Collide
When Ideas and Opportunities CollideGrow America
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval2012
 
Week 2 discussion 2
Week 2 discussion 2Week 2 discussion 2
Week 2 discussion 2LILBIT2012
 
The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012Philip Polstra
 
Event Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED TaskEvent Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED TaskMediaEval2012
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...MediaEval2012
 
MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval2012
 
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012MediaEval2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012MediaEval2012
 
Idea or opportunity?
Idea or opportunity?Idea or opportunity?
Idea or opportunity?Grow America
 
Mentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and VideoMentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and VideoGrow America
 
Secrets of Storytelling by Candace Klein
Secrets of Storytelling by Candace KleinSecrets of Storytelling by Candace Klein
Secrets of Storytelling by Candace KleinGrow America
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodMediaEval2012
 

Viewers also liked (17)

Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
When Ideas and Opportunities Collide
When Ideas and Opportunities CollideWhen Ideas and Opportunities Collide
When Ideas and Opportunities Collide
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
Week 2 discussion 2
Week 2 discussion 2Week 2 discussion 2
Week 2 discussion 2
 
The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012
 
Thotcon2013
Thotcon2013Thotcon2013
Thotcon2013
 
Event Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED TaskEvent Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED Task
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
 
MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
Idea or opportunity?
Idea or opportunity?Idea or opportunity?
Idea or opportunity?
 
Live pitch event
Live pitch eventLive pitch event
Live pitch event
 
Mentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and VideoMentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and Video
 
Secrets of Storytelling by Candace Klein
Secrets of Storytelling by Candace KleinSecrets of Storytelling by Candace Klein
Secrets of Storytelling by Candace Klein
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Simha_RP
Simha_RPSimha_RP
Simha_RP
 

Similar to MediaEval 2012 Spoken Web Search Task Results

Maximize the Value of Raster Data Using FME
Maximize the Value of Raster Data Using FMEMaximize the Value of Raster Data Using FME
Maximize the Value of Raster Data Using FMESafe Software
 
2012 8 29 TAR Webinar Part 2 Sigler
2012 8 29 TAR Webinar Part 2 Sigler2012 8 29 TAR Webinar Part 2 Sigler
2012 8 29 TAR Webinar Part 2 SiglerSonya Sigler
 
VerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdfVerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdfAmzath3
 
Easydd program3
Easydd program3Easydd program3
Easydd program3Taha Sochi
 
Orchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache MahoutOrchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache Mahoutaneeshabakharia
 
Sochi hexitex manchester 10 dec 2008 presentation
Sochi hexitex  manchester 10 dec 2008 presentationSochi hexitex  manchester 10 dec 2008 presentation
Sochi hexitex manchester 10 dec 2008 presentationTaha Sochi
 

Similar to MediaEval 2012 Spoken Web Search Task Results (6)

Maximize the Value of Raster Data Using FME
Maximize the Value of Raster Data Using FMEMaximize the Value of Raster Data Using FME
Maximize the Value of Raster Data Using FME
 
2012 8 29 TAR Webinar Part 2 Sigler
2012 8 29 TAR Webinar Part 2 Sigler2012 8 29 TAR Webinar Part 2 Sigler
2012 8 29 TAR Webinar Part 2 Sigler
 
VerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdfVerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdf
 
Easydd program3
Easydd program3Easydd program3
Easydd program3
 
Orchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache MahoutOrchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache Mahout
 
Sochi hexitex manchester 10 dec 2008 presentation
Sochi hexitex  manchester 10 dec 2008 presentationSochi hexitex  manchester 10 dec 2008 presentation
Sochi hexitex manchester 10 dec 2008 presentation
 

More from MediaEval2012

A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account MatchingMediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsMediaEval2012
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioMediaEval2012
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskMediaEval2012
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...MediaEval2012
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...MediaEval2012
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskMediaEval2012
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...MediaEval2012
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationMediaEval2012
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...MediaEval2012
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesMediaEval2012
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskMediaEval2012
 

More from MediaEval2012 (20)

A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

MediaEval 2012 Spoken Web Search Task Results

  • 1. MediaEval 2012 Spoken Web Search Florian Metze, Marelie Davel, Etienne Barnard, Xavier Anguera, Guillaume Gravier, and Nitendra Rajput Pisa, October 4, 2012
  • 2. Outline  The Spoken Web Search Task  Data and Scoring  Organizers and Participants  Results  Discussion
  • 3. Organizers  Florian Metze (Carnegie Mellon)  Etienne Barnard, Marelie Davel, Charl v. Heerden (North-West University)  Xavier Anguera (Telefonica Research)  Guillaume Gravier (IRISA)  Nitendra Rajput (IBM India)
  • 4. Real life audio content is very diverse! “2011 Indian Data”
  • 5. Spoken Web Search Task Motivation  Any speech problem can be solved with enough:  Money, Time, Constraints, Data  What if we have just one constraints?  Don’t know what language/ dialect is being used? Don’t have much data!  But don’t have to do Large Vocabulary Speech Recognition, “only” content retrieval  What can be done?  Port outside resources (ie. run language-independent/-portable recognizer)  Build a “zero knowledge” approach (i.e. try to directly identify similar words)
  • 6. Primary Data Source: “African Data”  “Lwazi” Corpus  Data obtained during targeted effort  Lwazi means Knowledge  Meant as resource for speech  Lwazi project aims to develop research, so no “found” data, telephony-based speech-driven as “Indian Data” information system  E. Barnard, M. Davel, and C. van  11 South-African languages, 3h-6h Heerden, "ASR Corpus design for of speech per language resource-scarce languages," in Proc. INTERSPEECH, Brighton, UK; Sep.  Phone sets, dictionaries, read & 2009, pp. 2847-2850. spontaneous, …  3200 utterances used, from 4 languages
  • 7. Evaluation Paradigm: Spoken Term Detection (STD)  Do not attempt to convert speech to text (full recognition, ASR)  Attempt to detect the occurrence (or absence) of “keywords”  STD is not easier than doing ASR  It requires less resources: particularly not a strong language model  Evaluation metrics:  (Spoken) Document Retrieval (SDR), when relaxing time constraints  Actual Term Weighted Value (ATWV, MTWV – defined by NIST)
  • 8. Evaluation Idea – 4 Conditions  Test development terms on (known) development data  Test (unknown) evaluation terms on (unknown) evaluation data  Test development terms on evaluation data  Test evaluation terms on development data  Terms provided as audio examples taken from collections  Systems could be developed with or without using external resources (i.e. other speech data, it is important to document, which ones were used – “restricted” vs “open”)
  • 9. NIST Scoring Tools  Developed for 2006 Spoken Term Detection  Generates “Actual” and “Maximum Term Weighted Value” (ATWV, MTWV)  Generates DET curves  Adapted by us  ECF = “Experiment Control File” (controls which sections to process)  RTTM = “Rich Transcription Time Mark” (defines references)  TLIST =“Term List” Files (links term IDs and word dictionary)  A few parameters to choose  Different for 2011 and 2012, to better represent characteristics of SWS task (thanks, Xavi)  Best ATWV value is 1, below 0 possible
  • 10. How to Interpret DET Plots  Most useful plot  If done right, will give you Miss probability (in %) 10 20 40 60 80 90 95 98 .0001 5  P(Miss) over P(FA) for all decision Term Wtd. float-primary-test: CTS Subset Max Val=0.173 Scr=1.276 Term Wtd. float-primary-test : ALL Data Max Val=0.173 Scr=1.276 .001 .004.01.02 .05 .1 .2 .5 1 2 scores False Alarm probability (in %)  A “marker” at the actual decision Combined DET Plot  If computed using score, this will Random Performance 5 be on the line 10 20  Used for evaluation (with 40 score.occ.txt)
  • 11. 2012 Spoken Web Search Participants Authors Title Haipeng Wang and Tan Lee CUHK System for the Spoken Web Search task at Mediaeval 2012 Cyril Joder, Felix Weninger, Martin Wöllmer The TUM Cumulative DTW Approach for the Mediaeval and Björn Schuller 2012 Spoken Web Search Task Andi Buzo, Horia Cucu, Mihai Safta, ARF @ MediaEval 2012: A Romanian ASR-based Bogdan Ionescu, and Corneliu Burileanu Approach to Spoken Term Detection Alberto Abad and Ramón F. Astudillo The L2F Spoken Web Search system for Mediaeval 2012 Jozef Vavrek, Matus Pleva and Jozef Juhar TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM Amparo Varona, Mikel Penagarikano, Luis GTTS System for the Spoken Web Search Task at Javier Rodriguez-Fuentes, German Bordel, MediaEval 2012 and Mireia Diez Igor Szoke, Michal Fapšo, and Karel BUT 2012 APPROACHES FOR SPOKEN WEB Veselý SEARCH - MEDIAEVAL 2012 Aren Jansen, Benjamin Van Durme, and The JHU-HLTCOE Spoken Web Search System for Pascal Clark MediaEval 2012 Xavier Anguera (TID) Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
  • 12. Summary of (Primary) Results Team Type Dev Eval CUHK cuhk_phnrecgmmasm_p-fusionprf_1 open 0,7824 0,7430 CUHK cuhk_spch_p-gmmasmprf_1 restricted 0,6776 0,6350 L2F l2f_12_spch_p-phonetic4_fusion_mv_1 open 0,5313 0,5195 BUT BUT_spch_p-akws-devterms_1 open 0,4884 0,4918 BUT BUT_spch_g-DTW-devterms_1 open 0,4426 0,4477 JHU-HLTCOE jhu_all_spch_p-rails_1 restricted 0,3811 0,3688 TID sws2012_IRDTW restricted 0,3866 0,3301 TUM tum_spch_p-cdtw_1 restricted 0,2628 0,2895 ARF arf_spch_p-asrDTWAlign_w15_a08_b04 open 0,4109 0,2448 GTTS gtts_spch_p-phone_lattice_1 open 0,0978 0,0809 TUKE tuke_spch_p-dtwsvm restricted 0 0
  • 13. Development data, development terms 98 Random Performance ARF MTWV=0.471 95 ARF MTWV=0.491 ARF MTWV=0.253 90 ARF MTWV=0.487 Development Data, Development Terms BUT MTWV=0.468 Miss probability (in %) 80 BUT MTWV=0.493 CUHK MTWV=0.735 CUHK MTWV=0.751 60 CUHK MTWV=0.787 CUHK MTWV=0.631 CUHK MTWV=0.680 40 JHU-HLTCOE MTWV=0.382 L2F MTWV=0.531 TUKE MTWV=0.000 20 TUM MTWV=0.354 TUM MTWV=0.337 TUM MTWV=0.270 10 TID MTWV=0.390 5 TID MTWV=0.375 .0001 .001.004.01.02 .05 .1 .2 .5 1 2 5 10 20 40 GTTS MTWV=0.098 GTTS MTWV=0.105 False Alarm probability (in %)
  • 14. Development data, evaluation terms 98 Random Performance ARF MTWV=0.443 Scr=0.470 95 ARF MTWV=0.475 ARF MTWV=0.016 90 ARF MTWV=0.224 ARF MTWV=0.466 Miss probability (in %) 80 BUT MTWV=0.481 Development Data, Evaluation Terms BUT MTWV=0.629 CUHK MTWV=0.769 60 CUHK MTWV=0.772 CUHK MTWV=0.805 CUHK MTWV=0.687 40 CUHK MTWV=0.686 JHU-HLTCOE MTWV=0.440 L2F MTWV=0.633 20 TUKE MTWV=0.000 TUKE MTWV=0.257 TUM MTWV=0.201 10 TUM MTWV=0.396 5 TID MTWV=0.498 .0001 .001.004.01.02 .05 .1 .2 .5 1 2 5 10 20 40 TID MTWV=0.300 GTTS MTWV=0.083 False Alarm probability (in %) GTTS MTWV=0.109
  • 15. Evaluation data, development terms 98 Random Performance ARF MTWV=0.317 95 ARF MTWV=0.339 ARF MTWV=0.000 90 ARF MTWV=0.167 ARF MTWV=0.333 Miss probability (in %) Evaluation Data, Development Terms 80 BUT MTWV=0.383 BUT MTWV=0.429 CUHK MTWV=0.707 60 CUHK MTWV=0.715 CUHK MTWV=0.752 CUHK MTWV=0.561 40 CUHK MTWV=0.620 JHU-HLTCOE MTWV=0.336 L2F MTWV=0.486 20 TUKE MTWV=0.000 TUM MTWV=0.236 TUM MTWV=0.291 10 TUM MTWV=0.174 5 TID MTWV=0.314 .0001 .001.004.01.02 .05 .1 .2 .5 1 2 5 10 20 40 TID MTWV=0.472 GTTS MTWV=0.070 False Alarm probability (in %) GTTS MTWV=0.081
  • 16. Evaluation data, evaluation terms 98 Random Performance ARF MTWV=0.268 95 ARF MTWV=0.310 ARF MTWV=0.001 90 ARF MTWV=0.120 ARF MTWV=0.306 Miss probability (in %) 80 BUT MTWV=0.488 BUT MTWV=0.530 Evaluation Data, Evaluation Terms CUHK MTWV=0.724 60 CUHK MTWV=0.742 CUHK MTWV=0.762 CUHK MTWV=0.589 40 CUHK MTWV=0.643 JHU-HLTCOE MTWV=0.384 L2F MTWV=0.523 20 TUKE MTWV=0.000 TUM MTWV=0.187 TUM MTWV=0.164 10 TUM MTWV=0.296 5 TID MTWV=0.342 .0001 .001.004.01.02 .05 .1 .2 .5 1 2 5 10 20 40 TID MTWV=0.311 GTTS MTWV=0.070 False Alarm probability (in %) GTTS MTWV=0.081
  • 17. Spoken Web Search Task Summary 1  Second time around  Last year’s participants (mostly) became organizers  Grew from 5 to ca. 10 participants!!!  Europe, America, Asia, Africa (where’s Australia and Antarctica?)  Interesting differences in performance  Thank you all participants! It was fun & interesting.  Evaluation criteria useful, correct?
  • 18. Spoken Web Search Task Summary 2  Could talk a bit about JHU-HLTCOE’s “RAILS” system  Next steps?  Do more joint analysis (hope everybody’s results agree with ours?)  Shared Publications? ICASSP? Journal?  Develop task further for next year?  “Speech Kitchen” idea will be presented later …
  • 20. How to Interpret *.occ.txt File  Coefficients C, V  Values used for padding and multi-term detections are missing  Weighting of correct vs incorrect detections  In some rare cases lists different  Probability of a Term values for total and only sub-class  Expectation of terms  Was expecting more questions  Average and Maximum TWV  P(FA) and P(Miss)  Optimal decision score
  • 21. Parameters used  The tools assume you use a  Used different parameters for “decision score” African and Indian data sets to reflect different use cases  Submit “candidates” with score lower than cutoff  KoefV/ KoefC are debatable  Submit “detections” with score  What’s the cost of wrong and the higher than cutoff benefit of correct detections  Enables plotting of DET curves  -P Probability-of-Term  Can be confusing  How frequent are terms expected to be?
  • 22. How to Interpret score.det.thresh.pdf  Can be used to analyze decision MaxValue 0.173 @ 1.276 score behavior P(Miss) P(FA) Value 0.2 0.4 0.6 0.8 0 1  P(FA) False Alarms 1  P(Miss) Missed Detections Term Wtd. Threshold Plot for float-primary-test : ALL Data 2 3  Resulting TWV Decision Score 4 5 6 7
  • 23. Dev-Dev MTWV-ATWV differences 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0
  • 24. Eval-Eval MTWV-ATWV differences 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0
  • 25. Dev-Eval MTWV-ATWV differences 0.25 0.2 0.15 0.1 0.05 0
  • 26. Eval-Dev MTWV-ATWV differences 0.25 0.2 0.15 0.1 0.05 0