SpeechTEK 2009: Optimizing Speech Recognizer Rejection Thresholds

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    SpeechTEK 2009: Optimizing Speech Recognizer Rejection Thresholds - Presentation Transcript

    1. Optimizing speech recognizer rejection thresholds Dan Burnett Director of Speech Technologies, Voxeo August 24, 2009
    2. Why this talk? • Sometimes we forget the basics, which are: • Recognizers are not perfect • They can be optimized in a straightforward manner • The simplest optimization is the rejection threshold
    3. The Goal • End user goal: optimal experience • Our Goal: determine user experience for each possible rejection threshold, then choose optimum threshold • Must compare true classification of an audio sample against the ASR engine’s classification
    4. True classifications • Assume human-level recognition • App should still distinguish (i.e. possibly behave differently) among the following cases: Case Possible behavior No speech in audio sample Mention that you didn’t hear (nospeech) anything and ask for repeat Speech, but not intelligible Ask for repeat (unintelligible) Intelligible speech, but not in app grammar Encourage in-grammar speech (out-of-grammar) Intelligible speech, and within app grammar (in-grammar) Respond to what person said
    5. ASR Engine Classifications • Silence/nospeech (nospeech) • Reject (rejected) • Recognize (recognized)
    6. Crossing these two . . . ASR nospeech rejected recognized Correct Improperly nospeech Incorrect classification rejected Improperly Correct Assume unintelligible treated as silence behavior incorrect True out-of- Improperly Correct Incorrect grammar treated as silence behavior Improperly Improperly Either correct in-grammar treated as silence rejected or incorrect
    7. Crossing these two . . . Misrecognitions ASR nospeech rejected recognized Correct Improperly nospeech Incorrect classification rejected Improperly Correct Assume unintelligible treated as silence behavior incorrect True out-of- Improperly Correct Incorrect grammar treated as silence behavior Improperly Improperly Either correct in-grammar treated as silence rejected or incorrect
    8. Crossing these two . . . “Misrejections” ASR nospeech rejected recognized Correct Improperly nospeech Incorrect classification rejected Improperly Correct Assume unintelligible treated as silence behavior incorrect True out-of- Improperly Correct Incorrect grammar treated as silence behavior Improperly Improperly Either correct in-grammar treated as silence rejected or incorrect
    9. Crossing these two . . . “Missilences” ASR nospeech rejected recognized Correct Improperly nospeech Incorrect classification rejected Improperly Correct Assume unintelligible treated as silence behavior incorrect True out-of- Improperly Correct Incorrect grammar treated as silence behavior Improperly Improperly Either correct in-grammar treated as silence rejected or incorrect
    10. Three types of errors • Missilences -- called silence, but wasn’t • Misrejections -- rejected inappropriately • Misrecognitions -- recognized inappropriately or incorrectly
    11. Three types of errors • Missilences -- called silence, but wasn’t • Misrejections -- rejected inappropriately • Misrecognitions -- recognized inappropriately or incorrectly So how do we evaluate these?
    12. Evaluating errors 1. Evaluation data set 2. Try every rejection threshold value 3. Plot errors as function of threshold 4. Select optimal value for your app
    13. 1. Evaluation data set(s) • Data selection • Must be representative (“every nth call”) • Ideally at least 100 recordings per grammar path for good confidence in results • Transcription • Goal is to compare against recognition results, so no punctuation, coughs, etc. needed in transcription itself (but good to have in separate comments)
    14. 2. Try every rejection threshold value • Run recognizer in batch mode with rejection threshold of 0 (i.e., no rejection) Remember to collect confidence scores! • Then, for each threshold from 0 to 100 • Calculate number of misrecognitions, misrejections, and missilences
    15. 3. Plot errors “Misrejections” Misrecognitions Equal Error Rate “Missilences” 0 Rejection Threshold 100
    16. 3. Plot errors Minimum Total Error Sum 0 Rejection Threshold 100
    17. 4. Select optimal value
    18. 4. Select optimal value • Equal-error-rate: not necessarily the optimum
    19. 4. Select optimal value • Equal-error-rate: not necessarily the optimum • Minimum of the sum: good starting point, great for comparing across engines (on same data set only!!)
    20. 4. Select optimal value • Equal-error-rate: not necessarily the optimum • Minimum of the sum: good starting point, great for comparing across engines (on same data set only!!) • Optimal: depends on your app; some errors may be more critical than others
    21. 4. Select optimal value • Equal-error-rate: not necessarily the optimum • Minimum of the sum: good starting point, great for comparing across engines (on same data set only!!) • Optimal: depends on your app; some errors may be more critical than others • Question: if missilences not affected by threshold, why did I include it?
    22. Further optimizations • Move OOG into IG category if semantically correct (“You bet” -> “yes”) • Consider additional threshold for confirmation • Optimize endpointer parameters (affects missilences and/or “too much speech”)
    23. Optimizing speech recognizer rejection thresholds Dan Burnett Director of Speech Technologies, Voxeo August 24, 2009

    + Voxeo CorpVoxeo Corp, 3 months ago

    custom

    485 views, 0 favs, 1 embeds more stats

    At SpeechTEK 2009 in New York on August 24, 2009, more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 485
      • 463 on SlideShare
      • 22 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 9
    Most viewed embeds
    • 22 views on http://blogs.voxeo.com

    more

    All embeds
    • 22 views on http://blogs.voxeo.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories