Your SlideShare is downloading. ×
Speech Recognition in VoiceXML
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Speech Recognition in VoiceXML

1,329
views

Published on

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,329
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Speech Recognition in VoiceXML A pre-lunch fun fest! February 10, 2010 Mark J. Headd
  • 2. Agenda
    • Specifications (VXML, SRGS, SISR)
    • VoiceXML Properties for Speech
    • Grammars (Structure, Formats)
    • Examining Recognition Results
    • VXML 2.1 features
    • Eat
  • 3. Specifications
    • VoiceXML 2.0 spec (minor adds in 2.1)
      • Section 3 ( User Input )
    • Speech Recognition Grammar spec
      • Defines required formats for grammars
      • XML, ABNF formats
    • Semantic Interpretation spec
      • Assigning values based on user input
  • 4. Properties
    • Related to speech rec:
      • inputmodes (defauts to “dtmf voice”)
      • maxnbest (controls # of results returned)
      • confidencelevel
      • sensitivity
      • bargein
      • bargeintype (speech vs. hotword)
      • recordutterance (OoG utterances?)
      • Timing properties ( Appendix D of 2.0 spec)
  • 5. Grammars
    • Builtin ( Appendix P of 2.0 spec)
    • Inline vs. external
    • Formats
      • ABNF
      • XML
      • JSGF (Prophecy)
    • Semantic Interpretation
      • Filling slots with values
  • 6. Grammars Note difference between recognition and interpretation.
  • 7. Examining Results
    • Field Shadow Variables
      • name $.utterance
      • name $.inputmode
      • name $.interpretation
      • name $.confidence
    • application.lastresult$ Array
      • Array of elements holding last recognition
      • Sorted by confidence, from highest to lowest
  • 8. VXML 2.1 features
    • Dynamic referencing of grammars
      • <grammar srcexpr=“’http://host/’ + foo + ‘?bar=‘ + bar”/>
    • Recording utterance while attempting recognition.
      • OoG utterances?
      • Build library of audio for grammar tuning?
  • 9. Demo
    • Demo code available on GitHub
    • Running against Prophecy 8 on local server
    • Analog line through AudioCodes gateway
  • 10. Grub
    • Time to eat.