Speech Recognition in VoiceXML A pre-lunch fun fest! February 10, 2010 Mark J. Headd
Agenda <ul><li>Specifications (VXML, SRGS, SISR) </li></ul><ul><li>VoiceXML Properties for Speech </li></ul><ul><li>Gramma...
Specifications <ul><li>VoiceXML 2.0 spec (minor adds in 2.1) </li></ul><ul><ul><li>Section 3 ( User Input ) </li></ul></ul...
Properties <ul><li>Related to speech rec: </li></ul><ul><ul><li>inputmodes (defauts to “dtmf voice”) </li></ul></ul><ul><u...
Grammars <ul><li>Builtin ( Appendix P  of 2.0 spec) </li></ul><ul><li>Inline vs. external </li></ul><ul><li>Formats </li><...
Grammars Note difference between recognition and interpretation.
Examining Results <ul><li>Field Shadow Variables </li></ul><ul><ul><li>name $.utterance </li></ul></ul><ul><ul><li>name $....
VXML 2.1 features <ul><li>Dynamic referencing of grammars </li></ul><ul><ul><li><grammar srcexpr=“’http://host/’ + foo + ‘...
Demo <ul><li>Demo code  available on  GitHub </li></ul><ul><li>Running against Prophecy 8 on local server </li></ul><ul><l...
Grub <ul><li>Time to eat. </li></ul>
Upcoming SlideShare
Loading in …5
×

Speech Recognition in VoiceXML

1,614 views
1,514 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,614
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Speech Recognition in VoiceXML

  1. 1. Speech Recognition in VoiceXML A pre-lunch fun fest! February 10, 2010 Mark J. Headd
  2. 2. Agenda <ul><li>Specifications (VXML, SRGS, SISR) </li></ul><ul><li>VoiceXML Properties for Speech </li></ul><ul><li>Grammars (Structure, Formats) </li></ul><ul><li>Examining Recognition Results </li></ul><ul><li>VXML 2.1 features </li></ul><ul><li>Eat </li></ul>
  3. 3. Specifications <ul><li>VoiceXML 2.0 spec (minor adds in 2.1) </li></ul><ul><ul><li>Section 3 ( User Input ) </li></ul></ul><ul><li>Speech Recognition Grammar spec </li></ul><ul><ul><li>Defines required formats for grammars </li></ul></ul><ul><ul><li>XML, ABNF formats </li></ul></ul><ul><li>Semantic Interpretation spec </li></ul><ul><ul><li>Assigning values based on user input </li></ul></ul>
  4. 4. Properties <ul><li>Related to speech rec: </li></ul><ul><ul><li>inputmodes (defauts to “dtmf voice”) </li></ul></ul><ul><ul><li>maxnbest (controls # of results returned) </li></ul></ul><ul><ul><li>confidencelevel </li></ul></ul><ul><ul><li>sensitivity </li></ul></ul><ul><ul><li>bargein </li></ul></ul><ul><ul><li>bargeintype (speech vs. hotword) </li></ul></ul><ul><ul><li>recordutterance (OoG utterances?) </li></ul></ul><ul><ul><li>Timing properties ( Appendix D of 2.0 spec) </li></ul></ul>
  5. 5. Grammars <ul><li>Builtin ( Appendix P of 2.0 spec) </li></ul><ul><li>Inline vs. external </li></ul><ul><li>Formats </li></ul><ul><ul><li>ABNF </li></ul></ul><ul><ul><li>XML </li></ul></ul><ul><ul><li>JSGF (Prophecy) </li></ul></ul><ul><li>Semantic Interpretation </li></ul><ul><ul><li>Filling slots with values </li></ul></ul>
  6. 6. Grammars Note difference between recognition and interpretation.
  7. 7. Examining Results <ul><li>Field Shadow Variables </li></ul><ul><ul><li>name $.utterance </li></ul></ul><ul><ul><li>name $.inputmode </li></ul></ul><ul><ul><li>name $.interpretation </li></ul></ul><ul><ul><li>name $.confidence </li></ul></ul><ul><li>application.lastresult$ Array </li></ul><ul><ul><li>Array of elements holding last recognition </li></ul></ul><ul><ul><li>Sorted by confidence, from highest to lowest </li></ul></ul>
  8. 8. VXML 2.1 features <ul><li>Dynamic referencing of grammars </li></ul><ul><ul><li><grammar srcexpr=“’http://host/’ + foo + ‘?bar=‘ + bar”/> </li></ul></ul><ul><li>Recording utterance while attempting recognition. </li></ul><ul><ul><li>OoG utterances? </li></ul></ul><ul><ul><li>Build library of audio for grammar tuning? </li></ul></ul>
  9. 9. Demo <ul><li>Demo code available on GitHub </li></ul><ul><li>Running against Prophecy 8 on local server </li></ul><ul><li>Analog line through AudioCodes gateway </li></ul>
  10. 10. Grub <ul><li>Time to eat. </li></ul>

×