Voxeo Summit 2010: Standards Update: VoiceXML3

1,536 views
1,447 views

Published on

At the Voxeo Customer Summit 2010, Dir. of Speech Technologies Dan Burnett provided an update on the evolving VoiceXML 3 standard.

More information at:
http://www.voxeo.com/
http://www.voxeo.com/summit2010
http://blogs.voxeo.com/speakingofstandards/

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,536
On SlideShare
0
From Embeds
0
Number of Embeds
185
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Voxeo Summit 2010: Standards Update: VoiceXML3

  1. 1. Standards Update:
 VoiceXML 3
 Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo (Dir. of Standards, Voxeo)
  2. 2. Voxeo on Standards   Develop ahead of standards   Make it Open Source   Lead in standards creation   Lead in standards adoption © Voxeo Corporation
  3. 3. Past Leadership   W3C •  VoiceXML 2.0/2.1, SRGS 1.0, SISR 1.0, SSML 1.0 •  CCXML 1.0, SCXML 1.0, EMMA 1.0   IETF •  MRCPv1 extensions, MRCPv2, P-charge-info, SIP security © Voxeo Corporation
  4. 4. Where we are now   W3C •  VoiceXML 3, SSML 1.1, Pronunciation Alphabet Registry, Speech in HTML 5 •  CCXML 1.0, SCXML 1.0, EMMA next, MMI architecture   IETF, 3GPP •  MRCPv2, XMPP (incl. multi-party Jingle and multiple chat), Media Control, SIP Overload, SIPREC, CODEC (Speex)   JCP •  JSR 289, 309 – SIP servlets, media control •  JSR 154, 254 – Java servlets and servlet pages •  XMPP SIP servlet – submitting to JCP © Voxeo Corporation
  5. 5. VoiceXML VoiceXML 3 VoiceXML VoiceXML 2.1 2.0 VoiceXML 1.0 2000 2004 2007 2010 © Voxeo Corporation
  6. 6. VoiceXML VoiceXML 3 VoiceXML VoiceXML 2.1 2.0 VoiceXML 1.0 2000 2004 2007 2010 © Voxeo Corporation
  7. 7. V3 Motivations   FIA flexibility   New features   Extensibility   Better integration with other W3C languages © Voxeo Corporation
  8. 8. V3 is . . .   a restructured core   some new features   convenience elements to mimic VoiceXML 2.1 © Voxeo Corporation
  9. 9. V3 Architecture   Core functionality defined in modules   Modules combined with convenience syntax into profiles © Voxeo Corporation
  10. 10. Core functionality defined in modules   Module behavior defined precisely as state machines © Voxeo Corporation
  11. 11. Modules + Conv. Syntax = Profiles   Modules grouped into profiles   Legacy (V2.1), Basic, Maximal   Convenience syntax simplifies authoring © Voxeo Corporation
  12. 12. Convenience Syntax   New elements and attributes, but no new functionality   Behavior defined in terms of core functionality   For example, <menu> defined in terms of <form> with grammars and prompts © Voxeo Corporation
  13. 13. Convenience Syntax   Definite candidates are •  menu/choice/enumerate/option •  error/help/noinput/nomatch shortcuts •  link   Possible (but different) candidates might be •  if/else/elseif (using SCXML) •  transfer (using CCXML) © Voxeo Corporation
  14. 14. New Stuff   New media, SIV functions   Session root documents   Real-time controls   Author-specifiable transition controllers   V2 eventing model now async & compatible with DOM Level 3 © Voxeo Corporation
  15. 15. New Functionality – Video   Video -- <audio> replaced by <media>, which allows both audio and video <media type="audio/x-wav" src="http://www.example.com/resource.wav"/> <media type="video/3gpp" src="http://www.example.com/resource.3gp"/> <media> <!-- inline SSML with audio media fallback--> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"> Ich bin ein Berliner. </speak> <media type="audio/x-wav" src="ichbineinberliner.wav"> </media> © Voxeo Corporation
  16. 16. New Functionality – Media Control   Media control -- media clipping, speed, and volume control now possible without resorting to SSML <media type="audio/x-wav" soundLevel="+6.0dB" speed="50%" repeatcount= "2" src="http://www.example.com/resource.wav"/> <media type="video/3gpp" clipBegin= "2s" clipEnd="5s" repeatDur="25s" src="http://www.example.com/resource.3gp"/> © Voxeo Corporation
  17. 17. New Functionality – SIV   SIV – speaker authentication capabilities available as core functionality •  Enrollment – creates voice model, associates it with id in speaker database •  Identification – which voice model in speaker database is a match for the speech? •  Verification – for the claimed id, does the speech match the voice model in the speaker database? © Voxeo Corporation
  18. 18. New Control – Session Root   Just like application root <vxml session="blahblah.vxml" ...>   Well, not exactly •  If not specified, no session root •  Session root change is ignored or causes error   First, let’s review application roots © Voxeo Corporation
  19. 19. Application Root Review A: <vxml> AppRoot A B: <vxml> AppRoot B C: <vxml root="B"> AppRoot B D: <vxml root="E"> AppRoot E F: <vxml root="E"> AppRoot E G: <vxml> AppRoot G © Voxeo Corporation
  20. 20. Session Root A: <vxml> No Session Root B: <vxml session="C"> Session Root C D: <vxml> Session Root C E: <vxml session="F" > Session Root C G: <vxml session="H" requiresession="true"> error.badfetch © Voxeo Corporation
  21. 21. Real-time Controls   Special grammars that are always active (not just in the wait state) •  Allows arbitrary speech/dtmf •  Immediate: volume, speed, skip •  At next event processing: cancel, goto <form> <rtc grammar="digit3.grxml" action="volume" params="+5"/> <field name="a"> ... </field> <field name="b"> <cancelrtc grammar= "digit3.grxml "/> ... </field> </form>   Acts as pre-filter on input stream, replacing matches with silence © Voxeo Corporation
  22. 22. Transition Controllers   Inter-element transitions now under author control   Controllers at form, document, application, and perhaps session levels •  e.g. form controller specifies which form item to execute next   Controllers can be in SCXML or another flow control language   Default controllers will give FIA behavior in Legacy Profile © Voxeo Corporation
  23. 23. Transition Controllers Example 1 <!-- document-level transition controller controls inter-form transitions --> <vxml ...> <controller ...> <scxml:scxml version="1.0" ...> <!-- SCXML code determining which form to go to next --> </scxml> </controller> <form id="form_a" > ... <goto next="form_b"/> <!-- goto is only a suggestion now --> </form> <form id="form_b" > ... </form> ... </vxml> © Voxeo Corporation
  24. 24. Transition Controllers Example 2 <!-- form-level transition controller controls inter-field transitions --> <vxml ...> <form> <controller src= "myformbehavior.scxml"> <field name="field_a" > ... </field> <field name="field_b" > ... </field> <field name="field_c" > ... </field> <field name="field_d" > ... </field> </form> ... </vxml> © Voxeo Corporation
  25. 25. For More V3 Info   Follow the work •  http://www.w3.org/Voice   Check out our recent Developer Jam Session •  http://developers.voiceobjects.com/tech-topics/ monthly-jam-sessions/   Contact me •  dburnett at voxeo dot com Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo © Voxeo Corporation

×