0
Alphabet Soup: Sorting out Emerging Telephony and Speech Standards Ken Rehor Co-founder, VoiceXML Forum  Founder, Harken S...
<ul><li>Voice Web Telephony Architecture </li></ul><ul><li>Benefits of Open Interfaces, Protocols, Languages </li></ul><ul...
Components of a Voice Solution
Break out of the monolithic systems trap <ul><li>Modernize existing proprietary applications without starting from scratch...
Voice / Web Application Architecture Phone user HTTP HTTP App server <ul><li>Application logic </li></ul><ul><li>Content a...
Voice App Architecture and Standards Scripts HTTP HTTPS HTTP HTTPS VoiceXML Browser Telephony Control Interface:  SIP, etc...
Why Standards?  <ul><li>Grow an industry </li></ul><ul><li>Interoperation </li></ul><ul><li>Lower cost of goods </li></ul>...
Open Interfaces Enable Innovation <ul><li>Migration: Proprietary, hardware-based solutions to Proprietary software-based s...
<ul><li>Leverage open, known technology </li></ul><ul><ul><li>Web protocols, servers, networks, development tools, experti...
Visual vs. Voice markup <ul><li>Web app UI </li></ul><ul><li>HTML – Structure </li></ul><ul><ul><li>Layout </li></ul></ul>...
Protocols <ul><li>Web applications </li></ul><ul><li>HTTP, HTTPS </li></ul><ul><li>SIP </li></ul><ul><li>RTP </li></ul><ul...
The Telecom Trilogy <ul><li>User Interaction </li></ul><ul><ul><li>Voice user interface </li></ul></ul><ul><ul><li>Multimo...
Ecosystem at Every Interface Proprietary  dialog XML <xml> VoiceXML, GRXML, SSML, Scripts, etc. MRCP client MRCP server VS...
Industry Standards – Global Adoption <ul><li>VoiceXML Forum  </li></ul><ul><ul><li>Nearly 100 member organizations worldwi...
W3C Speech Interface Framework <ul><li>VoiceXML </li></ul><ul><li>SRGS </li></ul><ul><li>SSML </li></ul><ul><li>Semantic I...
W3C Speech Interface Framework <ul><li>W3C VoiceXML 2.0 </li></ul><ul><ul><li>W3C Recommendation March 2004 </li></ul></ul...
W3C Speech Interface Framework <ul><li>Call Control   W3C CCXML 1.0 </li></ul><ul><ul><li>W3C Working Draft  Jan 2007 </li...
W3C Speech Interface Framework <ul><li>Input grammars   SRGS 1.0 </li></ul><ul><ul><li>W3C Recommendation March 2004 </li>...
What's Next? <ul><li>VoiceXML 3.0 </li></ul><ul><ul><li>Video </li></ul></ul><ul><ul><li>Multimodal integration </li></ul>...
Web / Voice ++ <ul><li>Standards enable easy integration with other technologies </li></ul><ul><li>Re-use web technologies...
&quot;Integration&quot; / &quot;Mashups&quot; / &quot;SOA&quot; <ul><li>Modular architecture </li></ul><ul><li>Open interf...
Mashups, SOA, Multi-Channel/Modal POTS PSTN or VoIP Mobile web VXML Browser Voice UI App Mobile IP IP Presentation logic B...
http://www.kenrehor.com http://www.voicexml.org http://www.w3.org/voice For more information:
3 rd  Party Call Control: CCXML and SIP Media HTTP HTTP PSTN Caller Telephony  Control Interface Dialog Control Interface ...
Voice Web Application Architecture VoiceXML browser PSTN or IP network database audio <record> audio .wav MRCP  Server Voi...
VSOA Interfaces <ul><li>Services can use a combination of interfaces </li></ul><ul><li>SIP / RTP for media services </li><...
An   eComm 2008   presentation –   http://eCommMedia.com   for more
Upcoming SlideShare
Loading in...5
×

Ken Rehor's presentation at eComm 2008

1,188

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,188
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
47
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Ken Rehor's presentation at eComm 2008"

  1. 1. Alphabet Soup: Sorting out Emerging Telephony and Speech Standards Ken Rehor Co-founder, VoiceXML Forum Founder, Harken Systems, LLC
  2. 2. <ul><li>Voice Web Telephony Architecture </li></ul><ul><li>Benefits of Open Interfaces, Protocols, Languages </li></ul><ul><li>Status and Deployment </li></ul>
  3. 3. Components of a Voice Solution
  4. 4. Break out of the monolithic systems trap <ul><li>Modernize existing proprietary applications without starting from scratch </li></ul><ul><li>Develop new apps, and incrementally add features in a modular fashion </li></ul><ul><li>Advantages </li></ul><ul><ul><li>Faster development </li></ul></ul><ul><ul><li>Less expensive to develop and maintain </li></ul></ul><ul><ul><li>Path towards modern, open standards architecture </li></ul></ul>
  5. 5. Voice / Web Application Architecture Phone user HTTP HTTP App server <ul><li>Application logic </li></ul><ul><li>Content and data </li></ul><ul><li>Transaction processing </li></ul><ul><li>Database interface </li></ul>VoiceXML platform TDM or VoIP <ul><li>Grammars </li></ul><ul><li>Audio / SSML </li></ul><ul><li>Scripts </li></ul><ul><li>Images </li></ul><ul><li>Media </li></ul><ul><li>Scripts </li></ul>HTTP Any phone Internet or Intranet Web user <html> .wav <grxml> <vxml>
  6. 6. Voice App Architecture and Standards Scripts HTTP HTTPS HTTP HTTPS VoiceXML Browser Telephony Control Interface: SIP, etc. Dialog Control Interface: SIP, MSCP, etc. Dialog Control Interface VoiceXML Application CCXML VXML Phone Network Caller CCXML Call Control Application Media Control Interface SOAP GRXML Scripts Audio T1 / E1 ISDN SS7 SIP RFC 2833 RTP M R C P GRXML SSML GRXML G.711, WAV, .au, mp3, etc. SIP Netann MSCML MOML / MSML MSCP DMSP MGCP etc. Telephony Control Interface VoiceXML 2.0 VoiceXML 2.1 ECMAScript 262 MRCP v1 MRCP v2 SSML VoIP Gateway Conference/ Media Server CCXML Browser MRCP Client Audio DTMF Media Mixer / Server TTS Server SIV Server ASR Server
  7. 7. Why Standards? <ul><li>Grow an industry </li></ul><ul><li>Interoperation </li></ul><ul><li>Lower cost of goods </li></ul><ul><li>Innovation and evolution </li></ul><ul><li>Disrupt proprietary markets </li></ul><ul><ul><li>Ecosystems develop around every open interface </li></ul></ul><ul><ul><li>Everyone benefits through joint work: reduces design effort </li></ul></ul><ul><ul><li>Promote technology to the next level </li></ul></ul><ul><ul><li>Sell more due to larger market </li></ul></ul>
  8. 8. Open Interfaces Enable Innovation <ul><li>Migration: Proprietary, hardware-based solutions to Proprietary software-based solutions to Open Software </li></ul><ul><li>New Business Models </li></ul><ul><ul><li>e.g. Voice Service Provider: Separate application from Telephony/Speech resources </li></ul></ul><ul><li>Separation of concerns </li></ul><ul><li>Evolve components without starting from scratch </li></ul><ul><li>Concentrate on innovation rather than duplication </li></ul><ul><li>Move up the value chain </li></ul>
  9. 9. <ul><li>Leverage open, known technology </li></ul><ul><ul><li>Web protocols, servers, networks, development tools, expertise </li></ul></ul><ul><li>Distributed Client-Server Architecture </li></ul><ul><ul><li>Enables new business models and efficient resource utilization </li></ul></ul><ul><li>Standard/Common high-level language </li></ul><ul><ul><li>Designed for voice dialogs and telephony </li></ul></ul><ul><li>Phone number mapped to URL </li></ul><ul><ul><li>Phone number associated with URL of voice application </li></ul></ul>Voice Web Fundamental Concepts
  10. 10. Visual vs. Voice markup <ul><li>Web app UI </li></ul><ul><li>HTML – Structure </li></ul><ul><ul><li>Layout </li></ul></ul><ul><ul><li>Input declaration </li></ul></ul><ul><ul><li>Transitions </li></ul></ul><ul><li>Images </li></ul><ul><li>Audio </li></ul><ul><li>Video </li></ul><ul><li>Text </li></ul><ul><li>Scripts </li></ul><ul><li>Voice Web app UI </li></ul><ul><li>VoiceXML – Structure </li></ul><ul><ul><li>Dialog flow </li></ul></ul><ul><ul><li>Input declaration </li></ul></ul><ul><ul><li>Transitions </li></ul></ul><ul><li>Audio </li></ul><ul><li>Video, Images </li></ul><ul><li>Text (for TTS) </li></ul><ul><li>Scripts </li></ul>
  11. 11. Protocols <ul><li>Web applications </li></ul><ul><li>HTTP, HTTPS </li></ul><ul><li>SIP </li></ul><ul><li>RTP </li></ul><ul><li>SOAP </li></ul><ul><li>WSDL </li></ul><ul><li>… </li></ul><ul><li>Voice Web applications </li></ul><ul><li>HTTP, HTTPS </li></ul><ul><li>SIP </li></ul><ul><li>RTP </li></ul><ul><li>SOAP </li></ul><ul><li>WSDL </li></ul><ul><li>… </li></ul>
  12. 12. The Telecom Trilogy <ul><li>User Interaction </li></ul><ul><ul><li>Voice user interface </li></ul></ul><ul><ul><li>Multimodal user interface </li></ul></ul><ul><li>Switching </li></ul><ul><ul><li>Connecting endpoints </li></ul></ul><ul><ul><li>Moving connections </li></ul></ul><ul><ul><li>Signaling </li></ul></ul><ul><li>Media processing </li></ul><ul><ul><li>ASR, SIV, TTS, Record / Play </li></ul></ul><ul><ul><li>Conferencing, Mixing, Echo cancellation </li></ul></ul><ul><ul><li>Endpointing, Coding / Format conversion </li></ul></ul>
  13. 13. Ecosystem at Every Interface Proprietary dialog XML <xml> VoiceXML, GRXML, SSML, Scripts, etc. MRCP client MRCP server VSP: Telephony, Speech, apps <ul><li>Application Developers </li></ul><ul><li>VUI designers </li></ul><ul><li>Voice platforms </li></ul><ul><li>Tools </li></ul><ul><li>Service Providers </li></ul><ul><li>Application Servers </li></ul>Audio Engine ASR Engine <grxml> TTS Engine <ssml> VoiceXML browser <vxml> Application Server Code Generator GUI Tool / SDE .wav
  14. 14. Industry Standards – Global Adoption <ul><li>VoiceXML Forum </li></ul><ul><ul><li>Nearly 100 member organizations worldwide </li></ul></ul><ul><ul><li>Platform Certification </li></ul></ul><ul><ul><li>Speaker Biometrics </li></ul></ul><ul><ul><li>Collaborating with W3C, ANSI, ISO </li></ul></ul><ul><li>W3C Speech Interface Framework </li></ul><ul><ul><li>VoiceXML 2.0/2.1, SRGS 1.0, SSML 1.0, CCXML 1.0 </li></ul></ul><ul><ul><li>SISR 1.0, PLS 1.0 </li></ul></ul><ul><ul><li>Coming: VoiceXML 3.0, SSML 1.1 </li></ul></ul><ul><li>IETF </li></ul><ul><ul><li>Media Resource Control Protocol (MRCPv2) </li></ul></ul><ul><ul><li>SIP / VoiceXML media server spec (MEDIACTRL) </li></ul></ul>
  15. 15. W3C Speech Interface Framework <ul><li>VoiceXML </li></ul><ul><li>SRGS </li></ul><ul><li>SSML </li></ul><ul><li>Semantic Interpretation </li></ul><ul><li>Call Control </li></ul><ul><li>Pronunciation Lexicon </li></ul><ul><li>SCXML </li></ul>For more information, see: W3C Voice Browser Working Group http://www.w3.org/Voice/
  16. 16. W3C Speech Interface Framework <ul><li>W3C VoiceXML 2.0 </li></ul><ul><ul><li>W3C Recommendation March 2004 </li></ul></ul><ul><ul><li>Widely implemented </li></ul></ul><ul><ul><ul><li>Approximately 4 dozen platforms </li></ul></ul></ul><ul><ul><ul><li>Many service providers worldwide </li></ul></ul></ul><ul><ul><ul><li>Many tools, countless applications </li></ul></ul></ul><ul><ul><li>VoiceXML Forum Platform Certification Program </li></ul></ul><ul><ul><ul><li>24 certified platforms, more coming </li></ul></ul></ul><ul><li>W3C VoiceXML 2.1 </li></ul><ul><ul><li>W3C Recommendation April 2007 </li></ul></ul><ul><ul><li>Most platform vendors support it </li></ul></ul><ul><ul><li>Certification Program and Test suite in progress </li></ul></ul><ul><li>W3C VoiceXML 3.0 </li></ul><ul><ul><li>Spec in early stages of development </li></ul></ul>
  17. 17. W3C Speech Interface Framework <ul><li>Call Control W3C CCXML 1.0 </li></ul><ul><ul><li>W3C Working Draft Jan 2007 </li></ul></ul><ul><ul><li>Implementations increasing </li></ul></ul><ul><li>Pronunciation Lexicon W3C PLS 1.0 </li></ul><ul><ul><li>Used to describe phonetic information for use in speech recognition and synthesis </li></ul></ul><ul><ul><li>2 nd Last Call Working Draft Oct 2006 </li></ul></ul>
  18. 18. W3C Speech Interface Framework <ul><li>Input grammars SRGS 1.0 </li></ul><ul><ul><li>W3C Recommendation March 2004 </li></ul></ul><ul><ul><li>Widely implemented </li></ul></ul><ul><li>Output formatting SSML 1.0, 1.1 </li></ul><ul><ul><li>SSML 1.0 - W3C Recommendation March 2004 </li></ul></ul><ul><ul><li>Widely implemented, yet minor real support (most TTS engines ignore the SSML instructions) </li></ul></ul><ul><ul><li>SSML 1.1 – W3C Working Draft June 2007 </li></ul></ul><ul><ul><li>Adds support for Asian, Eastern European, and Middle Eastern languages </li></ul></ul><ul><li>Semantic Interpretation for Speech Recognition SISR 1.0 </li></ul><ul><ul><li>W3C Recommendation April 2007 </li></ul></ul><ul><ul><li>Implementations increasing </li></ul></ul><ul><ul><li>Required for new Platform Certification </li></ul></ul>
  19. 19. What's Next? <ul><li>VoiceXML 3.0 </li></ul><ul><ul><li>Video </li></ul></ul><ul><ul><li>Multimodal integration </li></ul></ul><ul><ul><li>Speaker Biometrics </li></ul></ul><ul><ul><li>Cleaner Modularity </li></ul></ul><ul><li>SCXML 1.0 </li></ul><ul><ul><li>State Chart Markup Language </li></ul></ul><ul><ul><li>Separate logic from presentation </li></ul></ul><ul><ul><li>W3C Working Draft Feb 2007 </li></ul></ul><ul><ul><li>Several implementations available </li></ul></ul><ul><ul><ul><li>Commercial, educational, open source </li></ul></ul></ul>
  20. 20. Web / Voice ++ <ul><li>Standards enable easy integration with other technologies </li></ul><ul><li>Re-use web technologies </li></ul><ul><li>Multiple modalities / channels: Voice + </li></ul><ul><ul><li>SMS </li></ul></ul><ul><ul><li>Web </li></ul></ul><ul><ul><li>Chat </li></ul></ul><ul><ul><li>Mobile </li></ul></ul><ul><ul><li>Voice Control / Search </li></ul></ul>
  21. 21. &quot;Integration&quot; / &quot;Mashups&quot; / &quot;SOA&quot; <ul><li>Modular architecture </li></ul><ul><li>Open interfaces </li></ul><ul><li>Common languages, protocols </li></ul><ul><li>Combine data, services, modalities </li></ul><ul><li>Easy adoption of new technologies and features </li></ul><ul><ul><li>Video </li></ul></ul><ul><ul><li>Multimodal </li></ul></ul><ul><ul><li>Biometrics </li></ul></ul><ul><ul><li>Telephony </li></ul></ul>
  22. 22. Mashups, SOA, Multi-Channel/Modal POTS PSTN or VoIP Mobile web VXML Browser Voice UI App Mobile IP IP Presentation logic Business logic Mobile UI App Web UI App PC
  23. 23. http://www.kenrehor.com http://www.voicexml.org http://www.w3.org/voice For more information:
  24. 24. 3 rd Party Call Control: CCXML and SIP Media HTTP HTTP PSTN Caller Telephony Control Interface Dialog Control Interface Telephony Web Application Voice Web Application CCXML VXML Telephony Interface CCXML Server VoiceXML Server Media Server
  25. 25. Voice Web Application Architecture VoiceXML browser PSTN or IP network database audio <record> audio .wav MRCP Server Voice Web Application Server MRCP Client ASR Engine <grxml> <vxml> TTS Engine <ssml>
  26. 26. VSOA Interfaces <ul><li>Services can use a combination of interfaces </li></ul><ul><li>SIP / RTP for media services </li></ul><ul><ul><li>With data carried in SIP messages </li></ul></ul><ul><li>VoiceXML / HTTP for dialog services </li></ul><ul><li>CCXML / HTTP for switching control services </li></ul><ul><li>All can use SOAP or other web services interfaces </li></ul>
  27. 27. An eComm 2008 presentation – http://eCommMedia.com for more
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×