Ken Rehor's presentation at eComm 2008

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Ken Rehor's presentation at eComm 2008 - Presentation Transcript

    1. Alphabet Soup: Sorting out Emerging Telephony and Speech Standards Ken Rehor Co-founder, VoiceXML Forum Founder, Harken Systems, LLC
      • Voice Web Telephony Architecture
      • Benefits of Open Interfaces, Protocols, Languages
      • Status and Deployment
    2. Components of a Voice Solution
    3. Break out of the monolithic systems trap
      • Modernize existing proprietary applications without starting from scratch
      • Develop new apps, and incrementally add features in a modular fashion
      • Advantages
        • Faster development
        • Less expensive to develop and maintain
        • Path towards modern, open standards architecture
    4. Voice / Web Application Architecture Phone user HTTP HTTP App server
      • Application logic
      • Content and data
      • Transaction processing
      • Database interface
      VoiceXML platform TDM or VoIP
      • Grammars
      • Audio / SSML
      • Scripts
      • Images
      • Media
      • Scripts
      HTTP Any phone Internet or Intranet Web user <html> .wav <grxml> <vxml>
    5. Voice App Architecture and Standards Scripts HTTP HTTPS HTTP HTTPS VoiceXML Browser Telephony Control Interface: SIP, etc. Dialog Control Interface: SIP, MSCP, etc. Dialog Control Interface VoiceXML Application CCXML VXML Phone Network Caller CCXML Call Control Application Media Control Interface SOAP GRXML Scripts Audio T1 / E1 ISDN SS7 SIP RFC 2833 RTP M R C P GRXML SSML GRXML G.711, WAV, .au, mp3, etc. SIP Netann MSCML MOML / MSML MSCP DMSP MGCP etc. Telephony Control Interface VoiceXML 2.0 VoiceXML 2.1 ECMAScript 262 MRCP v1 MRCP v2 SSML VoIP Gateway Conference/ Media Server CCXML Browser MRCP Client Audio DTMF Media Mixer / Server TTS Server SIV Server ASR Server
    6. Why Standards?
      • Grow an industry
      • Interoperation
      • Lower cost of goods
      • Innovation and evolution
      • Disrupt proprietary markets
        • Ecosystems develop around every open interface
        • Everyone benefits through joint work: reduces design effort
        • Promote technology to the next level
        • Sell more due to larger market
    7. Open Interfaces Enable Innovation
      • Migration: Proprietary, hardware-based solutions to Proprietary software-based solutions to Open Software
      • New Business Models
        • e.g. Voice Service Provider: Separate application from Telephony/Speech resources
      • Separation of concerns
      • Evolve components without starting from scratch
      • Concentrate on innovation rather than duplication
      • Move up the value chain
      • Leverage open, known technology
        • Web protocols, servers, networks, development tools, expertise
      • Distributed Client-Server Architecture
        • Enables new business models and efficient resource utilization
      • Standard/Common high-level language
        • Designed for voice dialogs and telephony
      • Phone number mapped to URL
        • Phone number associated with URL of voice application
      Voice Web Fundamental Concepts
    8. Visual vs. Voice markup
      • Web app UI
      • HTML – Structure
        • Layout
        • Input declaration
        • Transitions
      • Images
      • Audio
      • Video
      • Text
      • Scripts
      • Voice Web app UI
      • VoiceXML – Structure
        • Dialog flow
        • Input declaration
        • Transitions
      • Audio
      • Video, Images
      • Text (for TTS)
      • Scripts
    9. Protocols
      • Web applications
      • HTTP, HTTPS
      • SIP
      • RTP
      • SOAP
      • WSDL
      • Voice Web applications
      • HTTP, HTTPS
      • SIP
      • RTP
      • SOAP
      • WSDL
    10. The Telecom Trilogy
      • User Interaction
        • Voice user interface
        • Multimodal user interface
      • Switching
        • Connecting endpoints
        • Moving connections
        • Signaling
      • Media processing
        • ASR, SIV, TTS, Record / Play
        • Conferencing, Mixing, Echo cancellation
        • Endpointing, Coding / Format conversion
    11. Ecosystem at Every Interface Proprietary dialog XML <xml> VoiceXML, GRXML, SSML, Scripts, etc. MRCP client MRCP server VSP: Telephony, Speech, apps
      • Application Developers
      • VUI designers
      • Voice platforms
      • Tools
      • Service Providers
      • Application Servers
      Audio Engine ASR Engine <grxml> TTS Engine <ssml> VoiceXML browser <vxml> Application Server Code Generator GUI Tool / SDE .wav
    12. Industry Standards – Global Adoption
      • VoiceXML Forum
        • Nearly 100 member organizations worldwide
        • Platform Certification
        • Speaker Biometrics
        • Collaborating with W3C, ANSI, ISO
      • W3C Speech Interface Framework
        • VoiceXML 2.0/2.1, SRGS 1.0, SSML 1.0, CCXML 1.0
        • SISR 1.0, PLS 1.0
        • Coming: VoiceXML 3.0, SSML 1.1
      • IETF
        • Media Resource Control Protocol (MRCPv2)
        • SIP / VoiceXML media server spec (MEDIACTRL)
    13. W3C Speech Interface Framework
      • VoiceXML
      • SRGS
      • SSML
      • Semantic Interpretation
      • Call Control
      • Pronunciation Lexicon
      • SCXML
      For more information, see: W3C Voice Browser Working Group http://www.w3.org/Voice/
    14. W3C Speech Interface Framework
      • W3C VoiceXML 2.0
        • W3C Recommendation March 2004
        • Widely implemented
          • Approximately 4 dozen platforms
          • Many service providers worldwide
          • Many tools, countless applications
        • VoiceXML Forum Platform Certification Program
          • 24 certified platforms, more coming
      • W3C VoiceXML 2.1
        • W3C Recommendation April 2007
        • Most platform vendors support it
        • Certification Program and Test suite in progress
      • W3C VoiceXML 3.0
        • Spec in early stages of development
    15. W3C Speech Interface Framework
      • Call Control W3C CCXML 1.0
        • W3C Working Draft Jan 2007
        • Implementations increasing
      • Pronunciation Lexicon W3C PLS 1.0
        • Used to describe phonetic information for use in speech recognition and synthesis
        • 2 nd Last Call Working Draft Oct 2006
    16. W3C Speech Interface Framework
      • Input grammars SRGS 1.0
        • W3C Recommendation March 2004
        • Widely implemented
      • Output formatting SSML 1.0, 1.1
        • SSML 1.0 - W3C Recommendation March 2004
        • Widely implemented, yet minor real support (most TTS engines ignore the SSML instructions)
        • SSML 1.1 – W3C Working Draft June 2007
        • Adds support for Asian, Eastern European, and Middle Eastern languages
      • Semantic Interpretation for Speech Recognition SISR 1.0
        • W3C Recommendation April 2007
        • Implementations increasing
        • Required for new Platform Certification
    17. What's Next?
      • VoiceXML 3.0
        • Video
        • Multimodal integration
        • Speaker Biometrics
        • Cleaner Modularity
      • SCXML 1.0
        • State Chart Markup Language
        • Separate logic from presentation
        • W3C Working Draft Feb 2007
        • Several implementations available
          • Commercial, educational, open source
    18. Web / Voice ++
      • Standards enable easy integration with other technologies
      • Re-use web technologies
      • Multiple modalities / channels: Voice +
        • SMS
        • Web
        • Chat
        • Mobile
        • Voice Control / Search
    19. &quot;Integration&quot; / &quot;Mashups&quot; / &quot;SOA&quot;
      • Modular architecture
      • Open interfaces
      • Common languages, protocols
      • Combine data, services, modalities
      • Easy adoption of new technologies and features
        • Video
        • Multimodal
        • Biometrics
        • Telephony
    20. Mashups, SOA, Multi-Channel/Modal POTS PSTN or VoIP Mobile web VXML Browser Voice UI App Mobile IP IP Presentation logic Business logic Mobile UI App Web UI App PC
    21. http://www.kenrehor.com http://www.voicexml.org http://www.w3.org/voice For more information:
    22. 3 rd Party Call Control: CCXML and SIP Media HTTP HTTP PSTN Caller Telephony Control Interface Dialog Control Interface Telephony Web Application Voice Web Application CCXML VXML Telephony Interface CCXML Server VoiceXML Server Media Server
    23. Voice Web Application Architecture VoiceXML browser PSTN or IP network database audio <record> audio .wav MRCP Server Voice Web Application Server MRCP Client ASR Engine <grxml> <vxml> TTS Engine <ssml>
    24. VSOA Interfaces
      • Services can use a combination of interfaces
      • SIP / RTP for media services
        • With data carried in SIP messages
      • VoiceXML / HTTP for dialog services
      • CCXML / HTTP for switching control services
      • All can use SOAP or other web services interfaces
    25. An eComm 2008 presentation – http://eCommMedia.com for more

    + eComm2008eComm2008, 2 years ago

    custom

    636 views, 0 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 636
      • 636 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 38
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Tags