Developing with VoiceXML Building a Video Conference Application
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


Developing with VoiceXML Building a Video Conference Application






Total Views
Views on SlideShare
Embed Views



1 Embed 3 3



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Discuss role of each component: Video Conference Application implements the business logic: PIN number validation, control of the conference mode Video phone, mobile phone, PC: establish call with conference application, transmit and receive audio/video to/from media server Media server: performs audio mixing, muting participants, video switching or mixing Discuss protocols: SIP for signalling RTP for media. There are two separate RTP streams per video phone: one for audio and one for video
  • Note that sending an SDP with “a=recvonly” to the participant device is not secure: there is no way to know if the participant device will comply.

Developing with VoiceXML Building a Video Conference Application Presentation Transcript

  • 1.  
  • 2. Developing with VoiceXML Building a Video Conference Application
  • 3. Agenda
    • VoiceXML
    • Video using VoiceXML
    • Components of a Video Conference Server
    • System Architecture
    • SIP & RTP flows
    • JSLEE & Mobicents
    • Software Architecture
    • Controlling Participants
    • Putting it all together
  • 4. VoiceXML – What is it?
    • VoiceXML is an IVR scripting language
    • Used to develop complex IVR applications, such as
      • Phone-based self-help services (i.e., labyrinths  )
      • Multi-level auto-attendants
      • Calling card services
    • Standardized by W3C
    • Also used as a control protocol between VoIP application servers and media servers
    • Supported by most media server vendors
  • 5. Developing an Application with VoiceXML
    • This presentation shows how to develop a video conferencing application using VoiceXML and off-the-shelf components
    • We will use the Voxpilot/HP video extensions to VoiceXML
      • Provides playing and recording video prompt
      • Supports multiple video codecs
      • Proposed by Burke (Voxpilot) & McGlashan (HP)
      • Extensions may get integrated into VoiceXML 3.0
    • We will cover the system architecture, components, protocols, and support for multiple audio and video codecs
  • 6. Video Conferencing Application
    • Components for building a video conferencing solution are now much cheaper
      • Good web cam
      • Good headset
      • Video softclient
      • Open source telecom framework
      • Video-enabled media server
    • Small & Medium Enterprises can now use it
    • Can even be deployed in home offices
  • 7. Video Conference Application - Goals
    • Low Cost
      • Must use standard components & protocols
    • Easy to Use & Minimal Learning Curve
      • Same interface as existing meet-me conference bridges
      • Advanced interface accessible through web
    • Provide Common Conferencing Features
      • PIN number validation
      • Mute one or more participants
      • Prime Speaker
      • Manual or Automated Video Source Control
    • Good Video Quality
      • At least CIF (352x288) @ 15 frames/second
  • 8. Video Conference – System Architecture Video Conference Application Video-Enabled Media Server SIP SIP SIP SIP, NETANN, VoiceXML RTP RTP RTP
  • 9. Major Component Responsibilities
    • Video Conferencing Application
      • Back-to-back SIP user agent (B2BUA)
      • Controls conference participants when conference is up
        • e.g., muting a participant, giving priority to a participant
      • Delegates to VoiceXML script the task of PIN validation
    • VoiceXML script
      • Validates PIN (uses CGI script to access a database)
      • Transfers the call to conference bridge
    • Media Server
      • Executes VoiceXML script
      • Performs audio mixing
      • Performs video processing
  • 10. Basic Call Flow – SIP & VoiceXML 1. SIP INVITE validate.cgi?phone=5551212&pin=1234 Video Conference Application Refer-To: sip:conf-1@MS 11. SIP 200 OK 3. HTTP GET 4. SIP 200 OK 5. SIP 200 OK 2. SIP INVITE voicexml=http://as/askpin.vxml 6. HTTP POST 10. SIP reINVITE 7. SIP REFER 8. SIP INVITE sip:conf-1@MS 9. SIP 200 OK
  • 11. Video Conference Application Software
    • Video application is built on top of JSLEE, a Java real-time framework
    • Database contains a list of active conferences, phone numbers, and PINs
    • Apache provides
      • Web pages
      • Access to VoiceXML scripts
      • Access to media files
      • Execution of CGI scripts
  • 12. Video-enabled Media Server
    • Video-enabled Media Servers are available from many vendors
    • Select a media server that supports Video IVR and Video Conferencing
      • Video Codec: H.263 and H.264 @ CIF resolution (352x288)
    • Video Conferencing mode should have at least:
      • Manual Control
      • Automated control (e.g., follow-me)
    • Audio mixing should provide:
      • Audio Codec: G.711 ulaw/A-Law, G.729, AMR
      • Audio mixing without introducing echo
      • Noise reduction
      • Packet Loss Concealment algorithm
  • 13. Mobicents – a Telecom Framework
    • Mobicents is an open source JSLEE container
      • JSLEE is a Java-based framework for real-time apps
      • JSLEE is to telecom what J2EE is to business apps
      • Mobicents is written by some JBoss developers
    • Mobicents provides:
      • Soft real-time event routing
      • SIP stack
      • Traces, logs, alarms
    • See
  • 14. Mobicents - Internals
  • 15. Video Conference Application – Software Components
  • 16. IVR Service
    • IVR service provides a high-level API to playback and digit collection functions
      • Hides details of SIP and media server protocols
      • Isolates applications from Media Server protocol
        • e.g., if MS protocol changes from VoiceXML to MSCML, only IVR service must change
    • IVR service implemented as a JSLEE Service Building Block (SBB)
  • 17. IVR Service - API
    • Simple API hides IVR complexity
    • Instantiate and send an event to IvrSBB
    • Events supported:
      • CreateConnection
      • Play
      • PlayCollect
      • Release
  • 18. VoiceXML – Using video extensions
    • Playing video clips using VoiceXML 2.0
    • <audio src=“http://as/”>
    • <audio src=“http://as/” />
    • </audio>
      • Two video clips are provided: one for H.263 video clients, the other for H.264 video clients
      • The example is using the VoiceXML fallback audio feature for supporting both codecs
      • The VoiceXML interpreter will try to play each video clip in the list until it finds one that is compatible with the video codec of the remote device
  • 19. Muting a participant using RFC3264
    • The conference leader can mute a participant
    • This is achieved by the Video Conf App sending a SIP reINVITE with SDP containing “a=sendonly” to the media server:
    • v=0
    • o=Caller 10 20 IP4
    • s=Participant
    • c=IN IP4
    • t=0 0
    • m=audio 5004 RTP/AVP 0
    • a=rtpmap:0 PCMU/8000
    • a=sendonly
  • 20. Manual Control of the Video Feed
    • The conference leader can manually control the video feed displayed to all participants
      • This is achieved by turning-off the video source of all participants except one
      • Send a reINVITE with SDP containing “a=sendonly” applied to video:
    • v=0
    • o=Caller 10 20 IP4
    • s=Participant
    • c=IN IP4
    • t=0 0
    • m=audio 5004 RTP/AVP 0
    • a=rtpmap:0 PCMU/8000
    • a=sendrecv
    • m=video 5006 RTP/AVP 98
    • a=rtpmap:98 H264/90000
    • a=sendonly
  • 21. Putting it all Together
    • VoiceXML provides the user interface to the video conference
    • Mobicents provides an easy to use real-time framework for telecom applications
      • Mobicents hides SIP complexity
    • Building the business logic for a video conferencing application is no longer difficult
    • Low-cost video phones and softclients make this solution possible
    • Entire solution can be deployed in small and medium businesses