Developing with VoiceXML Building a Video Conference Application


Published on

  • Be the first to comment

  • Be the first to like this

Developing with VoiceXML Building a Video Conference Application

  1. 2. Developing with VoiceXML Building a Video Conference Application
  2. 3. Agenda <ul><li>VoiceXML </li></ul><ul><li>Video using VoiceXML </li></ul><ul><li>Components of a Video Conference Server </li></ul><ul><li>System Architecture </li></ul><ul><li>SIP & RTP flows </li></ul><ul><li>JSLEE & Mobicents </li></ul><ul><li>Software Architecture </li></ul><ul><li>Controlling Participants </li></ul><ul><li>Putting it all together </li></ul>
  3. 4. VoiceXML – What is it? <ul><li>VoiceXML is an IVR scripting language </li></ul><ul><li>Used to develop complex IVR applications, such as </li></ul><ul><ul><li>Phone-based self-help services (i.e., labyrinths  ) </li></ul></ul><ul><ul><li>Multi-level auto-attendants </li></ul></ul><ul><ul><li>Calling card services </li></ul></ul><ul><li>Standardized by W3C </li></ul><ul><ul><li> </li></ul></ul><ul><li>Also used as a control protocol between VoIP application servers and media servers </li></ul><ul><li>Supported by most media server vendors </li></ul>
  4. 5. Developing an Application with VoiceXML <ul><li>This presentation shows how to develop a video conferencing application using VoiceXML and off-the-shelf components </li></ul><ul><li>We will use the Voxpilot/HP video extensions to VoiceXML </li></ul><ul><ul><li>Provides playing and recording video prompt </li></ul></ul><ul><ul><li>Supports multiple video codecs </li></ul></ul><ul><ul><li>Proposed by Burke (Voxpilot) & McGlashan (HP) </li></ul></ul><ul><ul><li>Extensions may get integrated into VoiceXML 3.0 </li></ul></ul><ul><li>We will cover the system architecture, components, protocols, and support for multiple audio and video codecs </li></ul>
  5. 6. Video Conferencing Application <ul><li>Components for building a video conferencing solution are now much cheaper </li></ul><ul><ul><li>Good web cam </li></ul></ul><ul><ul><li>Good headset </li></ul></ul><ul><ul><li>Video softclient </li></ul></ul><ul><ul><li>Open source telecom framework </li></ul></ul><ul><ul><li>Video-enabled media server </li></ul></ul><ul><li>Small & Medium Enterprises can now use it </li></ul><ul><li>Can even be deployed in home offices </li></ul>
  6. 7. Video Conference Application - Goals <ul><li>Low Cost </li></ul><ul><ul><li>Must use standard components & protocols </li></ul></ul><ul><li>Easy to Use & Minimal Learning Curve </li></ul><ul><ul><li>Same interface as existing meet-me conference bridges </li></ul></ul><ul><ul><li>Advanced interface accessible through web </li></ul></ul><ul><li>Provide Common Conferencing Features </li></ul><ul><ul><li>PIN number validation </li></ul></ul><ul><ul><li>Mute one or more participants </li></ul></ul><ul><ul><li>Prime Speaker </li></ul></ul><ul><ul><li>Manual or Automated Video Source Control </li></ul></ul><ul><li>Good Video Quality </li></ul><ul><ul><li>At least CIF (352x288) @ 15 frames/second </li></ul></ul>
  7. 8. Video Conference – System Architecture Video Conference Application Video-Enabled Media Server SIP SIP SIP SIP, NETANN, VoiceXML RTP RTP RTP
  8. 9. Major Component Responsibilities <ul><li>Video Conferencing Application </li></ul><ul><ul><li>Back-to-back SIP user agent (B2BUA) </li></ul></ul><ul><ul><li>Controls conference participants when conference is up </li></ul></ul><ul><ul><ul><li>e.g., muting a participant, giving priority to a participant </li></ul></ul></ul><ul><ul><li>Delegates to VoiceXML script the task of PIN validation </li></ul></ul><ul><li>VoiceXML script </li></ul><ul><ul><li>Validates PIN (uses CGI script to access a database) </li></ul></ul><ul><ul><li>Transfers the call to conference bridge </li></ul></ul><ul><li>Media Server </li></ul><ul><ul><li>Executes VoiceXML script </li></ul></ul><ul><ul><li>Performs audio mixing </li></ul></ul><ul><ul><li>Performs video processing </li></ul></ul>
  9. 10. Basic Call Flow – SIP & VoiceXML 1. SIP INVITE validate.cgi?phone=5551212&pin=1234 Video Conference Application Refer-To: sip:conf-1@MS 11. SIP 200 OK 3. HTTP GET 4. SIP 200 OK 5. SIP 200 OK 2. SIP INVITE voicexml=http://as/askpin.vxml 6. HTTP POST 10. SIP reINVITE 7. SIP REFER 8. SIP INVITE sip:conf-1@MS 9. SIP 200 OK
  10. 11. Video Conference Application Software <ul><li>Video application is built on top of JSLEE, a Java real-time framework </li></ul><ul><li>Database contains a list of active conferences, phone numbers, and PINs </li></ul><ul><li>Apache provides </li></ul><ul><ul><li>Web pages </li></ul></ul><ul><ul><li>Access to VoiceXML scripts </li></ul></ul><ul><ul><li>Access to media files </li></ul></ul><ul><ul><li>Execution of CGI scripts </li></ul></ul>
  11. 12. Video-enabled Media Server <ul><li>Video-enabled Media Servers are available from many vendors </li></ul><ul><li>Select a media server that supports Video IVR and Video Conferencing </li></ul><ul><ul><li>Video Codec: H.263 and H.264 @ CIF resolution (352x288) </li></ul></ul><ul><li>Video Conferencing mode should have at least: </li></ul><ul><ul><li>Manual Control </li></ul></ul><ul><ul><li>Automated control (e.g., follow-me) </li></ul></ul><ul><li>Audio mixing should provide: </li></ul><ul><ul><li>Audio Codec: G.711 ulaw/A-Law, G.729, AMR </li></ul></ul><ul><ul><li>Audio mixing without introducing echo </li></ul></ul><ul><ul><li>Noise reduction </li></ul></ul><ul><ul><li>Packet Loss Concealment algorithm </li></ul></ul>
  12. 13. Mobicents – a Telecom Framework <ul><li>Mobicents is an open source JSLEE container </li></ul><ul><ul><li>JSLEE is a Java-based framework for real-time apps </li></ul></ul><ul><ul><li>JSLEE is to telecom what J2EE is to business apps </li></ul></ul><ul><ul><li>Mobicents is written by some JBoss developers </li></ul></ul><ul><li>Mobicents provides: </li></ul><ul><ul><li>Soft real-time event routing </li></ul></ul><ul><ul><li>SIP stack </li></ul></ul><ul><ul><li>Traces, logs, alarms </li></ul></ul><ul><li>See </li></ul>
  13. 14. Mobicents - Internals
  14. 15. Video Conference Application – Software Components
  15. 16. IVR Service <ul><li>IVR service provides a high-level API to playback and digit collection functions </li></ul><ul><ul><li>Hides details of SIP and media server protocols </li></ul></ul><ul><ul><li>Isolates applications from Media Server protocol </li></ul></ul><ul><ul><ul><li>e.g., if MS protocol changes from VoiceXML to MSCML, only IVR service must change </li></ul></ul></ul><ul><li>IVR service implemented as a JSLEE Service Building Block (SBB) </li></ul>
  16. 17. IVR Service - API <ul><li>Simple API hides IVR complexity </li></ul><ul><li>Instantiate and send an event to IvrSBB </li></ul><ul><li>Events supported: </li></ul><ul><ul><li>CreateConnection </li></ul></ul><ul><ul><li>Play </li></ul></ul><ul><ul><li>PlayCollect </li></ul></ul><ul><ul><li>Release </li></ul></ul>
  17. 18. VoiceXML – Using video extensions <ul><li>Playing video clips using VoiceXML 2.0 </li></ul><ul><li><audio src=“http://as/”> </li></ul><ul><li><audio src=“http://as/” /> </li></ul><ul><li></audio> </li></ul><ul><ul><li>Two video clips are provided: one for H.263 video clients, the other for H.264 video clients </li></ul></ul><ul><ul><li>The example is using the VoiceXML fallback audio feature for supporting both codecs </li></ul></ul><ul><ul><li>The VoiceXML interpreter will try to play each video clip in the list until it finds one that is compatible with the video codec of the remote device </li></ul></ul>
  18. 19. Muting a participant using RFC3264 <ul><li>The conference leader can mute a participant </li></ul><ul><li>This is achieved by the Video Conf App sending a SIP reINVITE with SDP containing “a=sendonly” to the media server: </li></ul><ul><li>v=0 </li></ul><ul><li>o=Caller 10 20 IP4 </li></ul><ul><li>s=Participant </li></ul><ul><li>c=IN IP4 </li></ul><ul><li>t=0 0 </li></ul><ul><li>m=audio 5004 RTP/AVP 0 </li></ul><ul><li>a=rtpmap:0 PCMU/8000 </li></ul><ul><li>a=sendonly </li></ul>
  19. 20. Manual Control of the Video Feed <ul><li>The conference leader can manually control the video feed displayed to all participants </li></ul><ul><ul><li>This is achieved by turning-off the video source of all participants except one </li></ul></ul><ul><ul><li>Send a reINVITE with SDP containing “a=sendonly” applied to video: </li></ul></ul><ul><li>v=0 </li></ul><ul><li>o=Caller 10 20 IP4 </li></ul><ul><li>s=Participant </li></ul><ul><li>c=IN IP4 </li></ul><ul><li>t=0 0 </li></ul><ul><li>m=audio 5004 RTP/AVP 0 </li></ul><ul><li>a=rtpmap:0 PCMU/8000 </li></ul><ul><li>a=sendrecv </li></ul><ul><li>m=video 5006 RTP/AVP 98 </li></ul><ul><li>a=rtpmap:98 H264/90000 </li></ul><ul><li>a=sendonly </li></ul>
  20. 21. Putting it all Together <ul><li>VoiceXML provides the user interface to the video conference </li></ul><ul><li>Mobicents provides an easy to use real-time framework for telecom applications </li></ul><ul><ul><li>Mobicents hides SIP complexity </li></ul></ul><ul><li>Building the business logic for a video conferencing application is no longer difficult </li></ul><ul><li>Low-cost video phones and softclients make this solution possible </li></ul><ul><li>Entire solution can be deployed in small and medium businesses </li></ul>