CONTENTS Introduction History Usage Sample Program VoiceXml Inputs Advantages of VoiceXml Reference Conclusion
Introduction VoiceXML (VXML) is the W3C's standard XML format for specifying interactive voice dialogues between a human and a computer. It allows voice applications to be developed and deployed in an analogous way to HTML for visual applications. Just as HTML documents are interpreted by a visual web browser, VoiceXML documents are interpreted by a voice browser.
A common architecture is to deploy banks of voice browsers attached to the Public Switched Telephone Network (PSTN) to allow users to interact with voice applications over the telephone.
History AT&T, IBM, Lucent, and Motorola formed the VoiceXML Forum in March 1999, in order to develop a standard markup language for specifying voice dialogs. In March 2000 they published VoiceXML1.0 The W3C produced several intermediate versions of VoiceXML 2.0, which reached the final "Recommendation" stage in March 2004.
Usage: Many commercial VoiceXML applications have been deployed, processing millions of telephone calls per day. These applications include: order inquiry, package tracking, driving directions, emergency notification, flight tracking, voice access to email, audio news magazines, voice dialing etc.
VoiceXML has tags that instruct the voice browser to provide speech synthesis, automatic speech recognition, dialog management, and audio playback. The following is an example of a VoiceXML document: <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <block> <prompt> Hello world! </prompt> </block> </form> </vxml>
When interpreted by a VoiceXML interpreter this will output "Hello world" with synthesized speech. Typically, HTTP is used as the transport protocol for fetching VoiceXML pages. Some applications may use static VoiceXML pages, while others rely on dynamic VoiceXML page generation using an application server like Tomcat, Weblogic, IIS, or WebSphere.
Voice XML Inputs Voice XML accepts: Touch tone key Speech There is a difference between voice recognition and Voice XML. Personal voice recongition systems allow for wide grammar, but restrict the number of users (ie. Dragon Naturally Speaking or IBM Via Voice.) Voice XML restricts the grammar, but allows for a wide number of users.
Advantages of Voice XML Allows for the easy implementation of voice interfaces Removes the restrictions imposed by tools designed for touchtone systems Runs on existing web infrastructure
Future versions of the standard VoiceXML 3.0 will be the next major release of VoiceXML, with new major features. It includes a new XML statechart description language called SCXML.
References VoiceXML Forum Tutorial on VoiceXML 2003 W3C Recommends VoiceXML 2.0 InfoWorld, Ephraim Schwartz, March 17, 2004 http://www.w3.org/TR/voicexml21 Voice Extensible Markup Language (VoiceXML) 2.1 mediactrl charter: Burger, Dawkins