Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A Glimpse of Voice Technology

By:
Vishad Garg
Momentum India Pvt. Ltd.
vishadg@momentum-tech.com
vishadgarg@gmail.com
91-...
Agenda
 Automated Voice Processing
 Voice Portal
 Voice XML
 Voice Portal at Work

January, 2002

Momentum Confidentia...
Definition
“Automated Voice Processing is the act of answering,
routing, and handling phone calls with a computer-based
sy...
Applications of Voice Processing
Interactive Voice Response (IVR)
Voice Mail
Automatic Call Distribution(ACD)
Audiotex...
Interactive Voice Response (IVR)
“IVR systems facilitate people-to-computer/database
communications.It automates the handl...
Voice Mail
“Voice
mail
enhances
people-to-people
communication. Voice mail is an umbrella covering a
variety of automated ...
Automatic Call Distribution(ACD)
“ACD facilitate distribution of incoming calls based upon
some algorithms to a group of p...
Audio text
“Audio text is a service that allows callers to access

prerecorded information on a topic of interest to them....
Predictive Dialer
“Predictive Dialer facilitate launching of calls and monitor
their progress.Only connected calls are pas...
Agenda
 Automated Voice Processing
 Voice Portal
 Voice XML
 Voice Portal at Work

January, 2002

Momentum Confidentia...
Definition
“The convergence of the richness of the web and the
accessibility of the phone is forming a vast new network a ...
Voice Portal vs. Web Portal
“Leverages the Internet for application development and
delivery.”
Phone instead of PC
Voice...
Why bring the internet to voice applications?
Standard language enables portability.
High level domain-specific language...
Voice Portal Key Components





Automatic Speech Recognition(ASR)
Voice Browser
Text-To-Speech
VoiceXML

January, 200...
Automatic Speech Recognition
Automatic Speech Recognition (ASR) is the
technology that allows a machine to understand
huma...
Voice Browser/Interpreter
Document-Server

 A document server processes request
from a client application, the voice XML
...
Text-To-Speech(TTS)
TTS converts text strings inputs to the spoken outputs
 TTS is increasingly being used to speak e-mai...
Agenda
 Automated Voice Processing
 Voice Portal
 Voice XML
 Voice Portal at Work

January, 2002

Momentum Confidentia...
What is VXML
Voice extensible markup Language
 A language for specifying voice/audio dialogs
 Voice dialogs use audio pr...
Goal of VXML
Bring full power of web development and content
delivery to voice response applications
 Shield authors from...
Scope of VXML







Output of Synthesized speech
Output of audio files
Recognition of spoken input
Recognition of D...
VXML Concepts






Application
Dialog/Sub-dialog
Session
Grammar
Events

January, 2002

Momentum Confidential
22
Application


A set of Documents
sharing the same
application root document



Root document variable
and grammar availa...
Dialog/Sub-dialog
A dialog is an interaction with the user, means prompt
a menu and get some input
 Two kind of dialogs,‘...
Session


A session begins when user starts to interact with a
voice XML interpreter, it continues as documents are
loade...
Grammar
A grammar is a set of phrases that a caller is
expected to say during a dialog in response to a
particular prompt....
Events
VXML defines a mechanism for handling events not
covered by the form mechanism
 Events are thrown by the platform ...
Agenda
 Automated Voice Processing
 Voice Portal
 Voice XML
 Voice Portal at Work

January, 2002

Momentum Confidentia...
Momentum Voice Portal Development
Services
Momentum provides voice portal development services
using the latest and preemi...
How We Do It?

Requirement
Analysis

January, 2002

Prototype

VUI Design

Application
Development

Testing

Deployment

M...
Momentum Travel Voice Portal
Momentum has developed a Voice Portal Demo
application, Momentum Travel Voice Portal
(MTVP). ...
Nuance in MTVP
Momentum is using complete suite of Nuance voice
technology, which includes-

Nuance 7.0.3 for voice recog...
Demo
 
To try the MTVP demo, dial any of the following phone
number in US:
(800) 303-9987
(415) 869-6909
When the system a...
Future Plans

We are also planning to embark upon voice driven Ecommerce applications, i.e. V-Commerce, Voice
Enabled Intr...
Upcoming SlideShare
Loading in …5
×

A glimpse of voice technology

1,380 views

Published on

This presentation talks about Speech IVR technology, basics of IVR, VoiceXMl and how to develop an IVR application.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

A glimpse of voice technology

  1. 1. A Glimpse of Voice Technology By: Vishad Garg Momentum India Pvt. Ltd. vishadg@momentum-tech.com vishadgarg@gmail.com 91-9611077772 September 12, 2001 January, 2002 Momentum Confidential 1
  2. 2. Agenda  Automated Voice Processing  Voice Portal  Voice XML  Voice Portal at Work January, 2002 Momentum Confidential 2
  3. 3. Definition “Automated Voice Processing is the act of answering, routing, and handling phone calls with a computer-based system. The call processing system answers and processes calls according to the needs of the caller and the person and/or company being called.” January, 2002 Momentum Confidential 3
  4. 4. Applications of Voice Processing Interactive Voice Response (IVR) Voice Mail Automatic Call Distribution(ACD) Audiotext Predictive Dialer Voice Portal January, 2002 Momentum Confidential 4
  5. 5. Interactive Voice Response (IVR) “IVR systems facilitate people-to-computer/database communications.It automates the handling of calls by interacting with one or more online databases.” IVR system works on the premise of:  Data Capture  Information Delivery  Computer Telephony Integration (CTI) Link January, 2002 Momentum Confidential 5
  6. 6. Voice Mail “Voice mail enhances people-to-people communication. Voice mail is an umbrella covering a variety of automated voice processing features including voice mailboxes for storing and forwarding messages, voice menus for routing and responding to calls, recorded announcements for selectively disseminating information, and information access to databases.” January, 2002 Momentum Confidential 6
  7. 7. Automatic Call Distribution(ACD) “ACD facilitate distribution of incoming calls based upon some algorithms to a group of people (agents) that can answer the calls.It uses the facility of ANI and DNIS to perform it.” January, 2002 Momentum Confidential 7
  8. 8. Audio text “Audio text is a service that allows callers to access prerecorded information on a topic of interest to them. It allows multiple callers to retrieve recorded announcements containing information that would otherwise have been given by a person. The information retrieved is general and not specific to each caller .” January, 2002 Momentum Confidential 8
  9. 9. Predictive Dialer “Predictive Dialer facilitate launching of calls and monitor their progress.Only connected calls are passed to agents.” January, 2002 Momentum Confidential 9
  10. 10. Agenda  Automated Voice Processing  Voice Portal  Voice XML  Voice Portal at Work January, 2002 Momentum Confidential 10
  11. 11. Definition “The convergence of the richness of the web and the accessibility of the phone is forming a vast new network a voice portal, where internet content can be accessed from any phone, anywhere, using human voice”. “Speech enabled access to web-based information”. January, 2002 Momentum Confidential 11
  12. 12. Voice Portal vs. Web Portal “Leverages the Internet for application development and delivery.” Phone instead of PC VoiceXML instead of HTML A voice browser instead of an ordinary web browser. January, 2002 Momentum Confidential 12
  13. 13. Why bring the internet to voice applications? Standard language enables portability. High level domain-specific language simplifies application development. Can consolidate voice and web applications. Cost of creating a speech-based portal platform continues to decline. Internet has raised public expectations, with people growing used to having information at their fingertips when they want it. Once people get accustomed to immediate news, weather reports or stock quotes over the Internet, the transition to the phone makes perfect sense. January, 2002 Momentum Confidential 13
  14. 14. Voice Portal Key Components     Automatic Speech Recognition(ASR) Voice Browser Text-To-Speech VoiceXML January, 2002 Momentum Confidential 14
  15. 15. Automatic Speech Recognition Automatic Speech Recognition (ASR) is the technology that allows a machine to understand human speech.  Takes human speech input, digitizes it, and converts it into a machine-readable string of text.  A component called a recognizer then manipulates the text into a form that the recognizer uses to identify what the speaker said.  January, 2002 Momentum Confidential 15
  16. 16. Voice Browser/Interpreter Document-Server  A document server processes request from a client application, the voice XML interpreter. The server produces VXML document in reply, which is processed by the voice XML interpreter. VXML Browser Implementation Platform January, 2002  VoiceXML interpreter is responsible for detecting an incoming call, acquiring the initial voice XML document and answering the call. Momentum Confidential 16
  17. 17. Text-To-Speech(TTS) TTS converts text strings inputs to the spoken outputs  TTS is increasingly being used to speak e-mail and Web-based text to callers  January, 2002 Momentum Confidential 17
  18. 18. Agenda  Automated Voice Processing  Voice Portal  Voice XML  Voice Portal at Work January, 2002 Momentum Confidential 18
  19. 19. What is VXML Voice extensible markup Language  A language for specifying voice/audio dialogs  Voice dialogs use audio prompts and text- to- speech (TTS) for output; touch- tone keys (DTMF) and automatic speech recognition (ASR) for input.  Main input/ output device (initially) is the phone.  January, 2002 Momentum Confidential 19
  20. 20. Goal of VXML Bring full power of web development and content delivery to voice response applications  Shield authors from low level programming and platform specific details.  Enables Integration of Voice Services with data services using Client Server paradigm  Voice service is viewed as a sequence of interaction dialog between a user and an implementation platform.  January, 2002 Momentum Confidential 20
  21. 21. Scope of VXML       Output of Synthesized speech Output of audio files Recognition of spoken input Recognition of DTMF input Recording of spoken input Telephony features such as call transfer and disconnect January, 2002 Momentum Confidential 21
  22. 22. VXML Concepts      Application Dialog/Sub-dialog Session Grammar Events January, 2002 Momentum Confidential 22
  23. 23. Application  A set of Documents sharing the same application root document  Root document variable and grammar available when transitioning to other document. January, 2002 Root D1 D2 D3 Momentum Confidential 23
  24. 24. Dialog/Sub-dialog A dialog is an interaction with the user, means prompt a menu and get some input  Two kind of dialogs,‘Forms'and‘Menu’  A sub-dialog is like a function call   Sub-dialog use for database query January, 2002 Momentum Confidential 24
  25. 25. Session  A session begins when user starts to interact with a voice XML interpreter, it continues as documents are loaded and processed, and ends when requested by the user. January, 2002 Momentum Confidential 25
  26. 26. Grammar A grammar is a set of phrases that a caller is expected to say during a dialog in response to a particular prompt.  A grammar can be as simple as “yes” versus “no” as large as a list of all the names of people living in a city.  A grammar file is a text file and it has the file extension .grammar  January, 2002 Momentum Confidential 26
  27. 27. Events VXML defines a mechanism for handling events not covered by the form mechanism  Events are thrown by the platform under variety of circumstances, user does not respond, response not recognize, help etc  Events are caught by catch elements.  January, 2002 Momentum Confidential 27
  28. 28. Agenda  Automated Voice Processing  Voice Portal  Voice XML  Voice Portal at Work January, 2002 Momentum Confidential 28
  29. 29. Momentum Voice Portal Development Services Momentum provides voice portal development services using the latest and preeminent speech-recognition and text-to-speech technology including Nuance, Speechworks and Fonix. January, 2002 Momentum Confidential 29
  30. 30. How We Do It? Requirement Analysis January, 2002 Prototype VUI Design Application Development Testing Deployment Momentum Confidential 30
  31. 31. Momentum Travel Voice Portal Momentum has developed a Voice Portal Demo application, Momentum Travel Voice Portal (MTVP). The MTVP provides a user interface through voice to give functionalities for purchasing and reserving travel packages. January, 2002 Momentum Confidential 31
  32. 32. Nuance in MTVP Momentum is using complete suite of Nuance voice technology, which includes- Nuance 7.0.3 for voice recognition, call control and recording of prompt. V-Builder for developing voice-user interface (VUI) that defines flow of interaction. Grammar-Builder to write grammars that represents valid responses. Nuance Speech Objects - Speech Objects are a set of reusable components implemented as Java beans.VXML is used as a development language for VUI. January, 2002 Momentum Confidential 32
  33. 33. Demo   To try the MTVP demo, dial any of the following phone number in US: (800) 303-9987 (415) 869-6909 When the system asks you to enter a pin, you can dial one of the following PINS: 823272/ 823273/823274 January, 2002 Momentum Confidential 33
  34. 34. Future Plans We are also planning to embark upon voice driven Ecommerce applications, i.e. V-Commerce, Voice Enabled Intranet and Unified Messaging. January, 2002 Momentum Confidential 34

×