 Topic voice Browser
 Registration # 2013-ag-6094
 Name javaria kanwal
 Supervisor Miss.uzma satter
Submitted to
Prof. Akmal Rehan
1
VOICE BROWSER
2
WHAT IS A VOICE BROWSER
A voice browser is a device :
that interprets voice input and interprets voice markup
languages to generate voice output.
that interprets a script which specifies exactly what to verbally
present to the user as well as when to present each piece of
information.
3
MOTIVATION
There are 10 times as many telephones as connected PCs.
Cell phones usage is growing dramatically.
Speaking and listening are the natural usage modes for modes. Easy
to use - for people with no knowledge or fear of computers.
Voice interaction can escape the physical limitations on keypads and
displays as mobile devices become ever smaller
4
KEY TECHNOLOGIES
 Speech Recognition
Voice input VoXML file  Text
 Speech Synthesis
Text  VoXML file  Output(Pre-recorded)
5
WORLD WIDE WEB CONSORTIUM(W3C)
World Wide Web Consortium(W3C) develops
interpretable technologies(software and tools) to
lead the web to its full potential as a forum of
information ,commerce and communication.
 W3C Speech interface framework
 VoiceXML
 Speech recognition
 Call control
6
VOICEXML
voiceXML is a dialog markup language designs for
telephony applications where users are restricted to voice
and DTMF (touch tone) input.
7
8
SPEECH RECOGNITION
9
SPEECH GRAMMAR
 Speech grammars allow authors to specify the rules for
covering the sequence of words that users are expected
to say in particular context.
 These contextual clues allow the recognition engine to
focus on likely utterances , improving the chances of the
correct match
10
STOCHASTIC (N GRAM) LANGUAGE MODELS
 Speech grammar is un useful in case of open-
ended prompt e.g. how can I help you
 The solution is to use a stochastic language
models. such models specify the probability that
one word occurs following certain others. the
probabilities are computed from the collection of
utterances collected from many users.
11
SEMANTIC INTERPRETATION
The recognition process matches an utterance to a
speech grammar, building a parse tree as a
byproduct.
There are two approaches to harvesting semantic
rules from the parse tree :
1. Automating grammar rules with semantic
interpretation tags
2. Representing the results in XML
12
CALL CONTROL
 Fine-grained control of speech (signal processing )
resources and telephony resources in a VoiceXML
telephony platform.
 Will enable application developers to use markup to
perform call screening, whisper call waiting call
transfer, and more.
 Can be used to transfer a user from on voice
browser to another on a completely different
machine.
13
APPLICATIONS
 It can be divided into three categories :
 Web Browsing
 Limited information Access
 Spoken Dialog Systems
14
FUTURE
•Voice browsing will become visual(Multi-modal)
•Can be integrated to an OS
•Integrated to every application.
15
CONCLUSION
 Browser technology is changing very fast these
days and we are moving from the visual paradigm
to the voice paradigm.
 Voice browser is the technology to enter this
paradigm.
 Voice browser is a device which interpret voice
input and generate voice output.
16
17
18

voice browser

  • 1.
     Topic voiceBrowser  Registration # 2013-ag-6094  Name javaria kanwal  Supervisor Miss.uzma satter Submitted to Prof. Akmal Rehan 1
  • 2.
  • 3.
    WHAT IS AVOICE BROWSER A voice browser is a device : that interprets voice input and interprets voice markup languages to generate voice output. that interprets a script which specifies exactly what to verbally present to the user as well as when to present each piece of information. 3
  • 4.
    MOTIVATION There are 10times as many telephones as connected PCs. Cell phones usage is growing dramatically. Speaking and listening are the natural usage modes for modes. Easy to use - for people with no knowledge or fear of computers. Voice interaction can escape the physical limitations on keypads and displays as mobile devices become ever smaller 4
  • 5.
    KEY TECHNOLOGIES  SpeechRecognition Voice input VoXML file  Text  Speech Synthesis Text  VoXML file  Output(Pre-recorded) 5
  • 6.
    WORLD WIDE WEBCONSORTIUM(W3C) World Wide Web Consortium(W3C) develops interpretable technologies(software and tools) to lead the web to its full potential as a forum of information ,commerce and communication.  W3C Speech interface framework  VoiceXML  Speech recognition  Call control 6
  • 7.
    VOICEXML voiceXML is adialog markup language designs for telephony applications where users are restricted to voice and DTMF (touch tone) input. 7
  • 8.
  • 9.
  • 10.
    SPEECH GRAMMAR  Speechgrammars allow authors to specify the rules for covering the sequence of words that users are expected to say in particular context.  These contextual clues allow the recognition engine to focus on likely utterances , improving the chances of the correct match 10
  • 11.
    STOCHASTIC (N GRAM)LANGUAGE MODELS  Speech grammar is un useful in case of open- ended prompt e.g. how can I help you  The solution is to use a stochastic language models. such models specify the probability that one word occurs following certain others. the probabilities are computed from the collection of utterances collected from many users. 11
  • 12.
    SEMANTIC INTERPRETATION The recognitionprocess matches an utterance to a speech grammar, building a parse tree as a byproduct. There are two approaches to harvesting semantic rules from the parse tree : 1. Automating grammar rules with semantic interpretation tags 2. Representing the results in XML 12
  • 13.
    CALL CONTROL  Fine-grainedcontrol of speech (signal processing ) resources and telephony resources in a VoiceXML telephony platform.  Will enable application developers to use markup to perform call screening, whisper call waiting call transfer, and more.  Can be used to transfer a user from on voice browser to another on a completely different machine. 13
  • 14.
    APPLICATIONS  It canbe divided into three categories :  Web Browsing  Limited information Access  Spoken Dialog Systems 14
  • 15.
    FUTURE •Voice browsing willbecome visual(Multi-modal) •Can be integrated to an OS •Integrated to every application. 15
  • 16.
    CONCLUSION  Browser technologyis changing very fast these days and we are moving from the visual paradigm to the voice paradigm.  Voice browser is the technology to enter this paradigm.  Voice browser is a device which interpret voice input and generate voice output. 16
  • 17.
  • 18.