Submitted by :-
Arijit Chakraborty
(BETI1EC16002)
Submitted To :-
Dr. Ranjeet Singh Tomar
Professor , ECE dept.
1
CONTENTS
2
 ABSTRACT
 INTRODUCTION
 MOTIVATION
 BLOCK DIAGRAM
 WORKING PRINCIPLE
 PROPOSED METHADOLOGY
 ADVANTAGES AND DISADVANTAGES
 CONCLUSION
ABSTRACT
3
Voice based web access is a rapidly developing
technology. PHONET is a solution for this
technology
PHONET is used to make information accessible to
users who may not be able to read or write, or who
do not have access to the internet
 Unlike a computer interface , a voice interface needs
no keyboard, no mouse, freeing from these barriers
to access internet
 It requires no training
 It is accessible to anyone with a telephone
4
INTRODUCTION
5
 PHONET involves the most complex technologies
Like Speech Recognition (SR), Text to speech (TTS)
conversion and Artificial Intelligence (AI).
 The technologies like SR, TTS and AI are integrated
to develop a intelligent Platform (PHONET) to
achieve voice based web access which involves
Document processing and Document Rendering.
 Document Processing consists of two approaches
i.e., telephone browsing and transcoding
 In Document Rendering we present the major
problem i.e., the text rendering.
 The companies which deliver contents such as news,
weather, horoscopes, and stock quotes, etc. over the
phone, are called “Voice Portals”.
6
 Voice portals were the first web applications that
tried to integrate websites with voice.
 With the Voice Internet technology PHONET,
anyone can surf, search, send and receive email, and
conduct e-commerce transactions, etc.
 PHONET technology is faster and cheaper.
7
 The PHONET platform acts as an “Intelligent
Agent” (IA) located between the user and the
Internet .
 The IA automates the process of rendering
information from the Internet to the user in a
meaningful, precise and pleasant to listen audio
format
8
MOTIVATION
9
 When we are in the car or away from the office or
computer, accessing the Web is difficult.
 An increasing number of people prefer an interface
that allows them to hear and speak rather than see
and click or type.
 Some existing Internet users have also identified
problems with the visual Internet experience.
10
PROPOSED SYSTEM
Fig1:-
PROPOSED PLAN FOR THE
WORKING OF PHONET
WORKING PRINCIPLE
11
 Subscribers dial a toll-free number, and start
accessing the Internet using voice commands.
 Speech Recognition technology in the
company’s system allows users to give simple
commands, such as "go to Google" or "read my
email" to get to the Net-based information they
want.
 When the user sends request to access the
internet, the request goes to the voice browser.
 If the request is voice, speech Recognition
converts voice into text.
 They will be able to quickly locate information,
such as breaking news, traffic reports,
directions, or anything interested in the World
Wide Web.
12
 Using text-to-speech technology, an "intelligent agent"
will read the requested via ainformation
process
out loud
the user’s voicecomputerized voice, and
commands.
13Fig2:-
INTELLIGENTAGENT AND KEY FEATURESOF PHONET
The Technologies employed are :-
1. Speech processing
2. Text-to-speech translation
3. Artificial Intelligence
4.Language Translation
14
1. SPEECH PROCESSING
 Speech processing is the study of speech signals and the
processing methods of these signals.
 The speech signals are processed in digital representation.
15Fig3:-
BASIC FUNCTION OF SPEECH PROCESSING AND DIGITAL
SPEECH PROCESSING REPRESENTATION
2.TEXT TO SPEECH TRANSLATION
 Text to speech translation is the process by which
conversion of text data to speech signals takes place
Fig4:- TEXT TO SPEECH CONVERSION
16
3. ARTIFICIAL INTELLIGENCE
 It is the intelligence exhibited by software.
Fig5:- ARTIFICIALINTELLIGENCE 17
 PHONET Offer the promise of Allowing everyoneto
access web based services from any phone.
 Users will be able to choose whether to respond by a
key press or a spoken command
 The main plan is
1. Accept the voice commands
2. Output in audio format
18
4.LANGUAGE TRANSLATION
19
 The IA includes a language translation engine that
dynamically translates web contents from one
language into another in real time.
 Thus, a Hindi speaking person can ask to surf an
English website in Hindi - the Intelligent Agent
would access the English website, extract the
content of the website and translate it in Hindi and
read it back to the user in Hindi.
 Grammar processing continues with compilation and
optimization where redundancies are eliminated.
The word vocabulary associated with the grammar is
further processed by a Text-To-Speech (TTS)
pronunciation module that generates phonetic
transcriptions for each word of the grammar.
 Since the TTS engine uses pronunciation rules it is
not limited to dictionary words. The grammar and
vocabulary are then loaded into the speech
recognizer. This process typically takes about a
second. At the same time, the Web document is
described to the user.
20
 Rendering is achieved by using Page Highlights (a
method to find and speak the key contents on a page),
finding right as well as only relevant contents on a linked
page, assembling right contents from a linked page, and
providing easy navigation. These key steps are done
using the information available in the visual web page
itself and proper algorithms that use information such as
text contents, color, font size, links, paragraph, and
amount of text. Artificial Intelligence techniques are used
in this automated rendering process. This is similar to
how the human brain renders from a visual page;
selecting the information of interest and then reading.
21
 The IA includes a language translation engine that
dynamically translates web contents from one
language into another in real time. Thus, a Hindi
speaking person can ask to surf an English website
in Hindi - the Intelligent Agent would access the
English website, extract the content of the website
and translate it on the fly in Hindi and read it back to
the user in Hindi.
22
ADVANTAGES
23
 The possibility of accessing web through an
ordinary phone
 Email (send, receive, compose, copy, forward, reply,
delete and more)
 Airline reservations and tracking
DISADVANTAGES
24
 Complexity in Hardware interface
 All the users should know English
language , as the user interface will be
provided in English
CONCLUSION
25
 It is a new technology which provides a true
audio Internet experience. Using an ordinary
telephone and simple voice commands, users
will be able to surf and hear the entire Internet
information they desire
 Any web page will be accessible, but not
limited to sites as written with Wireless
Application Protocol.
Fig6:- GOOGLE VOICE SEARCH Fig7:- SPEECH RECOGNITION
Fig8:- SIMPLE REPRESENTATION OF
VOICE RECOGNITION
26
27

Phonet

  • 1.
    Submitted by :- ArijitChakraborty (BETI1EC16002) Submitted To :- Dr. Ranjeet Singh Tomar Professor , ECE dept. 1
  • 2.
    CONTENTS 2  ABSTRACT  INTRODUCTION MOTIVATION  BLOCK DIAGRAM  WORKING PRINCIPLE  PROPOSED METHADOLOGY  ADVANTAGES AND DISADVANTAGES  CONCLUSION
  • 3.
    ABSTRACT 3 Voice based webaccess is a rapidly developing technology. PHONET is a solution for this technology PHONET is used to make information accessible to users who may not be able to read or write, or who do not have access to the internet
  • 4.
     Unlike acomputer interface , a voice interface needs no keyboard, no mouse, freeing from these barriers to access internet  It requires no training  It is accessible to anyone with a telephone 4
  • 5.
    INTRODUCTION 5  PHONET involvesthe most complex technologies Like Speech Recognition (SR), Text to speech (TTS) conversion and Artificial Intelligence (AI).  The technologies like SR, TTS and AI are integrated to develop a intelligent Platform (PHONET) to achieve voice based web access which involves Document processing and Document Rendering.
  • 6.
     Document Processingconsists of two approaches i.e., telephone browsing and transcoding  In Document Rendering we present the major problem i.e., the text rendering.  The companies which deliver contents such as news, weather, horoscopes, and stock quotes, etc. over the phone, are called “Voice Portals”. 6
  • 7.
     Voice portalswere the first web applications that tried to integrate websites with voice.  With the Voice Internet technology PHONET, anyone can surf, search, send and receive email, and conduct e-commerce transactions, etc.  PHONET technology is faster and cheaper. 7
  • 8.
     The PHONETplatform acts as an “Intelligent Agent” (IA) located between the user and the Internet .  The IA automates the process of rendering information from the Internet to the user in a meaningful, precise and pleasant to listen audio format 8
  • 9.
    MOTIVATION 9  When weare in the car or away from the office or computer, accessing the Web is difficult.  An increasing number of people prefer an interface that allows them to hear and speak rather than see and click or type.  Some existing Internet users have also identified problems with the visual Internet experience.
  • 10.
    10 PROPOSED SYSTEM Fig1:- PROPOSED PLANFOR THE WORKING OF PHONET
  • 11.
    WORKING PRINCIPLE 11  Subscribersdial a toll-free number, and start accessing the Internet using voice commands.  Speech Recognition technology in the company’s system allows users to give simple commands, such as "go to Google" or "read my email" to get to the Net-based information they want.
  • 12.
     When theuser sends request to access the internet, the request goes to the voice browser.  If the request is voice, speech Recognition converts voice into text.  They will be able to quickly locate information, such as breaking news, traffic reports, directions, or anything interested in the World Wide Web. 12
  • 13.
     Using text-to-speechtechnology, an "intelligent agent" will read the requested via ainformation process out loud the user’s voicecomputerized voice, and commands. 13Fig2:- INTELLIGENTAGENT AND KEY FEATURESOF PHONET
  • 14.
    The Technologies employedare :- 1. Speech processing 2. Text-to-speech translation 3. Artificial Intelligence 4.Language Translation 14
  • 15.
    1. SPEECH PROCESSING Speech processing is the study of speech signals and the processing methods of these signals.  The speech signals are processed in digital representation. 15Fig3:- BASIC FUNCTION OF SPEECH PROCESSING AND DIGITAL SPEECH PROCESSING REPRESENTATION
  • 16.
    2.TEXT TO SPEECHTRANSLATION  Text to speech translation is the process by which conversion of text data to speech signals takes place Fig4:- TEXT TO SPEECH CONVERSION 16
  • 17.
    3. ARTIFICIAL INTELLIGENCE It is the intelligence exhibited by software. Fig5:- ARTIFICIALINTELLIGENCE 17
  • 18.
     PHONET Offerthe promise of Allowing everyoneto access web based services from any phone.  Users will be able to choose whether to respond by a key press or a spoken command  The main plan is 1. Accept the voice commands 2. Output in audio format 18
  • 19.
    4.LANGUAGE TRANSLATION 19  TheIA includes a language translation engine that dynamically translates web contents from one language into another in real time.  Thus, a Hindi speaking person can ask to surf an English website in Hindi - the Intelligent Agent would access the English website, extract the content of the website and translate it in Hindi and read it back to the user in Hindi.
  • 20.
     Grammar processingcontinues with compilation and optimization where redundancies are eliminated. The word vocabulary associated with the grammar is further processed by a Text-To-Speech (TTS) pronunciation module that generates phonetic transcriptions for each word of the grammar.  Since the TTS engine uses pronunciation rules it is not limited to dictionary words. The grammar and vocabulary are then loaded into the speech recognizer. This process typically takes about a second. At the same time, the Web document is described to the user. 20
  • 21.
     Rendering isachieved by using Page Highlights (a method to find and speak the key contents on a page), finding right as well as only relevant contents on a linked page, assembling right contents from a linked page, and providing easy navigation. These key steps are done using the information available in the visual web page itself and proper algorithms that use information such as text contents, color, font size, links, paragraph, and amount of text. Artificial Intelligence techniques are used in this automated rendering process. This is similar to how the human brain renders from a visual page; selecting the information of interest and then reading. 21
  • 22.
     The IAincludes a language translation engine that dynamically translates web contents from one language into another in real time. Thus, a Hindi speaking person can ask to surf an English website in Hindi - the Intelligent Agent would access the English website, extract the content of the website and translate it on the fly in Hindi and read it back to the user in Hindi. 22
  • 23.
    ADVANTAGES 23  The possibilityof accessing web through an ordinary phone  Email (send, receive, compose, copy, forward, reply, delete and more)  Airline reservations and tracking
  • 24.
    DISADVANTAGES 24  Complexity inHardware interface  All the users should know English language , as the user interface will be provided in English
  • 25.
    CONCLUSION 25  It isa new technology which provides a true audio Internet experience. Using an ordinary telephone and simple voice commands, users will be able to surf and hear the entire Internet information they desire  Any web page will be accessible, but not limited to sites as written with Wireless Application Protocol.
  • 26.
    Fig6:- GOOGLE VOICESEARCH Fig7:- SPEECH RECOGNITION Fig8:- SIMPLE REPRESENTATION OF VOICE RECOGNITION 26
  • 27.