SlideShare a Scribd company logo
1 of 18
SPEECHSYNTHESIS
TECHNOLOGY
Name: K. VIDYAMADHURI
ROLL NO: 14311A1201
IT-A, 2nd year, 2nd semester
CONTENTS
•Introduction
•History
•Construction
•Working
•Applications
•Challenges
INTRODUCTION
• Speech Synthesis is the artificial production of
human speech. A synthesizer can incorporate a model of
the vocal tract and other human voice characteristics to
create a completely "synthetic" voice output.
• A computer system used for this purpose is called
a speech computer or speech synthesizer.
• A text-to-speech (TTS) system converts normal
language text into speech; other systems
render symbolic linguistic representations like phonetic
transcriptions into speech.
HISTORY
• In 1779, the Danish scientist Christian Kratzenstein,
working at the Russian Academy of Sciences, built
models of the human vocal tract that could produce the
five long vowel sounds.
• Wolfgang Von Kempelen added models of tongue and
lips , enabling the machine to pronounce both vowels
and constants.
• In 1837, Charles Wheatstone produced a "speaking
machine" based on von Kempelen's design, and in 1857,
M. Faber built the "Euphonia". Wheatstone's design was
resurrected in 1923 by Paget.
• In the 1930s, Bell Labs developed the vocoder, which
automatically analyzed speech into its fundamental tone
and resonances.
• Homer Dudley developed a keyboard-operated voice
synthesizer called The Voder (Voice Demonstrator),
which is developed from the Vocoder.
• The first computer-based speech synthesis systems
were created in the late 1950s. The first general English
text-to-speech system was developed by Noriko
Umeda et al. in 1968 at the Electro technical Laboratory,
Japan.
A speech
synthesizer in
1990s
CONSTRUCTION
• A text-to-speech system (or "engine") is composed of two
parts: a front-end and a back-end.
• The front-end converts raw text containing symbols like
numbers and abbreviations into the equivalent of written-
out words (tokenization), then assigns phonetic
transcriptions to each word, and divides and marks the
text into prosodic units, like phrases, clauses,
and sentences (grapheme-phoneme conversion).
• The back-end—often referred to as the synthesizer—
then converts the symbolic linguistic representation into
sound.
Block diagram of Text-to-Speech
Engine
APPROACHES
There are different approaches to speech synthesis, for
example: text-to-speech and concept-to-speech synthesis.
• Concept-to-speech synthesis involves a generation
component that generates a textual expression from
semantic, pragmatic and discourse knowledge. The
speech signal can then be generated from this
expression.
• In text-to-speech synthesis, the text to be spoken in
provided, it is not generated by the system. It must
however be analyzed and interpreted in order to convey
the proper pronunciation and emphasis.
SYNTHESIZING TECHNIQUES
Concatenation
Formant
Articulatory
HMM-based
Sinewave
• Concatenative synthesis is based on the concatenation
(or stringing together) of segments of recorded speech.
Generally, concatenative synthesis produces the most
natural-sounding synthesized speech.
• Formant synthesis does not use human speech samples
at runtime. Instead, the synthesized speech output is
created using additive synthesis and an acoustic model
(physical modelling synthesis). Parameters such
as fundamental frequency, voicing, and noise levels are
varied over time to create a waveform of artificial speech.
• Articulatory synthesis refers to computational techniques
for synthesizing speech based on models of the
human vocal tract and the articulation processes
occurring there.
• HMM-based synthesis is a synthesis method based
on Hidden Markov models, also called Statistical
Parametric Synthesis. In this system, the frequency
spectrum (vocal tract),fundamental frequency (voice
source), and duration (prosody) of speech are modeled
simultaneously by HMMs.
• Sinewave synthesis is a technique for synthesizing
speech by replacing the formants (main bands of energy)
with pure tone whistles
SYSTEMS PROVIDING SPEECH
SYNTHESIS
• Apple uses Voiceover speech engine for its Mac OS in
laptops and iOS in iPhones, iPads and iPods.
• Modern Windows desktop systems can use SAPI
4 and SAPI 5 components to support speech synthesis
and speech recognition. Microsoft Speech Server is a
server-based package for voice synthesis and
recognition. It is designed for network use with web
applications and call centers.
• The Mattel Intellivision game console offered
the Intellivoice Voice Synthesis module in 1982. It
included the SP0256 Narrator speech synthesizer chip
on a removable cartridge.
SYSTEMS PROVIDING TEXT TO
SPEECH SYNTHESIS
• From version 1.6, Android added support for speech
synthesis (TTS)
• Currently, there are a number of applications and web
pages from a web browser or Google Toolbar such
as Text-to-voice which is an add-on to Firefox.
• Some specialized software can narrate RSS-feeds.
• Some e-book readers, such as the Amazon
Kindle, PocketBook eBook Reader Pro, and the Bebook
Neo use TTS.
• GPS Navigation units use speech synthesis for
automobile navigation.
Applications using Text-to Speech software
in iOS and Android respectively
MARKUP LANGUAGES ON
SPEECH SYNTHESIS
• A number of markup languages have been established
for the rendition of text as speech in an XML-compliant
format. The most recent is Speech Synthesis Markup
Language(SSML), which became a W3C
recommendation in 2004.
• Older speech synthesis markup languages include Java
Speech Markup Language (JSML) and SABLE.
APPLICATIONS
• The longest application has been in the use of screen
readers for people with visual impairment, but text-to-speech
systems are now commonly used by people with dyslexia and
other reading difficulties as well as by pre-literate children.
• Speech synthesis techniques are also used in entertainment
productions such as games and animations.
• In addition, speech synthesis is a valuable computational aid
for the analysis and assessment of speech disorders.
• It can also be used as an educational tool, to learn different
accents, like in Google Translate.
CHALLENGES
• Despite large improvements, Speech Synthesis can still
sound a little unnatural.
• The approaches to Speech Synthesis that yield the most
natural speech need considerable resources in terms of
data storage and processing power.
• The process of tokenizing text is rarely straightforward.
There are many spellings in English which are
pronounced differently based on context making it
difficult for users.
Speech synthesis technology

More Related Content

What's hot

Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognitionsowmith chatlapally
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionRHIMRJ Journal
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overviewsajanazoya
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognitionfathitarek
 
Sign Language Recognition based on Hands symbols Classification
Sign Language Recognition based on Hands symbols ClassificationSign Language Recognition based on Hands symbols Classification
Sign Language Recognition based on Hands symbols ClassificationTriloki Gupta
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionHugo Moreno
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition Goa App
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technologySrijanKumar18
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionAhmed Moawad
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCCHira Shaukat
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizyLizy Abraham
 
Text to Speech PowerPoint
Text to Speech PowerPointText to Speech PowerPoint
Text to Speech PowerPointmatthewmahony
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By MatlabAnkit Gujrati
 

What's hot (20)

Speech Signal Processing
Speech Signal ProcessingSpeech Signal Processing
Speech Signal Processing
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognition
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Linear Predictive Coding
Linear Predictive CodingLinear Predictive Coding
Linear Predictive Coding
 
Sign Language Recognition based on Hands symbols Classification
Sign Language Recognition based on Hands symbols ClassificationSign Language Recognition based on Hands symbols Classification
Sign Language Recognition based on Hands symbols Classification
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCC
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizy
 
Text to Speech PowerPoint
Text to Speech PowerPointText to Speech PowerPoint
Text to Speech PowerPoint
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 

Similar to Speech synthesis technology

Text To Speech
Text To SpeechText To Speech
Text To Speechlucyalexa
 
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...IJERA Editor
 
Ry pyconjp2015 karaoke
Ry pyconjp2015 karaokeRy pyconjp2015 karaoke
Ry pyconjp2015 karaokeRenyuan Lyu
 
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingEfficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingIOSR Journals
 
SAP (SPEECH AND AUDIO PROCESSING)
SAP (SPEECH AND AUDIO PROCESSING)SAP (SPEECH AND AUDIO PROCESSING)
SAP (SPEECH AND AUDIO PROCESSING)dineshkatta4
 
final ppt BATCH 3.pptx
final ppt BATCH 3.pptxfinal ppt BATCH 3.pptx
final ppt BATCH 3.pptxMounika715343
 
Text to speech converter in C#.NET
Text to speech converter in C#.NETText to speech converter in C#.NET
Text to speech converter in C#.NETMandeep Cheema
 
Survey On Speech Synthesis
Survey On Speech SynthesisSurvey On Speech Synthesis
Survey On Speech SynthesisCSCJournals
 
IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival FrameworkIRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival FrameworkIRJET Journal
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert Systemcsandit
 
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...ravi sharma
 
Silent sound interface
Silent sound interfaceSilent sound interface
Silent sound interfaceJeevitha Reddy
 
SMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk SystemSMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk SystemCSCJournals
 
Paper on Speech Recognition
Paper on Speech RecognitionPaper on Speech Recognition
Paper on Speech RecognitionThejus Joby
 
Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01girishjoshi1234
 
The main-principles-of-text-to-speech-synthesis-system
The main-principles-of-text-to-speech-synthesis-systemThe main-principles-of-text-to-speech-synthesis-system
The main-principles-of-text-to-speech-synthesis-systemCemal Ardil
 

Similar to Speech synthesis technology (20)

Text To Speech
Text To SpeechText To Speech
Text To Speech
 
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
 
Assign
AssignAssign
Assign
 
visH (fin).pptx
visH (fin).pptxvisH (fin).pptx
visH (fin).pptx
 
Ry pyconjp2015 karaoke
Ry pyconjp2015 karaokeRy pyconjp2015 karaoke
Ry pyconjp2015 karaoke
 
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingEfficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And Recording
 
Speech Synthesis.pptx
Speech Synthesis.pptxSpeech Synthesis.pptx
Speech Synthesis.pptx
 
SAP (SPEECH AND AUDIO PROCESSING)
SAP (SPEECH AND AUDIO PROCESSING)SAP (SPEECH AND AUDIO PROCESSING)
SAP (SPEECH AND AUDIO PROCESSING)
 
final ppt BATCH 3.pptx
final ppt BATCH 3.pptxfinal ppt BATCH 3.pptx
final ppt BATCH 3.pptx
 
Text to speech converter in C#.NET
Text to speech converter in C#.NETText to speech converter in C#.NET
Text to speech converter in C#.NET
 
Survey On Speech Synthesis
Survey On Speech SynthesisSurvey On Speech Synthesis
Survey On Speech Synthesis
 
IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival FrameworkIRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
 
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
 
Silent sound interface
Silent sound interfaceSilent sound interface
Silent sound interface
 
Speech-Recognition.pptx
Speech-Recognition.pptxSpeech-Recognition.pptx
Speech-Recognition.pptx
 
SMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk SystemSMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk System
 
Paper on Speech Recognition
Paper on Speech RecognitionPaper on Speech Recognition
Paper on Speech Recognition
 
Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01
 
The main-principles-of-text-to-speech-synthesis-system
The main-principles-of-text-to-speech-synthesis-systemThe main-principles-of-text-to-speech-synthesis-system
The main-principles-of-text-to-speech-synthesis-system
 

Recently uploaded

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 

Recently uploaded (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 

Speech synthesis technology

  • 1. SPEECHSYNTHESIS TECHNOLOGY Name: K. VIDYAMADHURI ROLL NO: 14311A1201 IT-A, 2nd year, 2nd semester
  • 3. INTRODUCTION • Speech Synthesis is the artificial production of human speech. A synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output. • A computer system used for this purpose is called a speech computer or speech synthesizer. • A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech.
  • 4. HISTORY • In 1779, the Danish scientist Christian Kratzenstein, working at the Russian Academy of Sciences, built models of the human vocal tract that could produce the five long vowel sounds. • Wolfgang Von Kempelen added models of tongue and lips , enabling the machine to pronounce both vowels and constants. • In 1837, Charles Wheatstone produced a "speaking machine" based on von Kempelen's design, and in 1857, M. Faber built the "Euphonia". Wheatstone's design was resurrected in 1923 by Paget.
  • 5. • In the 1930s, Bell Labs developed the vocoder, which automatically analyzed speech into its fundamental tone and resonances. • Homer Dudley developed a keyboard-operated voice synthesizer called The Voder (Voice Demonstrator), which is developed from the Vocoder. • The first computer-based speech synthesis systems were created in the late 1950s. The first general English text-to-speech system was developed by Noriko Umeda et al. in 1968 at the Electro technical Laboratory, Japan. A speech synthesizer in 1990s
  • 6. CONSTRUCTION • A text-to-speech system (or "engine") is composed of two parts: a front-end and a back-end. • The front-end converts raw text containing symbols like numbers and abbreviations into the equivalent of written- out words (tokenization), then assigns phonetic transcriptions to each word, and divides and marks the text into prosodic units, like phrases, clauses, and sentences (grapheme-phoneme conversion). • The back-end—often referred to as the synthesizer— then converts the symbolic linguistic representation into sound.
  • 7. Block diagram of Text-to-Speech Engine
  • 8. APPROACHES There are different approaches to speech synthesis, for example: text-to-speech and concept-to-speech synthesis. • Concept-to-speech synthesis involves a generation component that generates a textual expression from semantic, pragmatic and discourse knowledge. The speech signal can then be generated from this expression. • In text-to-speech synthesis, the text to be spoken in provided, it is not generated by the system. It must however be analyzed and interpreted in order to convey the proper pronunciation and emphasis.
  • 10. • Concatenative synthesis is based on the concatenation (or stringing together) of segments of recorded speech. Generally, concatenative synthesis produces the most natural-sounding synthesized speech. • Formant synthesis does not use human speech samples at runtime. Instead, the synthesized speech output is created using additive synthesis and an acoustic model (physical modelling synthesis). Parameters such as fundamental frequency, voicing, and noise levels are varied over time to create a waveform of artificial speech. • Articulatory synthesis refers to computational techniques for synthesizing speech based on models of the human vocal tract and the articulation processes occurring there.
  • 11. • HMM-based synthesis is a synthesis method based on Hidden Markov models, also called Statistical Parametric Synthesis. In this system, the frequency spectrum (vocal tract),fundamental frequency (voice source), and duration (prosody) of speech are modeled simultaneously by HMMs. • Sinewave synthesis is a technique for synthesizing speech by replacing the formants (main bands of energy) with pure tone whistles
  • 12. SYSTEMS PROVIDING SPEECH SYNTHESIS • Apple uses Voiceover speech engine for its Mac OS in laptops and iOS in iPhones, iPads and iPods. • Modern Windows desktop systems can use SAPI 4 and SAPI 5 components to support speech synthesis and speech recognition. Microsoft Speech Server is a server-based package for voice synthesis and recognition. It is designed for network use with web applications and call centers. • The Mattel Intellivision game console offered the Intellivoice Voice Synthesis module in 1982. It included the SP0256 Narrator speech synthesizer chip on a removable cartridge.
  • 13. SYSTEMS PROVIDING TEXT TO SPEECH SYNTHESIS • From version 1.6, Android added support for speech synthesis (TTS) • Currently, there are a number of applications and web pages from a web browser or Google Toolbar such as Text-to-voice which is an add-on to Firefox. • Some specialized software can narrate RSS-feeds. • Some e-book readers, such as the Amazon Kindle, PocketBook eBook Reader Pro, and the Bebook Neo use TTS. • GPS Navigation units use speech synthesis for automobile navigation.
  • 14. Applications using Text-to Speech software in iOS and Android respectively
  • 15. MARKUP LANGUAGES ON SPEECH SYNTHESIS • A number of markup languages have been established for the rendition of text as speech in an XML-compliant format. The most recent is Speech Synthesis Markup Language(SSML), which became a W3C recommendation in 2004. • Older speech synthesis markup languages include Java Speech Markup Language (JSML) and SABLE.
  • 16. APPLICATIONS • The longest application has been in the use of screen readers for people with visual impairment, but text-to-speech systems are now commonly used by people with dyslexia and other reading difficulties as well as by pre-literate children. • Speech synthesis techniques are also used in entertainment productions such as games and animations. • In addition, speech synthesis is a valuable computational aid for the analysis and assessment of speech disorders. • It can also be used as an educational tool, to learn different accents, like in Google Translate.
  • 17. CHALLENGES • Despite large improvements, Speech Synthesis can still sound a little unnatural. • The approaches to Speech Synthesis that yield the most natural speech need considerable resources in terms of data storage and processing power. • The process of tokenizing text is rarely straightforward. There are many spellings in English which are pronounced differently based on context making it difficult for users.