SlideShare a Scribd company logo
1 of 32
Speech Processing
• Fundamentals of Digital Speech processing
1.Anatomy and physiology of speech organs
2.The process of speech production
3.The Acoustic Theory of speech production
4.Digital models for speech signals
Applications of Speech Processing
• 1.Speech recognition: speech to text
• 2.Speech understanding: Not exact words(meaning is
important rather than text) :speech translation
• 3.speech synthesis: Text to speech, computer can
speak to you
• 4.Word processing: check and correct spelling,
grammar and style
• 5.text prediction: speed up word processing
• 6.automatic summarization: Topic identification,
summary generation
• 7.text mining : Necessary data
• Anatomy: It is the study of structure of bodies of people or animals
• Physiology: It is the study of how people’s and animals bodies functions
and understanding the higher order mechanisms within the human central
nervous system that account for speech production in human beings
• Acoustic: It is a scientific study of sounds
• Phonetics: It is relating to the sound of a word or to the sounds that are
used in languages
• Phonemes: It is the smallest unit of sounds which is significant in a
language
• Articulatory:It is the action of productory a sound or word cleary,in speech
or music
• Linguistics: It is study of the way in which language works
• Semantics: It is the branch of Linguistics that deals with the meanings of
words and sentences.
Speech Processing
Signal
Processing Information
Theory
Phonetics
Acoustics
Algorithms
(Programming)
Fourier transforms
Discrete time filters
AR(MA) models
Entropy
Communication theory
Rate-distortion theory
Statistical SP
Stochastic
models
Psychoacoustics
Room acoustics
Speech production
ASR: Application
© James Glass, MIT
7
Recognition
Voice Input Analog to Digital Acoustic Model
Language Model
Display Speech EngineFeedback
Automatic Speech Recognition
Speech Generation
• first talker formulates a message(in this mind)that
he wants to transmit to listener via speech
• The process of message formulation is creation of
printed text expressing the words of message
• The next step is conversion of the message into a
language code.
• This roughly corresponds to converting the
printed text of message into set of phoneme
sequence corresponding to sounds that make up
words and pitch accent associated with the
sounds
• Once the language code is chosen, the talker
must execute a series of neuromuscular
commands to cause the vocal cords to vibrate
when appropriate and shape the vocal tract
such that the proper sequence of speech
sounds is created and spoken by the talker,
then producing an acoustic signal as final
output
Speech Recognition
• First the listener processes the acoustic signal
the basilar membrane in the inner ear, which
providing a running spectrum analysis of the
incoming signal.
• The neural activity along the auditory nerve is
converted into a language code at higher
centers of processing within the brain and
message comprehension is achieved
• The lungs and the associated muscles act as the source
of air for exciting the vocal mechanism.
• The muscle force pushes air out of lungs(shown as a
piston pushing up within a cylinder)and though the
bronchi and trachea.
• When the vocal cords are tensed, the air flow causes
them to vibrate ,producing so called voiced speech
sounds
• When the vocal cords are relaxed, in order to produce
a sound, the air flow either must pass through a
constriction in vocal tract and thereby become
turbulent, producing so called unvoiced speech sounds
Classifications
• 1.silence(s)-no speech is produced()
• 2.Unvoiced(U):vocal cords are not vibrating so
speech signal is aperiodic or random in nature
• 3.Voiced(V): vocal cords are vibrate
periodically when air flows from the lungs, so
speech signal is periodic
Speech Waveform Characteristics
• Loudness
• Voiced/Unvoiced.
• Pitch.
– Fundamental frequency.
• Spectral envelope.
– Formants.
Speech Waveform Characteristics
Cont.
Voiced Speech Unvoiced Speech
/ih/ /s/
Phoneme Hierarchy
Speech sounds
Vowels ConsonantsDiphtongs
Plosive
Nasal
Fricative
Retroflex
liquid
Lateral
liquid
Glide
iy, ih, ae, aa,
ah, ao,ax, eh,
er, ow, uh, uw
ay, ey,
oy, aw
w, y
p, b, t,
d, k, g
m, n, ng f, v, th, dh,
s, z, sh, zh, h
r
l
Language dependent.
About 50 in English.
Signal processing
Digital speech processing
• Speech signals are composed of a sequence of
sounds.
• The study of these rules and their implication
s in human communication is the domain of
linguistics.
• The study and classification of sound of
speech is called phonetics.
speech processing basics
speech processing basics
speech processing basics
speech processing basics

More Related Content

What's hot

Digital speech processing lecture1
Digital speech processing lecture1Digital speech processing lecture1
Digital speech processing lecture1
Samiul Parag
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
Diptimaya Sarangi
 

What's hot (20)

Digital speech processing lecture1
Digital speech processing lecture1Digital speech processing lecture1
Digital speech processing lecture1
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Speech Recognition System
Speech Recognition SystemSpeech Recognition System
Speech Recognition System
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Linear Predictive Coding
Linear Predictive CodingLinear Predictive Coding
Linear Predictive Coding
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech synthesis technology
Speech synthesis technologySpeech synthesis technology
Speech synthesis technology
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Speech recognition an overview
Speech recognition   an overviewSpeech recognition   an overview
Speech recognition an overview
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizy
 
Linear Predictive Coding
Linear Predictive CodingLinear Predictive Coding
Linear Predictive Coding
 

Viewers also liked (6)

Speech Signal Processing
Speech Signal ProcessingSpeech Signal Processing
Speech Signal Processing
 
Ppt on speech processing by ranbeer
Ppt on speech processing by ranbeerPpt on speech processing by ranbeer
Ppt on speech processing by ranbeer
 
Radio Communication
Radio CommunicationRadio Communication
Radio Communication
 
Radio communication presentation
Radio communication presentationRadio communication presentation
Radio communication presentation
 
Radio Presentation
Radio PresentationRadio Presentation
Radio Presentation
 
Gsm.....ppt
Gsm.....pptGsm.....ppt
Gsm.....ppt
 

Similar to speech processing basics

Principal characteristics of speech
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speech
Nikolay Karpov
 
1-An Introduction to English Phonetics and Phonology.ppt
1-An Introduction to English Phonetics and Phonology.ppt1-An Introduction to English Phonetics and Phonology.ppt
1-An Introduction to English Phonetics and Phonology.ppt
PhamTheTan2
 
Principal characteristics of speech
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speech
Nikolay Karpov
 

Similar to speech processing basics (20)

The Phases of Speech
The Phases of SpeechThe Phases of Speech
The Phases of Speech
 
Principal characteristics of speech
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speech
 
Phonetics and its types.PPTX
Phonetics and its types.PPTXPhonetics and its types.PPTX
Phonetics and its types.PPTX
 
Part1 speech basics
Part1 speech basicsPart1 speech basics
Part1 speech basics
 
Chapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptx
Chapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptxChapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptx
Chapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptx
 
Phonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Phonetics & phonology, INTRODUCTION, Dr, Salama EmbarakPhonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Phonetics & phonology, INTRODUCTION, Dr, Salama Embarak
 
Physiology of speech and swallowing
Physiology of speech and swallowingPhysiology of speech and swallowing
Physiology of speech and swallowing
 
1-An Introduction to English Phonetics and Phonology.ppt
1-An Introduction to English Phonetics and Phonology.ppt1-An Introduction to English Phonetics and Phonology.ppt
1-An Introduction to English Phonetics and Phonology.ppt
 
Lecture phonetics
Lecture phoneticsLecture phonetics
Lecture phonetics
 
Speech and Language Processing
Speech and Language ProcessingSpeech and Language Processing
Speech and Language Processing
 
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
 
Physiology of speech
Physiology of speechPhysiology of speech
Physiology of speech
 
Audioprocessing
AudioprocessingAudioprocessing
Audioprocessing
 
An Introduction To Speech Sciences (Acoustic Analysis Of Speech)
An Introduction To Speech Sciences (Acoustic Analysis Of Speech)An Introduction To Speech Sciences (Acoustic Analysis Of Speech)
An Introduction To Speech Sciences (Acoustic Analysis Of Speech)
 
Principal characteristics of speech
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speech
 
Theories of speech perception.pptx
Theories of speech perception.pptxTheories of speech perception.pptx
Theories of speech perception.pptx
 
Phonetics phonology and sociolinguistics
Phonetics phonology and sociolinguisticsPhonetics phonology and sociolinguistics
Phonetics phonology and sociolinguistics
 
General linguistics
General linguisticsGeneral linguistics
General linguistics
 
Phonetics lesson 1 - general introduction
Phonetics   lesson 1 - general introductionPhonetics   lesson 1 - general introduction
Phonetics lesson 1 - general introduction
 
PHONETICS AND PHONOLOGY COBAEM COURSE pptx
PHONETICS AND PHONOLOGY COBAEM COURSE pptxPHONETICS AND PHONOLOGY COBAEM COURSE pptx
PHONETICS AND PHONOLOGY COBAEM COURSE pptx
 

Recently uploaded

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Recently uploaded (20)

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 

speech processing basics

  • 1. Speech Processing • Fundamentals of Digital Speech processing 1.Anatomy and physiology of speech organs 2.The process of speech production 3.The Acoustic Theory of speech production 4.Digital models for speech signals
  • 2. Applications of Speech Processing • 1.Speech recognition: speech to text • 2.Speech understanding: Not exact words(meaning is important rather than text) :speech translation • 3.speech synthesis: Text to speech, computer can speak to you • 4.Word processing: check and correct spelling, grammar and style • 5.text prediction: speed up word processing • 6.automatic summarization: Topic identification, summary generation • 7.text mining : Necessary data
  • 3.
  • 4. • Anatomy: It is the study of structure of bodies of people or animals • Physiology: It is the study of how people’s and animals bodies functions and understanding the higher order mechanisms within the human central nervous system that account for speech production in human beings • Acoustic: It is a scientific study of sounds • Phonetics: It is relating to the sound of a word or to the sounds that are used in languages • Phonemes: It is the smallest unit of sounds which is significant in a language • Articulatory:It is the action of productory a sound or word cleary,in speech or music • Linguistics: It is study of the way in which language works • Semantics: It is the branch of Linguistics that deals with the meanings of words and sentences.
  • 5. Speech Processing Signal Processing Information Theory Phonetics Acoustics Algorithms (Programming) Fourier transforms Discrete time filters AR(MA) models Entropy Communication theory Rate-distortion theory Statistical SP Stochastic models Psychoacoustics Room acoustics Speech production
  • 7. 7 Recognition Voice Input Analog to Digital Acoustic Model Language Model Display Speech EngineFeedback
  • 9.
  • 10. Speech Generation • first talker formulates a message(in this mind)that he wants to transmit to listener via speech • The process of message formulation is creation of printed text expressing the words of message • The next step is conversion of the message into a language code. • This roughly corresponds to converting the printed text of message into set of phoneme sequence corresponding to sounds that make up words and pitch accent associated with the sounds
  • 11. • Once the language code is chosen, the talker must execute a series of neuromuscular commands to cause the vocal cords to vibrate when appropriate and shape the vocal tract such that the proper sequence of speech sounds is created and spoken by the talker, then producing an acoustic signal as final output
  • 12. Speech Recognition • First the listener processes the acoustic signal the basilar membrane in the inner ear, which providing a running spectrum analysis of the incoming signal. • The neural activity along the auditory nerve is converted into a language code at higher centers of processing within the brain and message comprehension is achieved
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18. • The lungs and the associated muscles act as the source of air for exciting the vocal mechanism. • The muscle force pushes air out of lungs(shown as a piston pushing up within a cylinder)and though the bronchi and trachea. • When the vocal cords are tensed, the air flow causes them to vibrate ,producing so called voiced speech sounds • When the vocal cords are relaxed, in order to produce a sound, the air flow either must pass through a constriction in vocal tract and thereby become turbulent, producing so called unvoiced speech sounds
  • 19. Classifications • 1.silence(s)-no speech is produced() • 2.Unvoiced(U):vocal cords are not vibrating so speech signal is aperiodic or random in nature • 3.Voiced(V): vocal cords are vibrate periodically when air flows from the lungs, so speech signal is periodic
  • 20. Speech Waveform Characteristics • Loudness • Voiced/Unvoiced. • Pitch. – Fundamental frequency. • Spectral envelope. – Formants.
  • 21. Speech Waveform Characteristics Cont. Voiced Speech Unvoiced Speech /ih/ /s/
  • 22.
  • 23.
  • 24. Phoneme Hierarchy Speech sounds Vowels ConsonantsDiphtongs Plosive Nasal Fricative Retroflex liquid Lateral liquid Glide iy, ih, ae, aa, ah, ao,ax, eh, er, ow, uh, uw ay, ey, oy, aw w, y p, b, t, d, k, g m, n, ng f, v, th, dh, s, z, sh, zh, h r l Language dependent. About 50 in English.
  • 27.
  • 28. • Speech signals are composed of a sequence of sounds. • The study of these rules and their implication s in human communication is the domain of linguistics. • The study and classification of sound of speech is called phonetics.