SlideShare a Scribd company logo
Predictive Modeling of Speech
Dr. Yogananda Isukapalli
PREDICTIVE MODELING OF SPEECH
Linear Predictive Coding (LPC):
is defined as a digital method for encoding an analog signal in
which a particular value is predicted by a linear function of
the past values of the signal
- First proposed method for encoding Human Speech US DoD
- Human speech is produced in the vocal tract which can be
approximated as a variable diameter tube.
- The linear predictive coding (LPC) model is based on a
mathematical approximation of the vocal tract represented
by a tube of a varying diameter
Impulse
Train
Generator
Vocal-cord
Sound Pulse
White-Noise
Generator
Vocal-tract
Filter
Vocal-tract
parameters
Pitched period
Voiced/unvoiced switch
Block Diagram of simplified model for the speech production process
For Voiced
speech
For Unvoiced
speech
Produces a sequence
of impulses equally
spaced
SPEECH PRODUCTION
Ø The model assumes that the sound-generating mechanism
is linearly separable from the intelligence-modulating
vocal-tract filter.
Ø The precise form of the excitation depends on whether the
Speech is voiced or unvoiced.
• Voiced speech sound (such as /i/ in eve) is generated
from a quasi-periodic excitation of vocal-tract.
• Unvoiced speech sound (such as /f/ in fish) is generated
from random sounds produced by turbulent airflow
through a constriction along the vocal tract.
SPEECH PRODUCTION
SPEECH PRODUCTION
Vocal-tract filter is represented by the all-pole transfer function,
å
=
-
+
= M
k
k
k z
a
G
z
H
1
1
)
(
where
G: Gain factor
ak: Real-value coefficient
The form of excitation applied to this filter is changed by
switching between the voiced and unvoiced sounds.
SPEECH PRODUCTION
ü Accurate modeling the short-term power spectral
envelope plays an important role in the quality and
intelligibility of the reconstructed speech.
ü At low bit-rate speech coding, an all-pole filter is
adopted to model the spectral information in LPC
(linear predictive coding)-based coders.
ü By minimizing the MSE between the actual speech
samples and the predicted ones, an optimal set of
weights for an all-pole (synthesis) digital filter can
be determined
Assuming an “all-pole” model, irrespective of speech signal
envelope we are trying to estimate:
• If sound is unvoiced, usual LPC method gives “good
estimate” of filter coefficients ak
• If sound is voiced, usual LPC method gives a “biased
estimate” of ak with the estimate worsening as the
pitch increases (error criterion being MSE)
Example: This example illustrates the limitations of standard
all-pole modeling of periodic waveforms. Specifications are:
• All-pole filter order, M = 12
• Input Signal: periodic pulse sequence with N=32
SPEECH PRODUCTION
The solid line is the original 12-pole envelope, which goes through all the points.
The dashed line is the 12-pole linear predictive model for N=30 spectral lines
SPEECH PRODUCTION
Itakura-Saito distance measure:
Let u(n) be a real-valued, stationary stochastic process, it’s
Fourier Transform Uk is:
( ) 1
-
N
...
0,1,2,
k
1
0
=
= å
-
=
-
N
k
jn
n
k
k
e
u
U w
El-jaroudi and Makhoul used “Itakura-Saito distance measure”
as “error criterion” to overcome this problem to develop a
new model called “discrete all-pole model”.
SPEECH PRODUCTION
Let vector a denote a set of spectral parameters as:
T
M
a
a
a ]
,.....,
,
[
a 2
1
=
Note: See p.no.189-191 in textbook for proof
The auto-correlation function of u(n) for lag m is:
( )
( ) 1
-
N
...
0,1,2,
k
)
(
where
1
-
N
...
0,1,2,
k
1
)
(
1
0
1
0
=
=
=
=
å
å
-
=
-
-
=
N
k
jn
k
N
k
jn
k
k
k
e
m
r
S
e
S
N
m
r
w
w
SPEECH PRODUCTION
The time-reversed impulse response h(-i) of the discrete
frequency-sampled all-pole filter can be obtained from the
relation of predictor coefficients to the auto-correlation as:
å
=
-
=
-
M
k
k k
i
r
a
i
h
0
)
(
)
(
ˆ
å
-
=
÷
÷
ø
ö
ç
ç
è
æ
-
÷
÷
ø
ö
ç
ç
è
æ
-
=
1
0
2
2
1
)
(
ln
)
(
)
(
N
k k
k
k
k
IS
a
S
U
a
S
U
a
D
Itakura-Saito distance measure is given by:
SPEECH PRODUCTION
LPC Vocoder
LPC system for digital transmission and reception of speech
signals over a communication channel
Reproduction
of Speech Signal
From
Channel
Output
Speech
Signal
To
Channel
input
Window
LPC
Analyzer
Coder
Pitch
Detector
Block diagram of LPC Vocoder : a) Transmitter b) receiver
(b)
Decoder
Speech
Synthesizer
(a)
LPC Vocoder
Transmitter:
• Applies window (typically, 10 to 30 ms long) to the input
• Analyzes the input speech block by block
Ø Performs Linear prediction
Ø Pitch Detection
• Finally, following parameters are encoded:
Ø Set of coefficients computed by LPC Analyzer
Ø The Pitch Period
Ø The Gain parameter
Ø The voiced-unvoiced parameters
LPC Vocoder
Receiver:
Performs the inverse operations on the channel output:
• Decode incoming parameters
• Synthesize a speech signal from the parameters

More Related Content

Similar to P4_Predictive_Modeling_Speech.pdf

129966863283913778[1]
129966863283913778[1]129966863283913778[1]
129966863283913778[1]威華 王
 
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
IOSR Journals
 
Computer aided design of communication systems / Simulation Communication Sys...
Computer aided design of communication systems / Simulation Communication Sys...Computer aided design of communication systems / Simulation Communication Sys...
Computer aided design of communication systems / Simulation Communication Sys...
Makan Mohammadi
 
Paper id 252014135
Paper id 252014135Paper id 252014135
Paper id 252014135IJRAT
 
Enhanced modulation spectral subtraction for IOVT speech recognition application
Enhanced modulation spectral subtraction for IOVT speech recognition applicationEnhanced modulation spectral subtraction for IOVT speech recognition application
Enhanced modulation spectral subtraction for IOVT speech recognition application
IRJET Journal
 
ADC Lab Analysis
ADC Lab AnalysisADC Lab Analysis
ADC Lab Analysis
Kara Bell
 
Enhanced modulation spectral subtraction incorporating various real time nois...
Enhanced modulation spectral subtraction incorporating various real time nois...Enhanced modulation spectral subtraction incorporating various real time nois...
Enhanced modulation spectral subtraction incorporating various real time nois...
IRJET Journal
 
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...Cemal Ardil
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Automatic Speech Recognition Incorporating Modulation Domain Enhancement
Automatic Speech Recognition Incorporating Modulation Domain EnhancementAutomatic Speech Recognition Incorporating Modulation Domain Enhancement
Automatic Speech Recognition Incorporating Modulation Domain Enhancement
IRJET Journal
 
Gene's law
Gene's lawGene's law
Gene's law
Hoopeer Hoopeer
 
example based audio editing
example based audio editingexample based audio editing
example based audio editing
Ramin Anushiravani
 
Sampling and Reconstruction of Signal using Aliasing
Sampling and Reconstruction of Signal using AliasingSampling and Reconstruction of Signal using Aliasing
Sampling and Reconstruction of Signal using Aliasing
j naga sai
 
lect03-audio-representation.ppt
lect03-audio-representation.pptlect03-audio-representation.ppt
lect03-audio-representation.ppt
MayankKumar633196
 
lect03-audio-representation.ppt
lect03-audio-representation.pptlect03-audio-representation.ppt
lect03-audio-representation.ppt
mohan s
 
Mini Project- Communications Link Simulation
Mini Project- Communications Link SimulationMini Project- Communications Link Simulation
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONHUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
IRJET Journal
 
Biomedical signal modeling
Biomedical signal modelingBiomedical signal modeling
Biomedical signal modeling
Roland Silvestre
 
Analysis of Speech Enhancement Incorporating Speech Recognition
Analysis of Speech Enhancement Incorporating Speech RecognitionAnalysis of Speech Enhancement Incorporating Speech Recognition
Analysis of Speech Enhancement Incorporating Speech Recognition
IRJET Journal
 
PULSE-MODULATION-TECHNIQUES (this is to introduce the pulse modulation techn...
PULSE-MODULATION-TECHNIQUES (this is to introduce the  pulse modulation techn...PULSE-MODULATION-TECHNIQUES (this is to introduce the  pulse modulation techn...
PULSE-MODULATION-TECHNIQUES (this is to introduce the pulse modulation techn...
RYANDAVEDAYAGA
 

Similar to P4_Predictive_Modeling_Speech.pdf (20)

129966863283913778[1]
129966863283913778[1]129966863283913778[1]
129966863283913778[1]
 
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
 
Computer aided design of communication systems / Simulation Communication Sys...
Computer aided design of communication systems / Simulation Communication Sys...Computer aided design of communication systems / Simulation Communication Sys...
Computer aided design of communication systems / Simulation Communication Sys...
 
Paper id 252014135
Paper id 252014135Paper id 252014135
Paper id 252014135
 
Enhanced modulation spectral subtraction for IOVT speech recognition application
Enhanced modulation spectral subtraction for IOVT speech recognition applicationEnhanced modulation spectral subtraction for IOVT speech recognition application
Enhanced modulation spectral subtraction for IOVT speech recognition application
 
ADC Lab Analysis
ADC Lab AnalysisADC Lab Analysis
ADC Lab Analysis
 
Enhanced modulation spectral subtraction incorporating various real time nois...
Enhanced modulation spectral subtraction incorporating various real time nois...Enhanced modulation spectral subtraction incorporating various real time nois...
Enhanced modulation spectral subtraction incorporating various real time nois...
 
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Automatic Speech Recognition Incorporating Modulation Domain Enhancement
Automatic Speech Recognition Incorporating Modulation Domain EnhancementAutomatic Speech Recognition Incorporating Modulation Domain Enhancement
Automatic Speech Recognition Incorporating Modulation Domain Enhancement
 
Gene's law
Gene's lawGene's law
Gene's law
 
example based audio editing
example based audio editingexample based audio editing
example based audio editing
 
Sampling and Reconstruction of Signal using Aliasing
Sampling and Reconstruction of Signal using AliasingSampling and Reconstruction of Signal using Aliasing
Sampling and Reconstruction of Signal using Aliasing
 
lect03-audio-representation.ppt
lect03-audio-representation.pptlect03-audio-representation.ppt
lect03-audio-representation.ppt
 
lect03-audio-representation.ppt
lect03-audio-representation.pptlect03-audio-representation.ppt
lect03-audio-representation.ppt
 
Mini Project- Communications Link Simulation
Mini Project- Communications Link SimulationMini Project- Communications Link Simulation
Mini Project- Communications Link Simulation
 
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONHUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
 
Biomedical signal modeling
Biomedical signal modelingBiomedical signal modeling
Biomedical signal modeling
 
Analysis of Speech Enhancement Incorporating Speech Recognition
Analysis of Speech Enhancement Incorporating Speech RecognitionAnalysis of Speech Enhancement Incorporating Speech Recognition
Analysis of Speech Enhancement Incorporating Speech Recognition
 
PULSE-MODULATION-TECHNIQUES (this is to introduce the pulse modulation techn...
PULSE-MODULATION-TECHNIQUES (this is to introduce the  pulse modulation techn...PULSE-MODULATION-TECHNIQUES (this is to introduce the  pulse modulation techn...
PULSE-MODULATION-TECHNIQUES (this is to introduce the pulse modulation techn...
 

More from Yonas D. Ebren

Group presentation.pptx
Group presentation.pptxGroup presentation.pptx
Group presentation.pptx
Yonas D. Ebren
 
P15_Analog_Filters_P1.pdf
P15_Analog_Filters_P1.pdfP15_Analog_Filters_P1.pdf
P15_Analog_Filters_P1.pdf
Yonas D. Ebren
 
Group Task Presentation.pptx
Group Task Presentation.pptxGroup Task Presentation.pptx
Group Task Presentation.pptx
Yonas D. Ebren
 
bible study.pptx
bible study.pptxbible study.pptx
bible study.pptx
Yonas D. Ebren
 
Algorithms of FFT.pptx
Algorithms of FFT.pptxAlgorithms of FFT.pptx
Algorithms of FFT.pptx
Yonas D. Ebren
 
Applications.pptx
Applications.pptxApplications.pptx
Applications.pptx
Yonas D. Ebren
 
Presentation4.pptx
Presentation4.pptxPresentation4.pptx
Presentation4.pptx
Yonas D. Ebren
 
Presentation3.pptx
Presentation3.pptxPresentation3.pptx
Presentation3.pptx
Yonas D. Ebren
 
Presentation2.pptx
Presentation2.pptxPresentation2.pptx
Presentation2.pptx
Yonas D. Ebren
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
Yonas D. Ebren
 
02._Object-Oriented_Programming_Concepts.ppt
02._Object-Oriented_Programming_Concepts.ppt02._Object-Oriented_Programming_Concepts.ppt
02._Object-Oriented_Programming_Concepts.ppt
Yonas D. Ebren
 

More from Yonas D. Ebren (11)

Group presentation.pptx
Group presentation.pptxGroup presentation.pptx
Group presentation.pptx
 
P15_Analog_Filters_P1.pdf
P15_Analog_Filters_P1.pdfP15_Analog_Filters_P1.pdf
P15_Analog_Filters_P1.pdf
 
Group Task Presentation.pptx
Group Task Presentation.pptxGroup Task Presentation.pptx
Group Task Presentation.pptx
 
bible study.pptx
bible study.pptxbible study.pptx
bible study.pptx
 
Algorithms of FFT.pptx
Algorithms of FFT.pptxAlgorithms of FFT.pptx
Algorithms of FFT.pptx
 
Applications.pptx
Applications.pptxApplications.pptx
Applications.pptx
 
Presentation4.pptx
Presentation4.pptxPresentation4.pptx
Presentation4.pptx
 
Presentation3.pptx
Presentation3.pptxPresentation3.pptx
Presentation3.pptx
 
Presentation2.pptx
Presentation2.pptxPresentation2.pptx
Presentation2.pptx
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
02._Object-Oriented_Programming_Concepts.ppt
02._Object-Oriented_Programming_Concepts.ppt02._Object-Oriented_Programming_Concepts.ppt
02._Object-Oriented_Programming_Concepts.ppt
 

Recently uploaded

1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
Output determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CCOutput determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CC
ShahulHameed54211
 
ER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAEER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAE
Himani415946
 
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptxLiving-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
TristanJasperRamos
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 

Recently uploaded (16)

1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
Output determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CCOutput determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CC
 
ER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAEER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAE
 
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptxLiving-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 

P4_Predictive_Modeling_Speech.pdf

  • 1. Predictive Modeling of Speech Dr. Yogananda Isukapalli
  • 2. PREDICTIVE MODELING OF SPEECH Linear Predictive Coding (LPC): is defined as a digital method for encoding an analog signal in which a particular value is predicted by a linear function of the past values of the signal - First proposed method for encoding Human Speech US DoD - Human speech is produced in the vocal tract which can be approximated as a variable diameter tube. - The linear predictive coding (LPC) model is based on a mathematical approximation of the vocal tract represented by a tube of a varying diameter
  • 3. Impulse Train Generator Vocal-cord Sound Pulse White-Noise Generator Vocal-tract Filter Vocal-tract parameters Pitched period Voiced/unvoiced switch Block Diagram of simplified model for the speech production process For Voiced speech For Unvoiced speech Produces a sequence of impulses equally spaced SPEECH PRODUCTION
  • 4. Ø The model assumes that the sound-generating mechanism is linearly separable from the intelligence-modulating vocal-tract filter. Ø The precise form of the excitation depends on whether the Speech is voiced or unvoiced. • Voiced speech sound (such as /i/ in eve) is generated from a quasi-periodic excitation of vocal-tract. • Unvoiced speech sound (such as /f/ in fish) is generated from random sounds produced by turbulent airflow through a constriction along the vocal tract. SPEECH PRODUCTION
  • 5. SPEECH PRODUCTION Vocal-tract filter is represented by the all-pole transfer function, å = - + = M k k k z a G z H 1 1 ) ( where G: Gain factor ak: Real-value coefficient The form of excitation applied to this filter is changed by switching between the voiced and unvoiced sounds.
  • 6. SPEECH PRODUCTION ü Accurate modeling the short-term power spectral envelope plays an important role in the quality and intelligibility of the reconstructed speech. ü At low bit-rate speech coding, an all-pole filter is adopted to model the spectral information in LPC (linear predictive coding)-based coders. ü By minimizing the MSE between the actual speech samples and the predicted ones, an optimal set of weights for an all-pole (synthesis) digital filter can be determined
  • 7. Assuming an “all-pole” model, irrespective of speech signal envelope we are trying to estimate: • If sound is unvoiced, usual LPC method gives “good estimate” of filter coefficients ak • If sound is voiced, usual LPC method gives a “biased estimate” of ak with the estimate worsening as the pitch increases (error criterion being MSE) Example: This example illustrates the limitations of standard all-pole modeling of periodic waveforms. Specifications are: • All-pole filter order, M = 12 • Input Signal: periodic pulse sequence with N=32 SPEECH PRODUCTION
  • 8. The solid line is the original 12-pole envelope, which goes through all the points. The dashed line is the 12-pole linear predictive model for N=30 spectral lines SPEECH PRODUCTION
  • 9. Itakura-Saito distance measure: Let u(n) be a real-valued, stationary stochastic process, it’s Fourier Transform Uk is: ( ) 1 - N ... 0,1,2, k 1 0 = = å - = - N k jn n k k e u U w El-jaroudi and Makhoul used “Itakura-Saito distance measure” as “error criterion” to overcome this problem to develop a new model called “discrete all-pole model”. SPEECH PRODUCTION
  • 10. Let vector a denote a set of spectral parameters as: T M a a a ] ,....., , [ a 2 1 = Note: See p.no.189-191 in textbook for proof The auto-correlation function of u(n) for lag m is: ( ) ( ) 1 - N ... 0,1,2, k ) ( where 1 - N ... 0,1,2, k 1 ) ( 1 0 1 0 = = = = å å - = - - = N k jn k N k jn k k k e m r S e S N m r w w SPEECH PRODUCTION
  • 11. The time-reversed impulse response h(-i) of the discrete frequency-sampled all-pole filter can be obtained from the relation of predictor coefficients to the auto-correlation as: å = - = - M k k k i r a i h 0 ) ( ) ( ˆ å - = ÷ ÷ ø ö ç ç è æ - ÷ ÷ ø ö ç ç è æ - = 1 0 2 2 1 ) ( ln ) ( ) ( N k k k k k IS a S U a S U a D Itakura-Saito distance measure is given by: SPEECH PRODUCTION
  • 12. LPC Vocoder LPC system for digital transmission and reception of speech signals over a communication channel Reproduction of Speech Signal From Channel Output Speech Signal To Channel input Window LPC Analyzer Coder Pitch Detector Block diagram of LPC Vocoder : a) Transmitter b) receiver (b) Decoder Speech Synthesizer (a)
  • 13. LPC Vocoder Transmitter: • Applies window (typically, 10 to 30 ms long) to the input • Analyzes the input speech block by block Ø Performs Linear prediction Ø Pitch Detection • Finally, following parameters are encoded: Ø Set of coefficients computed by LPC Analyzer Ø The Pitch Period Ø The Gain parameter Ø The voiced-unvoiced parameters
  • 14. LPC Vocoder Receiver: Performs the inverse operations on the channel output: • Decode incoming parameters • Synthesize a speech signal from the parameters