1. 1
Detection of Acoustic Landmarks for
Speech Processing with High
Resolution
M.Tech Credit Seminar
Pushpa Gothwal (09307054)
Supervisor: Prof. P. C. Pandey
Electrical Engineering Department
November 2009
2. 2
Introduction
Landmarks and their categorization
Landmark detection methods
1. Manual labeling of landmarks
2. Detection of abrupt consonant and abrupt
landmarks
3. Stop consonant landmark detection method
Summary and Future work
Outline
2
3. 3
Introduction
Perception of speech under adverse listening
conditions is improved by processing of speech
Landmark detection is needed for processing
Aim : To study 3 different methods of landmark
detection and compare their temporal
resolution
3
4. 4
Introduction
Landmarks and their categorization
Landmark detection methods
1. Manual labeling of landmarks
2. Detection of abrupt consonant and abrupt landmarks
3. Stop consonant landmark detection method
Summary and Future work
4
5. 5
.
Landmarks is the region where the spectral
discontinuity in speech.
They can be categorized as:
– Abrupt Consonantal :It is the closure and release of
constriction. Example- /able/
– Abrupt: It shows the change in sound due to glottal
activity. Example- /paint/
– Nonabrupt: It marks the transition between semivowel
to vowel and vice versa. Example-/away/
– Vocalic: It occurs when the vocal cord is extremely
open for a vowel. Example-/bat/
What is a Landmark?
6. 6
An illustration of landmarks. AC = abrupt-consonantal, A = abrupt,
N = nonabrupt, V = vocalic (Lui 1996)
7. 7
Introduction
Landmarks and their categorization
Landmark detection methods
1. Manual labeling of landmarks
2. Detection of abrupt consonant and abrupt landmarks
3. Stop consonant landmark detection method
Summary and Future work
7
9. 9
Introduction
Landmarks and their categorization
Landmark detection methods
1. Manual labeling of landmarks
2. Detection of abrupt consonant and abrupt landmarks
3. Stop consonant landmark detection method
Summary and Future work
9
10. 10
Detection of abrupt consonant and abrupt
landmarks
It detects two landmarks
Spectrum is divided into 6 bands
Band1. 0.0-0.4 Khz
2. 0.8-1.5
3. 1.2-2.0
4. 2.0-3.5
5. 3.5-5.0
6. 5.0-8.0
Band 1-Monitor glottal activity
Band 2-5-Monitor Closure and release of sonorant
Band 6-Monitor the stop
12. 12
Spectrogram of “the money is coming today". The middle figure shows
energy of band 1; and bottom figure shows ROR of band.(Lui,1996)
Detection of abrupt consonant and abrupt
landmarks (cont.)
13. 13
Introduction
Landmarks and their categorization
Landmark detection methods
1. Manual labeling of landmarks
2. Detection of abrupt consonant and abrupt landmarks
3. Stop consonant landmark detection method
Summary and Future work
13
14. 14
Pass I
Step 1 : Spectrum is divided into 5 bands
Band Frequency (kHz)
1 0.0-0.4 (Monitor glottal vibration)
2 0.4-1.2
3 1.2-2.0
4 2.0-3.5
5 3.5-5.0
(Consonant closure and
release)
Stop consonant landmark detection method
15. 15
Short time spectral
analysis
Computation of energy peaks and
centroids
Computation of RORs energy and
centroid
Computation of spectral transition
index
Landmark localization
Wavelet decomposition around
landmarks
Computation of short time energy
and ZCR
Computation of energy and ZCR
RORs
Landmark localization
Landmark
(Pass 1)
Landmark
(Pass 2)
Pass 1 Pass2
Processing stage for landmark detection (Arjun et al., 2008)
speech
16. 16
Step 2 - Computation of energy peaks and centroid in frequency bands
where k1 and k2 upper and lower frequency index for band b,n frame.
Centroid frequency is
k2 k2
fc(b,n)= ∑ k|Xn(k)|
2
/ ∑ |Xn(k)|2
fs/N (2)
k=k1 k=k1
Ep (b, n) = 10 log10 (max [|X n (k)|] 2
), k1 ≤ k ≤k2 (1)
Stop consonant landmark detection method
(cont.)
18. 18
Step 4-Computation of transition index for energy and centroid
frequency
5
Tec(n) = 1/5∑E’pn(p, n)f’cn(b,n) (5)
b=1
Stop consonant landmark detection method
(cont.)
19. 19
Waveform for /uka/ , ROR for band1(b), band2(c), band3(d) (Arjun et al.,2008)
Stop consonant landmark detection method
(cont.)
21. 21
(a) Windowed segment used in second pass, (b) energy and ZCR ROR’s of level 1,
(c) ROR’s of level 2, and (d) transition index Tez computed from ROR’s in (b) and (c)
(Arjun et.al.2008)
Stop consonant landmark detection method
(cont.)
22. 22
Pass2:
Step1-Compute the wavelet decomposition for segmenting the speech
Step2-Compute the energy and Zero Crossing Rate (ZCR)
Step3-Compute the ROR for energy and ZCR
Stop consonant landmark detection method
(cont.)
23. 23
Introduction
Landmarks and their categorization
Landmark detection methods
1. Manual labeling of landmarks
2. Detection of abrupt consonant and abrupt landmarks
3. Stop consonant landmark detection method
Summary and Future work
23
24. 24
Summary
The first method of landmark detection is time
consuming and tedious. Moreover the resolution is
also very poor.
The second method is relatively faster but it also
gives poor temporal resolution.
The third method gives very high temporal resolution
at a faster pace.
24
25. 25
Future Work
To focus on the algorithms for landmark
detection in speech and to improvise them to
implement in the phone-based recognition
system.
26. 26
REFERENCES
[Lui 1996] S. A. Liu, “Landmark detection for distinctive feature based
speech recognition,” J. acoust. Soc. Am., vol. 100, no. 5, pp. 3417-
3430.
[Arjun et al., 2008] A.R.Jayan,P.C.Pandey and ,”Detection of Acoustic
Landmarks with high resolution for Speech Processing” Procc,14th
National conf.communication.
[Alani et al.,1999] A.Alani and M.Deriche, “A novel approach to speech
segmentation using the wavelet transform,” in proc.5th int.stmp.signal
Processing and Applications.(ISSSPA’99),127-129.
[OS 2001] D. O'shaughnesey, Speech Communications: Humans and
Machine, University Press (India).
[L.R., 2008] L. R. Rabiner, R. W. Schafer, Digital Processing of Speech
Signals, Pearson Education Inc. and Dorling Kindersley Publishing Inc., India.