Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Speech Proce...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Outline
Intr...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Prerequisite...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Prerequisite...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Prerequisite...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Prerequisite...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Prerequisite...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Introduction...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Applications...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Applications...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
What makes a...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Speaker-List...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Production-P...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Speech Produ...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Mechanical E...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Organization: Speech Processing
Prerequisites
Introduction
Speech Production
Representation of Speech Signals
Spectro-Temp...
Upcoming SlideShare
Loading in...5
×

Speech processinglecworkshop

215

Published on

The objective of the presentation is to provide and overview of speech processing

Published in: Engineering, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
215
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Speech processinglecworkshop

  1. 1. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Speech Processing Govind Center for Computational Engineering & Networking Amrita Vishwa Vidyapeetham Govind CEN, Amrita Vishwa Vidyapeetham
  2. 2. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Outline Introduction Human Speech Production and Perception Systems Representation of Speech in the Time and Frequency Domains Speech Sounds and Features Signal Processing Methods for Estimating Speech Features Speech Processing Applications Speech Recognition Speech Synthesis Govind CEN, Amrita Vishwa Vidyapeetham
  3. 3. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Prerequisites: S&S, DSP & ADSP Prior Knowledge Required: Signals and Systems Digital signal Processing Advanced DSP Govind CEN, Amrita Vishwa Vidyapeetham
  4. 4. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Prerequisites: S&S, DSP & ADSP Signals and Systems Classification of Signals LTI systems Correlation/Convolution Operations Fourier Representation: FS, DTFS, DTFT,DFT,FFT, Z-transform Concepts of Impulse Response, Frequency Response etc. Govind CEN, Amrita Vishwa Vidyapeetham
  5. 5. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Prerequisites: S&S, DSP & ADSP Digital signal Processing Sampling: Nyquist, Aliasing FFT implementation of DFT Design of FIR and IIR filters Structures for realization of Filters Multirate signal processing: Filter banks Govind CEN, Amrita Vishwa Vidyapeetham
  6. 6. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Prerequisites: S&S, DSP & ADSP Advanced DSP Time-Frequency Analysis TFA by STFT TFA by wigner Distribututions TFA by Wavelets Govind CEN, Amrita Vishwa Vidyapeetham
  7. 7. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Prerequisites: S&S, DSP & ADSP References L. Rabiner, Biing-Hwang Juang and B. Yegnanarayana,"Fundamentals of Speech Recognition",Pearson Education Inc.2009 Douglas O’Shaughnessy,"Speech Communication",University Press,2001 Thomas F Quatieri,"Discrete Time Speech Signal Processing", Pearson Education Inc.,2004 Govind CEN, Amrita Vishwa Vidyapeetham
  8. 8. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Introduction Information in Speech Message Language Accent Speaker Emotions/Stress Applications Recognition Speech recognition Speaker Recognition/Verification Emotion Recognition etc.. Synthesis Text to Speech Synthesis Speech Enhancement Voice Conversion Govind CEN, Amrita Vishwa Vidyapeetham
  9. 9. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Applications:Recognition Speech Objective Information Extracted Message Author of the danger... Speaker Its Govind Speaking Speaker claim has to be verified Hi Govind, your claim is ac- cepted Govind CEN, Amrita Vishwa Vidyapeetham
  10. 10. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Applications:Synthesis Input Objective Output Text To Speech Synthesis Text (Epochs Occur... Synthesize Text Speech Enhancement Remove noise Remove reverberation Enhance desired speaker speech Voice Conversion Convert source speaker speech target speakr speech Govind CEN, Amrita Vishwa Vidyapeetham
  11. 11. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals What makes automatic processing of speech Complicated? Its an inter-disciplinary area 1 Signal Processing: The process of extracting relevant information from speech signal 2 Physics: The science of understanding relationship between physical speech signal and physiological mechanisms that produced it. 3 Pattern Recognition: Grouping or classifying patterns of various events in speech 4 Communication and information theory: Deals with efficient way of encodng or decoding parameters of speech, efficient serach for patterns of interest in speech (dynamic programming, viterbi search, stack algorithms etc..) 5 Linguistics: The relationship between sounds (phonology) with syntax and semantics of a language and sense that derived from the meaning (pragmatics) 6 Computer Science: The study of diferent algorithms for implementing in Software/Hardware 7 Psychology: Understanding the psychological state of the speaker/listener will be helpful for the tasks like emotion analysis. Govind CEN, Amrita Vishwa Vidyapeetham
  12. 12. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Speaker-Listener Schematic Diagram in Speech Communication Figure: Schematic Diagram of Speech Communication: Figure Courtesy- Rabiner et al. Govind CEN, Amrita Vishwa Vidyapeetham
  13. 13. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Production-Perception Block Diagram DĞƐƐĂŐĞ &ŽƌŵƵůĂƚŝŽŶ >ĂŶŐƵĂŐĞ ŽĚĞ EĞƵƌŽͲ DƵƐĐƵůĂƌ ŽŶƚƌŽůƐ sŽĐĂů dƌĂĐƚ ^LJƐƚĞŵ ĐŽƵƐƚŝĐ tĂǀĞĨŽƌŵ dƌĂŶƐŵŝƐƐŝŽŶ ŚĂŶŶĞů ĐŽƵƐƚŝĐ tĂǀĞĨŽƌŵ DĞƐƐĂŐĞ hŶĚĞƌƐƚĂŶĚŝŶŐ >ĂŶŐƵĂŐĞ dƌĂŶƐůĂƚŝŽŶ EĞƵƌĂů dƌĂŶƐĚƵĐƚŝŽŶ ĂƐŝůĂƌ DĞŵďƌĂŶĞ DŽƚŝŽŶ dĞdžƚ WŚŽŶĞŵĞƐͲ WƌŽƐŽĚLJ ƌƚŝĐƵůĂƚŽƌLJ DŽƚŝŽŶ ^ĞŵĂŶƚŝĐƐ WŚŽŶĞŵĞƐ tŽƌĚƐ ^ĞŶƚĞŶĐĞƐ &ĞĂƚƵƌĞ džƚƌĂĐƚŝŽŶ ŽĚŝŶŐ ^ƉĞĐƚƌƵŵ ŶĂůLJƐŝƐ Figure: Speech production BlockDiagram: Figure Courtesy- Rabiner et al. Govind CEN, Amrita Vishwa Vidyapeetham
  14. 14. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Speech Production Figure: Speech production mechanism: Figure Courtesy- Thomas F. Quatieri, "Discrete-Time Speech Signal Processing", Chapter. 3, pp. 58, Pearson Edu., Delhi Govind CEN, Amrita Vishwa Vidyapeetham
  15. 15. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Mechanical Equivalent of Speech Production System Figure: Speech production mechanism: Figure Courtesy- Rabiner et al. Govind CEN, Amrita Vishwa Vidyapeetham
  16. 16. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Representation of Speech Signal 0 0.5 1 1.5 2 2.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Figure: Speech Signal in Time domain Govind CEN, Amrita Vishwa Vidyapeetham
  17. 17. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Glottal Air Flow During Speech Production Figure: Glottal air flow: Courtesy- Rabinar et al. Govind CEN, Amrita Vishwa Vidyapeetham
  18. 18. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Glottal Air Flow: Graphical Illustration 1.3 1.35 1.4 1.45 1.5 1.55 x 10 4 −1 −0.5 0 0.5 Time (Samples) Amplitude Speech Waveform 1.3 1.35 1.4 1.45 1.5 1.55 x 10 4 −1 −0.5 0 0.5 Time (Samples) Amplitude Glottal Flow: EGG Speech EGG Glottis Vibration Govind CEN, Amrita Vishwa Vidyapeetham
  19. 19. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Classification of Speech Sounds Silence (S): No Speech is produced Unvoiced (U): Vocal folds are not vibrating Voiced (V): Periodic vibration of vocal cords 0 0.5 1 1.5 2 2.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 US S V V V Figure: Speech signal in time domainGovind CEN, Amrita Vishwa Vidyapeetham
  20. 20. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Classification of Speech Sounds Separation of voiced sounds from unvoiced and silence sounds is known as voiced-non-voiced detection Issues in voiced-non-voiced detection: Difficult to identify weak unvoiced sound from silence Difficult to distinguish weakly periodic voiced sounds from unvoiced sounds Govind CEN, Amrita Vishwa Vidyapeetham
  21. 21. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes SpectroGrams: Narrow-band & Wide-band Govind CEN, Amrita Vishwa Vidyapeetham
  22. 22. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Spectral Envelope from a Long Segment of Speech 0 10 20 30 0 1000 2000 3000 4000 0 20 40 FrameIndex Frequency (Hz) Magnitude Govind CEN, Amrita Vishwa Vidyapeetham
  23. 23. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Classification of sound units WŚŽŶĞŵĞƐ sŽǁĞůƐ ĨĨƌŝĐĂƚĞ Ɛ ŝƉŚƚŚŽŶŐƐ ^ĞŵŝͲ sŽǁĞůƐ >ŝƋƵŝĚƐ 'ůŝĚĞƐ ŽŶƐŽŶĂŶƚ Ɛ EĂƐĂůƐ WůŽƐŝǀĞƐ &ƌŝĐĂƚŝǀĞƐ tŚŝƐƉĞƌƐ &ƌŽŶƚ DŝĚ ĂĐŬ sŽŝĐĞĚ hŶǀŽŝĐĞĚ ŝ ;ĞǀĞͿ / ;ŝƚͿ Ğ ;ŚĂƚĞͿ ;ŵĞƚͿ h;ŬͿ Ƶ;ƚͿ ;ƵƉͿ Ă ;ĨĂƚŚĞƌͿ Ž;KďĞLJͿ Đ; ůůͿ ĂLJ ;ďƵLJͿ Ăǁ;ĚŽǁŶͿ ĞLJ ;ďĂŝƚͿ K ;ďŽLJͿ ƚnj ;ƐƉŽƌƚƐͿ ũŚ;ũƵĚŐĞͿ ĐŚ ;ĐŚƵƌĐŚͿ ů ;ůĂƌŐĞͿ ƌ;ƌƵŶͿ ǁ ;ǁŝƚͿ LJ ;LJŽƵͿ ŵ ;ŵĞƚͿ Ŷ;ŶĞƚͿ ŶŐ;ƐŝŶŐͿ Ś ;ŚĞͿ ď ;ďĂůůͿ Ě ;ĚĞďƚͿ Ő ;ŐĞƚͿ Ŭ ;ŬŝƚͿ Ɖ ;ƉĞŶͿ ƚ;ƚĞŶͿ sŽŝĐĞĚ hŶǀŽŝĐĞĚ ǀ ;ǀĂƚͿ ĚŚ;ƚŚĂƚͿ nj;njŽŽͿ Ĩ ;ĨƵŶͿ ƚŚ ;ƚŚŝŶŐͿ Ɛ;ƐĂƚͿ ƐŚ;ƐŚŽƵůĚͿ Govind CEN, Amrita Vishwa Vidyapeetham
  24. 24. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Representation of sound units in speech Sounds are classified into vowels and consonant Vowels: By exciting fixed vocaltract shape with quasi periodic glottal pulses Vowels are classified into front, mid and back based on the tongue-hump-position Front vowels:/i/("eve"), /I/("it"),//("at"),/e/("hate") Mid vowels: /a/("father"), /Λ/("Up") Back Vowels: /U/("foot"),/u/("boot"),/o/("Obey") Another classification is based on the length of vowels: Long and short Diphthongs: Combination of two vowels /ay/ as in "buy",/aw/ as in "down",/ey/ as in "bait",/o/ as in "boat",/cy/ as in "boy" etc. Govind CEN, Amrita Vishwa Vidyapeetham
  25. 25. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Front Vowel Front Vowel Speech Signal Spectrogram I(It) 0.18 0.2 0.22 0.24 0.26 0.28 0.3 0.32 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 1000 2000 3000 4000 5000 6000 7000 e(Hate) 0.18 0.2 0.22 0.24 0.26 0.28 0.3 0.32 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 1000 2000 3000 4000 5000 6000 7000 i(eve) 0.32 0.34 0.36 0.38 0.4 0.42 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Govind CEN, Amrita Vishwa Vidyapeetham
  26. 26. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Vowel Analysis Front vowels found to show high frequency resonance Front vowels are discriminated among each other by the tongue height during the vowel production Mid vowels found to show well separated and balanced resonant frequency distribution Back vowels shows almost no energy beyond low frequency regions Govind CEN, Amrita Vishwa Vidyapeetham
  27. 27. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Diphthongs Govind CEN, Amrita Vishwa Vidyapeetham
  28. 28. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Semivowels Group of sounds consisting of /w/,/r/,/l/,/y/ difficult to characterize because they are vowel like in nature Characterized by gliding transition in vocaltract area functions between adjacent phonemes Best described as transitional vowel like sounds Govind CEN, Amrita Vishwa Vidyapeetham
  29. 29. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Nasal Consonants Group of sounds consisting of /m/,/n/,/η/ Produced with glottal Excitation and vocaltract totally constricted along the oral passageway Velam is lowered to block the air passage through oral cavity and allowing through nasal cavity Due the acoustic coupling of oral cavity to the pharynx, anti resonances will be created /m/,/n/ and /η/ are produced by the constiction at lips, behind the teeth and at velum, respectively. Govind CEN, Amrita Vishwa Vidyapeetham
  30. 30. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Nasalized Vowels Govind CEN, Amrita Vishwa Vidyapeetham
  31. 31. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Unvoiced Fricatives Produced by exciting vocaltract with a turbulant airflow through a narrow constriction /f/("four"),/θ/("thing"),/s/("sat") and /sh/ ("shut") are the class of fricative sounds /f/: Constriction at teeth /s/: Constriction near middle of oral cavity /sh/: constriction at the end of oral tract Govind CEN, Amrita Vishwa Vidyapeetham
  32. 32. Organization: Speech Processing Prerequisites Introduction Speech Production Representation of Speech Signals Spectro-Temporal Representation classification of Phonemes Voiced Fricatives /v/("vat"),/δ/("zoo"),/z/("zoo") and /zh/("azure") are the class of fricative sounds /v/: Constriction at teeth /z/: Constriction near middle of oral cavity /zh/: constriction at the end of oral tract Except glottal vibrations, the place of articulation remains same as that of unvoiced fricatives Govind CEN, Amrita Vishwa Vidyapeetham
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×