SlideShare a Scribd company logo
Nikolay V. Karpov (nkarpov(а)hse.ru)

Duration
 1 module, 10 weeks, 40 academic hours


Requirements
 3 practical works at home using Java, Matlab
  or others (lms.hse.ru)
 Final assessment
   2 Modelling Speech Production Acoustics
   3 Time/Frequency Representation.
    Properties of Digital Filters
   4 Linear Predictive Modelling
   5-6 Speech Coding
   7 Phonetics
   8 Speech Synthesis
   9-10 Speech Recognition
   Lingvocourse.ru
    http://lingvocourse.ru/wiki/index.php/Speech_recog
    nition
   Digital speech processing, synthesis, and recognition
    / Sadaoki Furui.- 2nd ed.,
   Speech Analysis Synthesis and Perception
    http://hear.ai.uiuc.edu/ECE537/PDF/main-all.pdf
   FUNDAMETALS OF SPEECH RECOGNITION: A SHORT
    COURSE
    http://speech.tifr.res.in/tutorials/fundamentalOfASR_
    picone96.pdf
   Speech Processing. 20 lectures in the Spring Term.
    Mike Brookes
    http://www.ee.ic.ac.uk/hp/staff/dmb/courses/speec
    h/speech.htm
   Coding
   Synthesis
   Recognition
   Identity Verification
   Enhancement
What: To transmit/store a speech waveform using as few bits as
possible while retaining high quality
Why: To save bandwidth in telecoms applications and to reduce
memory storage requirements.
How:
 Correlation ⇒Predictability ⇒Redundancy
    ◦ Predict waveform samples from previous samples and transmit only the
      prediction error
    ◦ Autocorrelation is Fourier transform of power spectrum: a peaky spectrum
      ⇒strong short-term correlations (~ 0.5 ms)
    ◦ Voiced speech is almost periodic ⇒strong long-term correlations (~ 10
      ms)
   Devote few bits to the aspects of speech where errors are least
    noticeable
    ◦ High amplitude speech will mask noise at the same frequency
   Ignore aspects of the speech that are inaudible
    ◦ Power spectrum is much more important than precise waveform
    ◦ For aperiodic sounds, the fine detail of the spectrum does not matter
What: To convert a text string into a speech waveform
Why: For technology to communicate when a display would be
inconvenient because:
 (a) Too big, (b) Eyes busy, (c) Via phone, (d) In the dark, (e)
  Moving around
Problems:
 The spelling of words doesn‟t match their sound
    ◦ Pronunciation rules + an exceptions dictionary
   Some words have multiple meanings + sounds
    ◦ Must guess which is the correct sound
   Simplistic speech models sound mechanical
    ◦ Can use extracts from real speech
   Speech sounds are influenced by adjacent phonemes
    ◦ Use phoneme pairs from real speech
   Important words must be slightly louder
    ◦ Must try to understand the text unit
   Voice pitch and talking speed must vary smoothly throughout a
    sentence
    ◦ Must be able to change pitch and speed without affecting formant
      frequencies
What: To convert a speech waveform into text
Why: To communicate and control technology when a keyboard would be
inconvenient because:
 (a) Too big, (b) Hands busy, (c) Via phone, (d) In the dark, (e) Moving
   around
Problems:
 The spelling of words doesn‟t match their sound
    ◦ Have a big phonetic dictionary
   The waveform of a word varies a lot between different speakers (or even
    the same speaker)
    ◦ Extract features from the speech waveform that are more consistent than the
      waveform
   The extracted features won‟t be exactly repeatable
    ◦ Characterize them with a probability distribution
   Speech sounds are influenced by adjacent phonemes
    ◦ Use context-dependent probability distributions
   Speaking speed varies enormously
    ◦ Try all possible speaking speeds
   No clear boundary between words or phonemes
    ◦ Try all possible boundaries
Speech waves conveys:
  Speaker meaning
  Individual information
  Emotion of speaker


Phrase(sentence) -> word units -> word ->
syllables -> phonemes
  Russian
а э и о у ы п п' б б' м м' ф ф' в в' т т' д д' н н' с
с' з з' р р' л л' ш ж щ җ ц ч й к к' г г' х х„

 English
http://en.wikipedia.org/wiki/English_phonology
   Speakers and listeners divide words into component
    sounds called phonemes.
    ◦ Native speakers agree on the phonemes that make up a
      particular word
    ◦ There are about 42 phonemes in English
 The phonemes in a particular word may vary with
dialect
    ◦ High amplitude speech will mask noise at the same
      frequency
   The actual sound that corresponds to a particular
    phoneme depends on:
    ◦   the adjacent phonemes in the word or sentence
    ◦   the accent of the speaker
    ◦   the talking speed
    ◦   whether it is a formal or informal occasion
   Turbulence: air moving quickly through a
    small hole (e.g./s/ in “size”)
   Explosion: pressure built up behind a
    blockage is suddenly released (e.g. /p/ in
    “pop”)
   Vocal Cords(Fold) Vibration
• airflow through vocal folds (vocal cords) reduces the
pressure and they snap shut (Bernoulli effect)
• muscle tension and air pressure buildup force the folds
open again and the process repeats
• frequency of vibration (fx) determined by tension in
vocal folds and pressure from lungs
• for normal breathing and voiceless sounds (e.g. /s/) the
vocal folds are held wide open and don‟t vibrate
   Vowel /а/, /о/, /у/
   Consonant
    ◦ Unvoiced
      Fricative /ш/, /щ/, /ф/, /х/
      Plosive /п/, /к/, /т/
      Affricate /ч/, /ц/
    ◦ Voiced
        Fricative /ж/, /җ/, /в/, /р/
        Plosive /б/, /г/, /д/
        Diphthongs /oj/
        Nasal /н/, /м/
        Semivowel /r/, /j/, /w/
   The sound spectrum is modified by the shape
    of the vocal tract. This is determined by
    movements of the jaw, tongue and lips.
   The resonant frequencies of the vocal tract
    cause peaks in the spectrum called formants.
   The first two formant frequencies are roughly
    determined by the distances from the tongue
    hump to the larynx and to the lips
    respectively.
Principal characteristics of speech
Principal characteristics of speech
Principal characteristics of speech
Principal characteristics of speech
Principal characteristics of speech

More Related Content

What's hot

Phonetics presentation part II
Phonetics presentation   part IIPhonetics presentation   part II
Phonetics presentation part II
Shermila Azariah
 
Phonetics presentation at ccnust by Monir Hossen
Phonetics presentation at ccnust by Monir Hossen Phonetics presentation at ccnust by Monir Hossen
Phonetics presentation at ccnust by Monir Hossen
Monir Hossen
 
Speech Mechanism
Speech MechanismSpeech Mechanism
Speech Mechanism
flattsph
 
Speech considerations for cd/ dentistry dental implants
Speech considerations for cd/ dentistry dental implantsSpeech considerations for cd/ dentistry dental implants
Speech considerations for cd/ dentistry dental implants
Indian dental academy
 
Speech Processes (Phonation and Articulation)
Speech Processes (Phonation and Articulation)Speech Processes (Phonation and Articulation)
Speech Processes (Phonation and Articulation)
Christian Sebastian
 
Introduction phonetics
Introduction   phoneticsIntroduction   phonetics
Introduction phonetics
Vivine McLeary
 
Csd 210 introduction to phonetics i and ii
Csd 210 introduction to phonetics i and iiCsd 210 introduction to phonetics i and ii
Csd 210 introduction to phonetics i and iiJake Probst
 
Speech mechanism
Speech mechanismSpeech mechanism
Speech mechanism
Draizelle Sexon
 
Phonetics
PhoneticsPhonetics
Phonetics
Sk Aziz Ikbal
 
Speech organ and manner of articulation
Speech organ and manner of articulationSpeech organ and manner of articulation
Speech organ and manner of articulation
Yanti95
 
The resonating-parts (1)
The resonating-parts (1)The resonating-parts (1)
The resonating-parts (1)
Hope Elizabeth Soberano
 
Resonators
ResonatorsResonators
Resonators
Ahmed Qadoury Abed
 
Consonants
ConsonantsConsonants
Consonants
Sunny River
 
Lecture phonetics
Lecture phoneticsLecture phonetics
Lecture phonetics
Nara Erjanovna
 
Speech organ uzma
Speech organ uzmaSpeech organ uzma
Speech organ uzma
uzma bashir
 
Affricate sounds 2010
Affricate sounds 2010Affricate sounds 2010
Affricate sounds 2010
Jordán Masías
 
Speech consideration in complete denture
Speech consideration in complete dentureSpeech consideration in complete denture
Speech consideration in complete denture
ethan1hunt
 
Consonant
ConsonantConsonant
Consonant
Sunny River
 
Consonant
ConsonantConsonant
Consonant
Sunny River
 

What's hot (20)

Phonetics presentation part II
Phonetics presentation   part IIPhonetics presentation   part II
Phonetics presentation part II
 
Phonetics presentation at ccnust by Monir Hossen
Phonetics presentation at ccnust by Monir Hossen Phonetics presentation at ccnust by Monir Hossen
Phonetics presentation at ccnust by Monir Hossen
 
Speech Mechanism
Speech MechanismSpeech Mechanism
Speech Mechanism
 
Speech considerations for cd/ dentistry dental implants
Speech considerations for cd/ dentistry dental implantsSpeech considerations for cd/ dentistry dental implants
Speech considerations for cd/ dentistry dental implants
 
Speech Processes (Phonation and Articulation)
Speech Processes (Phonation and Articulation)Speech Processes (Phonation and Articulation)
Speech Processes (Phonation and Articulation)
 
Introduction phonetics
Introduction   phoneticsIntroduction   phonetics
Introduction phonetics
 
Csd 210 introduction to phonetics i and ii
Csd 210 introduction to phonetics i and iiCsd 210 introduction to phonetics i and ii
Csd 210 introduction to phonetics i and ii
 
Speech mechanism
Speech mechanismSpeech mechanism
Speech mechanism
 
Phonetics
PhoneticsPhonetics
Phonetics
 
Speech organ and manner of articulation
Speech organ and manner of articulationSpeech organ and manner of articulation
Speech organ and manner of articulation
 
The resonating-parts (1)
The resonating-parts (1)The resonating-parts (1)
The resonating-parts (1)
 
Resonators
ResonatorsResonators
Resonators
 
Consonants
ConsonantsConsonants
Consonants
 
Phonetics report
Phonetics reportPhonetics report
Phonetics report
 
Lecture phonetics
Lecture phoneticsLecture phonetics
Lecture phonetics
 
Speech organ uzma
Speech organ uzmaSpeech organ uzma
Speech organ uzma
 
Affricate sounds 2010
Affricate sounds 2010Affricate sounds 2010
Affricate sounds 2010
 
Speech consideration in complete denture
Speech consideration in complete dentureSpeech consideration in complete denture
Speech consideration in complete denture
 
Consonant
ConsonantConsonant
Consonant
 
Consonant
ConsonantConsonant
Consonant
 

Viewers also liked

Chapter21
Chapter21Chapter21
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi..."Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
Yandex
 
The features of the connected speech final
The features of the connected speech finalThe features of the connected speech final
The features of the connected speech finalHina Honey
 
Blending, phrasing and intonation
Blending, phrasing and intonationBlending, phrasing and intonation
Blending, phrasing and intonationRyan Lualhati
 
Measuring the Effectiveness of the Promotional Program
Measuring the Effectiveness of the Promotional ProgramMeasuring the Effectiveness of the Promotional Program
Measuring the Effectiveness of the Promotional Program
Indrajit Bage
 
Establishing Objectives and Budgeting for the Promotional Program
Establishing Objectives and Budgeting for the Promotional ProgramEstablishing Objectives and Budgeting for the Promotional Program
Establishing Objectives and Budgeting for the Promotional Program
Indrajit Bage
 
speech production in psycholinguistics
speech production in psycholinguistics speech production in psycholinguistics
speech production in psycholinguistics
Aseel K. Mahmood
 
Chap20 International Advertising And Promotion
Chap20 International Advertising And PromotionChap20 International Advertising And Promotion
Chap20 International Advertising And PromotionPhoenix media & event
 
Comparing Broadsheet and Tabloid newspapers
Comparing Broadsheet and Tabloid newspapersComparing Broadsheet and Tabloid newspapers
Comparing Broadsheet and Tabloid newspapersjodieholmes
 

Viewers also liked (13)

Chapter21
Chapter21Chapter21
Chapter21
 
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi..."Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
 
The features of the connected speech final
The features of the connected speech finalThe features of the connected speech final
The features of the connected speech final
 
Blending, phrasing and intonation
Blending, phrasing and intonationBlending, phrasing and intonation
Blending, phrasing and intonation
 
Measuring the Effectiveness of the Promotional Program
Measuring the Effectiveness of the Promotional ProgramMeasuring the Effectiveness of the Promotional Program
Measuring the Effectiveness of the Promotional Program
 
Connected speech features
Connected speech featuresConnected speech features
Connected speech features
 
Stages of speaking
Stages of speakingStages of speaking
Stages of speaking
 
Establishing Objectives and Budgeting for the Promotional Program
Establishing Objectives and Budgeting for the Promotional ProgramEstablishing Objectives and Budgeting for the Promotional Program
Establishing Objectives and Budgeting for the Promotional Program
 
speech production in psycholinguistics
speech production in psycholinguistics speech production in psycholinguistics
speech production in psycholinguistics
 
Chap20 International Advertising And Promotion
Chap20 International Advertising And PromotionChap20 International Advertising And Promotion
Chap20 International Advertising And Promotion
 
The organs of speech and their function
The organs of speech and their functionThe organs of speech and their function
The organs of speech and their function
 
Comparing Broadsheet and Tabloid newspapers
Comparing Broadsheet and Tabloid newspapersComparing Broadsheet and Tabloid newspapers
Comparing Broadsheet and Tabloid newspapers
 
Properties of Sound
Properties of SoundProperties of Sound
Properties of Sound
 

Similar to Principal characteristics of speech

Part1 speech basics
Part1 speech basicsPart1 speech basics
Part1 speech basics
Minakshi Atre
 
Phonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Phonetics & phonology, INTRODUCTION, Dr, Salama EmbarakPhonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Phonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Abdulsalam Mohammed
 
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epgClass 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
Lisa Lavoie
 
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
International Journal of Science and Research (IJSR)
 
B110512
B110512B110512
speech processing basics
speech processing basicsspeech processing basics
speech processing basics
sivakumar m
 
speech recognition and removal of disfluencies
speech recognition and removal of disfluenciesspeech recognition and removal of disfluencies
speech recognition and removal of disfluencies
Ankit Sharma
 
Say That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent AcumenSay That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent Acumen
National Council on Interpreting in Health Care (NCIHC)
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizy
Lizy Abraham
 
Week 3 phonology copy
Week 3  phonology   copyWeek 3  phonology   copy
Week 3 phonology copy
Dr. Russell Rodrigo
 
Phonetics full
Phonetics fullPhonetics full
Phonetics fullHina Honey
 
(Emerson) Phonetics & Phonology.pptx
(Emerson) Phonetics & Phonology.pptx(Emerson) Phonetics & Phonology.pptx
(Emerson) Phonetics & Phonology.pptx
ShamsUlFatah
 
Presentation for China Forum (1).ppt
Presentation for China Forum (1).pptPresentation for China Forum (1).ppt
Presentation for China Forum (1).ppt
RAJALAKSHMIJ10
 
Ch 9 Language and Speech Processing.pptx
Ch 9 Language and Speech Processing.pptxCh 9 Language and Speech Processing.pptx
Ch 9 Language and Speech Processing.pptx
Larry195181
 
Teaching alphabetics and fluency in reading
Teaching alphabetics and fluency in readingTeaching alphabetics and fluency in reading
Teaching alphabetics and fluency in reading
Marcia Luptak
 
Speech and Language Processing
Speech and Language ProcessingSpeech and Language Processing
Speech and Language Processing
Vikalp Mahendra
 
Introduction to audiovidual translation by adriana serban
Introduction to audiovidual translation by adriana serbanIntroduction to audiovidual translation by adriana serban
Introduction to audiovidual translation by adriana serbanyiling666
 

Similar to Principal characteristics of speech (20)

Part1 speech basics
Part1 speech basicsPart1 speech basics
Part1 speech basics
 
Phonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Phonetics & phonology, INTRODUCTION, Dr, Salama EmbarakPhonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Phonetics & phonology, INTRODUCTION, Dr, Salama Embarak
 
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epgClass 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
 
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
 
B110512
B110512B110512
B110512
 
Language
LanguageLanguage
Language
 
speech processing basics
speech processing basicsspeech processing basics
speech processing basics
 
speech recognition and removal of disfluencies
speech recognition and removal of disfluenciesspeech recognition and removal of disfluencies
speech recognition and removal of disfluencies
 
Say That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent AcumenSay That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent Acumen
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizy
 
Week 3 phonology copy
Week 3  phonology   copyWeek 3  phonology   copy
Week 3 phonology copy
 
Phonetics full
Phonetics fullPhonetics full
Phonetics full
 
(Emerson) Phonetics & Phonology.pptx
(Emerson) Phonetics & Phonology.pptx(Emerson) Phonetics & Phonology.pptx
(Emerson) Phonetics & Phonology.pptx
 
Presentation for China Forum (1).ppt
Presentation for China Forum (1).pptPresentation for China Forum (1).ppt
Presentation for China Forum (1).ppt
 
Ch 9 Language and Speech Processing.pptx
Ch 9 Language and Speech Processing.pptxCh 9 Language and Speech Processing.pptx
Ch 9 Language and Speech Processing.pptx
 
Teaching alphabetics and fluency in reading
Teaching alphabetics and fluency in readingTeaching alphabetics and fluency in reading
Teaching alphabetics and fluency in reading
 
Phonology
PhonologyPhonology
Phonology
 
Speech and Language Processing
Speech and Language ProcessingSpeech and Language Processing
Speech and Language Processing
 
Phonetics
PhoneticsPhonetics
Phonetics
 
Introduction to audiovidual translation by adriana serban
Introduction to audiovidual translation by adriana serbanIntroduction to audiovidual translation by adriana serban
Introduction to audiovidual translation by adriana serban
 

More from Nikolay Karpov

Идентификация уровня сложности текста и его адаптация
Идентификация уровня сложности текста и его адаптацияИдентификация уровня сложности текста и его адаптация
Идентификация уровня сложности текста и его адаптацияNikolay Karpov
 
Идентификация уровня ложности текста и его адаптация
Идентификация уровня ложности текста и его адаптацияИдентификация уровня ложности текста и его адаптация
Идентификация уровня ложности текста и его адаптацияNikolay Karpov
 
Теория и практика обработки естественного языка
Теория и практика обработки естественного языкаТеория и практика обработки естественного языка
Теория и практика обработки естественного языкаNikolay Karpov
 
Speech waves in tube and filters
Speech waves in tube and filtersSpeech waves in tube and filters
Speech waves in tube and filtersNikolay Karpov
 
Speech signal time frequency representation
Speech signal time frequency representationSpeech signal time frequency representation
Speech signal time frequency representationNikolay Karpov
 

More from Nikolay Karpov (8)

Идентификация уровня сложности текста и его адаптация
Идентификация уровня сложности текста и его адаптацияИдентификация уровня сложности текста и его адаптация
Идентификация уровня сложности текста и его адаптация
 
Идентификация уровня ложности текста и его адаптация
Идентификация уровня ложности текста и его адаптацияИдентификация уровня ложности текста и его адаптация
Идентификация уровня ложности текста и его адаптация
 
Cepstral coefficients
Cepstral coefficientsCepstral coefficients
Cepstral coefficients
 
Теория и практика обработки естественного языка
Теория и практика обработки естественного языкаТеория и практика обработки естественного языка
Теория и практика обработки естественного языка
 
Linear prediction
Linear predictionLinear prediction
Linear prediction
 
Speech waves in tube and filters
Speech waves in tube and filtersSpeech waves in tube and filters
Speech waves in tube and filters
 
Speech signal time frequency representation
Speech signal time frequency representationSpeech signal time frequency representation
Speech signal time frequency representation
 
Tagger numbers
Tagger numbersTagger numbers
Tagger numbers
 

Recently uploaded

Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 

Recently uploaded (20)

Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 

Principal characteristics of speech

  • 1. Nikolay V. Karpov (nkarpov(а)hse.ru) Duration  1 module, 10 weeks, 40 academic hours Requirements  3 practical works at home using Java, Matlab or others (lms.hse.ru)  Final assessment
  • 2. 2 Modelling Speech Production Acoustics  3 Time/Frequency Representation. Properties of Digital Filters  4 Linear Predictive Modelling  5-6 Speech Coding  7 Phonetics  8 Speech Synthesis  9-10 Speech Recognition
  • 3. Lingvocourse.ru http://lingvocourse.ru/wiki/index.php/Speech_recog nition  Digital speech processing, synthesis, and recognition / Sadaoki Furui.- 2nd ed.,  Speech Analysis Synthesis and Perception http://hear.ai.uiuc.edu/ECE537/PDF/main-all.pdf  FUNDAMETALS OF SPEECH RECOGNITION: A SHORT COURSE http://speech.tifr.res.in/tutorials/fundamentalOfASR_ picone96.pdf  Speech Processing. 20 lectures in the Spring Term. Mike Brookes http://www.ee.ic.ac.uk/hp/staff/dmb/courses/speec h/speech.htm
  • 4. Coding  Synthesis  Recognition  Identity Verification  Enhancement
  • 5. What: To transmit/store a speech waveform using as few bits as possible while retaining high quality Why: To save bandwidth in telecoms applications and to reduce memory storage requirements. How:  Correlation ⇒Predictability ⇒Redundancy ◦ Predict waveform samples from previous samples and transmit only the prediction error ◦ Autocorrelation is Fourier transform of power spectrum: a peaky spectrum ⇒strong short-term correlations (~ 0.5 ms) ◦ Voiced speech is almost periodic ⇒strong long-term correlations (~ 10 ms)  Devote few bits to the aspects of speech where errors are least noticeable ◦ High amplitude speech will mask noise at the same frequency  Ignore aspects of the speech that are inaudible ◦ Power spectrum is much more important than precise waveform ◦ For aperiodic sounds, the fine detail of the spectrum does not matter
  • 6. What: To convert a text string into a speech waveform Why: For technology to communicate when a display would be inconvenient because:  (a) Too big, (b) Eyes busy, (c) Via phone, (d) In the dark, (e) Moving around Problems:  The spelling of words doesn‟t match their sound ◦ Pronunciation rules + an exceptions dictionary  Some words have multiple meanings + sounds ◦ Must guess which is the correct sound  Simplistic speech models sound mechanical ◦ Can use extracts from real speech  Speech sounds are influenced by adjacent phonemes ◦ Use phoneme pairs from real speech  Important words must be slightly louder ◦ Must try to understand the text unit  Voice pitch and talking speed must vary smoothly throughout a sentence ◦ Must be able to change pitch and speed without affecting formant frequencies
  • 7. What: To convert a speech waveform into text Why: To communicate and control technology when a keyboard would be inconvenient because:  (a) Too big, (b) Hands busy, (c) Via phone, (d) In the dark, (e) Moving around Problems:  The spelling of words doesn‟t match their sound ◦ Have a big phonetic dictionary  The waveform of a word varies a lot between different speakers (or even the same speaker) ◦ Extract features from the speech waveform that are more consistent than the waveform  The extracted features won‟t be exactly repeatable ◦ Characterize them with a probability distribution  Speech sounds are influenced by adjacent phonemes ◦ Use context-dependent probability distributions  Speaking speed varies enormously ◦ Try all possible speaking speeds  No clear boundary between words or phonemes ◦ Try all possible boundaries
  • 8. Speech waves conveys:  Speaker meaning  Individual information  Emotion of speaker Phrase(sentence) -> word units -> word -> syllables -> phonemes
  • 9.  Russian а э и о у ы п п' б б' м м' ф ф' в в' т т' д д' н н' с с' з з' р р' л л' ш ж щ җ ц ч й к к' г г' х х„  English http://en.wikipedia.org/wiki/English_phonology
  • 10. Speakers and listeners divide words into component sounds called phonemes. ◦ Native speakers agree on the phonemes that make up a particular word ◦ There are about 42 phonemes in English  The phonemes in a particular word may vary with dialect ◦ High amplitude speech will mask noise at the same frequency  The actual sound that corresponds to a particular phoneme depends on: ◦ the adjacent phonemes in the word or sentence ◦ the accent of the speaker ◦ the talking speed ◦ whether it is a formal or informal occasion
  • 11.
  • 12. Turbulence: air moving quickly through a small hole (e.g./s/ in “size”)  Explosion: pressure built up behind a blockage is suddenly released (e.g. /p/ in “pop”)  Vocal Cords(Fold) Vibration • airflow through vocal folds (vocal cords) reduces the pressure and they snap shut (Bernoulli effect) • muscle tension and air pressure buildup force the folds open again and the process repeats • frequency of vibration (fx) determined by tension in vocal folds and pressure from lungs • for normal breathing and voiceless sounds (e.g. /s/) the vocal folds are held wide open and don‟t vibrate
  • 13. Vowel /а/, /о/, /у/  Consonant ◦ Unvoiced  Fricative /ш/, /щ/, /ф/, /х/  Plosive /п/, /к/, /т/  Affricate /ч/, /ц/ ◦ Voiced  Fricative /ж/, /җ/, /в/, /р/  Plosive /б/, /г/, /д/  Diphthongs /oj/  Nasal /н/, /м/  Semivowel /r/, /j/, /w/
  • 14. The sound spectrum is modified by the shape of the vocal tract. This is determined by movements of the jaw, tongue and lips.  The resonant frequencies of the vocal tract cause peaks in the spectrum called formants.  The first two formant frequencies are roughly determined by the distances from the tongue hump to the larynx and to the lips respectively.