Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Speech acoustics

7,234 views

Published on

Published in: Education

Speech acoustics

  1. 1. Speech acoustics<br />Objectives: <br />Describe relative frequency and intensity of phonemes by voice, manner, and formant frequency.<br />Describe various phonemic cues.<br />Describe speech constraints.<br />
  2. 2. Average speech intensity<br />~65 dB SPL (~45 dB HL)<br />30 dB range<br />Any vowel has more power than any consonant<br />
  3. 3. Average speech frequency<br />~50 – 10,000 Hz<br />Most energy below 1000 Hz<br />Fundamental frequency<br />Men: 100 Hz<br />Women: 200 Hz<br />Children: 300 Hz<br />Crying babies: 500 Hz<br />Cues for talker identity<br />
  4. 4. Average speech duration<br />Vowels: 130 – 360 msec<br />Consonants: 20 – 150 msec<br />Rate: ~5 syllables/second; ~12 phonemes/second<br />
  5. 5. Vowel formants<br />Low F1<br />Low F2<br />Low F1<br />High F2<br />High F1<br />Low F2<br />High F1<br />High F2<br />
  6. 6. Vowel formants<br />
  7. 7. Consonants: place, manner, voicing<br />w<br />
  8. 8. Consonants: energy bands<br />
  9. 9. Phonemic cues - Stops<br />Closure<br />Voiceless stops – silent period<br />Voiced stops – low level energy<br />Burst<br />Wide-band energy ~40 msec<br />Greater intensity for voiceless stops<br />Frequency depends on place<br />Formant transition<br />First formant always rising<br />Second formant transition depends on place<br />
  10. 10. Phonemic cues - Stops<br />Voice easier to detect than place<br />For voiced stops<br />Voice-onset time is earlier<br />Energy present at fundamental frequency<br />Burst energy is lower in amplitude <br />Vowels are longer in duration before voiced final stops (“eyes” v. “ice”)<br />
  11. 11. Phonemic cues - Nasals<br />Always voiced<br />Continuant<br />Nasal resonance<br />highest for /m/<br />lowest for /n/<br />Second formant (frequency and transition) gives place information<br />
  12. 12. Phonemic cues - Fricatives<br />Hissing quality<br />Voiced fricatives<br />Periodic<br />Lower frequency<br />Lower amplitude<br />Greater overall energy (from fundamental)<br />Sibilants (s, z, sh, zh)<br />Higher amplitude than other fricatives<br />
  13. 13. -f- -θ- -s- -S-<br />
  14. 14. Suprasegmental cues<br />Stress<br />changes in fundamental frequency, intensity, duration<br />Intonation<br />changes in fundamental frequency, pitch pattern<br />expresses attitudes, feeling, meaning (command, request, statement)<br />Duration<br />variations in speech sounds due to context of other sounds<br />
  15. 15. Speech constraints<br />Syntactic<br />S = NP (Aux) VP<br />NP = (Det) (AP) N (PP) <br />“the naughty boy in the daycare…”<br />VP = V (NP) (PP) (Adv) <br />“…took the toy away brusquely”<br />
  16. 16. Speech constraints<br />Syntactic<br />S = NP (Aux) VP<br />NP = (Det) (AP) N (PP)<br />“the naughty boy in the daycare…”<br />VP = V (NP) (PP) (Adv)<br />“…took the toy away brusquely”<br />
  17. 17. Speech constraints<br />Syntactic<br />The question “What should you eat”<br />Answer is a noun phrase<br />The question “How should you eat”<br />Answer is an adverbial phrase<br />
  18. 18. Speech constraints<br />Semantic<br />Words in a sentence are related meaningfully<br />“Plug the mouse into the computer”<br />Situational<br />Conversation usually refers to the context of the environment<br />“I like that oat!”<br />Mall vs. Farm<br />
  19. 19. Overlapping cues help protect the signal from noise<br />Speech predictability helps protect the signal from noise<br />Noise can come from<br />the speaker (poor intelligibility, etc)<br />the environment (distractions, etc)<br />the listener (ESL, etc)<br />
  20. 20. Effects of hearing loss on speech perception<br />Objectives: <br />Describe speech characteristics that are lost and that are preserved for hearing losses of various degree, type and configuration.<br />
  21. 21. Auditory Response Area<br />
  22. 22. Auditory Response Area<br />
  23. 23. Auditory Response Area<br />
  24. 24. Speech audiogram<br />
  25. 25. Speech audiogram<br />X X X X X X<br />
  26. 26. Speech audiogram<br />
  27. 27. Consonants: energy bands<br />
  28. 28. Consonants: energy bands<br />
  29. 29. Consonants: energy bands<br />
  30. 30. Speech audiogram<br />
  31. 31. Speech audiogram<br />
  32. 32.
  33. 33.
  34. 34.
  35. 35. 34 dots<br />
  36. 36. Correlating SII to speech<br />Adult values (children would be worse)<br />Digits easy<br />Words hard<br />
  37. 37. X X X X X X<br />
  38. 38. Correlating SII to speech<br />
  39. 39.
  40. 40.
  41. 41.
  42. 42. Deafness<br />No access to average speech<br />
  43. 43. Severe<br />Access to only loudest components of speech<br />Speech production<br />High airflow rate<br />Speech initiation at low lung volumes<br />Poor velar control (nasality)<br />High fundamental frequency<br />Slow speech rate<br />
  44. 44. Moderate<br />Access to louder half of speech, or to loud speech<br />Speech production<br />Substitutions and distortions<br />Errors in affricate, fricatives and blends<br />
  45. 45. Slight to Mild<br />Access to all but the quietest components of speech<br />Speech production<br />Fewer distortions/substitutions<br />Good intelligibility<br />
  46. 46. Rising v. Sloping loss<br />
  47. 47. Rising v. Sloping loss<br />SII = 64<br />SII = 45<br />

×