Published on

Published in: Education
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Phonetics The Creation of Speech Sounds
  2. 2. Anatomy and phonetics <ul><li>Speech sounds are created by air pushed out of the lungs; air vibrates as it passes through the vocal tract </li></ul><ul><li>Different positions of the vocal folds, the tongue, lips, and other articulators in the mouth modify the air causing different speech sounds </li></ul>
  3. 3. Periodic vibrations <ul><li>When a tuning fork vibrates, it creates successive waves of compression and rarifaction among air molecules. </li></ul>
  4. 4. Periodic vibrations <ul><li>We can plot these changes in density as sine wave where the horizontal axis represents time and the vertical axis represents density of air molecules. </li></ul>
  5. 5. Sine Wave Amplitude Time
  6. 6. Phase <ul><li>Two waves can have the same frequency, but different phase. Combination of two waves can result in an increase in amplitude if they are in phase, or a decrease if they are out of phase. </li></ul>
  7. 7. Amplitude <ul><li>Two waves can have the same phase, but different amplitudes. </li></ul>
  8. 8. Complex waves <ul><li>The combination of sine waves can result in a complex wave with virtually any shape. </li></ul>
  9. 9. Complex waves <ul><li>The addition of higher harmonics creates a sawtooth shape to this wave. </li></ul>
  10. 10. Fourier analysis
  11. 11. Power spectrum of sine wave Amplitude Frequency
  12. 12. Power spectrum <ul><li>This graph shows the fundamental frequency, and the harmonic frequencies (whole number multiples of F 0 ) associated with that fundamental. </li></ul>
  13. 13. Aperiodic vibration (noise) <ul><li>Noise is characterized by random movement of air molecules. There is no patter to the vibrations, hence there is energy at all frequencies of sound. </li></ul>
  14. 14. Speech waveform
  15. 15. Source-filter model
  16. 16. Sagittal section of the vocal tract (Techmer 1880). text ©J.J. Ohala, September 2001 Lungs Trachea Vocal Folds (within the Larynx) Pharynx Nasal Cavity
  17. 17. Source-filter model
  18. 18. Source-filter model <ul><li>The lungs provide the power. </li></ul><ul><li>The larynx is a valve. </li></ul><ul><li>Air forced through the larynx causes it to vibrate (Bernoulli’s principle). </li></ul><ul><li>The vibrations resonate in the upper cavities. </li></ul>
  19. 19. Anatomy: the larynx <ul><li>The larynx is a highly complex structure that houses the vocal folds. It evolved out of the cartilagenous rings that structure the trachea. </li></ul>
  20. 20. Anatomy: the larynx <ul><li>The larynx has two levels of protection for preventing objects from passing into the trachea. </li></ul><ul><ul><li>Epiglottis </li></ul></ul><ul><ul><li>Vocal folds </li></ul></ul>
  21. 21. Anatomy: the larynx <ul><li>A complex set of muscles control the movement of the vocal folds (which are themselves muscles). </li></ul>
  22. 22. Anatomy: the vocal folds <ul><li>Vibration of the vocal folds. </li></ul><ul><ul><li>Bernulli’s Principle. </li></ul></ul>
  23. 23. Vocal fold sequence
  24. 24. Glottal waveform
  25. 25. The filter (vocal tract) <ul><li>The vocal tract can cause an increase in amplitude of some harmonics. </li></ul>
  26. 26. Orangutan vocal tract <ul><li>Compare the vocal tract of the Orangutan. Not much volume for resonance to occur in. </li></ul>
  27. 27. Rhesus monkey v-t <ul><li>The tongue takes up most of the space in the v-t of non-human primate species. </li></ul>
  28. 28. Homo sapiens v-t <ul><li>Notice the “L” shape of the v-t. Humans choke to death far more frequently than other species. </li></ul>
  29. 29. Resonance cavity (vocal tract) <ul><li>Position of the tongue creates sub-spaces for resonance of the vocal tone. </li></ul>
  30. 30. The vocal tract
  31. 31. Filter Function for Schwa Vowel
  32. 32. Power Spectrum of Glottal Tone
  33. 33. Output of Vocal Tract Filter Function
  34. 34. Vocal Tract Filter
  35. 35. Filter Function for Specific Vowels
  36. 36. Vowel formants for English <ul><li>F1 F2 </li></ul><ul><li>/i/ 290 2500 </li></ul><ul><li>/æ/ 690 1650 </li></ul><ul><li>/a/ 710 1200 </li></ul><ul><li>/u/ 310 900 </li></ul>
  37. 37. American Vowels
  38. 38. Other Vowel Systems
  39. 39. Consonants
  40. 40. <ul><li>Compare the musculature of non-human primates with respect to the articulation of speech sounds. </li></ul>
  41. 41. <ul><li>Humans have far more specific control over the facial muscles that control the lips and jaw. </li></ul>
  42. 43. The modularity of Speech <ul><li>Two aspects of modularity </li></ul><ul><li>language vs. general cognitive system </li></ul><ul><li>linguistic subsystems </li></ul><ul><li>The importance of modularity of speech </li></ul><ul><li>Two supporting evidences of modularity of speech </li></ul><ul><li> -- problem of invariance </li></ul><ul><li> -- categorical perception </li></ul>
  43. 44. Two supporting evidence for speech as a modular system <ul><li>1). Problem of invariance </li></ul><ul><li>  </li></ul><ul><li>The relationship between acoustic stimulus and perceptual experience is complex in the case of speech. The fact that there is no one-to-one correspondence between acoustic cues and perceptual events has been termed the lack of invariance. </li></ul>
  44. 45. Why is the question of modularity important? <ul><li>It is related to the question of the organization of the brain for language  language development / disorders . </li></ul><ul><li>If speech is a modular system </li></ul><ul><li>a specialized neurological representation. </li></ul><ul><li>not be based on general cognitive functioning (working memory, episodic memory, and so on) but would be specific to language . </li></ul><ul><li>the basis for the perception of language in young infants and, if damaged, the reason that certain individuals suffer quite specific breakdowns in language functioning. </li></ul>
  45. 47. The phoneme /t/ and its allophones <ul><li>[t] as in [t h ] as in [ɾ] as in [ʔ] as in … </li></ul><ul><li>stop top little kitten </li></ul><ul><li>acoustic acoustic acoustic acoustic acoustic </li></ul><ul><li>pattern 1 pattern 2 pattern 3 pattern 4 pattern 5 </li></ul><ul><li>  </li></ul><ul><li>Perception of /t/ </li></ul>
  46. 48. Why does the problem of invariance support the speech as modular system hypothesis? <ul><li>When hearing the physical sound [d], how does a listener quickly solving the problem of its real identity (e.g. /t/ or /d/?) </li></ul><ul><li>this is more complex than ordinary auditory perception </li></ul><ul><li>speech is a special mode of perception. </li></ul>
  47. 49. Two supporting evidence for speech as a modular system <ul><li>2).Categorical perception of initial consonants: based on VOT ( Are there any difference between language and other cognitive functioning such as vision? ) </li></ul><ul><li>To comprehend speech, we must impose an absolute (or categorical ) identification on the incoming speech signal rather than simply a relative determination of the various physical characteristics of the signal. </li></ul><ul><li>auditory cues such as frequency and intensity will play a role, but ultimately the result of speech perception is the identification of a stimulus as belonging to one or another category of speech sound. </li></ul>
  48. 50. VOT—voice onset time <ul><li>In the case of oral stops, the airflow is blocked completely, causing pressure to build up. The obstruction in the mouth is then suddenly opened; the released airflow produces a sudden impulse in pressure causing an audible sound. </li></ul><ul><li>VOT is relative to the stop release burst. </li></ul>
  49. 51. VOT – voice onset time <ul><li>On a speech spectrogram it is possible to identify the difference between the voiced sound [ba] and the voiceless sound [pa] as due to the time between when the sound is released at the lips and when the vocal cords begin vibrating . </li></ul><ul><li>With voiced sounds, the vibration occurs immediately; however, with voiceless sounds it occurs after a short delay . This lag, the voice onset time, is an important cue in the perception of the voicing feature. </li></ul>