Phonetics The Creation of Speech Sounds
Anatomy and phonetics Speech sounds are created by air pushed out of the lungs; air vibrates as it passes through the vocal tract Different positions of the vocal folds, the tongue, lips, and other articulators in the mouth modify the air causing different speech sounds
Periodic vibrations When a tuning fork vibrates, it creates successive waves of compression and rarifaction among air molecules.
Periodic vibrations We can plot these changes in density as sine wave where the horizontal axis represents time and the vertical axis represents density of air molecules.
Sine Wave Amplitude Time
Phase Two waves can have the same frequency, but different phase.  Combination of two waves can result in an increase in amplitude if they are in phase, or a decrease if they are out of phase.
Amplitude Two waves can have the same phase, but different amplitudes.
Complex waves The combination of sine waves can result in a complex wave with virtually any shape.
Complex waves The addition of higher harmonics creates a sawtooth shape to this wave.
Fourier analysis
Power spectrum of sine wave Amplitude Frequency
Power spectrum This graph shows the fundamental frequency, and the harmonic frequencies (whole number multiples of F 0 ) associated with that fundamental.
Aperiodic vibration (noise) Noise is characterized by random movement of air molecules.  There is no patter to the vibrations, hence there is energy at all frequencies of sound.
Speech waveform
Source-filter model
Sagittal section of the vocal tract (Techmer 1880). text ©J.J. Ohala, September 2001 Lungs Trachea Vocal Folds (within the Larynx) Pharynx Nasal Cavity
Source-filter model
Source-filter model The lungs provide the power. The larynx is a valve. Air forced through the larynx causes it to vibrate (Bernoulli’s principle). The vibrations resonate in the upper cavities.
Anatomy: the larynx The larynx is a highly complex structure that houses the vocal folds.  It evolved out of the cartilagenous rings that structure the trachea.
Anatomy: the larynx The larynx has two levels of protection for preventing objects from passing into the trachea. Epiglottis Vocal folds
Anatomy: the larynx A complex set of muscles control the movement of the vocal folds (which are themselves muscles).
Anatomy: the vocal folds Vibration of the vocal folds. Bernulli’s Principle.
Vocal fold sequence
Glottal waveform
The filter (vocal tract) The vocal tract can cause an increase in amplitude of some harmonics.
Orangutan vocal tract Compare the vocal tract of the Orangutan.  Not much volume for resonance to occur in.
Rhesus monkey v-t The tongue takes up most of the space in the v-t of non-human primate species.
Homo sapiens v-t Notice the “L” shape of the v-t.  Humans choke to death far more frequently than other species.
Resonance cavity (vocal tract) Position of the tongue creates sub-spaces for resonance of the vocal tone.
The vocal tract
Filter Function for Schwa Vowel
Power Spectrum of Glottal Tone
Output of Vocal Tract Filter Function
Vocal Tract Filter
Filter Function for Specific Vowels
Vowel formants for English F1 F2 /i/  290 2500 /æ/ 690 1650 /a/ 710 1200 /u/ 310 900
American Vowels
Other Vowel Systems
Consonants
Compare the musculature of non-human primates with respect to the articulation of speech sounds.
Humans have far more specific control over the facial muscles that control the lips and jaw.
 
The modularity of Speech  Two aspects of modularity language vs. general cognitive system linguistic subsystems The importance of modularity of speech Two supporting evidences of modularity of speech   -- problem of invariance   -- categorical perception
Two supporting evidence for  speech as a modular system 1). Problem of invariance   The relationship between acoustic stimulus and perceptual experience is complex in the case of speech.  The fact that there is no one-to-one correspondence between  acoustic cues  and  perceptual events  has been termed  the lack of invariance.
Why is the question of  modularity important? It is related to the question of the  organization of the brain for language      language  development  /  disorders .  If speech is a modular system  a  specialized neurological representation.   not be based on  general  cognitive functioning (working memory, episodic  memory, and so on) but would be  specific to language .   the basis for the perception of language in young infants and, if damaged, the reason that certain individuals suffer quite  specific  breakdowns in language functioning.
 
The phoneme /t/ and its allophones [t] as in  [t h ] as in  [ɾ] as in  [ʔ] as in  … stop   top   little   kitten acoustic  acoustic  acoustic  acoustic  acoustic pattern 1  pattern 2  pattern 3  pattern 4 pattern 5     Perception of /t/
Why does the problem of invariance support the speech as modular system hypothesis? When hearing the physical sound [d], how does a listener quickly solving the problem of its real identity (e.g. /t/ or /d/?)  this is more complex than  ordinary  auditory perception  speech is a special mode of perception.
Two supporting evidence for speech as a modular system 2).Categorical perception of initial consonants: based on VOT ( Are there any difference between language and other cognitive functioning such as vision? ) To comprehend speech, we must impose an absolute (or  categorical ) identification on the incoming speech signal rather than simply a relative determination of the various physical characteristics of the signal.  auditory cues such as frequency and intensity will play a role, but ultimately the result of speech perception is the identification of a stimulus as belonging to one or another category of speech sound.
VOT—voice onset time In the case of oral stops, the airflow is blocked completely, causing pressure to build up. The obstruction in the mouth is then suddenly opened; the released airflow produces a sudden impulse in pressure causing an audible sound. VOT is  relative to the stop release burst.
VOT – voice onset time On a speech spectrogram it is possible to identify the difference between the voiced sound [ba] and the voiceless sound [pa] as due to the time between when the sound is  released at the lips  and when the  vocal cords begin vibrating .  With  voiced  sounds, the vibration occurs immediately; however, with voiceless sounds it occurs  after a short delay .  This lag, the voice onset time, is an important cue in the perception of the voicing feature.
 
 
 
 
 
 
 
 
 

Phonetics

  • 1.
    Phonetics The Creationof Speech Sounds
  • 2.
    Anatomy and phoneticsSpeech sounds are created by air pushed out of the lungs; air vibrates as it passes through the vocal tract Different positions of the vocal folds, the tongue, lips, and other articulators in the mouth modify the air causing different speech sounds
  • 3.
    Periodic vibrations Whena tuning fork vibrates, it creates successive waves of compression and rarifaction among air molecules.
  • 4.
    Periodic vibrations Wecan plot these changes in density as sine wave where the horizontal axis represents time and the vertical axis represents density of air molecules.
  • 5.
  • 6.
    Phase Two wavescan have the same frequency, but different phase. Combination of two waves can result in an increase in amplitude if they are in phase, or a decrease if they are out of phase.
  • 7.
    Amplitude Two wavescan have the same phase, but different amplitudes.
  • 8.
    Complex waves Thecombination of sine waves can result in a complex wave with virtually any shape.
  • 9.
    Complex waves Theaddition of higher harmonics creates a sawtooth shape to this wave.
  • 10.
  • 11.
    Power spectrum ofsine wave Amplitude Frequency
  • 12.
    Power spectrum Thisgraph shows the fundamental frequency, and the harmonic frequencies (whole number multiples of F 0 ) associated with that fundamental.
  • 13.
    Aperiodic vibration (noise)Noise is characterized by random movement of air molecules. There is no patter to the vibrations, hence there is energy at all frequencies of sound.
  • 14.
  • 15.
  • 16.
    Sagittal section ofthe vocal tract (Techmer 1880). text ©J.J. Ohala, September 2001 Lungs Trachea Vocal Folds (within the Larynx) Pharynx Nasal Cavity
  • 17.
  • 18.
    Source-filter model Thelungs provide the power. The larynx is a valve. Air forced through the larynx causes it to vibrate (Bernoulli’s principle). The vibrations resonate in the upper cavities.
  • 19.
    Anatomy: the larynxThe larynx is a highly complex structure that houses the vocal folds. It evolved out of the cartilagenous rings that structure the trachea.
  • 20.
    Anatomy: the larynxThe larynx has two levels of protection for preventing objects from passing into the trachea. Epiglottis Vocal folds
  • 21.
    Anatomy: the larynxA complex set of muscles control the movement of the vocal folds (which are themselves muscles).
  • 22.
    Anatomy: the vocalfolds Vibration of the vocal folds. Bernulli’s Principle.
  • 23.
  • 24.
  • 25.
    The filter (vocaltract) The vocal tract can cause an increase in amplitude of some harmonics.
  • 26.
    Orangutan vocal tractCompare the vocal tract of the Orangutan. Not much volume for resonance to occur in.
  • 27.
    Rhesus monkey v-tThe tongue takes up most of the space in the v-t of non-human primate species.
  • 28.
    Homo sapiens v-tNotice the “L” shape of the v-t. Humans choke to death far more frequently than other species.
  • 29.
    Resonance cavity (vocaltract) Position of the tongue creates sub-spaces for resonance of the vocal tone.
  • 30.
  • 31.
  • 32.
    Power Spectrum ofGlottal Tone
  • 33.
    Output of VocalTract Filter Function
  • 34.
  • 35.
    Filter Function forSpecific Vowels
  • 36.
    Vowel formants forEnglish F1 F2 /i/ 290 2500 /æ/ 690 1650 /a/ 710 1200 /u/ 310 900
  • 37.
  • 38.
  • 39.
  • 40.
    Compare the musculatureof non-human primates with respect to the articulation of speech sounds.
  • 41.
    Humans have farmore specific control over the facial muscles that control the lips and jaw.
  • 42.
  • 43.
    The modularity ofSpeech Two aspects of modularity language vs. general cognitive system linguistic subsystems The importance of modularity of speech Two supporting evidences of modularity of speech -- problem of invariance -- categorical perception
  • 44.
    Two supporting evidencefor speech as a modular system 1). Problem of invariance   The relationship between acoustic stimulus and perceptual experience is complex in the case of speech. The fact that there is no one-to-one correspondence between acoustic cues and perceptual events has been termed the lack of invariance.
  • 45.
    Why is thequestion of modularity important? It is related to the question of the organization of the brain for language  language development / disorders . If speech is a modular system a specialized neurological representation. not be based on general cognitive functioning (working memory, episodic memory, and so on) but would be specific to language . the basis for the perception of language in young infants and, if damaged, the reason that certain individuals suffer quite specific breakdowns in language functioning.
  • 46.
  • 47.
    The phoneme /t/and its allophones [t] as in [t h ] as in [ɾ] as in [ʔ] as in … stop top little kitten acoustic acoustic acoustic acoustic acoustic pattern 1 pattern 2 pattern 3 pattern 4 pattern 5   Perception of /t/
  • 48.
    Why does theproblem of invariance support the speech as modular system hypothesis? When hearing the physical sound [d], how does a listener quickly solving the problem of its real identity (e.g. /t/ or /d/?) this is more complex than ordinary auditory perception speech is a special mode of perception.
  • 49.
    Two supporting evidencefor speech as a modular system 2).Categorical perception of initial consonants: based on VOT ( Are there any difference between language and other cognitive functioning such as vision? ) To comprehend speech, we must impose an absolute (or categorical ) identification on the incoming speech signal rather than simply a relative determination of the various physical characteristics of the signal. auditory cues such as frequency and intensity will play a role, but ultimately the result of speech perception is the identification of a stimulus as belonging to one or another category of speech sound.
  • 50.
    VOT—voice onset timeIn the case of oral stops, the airflow is blocked completely, causing pressure to build up. The obstruction in the mouth is then suddenly opened; the released airflow produces a sudden impulse in pressure causing an audible sound. VOT is relative to the stop release burst.
  • 51.
    VOT – voiceonset time On a speech spectrogram it is possible to identify the difference between the voiced sound [ba] and the voiceless sound [pa] as due to the time between when the sound is released at the lips and when the vocal cords begin vibrating . With voiced sounds, the vibration occurs immediately; however, with voiceless sounds it occurs after a short delay . This lag, the voice onset time, is an important cue in the perception of the voicing feature.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.