The Neurophysiology of Speech

Uploaded on

An introduction to the biology and neurophysiology of human speech. The target audience is researchers and engineers working on speech recognition technology.

An introduction to the biology and neurophysiology of human speech. The target audience is researchers and engineers working on speech recognition technology.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Neurophysiology of Speech T.S. Yo
  • 2. ReferencesAudition, the body senses, and the chemical senses. Physiology of behavior, 6th Ed, 1998, pp. 185-223. by Carlson N. R.Human communication. Physiology of behavior, 6th Ed, 1998, pp. 477-508. by Carlson, N. R.FUNCTIONAL MRI OF LANGUAGE: New Approaches to Understanding theCortical Organization of Semantic Processing Annu. Rev. Neurosci., (2002), pp. 151-188. by Bookheimer, S.Lateralization of auditory language functions: A dynamic dual pathway model Brain and Language, 89 (2004) 267–276 by Friederici, A.D. and Alter, K.
  • 3. Outline● Auditory apparatus● MFCC● Lesion study● Neuroimaging● Dynamic dual channel model● Can we design ASR systems by mimicking organic systems?
  • 4. Auditory system 槌骨 砧骨 鐙骨 耳蝸 前庭耳廓 鼓膜 歐氏管;耳咽管
  • 5. Cochlea
  • 6. Cochlea (2)
  • 7. Auditory Pathway
  • 8. Detecting Acoustic Features● Pitch – High freq: place coding – Low freq: rate coding● Loudness – Freq of firing in cochlea nerves● Timbre – Waveform decomposition
  • 9. Localization with Neural Circuits
  • 10. Localization with Neural Circuits
  • 11. Vestibular System
  • 12. MFCC● Mel Frequency Cepstral Coefficient – Take the Fourier transform of a signal – Map the log amplitudes of the spectrum obtained above onto the mel scale, using triangular overlapping windows. – Take the Discrete Cosine Transform of the list of mel log-amplitudes, as if it were a signal. – The MFCCs are the amplitudes of the resulting spectrum.
  • 13. From the ears to the brain● Ear – Spectral signals. – Fourier transform done by neural circuits.● Brain – Two pathways in two hemisphere – Left: semantics and syntactics – Right: prosody
  • 14. Brain Mechanisms for Language● From lesion study to neuroimaging● Localization of functions● Lateralization● Speech Production and Comprehension● Prosody
  • 15. Lesion Studies● Aphasia – Difficulty in producing or comprehending speech caused by brain damage.● Brocas aphasia – agrammatism – anomia● Wernickes aphasia – poor speech comprehension
  • 16. Brocas Aphasia● Agrammatism: – difficulty in understanding / using grammar● Anomia: – difficulty in finding the appropriate word to describe an object, action, or attribute.● Apraxia of speech: – impairment in the ability to program movements of the tongue, lips, and throat required to produce the proper sequence of speech sounds.
  • 17. Brocas Aphasia Example● "Yes ... Monday ... Dad, and Dad ... hospital, and ... Wednesday, Wednesday, nine oclock and ... Thursday, ten oclock ... doctors, two, two ... doctors and ... teeth, yah."● 是...阿...星期一...阿...父親及父親....阿...醫院...及 阿...星期三...星期三九點... 以及 ,喔...星期四...十 點, 阿,醫生...兩個...醫生...及阿...牙齒...對的。
  • 18. Brocas Aphasia
  • 19. Wernickes Aphasia● Poor speech comprehension: –● Fluent but meaningless speech: –● Pure word deafness: – The ability to hear, to speak, and to read and write without being able to comprehend the meaning of speech.
  • 20. Wernickes Aphasia Example● Examiner: What kind of work have you done?● Patient: We, the kids, all of us, and I, we were working for a long time in the ... you know ... its the kind of space, I mean place rear to the spedawn ...● Examiner: Excuse me, but I wanted to know what work you have been doing.● Patient: If you had said that, we had said that, poomer, near the fortunate, porpunate, tamppoo, all around the fourth of martz. Oh, I get all confused.
  • 21. Wernickes Aphasia
  • 22. Neuroimaging Studies● Neuroimaging – Functional magnetic resonance imaging (fMRI) – Positron emission tomography (PET)● Subjects are asked to perform cognitive tasks while taking imaging.
  • 23. Neuroimaging● FMRI● PET
  • 24. Normalizing Neuroimages● Talairach coordinate space – Center: Anterior Commissure – X: [-65, +65] – Y: [+70, -90] – Z: [-40, +65]
  • 25. Semantic Conditions● Same – The lawyer questioned the witness. – The attorney questioned the witness.● Different – The man was attacked by the doberman. – The man was attacked by the pitbull.
  • 26. Syntactic Conditions● Same – The policeman arrested the thief. – The thief was arrested by the policeman.● Different – The teacher was outsmarted by the student. – The teacher outsmarted the student.
  • 27. Summary by Bookheimer, 2002● The role of the left inferior frontal lobe in semantic processing and dissociations from other frontal lobe language functions.● The organization of categories of objects and concepts in the temporal lobe.● The role of the right hemisphere in comprehending contextual and figurative meaning.
  • 28. Overview by Ahrens, 2007● Past – Functional localization (brain damage)● Present – Narrower localization + discussion of overlap and integration (neuro-imaging techniques)● Future – Language as a brain function (integrate knowledge about timing, context, and individual differences)
  • 29. The Three Myths● Myth 1: Broca’s area deals with syntax/production – Fact: Semantics and phonology cluster in different areas of the IFG; syntax seems to be distributed throughout the IFG. – Fact: IFG is activated during non-language tasks.● Myth 2: Wernicke’s area deals with semantics/comprehension – Fact: There are functional subdivisions for language in posterial temporal area.
  • 30. The Three Myths● Myth 3: The right hemisphere is not used when processing language – Fact: The right hemisphere is called upon for many integrative language processes. > Figurative Language and Metaphor > Linguistic Context > Prosody
  • 31. Summary of Neuroimaging Studies
  • 32. Dynamic Dual Pathway Model● Spoken language comprehension requires the coordination of different subprocesses in time.● Segmental information: – phonemes, syntactic elements and lexical-semantic elements.● Suprasegmental information: – accentuation and intonational phrases, i.e., prosody.
  • 33. Localization of Different Subsystems● Segmental information: – syntactic and semantic information are primarily processed in a left hemispheric temporo-frontal pathway including separate circuits for syntactic and semantic information● Suprasegmental information: – sentence level prosody is processed in a right hemispheric temporo-frontal pathway.
  • 34. Dynamic Interaction● Corpus Callosum
  • 35. Can we design ASR systems by imitating the brain?● An open question – Is it possible? Is it more effective?● Complexity – Basic computation power of a neuron: 60 hz – 10^8 of input, 10^10 in the brain, each with >8000 connections● Training time – How long would it take for a human being to understand language?
  • 36. Some factors in human neural system