The Neurophysiology of Speech

2,329 views

Published on

An introduction to the biology and neurophysiology of human speech. The target audience is researchers and engineers working on speech recognition technology.

Published in: Technology, Health & Medicine
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,329
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
97
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

The Neurophysiology of Speech

  1. 1. Neurophysiology of Speech T.S. Yo
  2. 2. ReferencesAudition, the body senses, and the chemical senses. Physiology of behavior, 6th Ed, 1998, pp. 185-223. by Carlson N. R.Human communication. Physiology of behavior, 6th Ed, 1998, pp. 477-508. by Carlson, N. R.FUNCTIONAL MRI OF LANGUAGE: New Approaches to Understanding theCortical Organization of Semantic Processing Annu. Rev. Neurosci., (2002), pp. 151-188. by Bookheimer, S.Lateralization of auditory language functions: A dynamic dual pathway model Brain and Language, 89 (2004) 267–276 by Friederici, A.D. and Alter, K.
  3. 3. Outline● Auditory apparatus● MFCC● Lesion study● Neuroimaging● Dynamic dual channel model● Can we design ASR systems by mimicking organic systems?
  4. 4. Auditory system 槌骨 砧骨 鐙骨 耳蝸 前庭耳廓 鼓膜 歐氏管;耳咽管
  5. 5. Cochlea
  6. 6. Cochlea (2)
  7. 7. Auditory Pathway
  8. 8. Detecting Acoustic Features● Pitch – High freq: place coding – Low freq: rate coding● Loudness – Freq of firing in cochlea nerves● Timbre – Waveform decomposition
  9. 9. Localization with Neural Circuits
  10. 10. Localization with Neural Circuits
  11. 11. Vestibular System
  12. 12. MFCC● Mel Frequency Cepstral Coefficient – Take the Fourier transform of a signal – Map the log amplitudes of the spectrum obtained above onto the mel scale, using triangular overlapping windows. – Take the Discrete Cosine Transform of the list of mel log-amplitudes, as if it were a signal. – The MFCCs are the amplitudes of the resulting spectrum.
  13. 13. From the ears to the brain● Ear – Spectral signals. – Fourier transform done by neural circuits.● Brain – Two pathways in two hemisphere – Left: semantics and syntactics – Right: prosody
  14. 14. Brain Mechanisms for Language● From lesion study to neuroimaging● Localization of functions● Lateralization● Speech Production and Comprehension● Prosody
  15. 15. Lesion Studies● Aphasia – Difficulty in producing or comprehending speech caused by brain damage.● Brocas aphasia – agrammatism – anomia● Wernickes aphasia – poor speech comprehension
  16. 16. Brocas Aphasia● Agrammatism: – difficulty in understanding / using grammar● Anomia: – difficulty in finding the appropriate word to describe an object, action, or attribute.● Apraxia of speech: – impairment in the ability to program movements of the tongue, lips, and throat required to produce the proper sequence of speech sounds.
  17. 17. Brocas Aphasia Example● "Yes ... Monday ... Dad, and Dad ... hospital, and ... Wednesday, Wednesday, nine oclock and ... Thursday, ten oclock ... doctors, two, two ... doctors and ... teeth, yah."● 是...阿...星期一...阿...父親及父親....阿...醫院...及 阿...星期三...星期三九點... 以及 ,喔...星期四...十 點, 阿,醫生...兩個...醫生...及阿...牙齒...對的。
  18. 18. Brocas Aphasia
  19. 19. Wernickes Aphasia● Poor speech comprehension: –● Fluent but meaningless speech: –● Pure word deafness: – The ability to hear, to speak, and to read and write without being able to comprehend the meaning of speech.
  20. 20. Wernickes Aphasia Example● Examiner: What kind of work have you done?● Patient: We, the kids, all of us, and I, we were working for a long time in the ... you know ... its the kind of space, I mean place rear to the spedawn ...● Examiner: Excuse me, but I wanted to know what work you have been doing.● Patient: If you had said that, we had said that, poomer, near the fortunate, porpunate, tamppoo, all around the fourth of martz. Oh, I get all confused.
  21. 21. Wernickes Aphasia
  22. 22. Neuroimaging Studies● Neuroimaging – Functional magnetic resonance imaging (fMRI) – Positron emission tomography (PET)● Subjects are asked to perform cognitive tasks while taking imaging.
  23. 23. Neuroimaging● FMRI● PET
  24. 24. Normalizing Neuroimages● Talairach coordinate space – Center: Anterior Commissure – X: [-65, +65] – Y: [+70, -90] – Z: [-40, +65]
  25. 25. Semantic Conditions● Same – The lawyer questioned the witness. – The attorney questioned the witness.● Different – The man was attacked by the doberman. – The man was attacked by the pitbull.
  26. 26. Syntactic Conditions● Same – The policeman arrested the thief. – The thief was arrested by the policeman.● Different – The teacher was outsmarted by the student. – The teacher outsmarted the student.
  27. 27. Summary by Bookheimer, 2002● The role of the left inferior frontal lobe in semantic processing and dissociations from other frontal lobe language functions.● The organization of categories of objects and concepts in the temporal lobe.● The role of the right hemisphere in comprehending contextual and figurative meaning.
  28. 28. Overview by Ahrens, 2007● Past – Functional localization (brain damage)● Present – Narrower localization + discussion of overlap and integration (neuro-imaging techniques)● Future – Language as a brain function (integrate knowledge about timing, context, and individual differences)
  29. 29. The Three Myths● Myth 1: Broca’s area deals with syntax/production – Fact: Semantics and phonology cluster in different areas of the IFG; syntax seems to be distributed throughout the IFG. – Fact: IFG is activated during non-language tasks.● Myth 2: Wernicke’s area deals with semantics/comprehension – Fact: There are functional subdivisions for language in posterial temporal area.
  30. 30. The Three Myths● Myth 3: The right hemisphere is not used when processing language – Fact: The right hemisphere is called upon for many integrative language processes. > Figurative Language and Metaphor > Linguistic Context > Prosody
  31. 31. Summary of Neuroimaging Studies
  32. 32. Dynamic Dual Pathway Model● Spoken language comprehension requires the coordination of different subprocesses in time.● Segmental information: – phonemes, syntactic elements and lexical-semantic elements.● Suprasegmental information: – accentuation and intonational phrases, i.e., prosody.
  33. 33. Localization of Different Subsystems● Segmental information: – syntactic and semantic information are primarily processed in a left hemispheric temporo-frontal pathway including separate circuits for syntactic and semantic information● Suprasegmental information: – sentence level prosody is processed in a right hemispheric temporo-frontal pathway.
  34. 34. Dynamic Interaction● Corpus Callosum
  35. 35. Can we design ASR systems by imitating the brain?● An open question – Is it possible? Is it more effective?● Complexity – Basic computation power of a neuron: 60 hz – 10^8 of input, 10^10 in the brain, each with >8000 connections● Training time – How long would it take for a human being to understand language?
  36. 36. Some factors in human neural system

×