Cog5 lecppt chapter03

2,290 views
2,082 views

Published on

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,290
On SlideShare
0
From Embeds
0
Number of Embeds
1,345
Actions
Shares
0
Downloads
76
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Let’s imagine that you want to put your shoes on, in order to go outdoors. You’d fail in this attempt if you couldn’t recognize your shoes when you saw them.
    Likewise, you know perfectly well how to use a telephone, or how to open a door, or what a chair is for, but you’d never be able to use this knowledge if you couldn’t recognize these ordinary objects when you saw them.
    If I don’t recognize my brother, how would I know that he was at McDonald’s yesterday with his friends?
  • Form perception is the process through which the basic shape and size of an object are seen.
    Object recognition is the process through which the object is identified.
  • Object recognition begins with the detection of simple visual features.
    However, our perception of the visual world goes “beyond the information given” (Bruner, 1973).
    An early twentieth-century movement known as Gestalt psychology captured this idea as, “the whole is different from the sum of its parts.”
    Knowledge influences how we put simple features together.
  • The Necker Cube is an example of perception going “beyond the information given.”
    Two different perceptions of depth are possible, given the lines on the page.
  • In the face-vase figure, two interpretations are possible, each based on a different figure/ground organization.
    The other two ambiguous figures also allow for two interpretations.
    This again shows that perception goes “beyond the information given.”
  • For this picture to be perceived correctly, the perceptual system must first decide what goes with what—for example, that Segment B and Segment E are different bits of the same object (even though they are separated by Segment D) and that Segment B and Segment A are different objects (even though they are adjacent and the same color).
    The perceptual system also needs to separate the figures from the ground, and so the horizontal stripes are perceived to be part of the background, continuing behind the blue vase.
  • Similarity: We tend to group these dots into columns rather than rows, grouping dots of similar colors.
    Proximity: We tend to perceive groups, linking dots that are close together.
    Good continuation: We tend to see a continuous green bar rather than two smaller rectangles.
    Closure: We tend to perceive an intact triangle, reflecting our bias toward perceiving closed figures rather than incomplete ones.
    Simplicity: We tend to interpret a form in the simplest way possible. We would see the form on the left as two intersecting rectangles (as shown on right) rather than as a single, 12-sided, irregular polygon.
  • Object recognition begins with the detection of simple visual features.
    However, our perception of the visual world goes “beyond the information given” (Bruner, 1973).
    An early twentieth-century movement known as Gestalt psychology captured this idea as “the whole is different from the sum of its parts.”
    Knowledge influences how we put simple features together.
  • The visual system prefers the simplest explanation possible, avoiding interpretations that involve coincidences.
    This figure is interpreted as two crossed lines, and not two V shapes precisely aligned.
  • Our interpretation of the visual input influences how basic visual features are processed.
    In this image, it is only when the white parts of the figure are treated as figure, and not ground, that the features of letters are analyzed.
    Present basic figure/ground information
  • In this image, the word “perception” is recognized even though most of the features of the component letters are absent from the stimulus.
  • These examples illustrate that the brain areas that analyze basic visual features and the brain areas that analyze large-scale form are interactive, each sending information to the other.
    This again is an example of parallel processing.
  • A first consideration about object recognition is that we can recognize objects when information is incomplete.
    For example, a cat behind a tree is recognized even if only the head and one paw can be seen.
    The context in which objects are viewed also can have a large effect.
  • For instance, this image is likely to be read as “THE CAT” and not “TAE CHT,” even though the letters “H” and “A” are identical here.
  • Recognition might begin with the input pattern’s features—the small elements out of which more complicated patterns are composed.
    Note that here the features are not those of the raw input, but rather those that result from more organized perception of form.
  • Advantages of a feature-based system:
    Features could serve as building blocks, allowing a single object-recognition system to deal with a variety of targets.
    Focusing on features might allow us to concentrate on what is common to otherwise variable objects.
    Experimental tasks such as visual search suggest that features do have priority in perception.
  • In (A), you can immediately spot the vertical, distinguished from the other shapes by just one feature. Likewise, in (B), you can immediately spot the lone green bar in the field of reds. In (C), it takes much longer to find the one red vertical, because now you need to search for a combination of features—not just for red or vertical, but for the one form that has both of these attributes.
  • In (A), you can immediately spot the vertical, distinguished from the other shapes by just one feature. Likewise, in (B), you can immediately spot the lone green bar in the field of reds. In (C), it takes much longer to find the one red vertical, because now you need to search for a combination of features—not just for red or vertical, but for the one form that has both of these attributes.
  • In (A), you can immediately spot the vertical, distinguished from the other shapes by just one feature. Likewise, in (B), you can immediately spot the lone green bar in the field of reds. In (C), it takes much longer to find the one red vertical, because now you need to search for a combination of features—not just for red or vertical, but for the one form that has both of these attributes.
  • The last one should be harder.
  • Other data suggest that the detection of features is a distinct step in object recognition:
    A disorder called integrative agnosia, caused by parietal cortex damage, involves a preserved ability to detect whether certain features are present in a display but a disrupted ability to judge how features are bound together in objects.
    A similar dissociation has been produced using transcranial magnetic stimulation (TMS).
  • The tachistoscope is a device for presenting stimuli for precisely controlled amounts of time. Today, computers are used for this purpose.
  • Word stimuli may be followed by a mask—a stimulus designed to disrupt further sensory processing of the words—such as a random string of letters.
  • Visual words can be recognized with extremely brief presentations (e.g., 40 ms) under the right conditions:
    Words that are more frequent in the language are better recognized.
    Words that have been recently seen are better recognized, a phenomenon known as repetition priming.
  • In an experiment demonstrating the word-superiority effect, a procedure known as two-alternative forced choice might be used.
    For example, a word such as “DARK” is briefly presented, and the participant is asked whether an “E” or a “K” was present in the display.
    Participants are more accurate when the letters appear within a word than when they appear by themselves or within a letter string such as “JPSRW.”
  • In an experiment demonstrating the word-superiority effect, a procedure known as two-alternative forced choice might be used.
    For example, a word such as “DARK” is briefly presented, and the participant is asked whether an “E” or a “K” was present in the display.
    Participants are more accurate when the letters appear within a word than when they appear by themselves or within a letter string such as “JPSRW.”
  • In other words, how Englishlike are the words? The more they are like English, the easier they are to recognize.
  • There is a strong tendency to misread less common letter sequences as if they were more-common patterns; irregular patterns are misread as if they were regular patterns. Thus, for example, “TPUM” is likely to be misread as “TRUM” or even “DRUM.” But the reverse errors are rare: “DRUM” is unlikely to be misread as “TRUM” or “TPUM.”
  • One possibility for how the visual system recognizes words is through a system called a feature net.
    The initial layer, at the bottom, comprises detectors for features.
    Subsequent layers detect more complex patterns like letters, and then words.
  • Note in the feature net that there are similarities to how neurons fire and send information to each other.
    The detectors have receptive fields, and they fire a signal when a threshold of stimulation is reached.
    However, the detectors here are probably not individual neurons, but more complex assemblies of neurons.
  • This feature net can explain two of the experimental results discussed earlier:
    Words that are more frequent in the language are better recognized.
    Words that have been recently seen are better recognized, a phenomenon known as repetition priming.
    In both of these situations, detectors that have fired more recently have a higher starting activation level.
  • To explain the word-superiority effect, the finding that words in general are better recognized compared to strings of letters, we must add another layer to the network that detects bigrams, or letter pairs.
  • The bigram layer also helps the system recover from confusion about individual letters.
    Here, only some letter “O” features were detected, but this is compensated for by the higher baseline activity of the “CO” detector.
  • A similar mechanism explains why an ambiguous stimulus can be perceived as an “A” in some contexts and an “H” in others.
  • One downside to this organization is that it leads to errors of overregularization.
    Here, the presented stimulus is “CQRN” but is likely to be misread as “CORN.” However, the network’s biases usually help achieve correct perception.
  • What the network “knows” about spelling, or what it “expects” or “infers” about the patterns it sees is not locally represented in any single detector, but rather is a property of the network as a whole.
    This is an example of distributed knowledge.
  • Note that the errors made by the network are produced by the same mechanisms responsible for its advantages—the ability to deal with ambiguous inputs and to recover from errors.
    It is more likely someone will misperceive a low-frequency word (a low-probability event) than a high-frequency word (high-probability event).
    The network sacrifices a small amount of accuracy for a great deal of efficiency.
  • McClelland and Rumelhart’s (1981) model of word recognition included two additions:
    Excitatory and inhibitory connections between detectors
    Top-down connections from words to letters and letters to features
  • Similar feature nets may underlie our perception of objects.
    The recognition by components (RBC) model includes an intermediate layer that is sensitive to geons, basic shapes proposed as the building blocks for all three-dimensional forms.
  • Geons can be identified from virtually any angle of view, and so recognition based on geons is viewpoint independent. Thus, no matter what your position is relative to a cat, you’ll be able to identify its geons and thus identify the cat. Moreover, it seems that most objects can be recognized from just a few geons. As a consequence, geon-based models like RBC can recognize an object even if many of the object’s geons are hidden from view.
  • One piece of evidence supporting the representation of geons is that perceptually degraded pictures are better recognized if geons are preserved.
    Here these items are hard to recognize because the geons are not preserved.
  • Here the geons are preserved and the objects are easier to recognize.
  • Models of object recognition differ on whether object recognition depends on viewpoint.
    In the recognition by components model, geons result in viewpoint-independent recognition.
    Other proposals are viewpoint dependent, requiring the remembered representation to be “rotated” into alignment with the current view.
  • Some evidence suggests that certain categories of objects are perceived using specialized mechanisms.
    In particular, the recognition of faces may involve principles different from those discussed thus far.
  • One source of evidence for specialized face-processing mechanisms comes from prosopagnosia, a type of agnosia in which the visual object-recognition deficit is specific to faces.
  • Perception and memory for faces is also highly viewpoint dependent, much more so than for other objects.
    Here, inverting faces causes a greater disruption in memory performance compared to inverting houses.
  • This example using an altered photograph of Margaret Thatcher also illustrates how viewpoint dependent the perception of faces is.
    When upside down, these two pictures do not look that different.
  • However, when they are right-side up, the differences between the two faces are immediately clear.
  • Pictures of faces, cars and objects were shown to a car “expert” and a bird “expert.” fMRI data in the left column shows activation (shown in red) in response to faces. The bird expert shows the expected pattern—with face stimuli leading to strong activation of the FFA; the car expert’s FFA, it turns out, was slightly lower in his brain, and so, even though clearly activated, does not show in this scan. More important, when cars were the stimuli, the car expert (but not the bird expert) showed activation in the FFA. When birds were the stimuli, it was the bird expert who showed FFA activation.
  • Face recognition, in contrast, does not depend on an inventory of a face’s parts; instead, this recognition seems to depend on holistic perception of the face. In other words, the recognition depends on complex relationships created by the face’s overall configuration—the spacing of the eyes relative to the length of the nose, the height of the forehead relative to the width of the face, and so forth.
  • Correct answer: d
    Feedback: Basically, this question is asking you for evidence that perception is bottom-up, which would be answer b. The other three answers have to do with the fact that perception is top-down, or in the eye of the beholder.
  • Correct answer: a
    Feedback: This answer is not evidence for a feature theory of perception because form perception is independent of feature recognition.
  • Correct answer: c
    Feedback: It definitely involves bigrams but is due to experience, not to any predisposition.
  • Correct answer: d
    Feedback: TMS is used to temporarily stop neuronal firing. It does not measure anything.
  • Correct answer: a
    Feedback: Geons are not representations at the feature level but rather are combinations of features and hence serve as an intermediate level of detector.
  • Correct answer: b
    Feedback: Recognition via multiple views is a another way to describe the viewpoint-dependent model, which holds that some perspectives will favor object recognition relative to others.
  • Correct answer: d
    Feedback: A lack of ability to move something has to do with apraxia. Aphasia has to do with language, neglect with attention and agnosia with object recognition.
  • Cog5 lecppt chapter03

    1. 1. © 2010 by W. W. Norton & Co., Inc. Recognizing Objects Chapter 3 Lecture Outline
    2. 2. Chapter 3: Recognizing Objects  Lecture Outline  Form Perception  Object Recognition  Word Recognition  Feature Nets  Different Objects, Different Recognition Systems?  Top-down Influences on Object Recognition
    3. 3. Recognizing Objects  Why is object recognition important?  Crucial for applying your knowledge  Crucial for learning
    4. 4. Form Perception  How do we perceive and recognize objects?  Form perception: shape and size  Object recognition: identification
    5. 5. Form Perception Jerome Bruner Gestalt Psychology
    6. 6. Form Perception One set of visual features Two possible interpretations But only one can be seen at a time Necker Cube
    7. 7. Form Perception  Knowledge can change our interpretation
    8. 8. Form Perception  People resolve ambiguity in everyday situations
    9. 9. Form Perception  Your ability to interpret these scenes is governed by a few basic principles
    10. 10. Form Perception Good Continuation Proximity Similarity Closure Simplicity Single objects How our mind creates objects
    11. 11. Form Perception Parallel Processing
    12. 12. Form Perception Simpler to interpret this as one X and not two v’s
    13. 13. Form Perception  What is this?  Hint: The black is the background.
    14. 14. Form Perception Proximity, good continuation, closure Letter and Word Recognition
    15. 15. Form Perception Brain areas for basic visual features brain areas for large-scale form Interactive
    16. 16. Object Recognition  Now let’s turn from form perception, the process through which the basic shape and size of an object are seen  And discuss object recognition, the process through which the object is identified
    17. 17. Object Recognition Can recognize objects even when incomplete Incomplete information From the back From the front Context helps
    18. 18. Object Recognition Same stimulus H A
    19. 19. Top-Down Influences on Object Recognition  Bottom-up (or data-driven) processing  Stimulus-driven effects  Top-down (or concept-driven) processing  Knowledge- or expectation-driven effects
    20. 20. Object Recognition  Recognition begins with features—the small elements that result from the organized perception of form
    21. 21. Object Recognition  Features  Building blocks  Commonalities for variable objects  Play a role in visual search
    22. 22. Object Recognition  Visual Search Demo
    23. 23. Object Recognition  Find the vertical line (standing up)
    24. 24. Object Recognition
    25. 25. Object Recognition  Find the green-colored line
    26. 26. Object Recognition
    27. 27. Object Recognition  Find the vertical red-colored line (standing up)
    28. 28. Object Recognition
    29. 29. Object Recognition  Which one was harder?
    30. 30. Object Recognition  Difficulty in judging how more than one feature is bound together in objects Integrative agnosia, parietal cortex damage Disruption of parietal cortex via transcranial magnetic stimulation (TMS)
    31. 31. Word Recognition  Some methodology for studying word recognition:  From tachistoscope to computers
    32. 32. Word Recognition
    33. 33. Word Recognition Masked words Repeated words 40 ms
    34. 34. Word Recognition  Word-superiority effect: response when asked whether “DARK” has an “E” or a “K” faster than within a letter string such as “JPERW”
    35. 35. Word Recognition Better at identifying letters in a word
    36. 36. Word Recognition  Why word superiority?  Probability  How likely is it that letter combinations appear in English?
    37. 37. Word Recognition  Errors also driven by probability  Likely to misread words predictably  “TPUM” is likely to be misread as “TRUM” or even “DRUM.”  But the reverse errors are rare: “DRUM” is unlikely to be misread as “TRUM” or “TPUM”
    38. 38. Feature Nets Complex Simple
    39. 39. Feature Nets  “Neural Network”  Have receptive fields  Fire above threshold  Like complex assemblies of neurons
    40. 40. Feature Nets  Recent firing = higher starting activation level  Frequency leads to higher recency  Repetition increases recency
    41. 41. Feature Nets To explain the word-superiority effect,
    42. 42. Feature Nets Stronger baseline activity Better recognition recover from confusion
    43. 43. Feature Nets TH more frequent CA and AT more frequent
    44. 44. Feature Nets Stronger baseline activity Will correct recognition
    45. 45. Feature Nets  Knowledge not locally represented  But rather, distributed knowledge
    46. 46. Feature Nets  Errors arise from the network’s ability to deal with ambiguous inputs and to recover from errors  Accuracy sacrificed for efficiency
    47. 47. Feature Nets  A much more complex feature net with feedforward and feedback loops  More like a brain
    48. 48. Feature Nets Building blocks for objects
    49. 49. Feature Nets  Bottom-up recognition  Geon recognition leads to object recognition  Viewpoint invariant
    50. 50. Object Recognition  Geon Demo
    51. 51. Object Recognition  Write the objects you see
    52. 52. Feature Nets
    53. 53. Feature Nets
    54. 54. Feature Nets
    55. 55. Feature Nets Recognition by components viewpoint independent viewpoint dependent Whole objects need to be rotated
    56. 56. Different Objects, Different Recognition Systems?  Some categories are special  Faces
    57. 57. Different Objects, Different Recognition Systems?  Prosopagnosia is a type of agnosia also known as face blindness
    58. 58. Different Objects, Different Recognition Systems? Houses about the same upright and inverted Faces much worse Inverted and much better upright
    59. 59. Different Objects, Different Recognition Systems?  Do these two faces look different?
    60. 60. Different Objects, Different Recognition Systems?  Do these two faces look different?
    61. 61. Different Objects, Different Recognition Systems?  Viewpoint dependence appears when  Interpreting faces  Expertise is high (e.g., dog judges)  Specific individuals have to be recognized  Configurations of component parts are important
    62. 62. Different Objects, Different Recognition Systems? Face Expertise Car Expertise Bird Expertise
    63. 63. Different Objects, Different Recognition Systems?  Holistic processing  Composite faces
    64. 64. The Importance of Larger Contexts  Most of the accounts we have covered in this chapter depend on bottom-up processing  However, there is a great deal of knowledge that guides our recognition  Later chapters will discuss this further
    65. 65. Chapter 2 Questions
    66. 66. Which of the following is supportive of the claim that perception is in the “eye of the beholder” and not in the stimulus itself: a) When presented with ambiguous letters, the visual system uses context to determine their identity. b) A traffic light can be identified even if partially occluded by a tree branch. c) Whether someone remembers having seen an ambiguous figure (e.g., face-vase) before depends on whether the interpretation of the figure is the same. d) all of the above
    67. 67. Which of the following is evidence for a feature theory of perception? a) The visual system is specialized with cells that detect single features. b) When researchers are able to stabilize the retinal image for an individual, preventing tiny eye movements (saccades) that refresh the rods and cones, the image stays the same. c) In visual search paradigms, in which a single target must be found in an array of other items, target identification is faster when it shares features with the distractors. d) Detecting an embedded figure (including its features) is independent of the way the form is parsed.
    68. 68. When Betty (an English speaker) is shown strings of letters tachistoscopically, they are overregularized to follow the rules of common English spelling. This is because a) of the word superiority effect. b) all humans are predisposed toward the visual configurations evident in “regular” bigrams; this is why English uses them. c) of a lifetime of strengthening the bigram detectors for common English letter pairs. d) Betty is reluctant to give answers that she cannot easily pronounce.
    69. 69. Which of the following methodologies does not measure brain activity or structure? a) magnetic resonance imaging (MRI) b) computerized axial tomography (CT) c) positron emission tomography (PET) d) transcranial magnetic stimulation (TMS)
    70. 70. The use of geons is associated with a) the recognition-by-components (RBC) model. b) the word superiority effect. c) visual masking. d) feature nets.
    71. 71. The “recognition-via-multiple-views” approach to object recognition is also known as _____ recognition. a) viewpoint dependent b) viewpoint independent c) object d) face
    72. 72. Which of the following is the clinical term we use to describe a disturbance in the initiation or organization of voluntary action? a) aphasia b) neglect c) agnosia d) none of the above

    ×