최유진
Grounding words in perception and action:
computational model.
Deb Roy
TRENDS in Cognitive Sciences
Vol.9 No.8 August 2005...
Language
English
Russian
Korean
French
Chinese
Japanese
Portuguise
Indian
Germane
Spanish
Arabic
Thursday, October 13, 2011
Oneʼs language = Oneʼs perspective on the world
Makes a language of machines with that of humans.
Human - communicate with...
Deb Roy
Associate Professor of Media Arts and Sciences
Director, Cognitive Machines
Roy studies how children learn languag...
We use words to communicate about the things and kinds of things, their properties, relations and actions.
Analogy between...
1. Words about the physical world.
• Is human language is like dictionary?
computational model symbolic
.
Real-world refer...
2. Words - Perceptual Categories : Salient Linguistic Feature
2.1 Language grounding system & categorization.
Sensory inpu...
2. Words - Perceptual Categories : Salient Linguistic Feature
2.2. Models of color naming : Is perceptual model is fixed?
M...
3. Words - Perceptual Categories : Context-dependent Word Use
3.1 Gardenforʼs model : Color distance
How linguistic conven...
3. Words - Perceptual Categories : Context-dependent Word Use
3.2 Reiger : Spatial Distance
: studied graded acceptability...
3. Words - Perceptual Categories : Context-dependent Word Use
3.2 Reiger : CONT.
2) movements : simple movies of objects m...
4. Models of Infant word learning that process ʻfirst-person-
perspectiveʼ sensory data
4.1. Cross-channel early lexical le...
4. Models of Infant word learning that process ʻfirst-person-
perspectiveʼ sensory data
4.1. Cross-channel early lexical le...
1. word - perception : indirect processing
- purely semantic
- context-dependent
2. first-person perspective : direct proce...
5. Richer representational structures :
grounding verbs in physical action.
Verbs that refer to physical actions are natur...
6. Integration of action and perception in grounding nouns.
6.1. Roy : structure networks of motors and sensor primitives ...
6. Integration of action and perception in grounding nouns.
6.1. Roy : CONT.
Ripleyʼs representations and algorithms appro...
7. Conclusions
- Interaction between word use, perception, and action
- Further research (Box 3):
other aspects of the lan...
Upcoming SlideShare
Loading in …5
×

(발제) Grounding words in perception and action computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

303 views

Published on

Published in: Technology, Spiritual
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
303
On SlideShare
0
From Embeds
0
Number of Embeds
28
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

(발제) Grounding words in perception and action computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

  1. 1. 최유진
  2. 2. Grounding words in perception and action: computational model. Deb Roy TRENDS in Cognitive Sciences Vol.9 No.8 August 2005 Thursday, October 13, 2011
  3. 3. Language English Russian Korean French Chinese Japanese Portuguise Indian Germane Spanish Arabic Thursday, October 13, 2011
  4. 4. Oneʼs language = Oneʼs perspective on the world Makes a language of machines with that of humans. Human - communicate with - Machine Thursday, October 13, 2011
  5. 5. Deb Roy Associate Professor of Media Arts and Sciences Director, Cognitive Machines Roy studies how children learn language, and designs machines that learn to communicate in human-like ways. To enable this work, he has pioneered new data-driven methods for analyzing and modeling human linguistic and social behavior. : artificial intelligence, cognitive modeling, human-machine interaction, data mining and information visualization http://www.ted.com/talks/deb_roy_the_birth_of_a_word.html Thursday, October 13, 2011
  6. 6. We use words to communicate about the things and kinds of things, their properties, relations and actions. Analogy between Human and Machine. - Researches in robotics and simulated systems uses : Ground language in machine perception and action = Human abilities. - Research Tradition in computational model moves from : purely symbolic level to connecting symbolic to physical realm of the real world referents. : purely symbolic model context-dependent . Index. 1. Words about the physical world. 2. Association between words and perceptual categories. 3. Modeling context-dependent word use. 4. Models of infant word learning that process ʻfirst-person-perspectiveʼ sensory data. 5. Richer representational structures : grounding verbs in physical action. 6. Integration of action and perception in grounding nouns. 7. Conclusions 0. Research Background Thursday, October 13, 2011
  7. 7. 1. Words about the physical world. • Is human language is like dictionary? computational model symbolic . Real-world referents : ? • Computational model and embodied nature of language : Complex crossmodal phenomena --> particularly useful in situated language acquisition. (physical env.) (object and activities) . • Implication of the study : the possibility of machines to autonomously acquire and verify beliefs about the world, and to communicate in natural language about their beliefs. ROUND PUSH HEAVY Visual feature Motor control feature Haptic feature Thursday, October 13, 2011
  8. 8. 2. Words - Perceptual Categories : Salient Linguistic Feature 2.1 Language grounding system & categorization. Sensory input Natural language description.translation : continuous sensor input (vectors) -- linguistic categories e.g. Generative and discriminative models of categorization. (a). Two prototypes can ʻcompeteʼ (b), leading to a category boundary along points of equal distance from both prototypes (if non-Euclidean distance measures are used, non-linear boundaries may emerge). Categories may also be modeled by explicitly representing categorical boundaries. In (c), a linear model, f(height)=A*width + B, encodes the same categorical distinction as the prototypes in (b) Thursday, October 13, 2011
  9. 9. 2. Words - Perceptual Categories : Salient Linguistic Feature 2.2. Models of color naming : Is perceptual model is fixed? Mojsilovicʼs early model : . , . in different context. “Purple”“Red” “Red wine” Thursday, October 13, 2011
  10. 10. 3. Words - Perceptual Categories : Context-dependent Word Use 3.1 Gardenforʼs model : Color distance How linguistic convention and visual perception combine to determine word meanings. : Arbitrary linguistic convention within perceptual color constraints. e.g. ʻRed wineʼ in Spanish : ʻvinto tintoʼ(colored wine,literally) in Catalan : ʻvino negroʼ(black wine) red(tinto) black(negro) linguistic convention (arbitrary) Gardenfor red white . : Distance between white and red(dark) wine > between white and white (light) wine (in the context-independent prototype) Thursday, October 13, 2011
  11. 11. 3. Words - Perceptual Categories : Context-dependent Word Use 3.2 Reiger : Spatial Distance : studied graded acceptability judgments of 1) spatial terms. For English speakers , how they perceive the term “Above” in conjunction with the physical context. ʻ The circle is above the blockʼ : Q_ a, b, c ? “Above” L1 : Connects the centers of the mass of the regions. L2 : Connects the closest points between the regions. L1 of (b) = L1 of (c) L2 of (a) = L2 of (b) L1 L2 . ) above near . Thursday, October 13, 2011
  12. 12. 3. Words - Perceptual Categories : Context-dependent Word Use 3.2 Reiger : CONT. 2) movements : simple movies of objects moving relative one another to visually ground words s.a. ʻthroughʼ and ʻintoʼ. e.g. ʻPutting a key into a lockʻ vs. ʻRemoving a key from a lockʼ : events distinguished by their initial points vs. end points. 3.3 Limitation in spatial semantics and further studies. - Lack of functional contexts e.g. ʻclean behind the couch( )ʼ ʻhind behind the couchʼ( ) behind . Thursday, October 13, 2011
  13. 13. 4. Models of Infant word learning that process ʻfirst-person- perspectiveʼ sensory data 4.1. Cross-channel early lexical learning(CELL) “Step into the shoes” of humans and learn natural sensory data. : Directly process recordings from natural human environments became enabled without manual transcription. CELL Computational Model : (visual categories) (spoken words) . - A model of learning words from sights and sounds. CELL vs. Blinded system : 50% accuracy rate gaps! Thursday, October 13, 2011
  14. 14. 4. Models of Infant word learning that process ʻfirst-person- perspectiveʼ sensory data 4.1. Cross-channel early lexical learning(CELL) Method : Lexical Learning Analysis 1) STM : Utterance-Context pair : audio-visual input audio -phonetic representations of spoken sequences : linguistic unit video- context: visually observable object and motion : semantic(contextual) unit 2) LTM - Lexical candidates utterance are decomposed into a set of hypothesized linguistic unit prototype contexts are decomposed into a set of hypothesized semantic category prototypes e.g. bounce - ball , ruf-ruf - dog, vrrooom - car...shoes, truck Limitation : 1) Noises from sensory processes 2) Semantically Inappropriate candidates e.g. ʻyeahʼ Thursday, October 13, 2011
  15. 15. 1. word - perception : indirect processing - purely semantic - context-dependent 2. first-person perspective : direct processing - CELL(single object at once) - Eyegaze(multiple objects at once) 3. whatʼs next? VERB = ACTION. Thursday, October 13, 2011
  16. 16. 5. Richer representational structures : grounding verbs in physical action. Verbs that refer to physical actions are naturally grounded in representations that encode the temporal flow of events. 5.1 Siskind : Perceptually grounded model of verbs - sequences of human hands moving colored blocks. (video recorded) - , , , , (contact, support, attachment) [Talmyʼs theory of force dynamics] - semantics of basic verbs = temporal schema, an expected sequences of force dynamic interactions. e.g. ʻHands pick up blockʼ table-supports-block hand-contacts-block hand-attached-block hand-supports-block 1 2 3 4 subject verb object * Allen relations : 13 logical pairs of time interval between A and B 5.2. Bailey et al. developed a system that learns verb semantics and action control structure, ʻX-schemaʼ. - e.g.Difference between ʻPushʼ and ʻShoveʼ Thursday, October 13, 2011
  17. 17. 6. Integration of action and perception in grounding nouns. 6.1. Roy : structure networks of motors and sensor primitives : conversational robot named Ripley. ʻHand me the blue one on your rightʼ - Ripley maintain a dynamic mental model, three-dimensional model of physical environment : , , ( ) - the contents of the robotʼs mental model maybe updated based on linguistic,visual,or haptic input. (Ripley remember the position of the object when it is out of its sensory field.) - multimodal sensory expectation : When Ripley do something What visual system expects Look at the location Find the visual region Reaches to the location Touch and grasp the object Grasps the objects control over object locationlocation info. updated Thursday, October 13, 2011
  18. 18. 6. Integration of action and perception in grounding nouns. 6.1. Roy : CONT. Ripleyʼs representations and algorithms approches to the grounds the meaning of verbs,adjectives,and nouns using a unified representational system. VERB motor-control like X-schemes actions ADJECTIVES object : All perceptual properties corresponds to actions. red =/ color categories = categories linked to motor programs ADJECTIVES object : All perceptual properties corresponds to actions. heavy = haptic categories linked to specific actions. NOUNS Objects linked with locations Ball - Round (or color,size..) - All of actions involved. Thursday, October 13, 2011
  19. 19. 7. Conclusions - Interaction between word use, perception, and action - Further research (Box 3): other aspects of the language such as grammatical composition and functional use in social context. - Re-unite sub-fields of AI : from computer vision, parsing, information retrieval, machine learning, and planning. - Drop in cost of sensor and robotic technology, and ubiquitous situated computing : create new forms of situated human-machine communication. Thursday, October 13, 2011

×