How should we represent visual scenes?               Common-Sense Core,              Probabilistic Programs               ...
Core of common-sense reasoningHuman thought is structured around a basic understanding of physical objects, intentional ag...
A developmental perspectiveA 3 year old and her dad:Dad: “Whats this a picture of?”Sarah: “A bear hugging a panda bear.” ....
Intuitive physics and psychologySouthgate and Csibra, 2009(13 month olds)                             Heider and Simmel, 1...
Intuitive physics(Gupta, Efros, Hebert)                           (Whiting et al)
Intuitive psychology
Probabilistic generative models• early 1990’s-early 2000’s   – Bayesian networks: model the causal processes that     give...
Scene understanding as an         inverse problemThe “inverse Pixar” problem:                 World state (t)             ...
Scene understanding as an               inverse problem   The “inverse Pixar” problem:                      physics… World...
Probabilistic programs• Probabilistic models a la Laplace.   – The world is fundamentally deterministic (described by a pr...
Probabilistic programs for “inverse     pixar” scene understanding• World state: CAD++• Graphics  – Approximate Rendering ...
Probabilistic programs for “inverse     pixar” scene understanding• World state: CAD++• Graphics• Physics  – Approximate N...
Modeling stability judgments
Modeling stability judgments                      physics… World state (t-1)             World state (t)   World state (t+...
Modeling stability judgments                      physics… World state (t-1)             World state (t)   World state (t+...
Modeling stability judgments                      physics… World state (t-1)             World state (t)   World state (t+...
Modeling stability judgments                    Prob.                    approx.                    Newton… World state (t...
Modeling stability judgments                    Prob.                    approx.                    Newton… World state (t...
Modeling stability judgments   (Hamrick,   Battaglia,   Tenenbaum,   Cogsci 2011)Perception: Approximate posterior with bl...
ResultsMean humanstabilityjudgment             Model prediction             (expected proportion of tower that will fall)
Simpler alternatives?
The flexibility of common sense(“infinite use of finite means”, “visual Turing test”)• Which way will the blocks fall?• Ho...
Direction of fall
Direction and distance of fall
If you bump the table…
If you bump the table…              (Battaglia, & Tenenbaum, in prep)Mean humanjudgment              Model prediction     ...
Experiment 1: Cause/ Prevention Judgments                          (Gerstenberg, Tenenbaum,                          Goodm...
Modeling people’s cause/prevention judgments• Physics Simulation Model                                  p(B|A) – p(B| not ...
Simulation Model
Intuitive psychologyBeliefs (B)    Desires (D)       Actions (A)                             Heider and Simmel, 1944
Intuitive psychology      Beliefs (B)          Desires (D)                   Actions (A)Pr(A|B,D)Beliefs (B)…             ...
Intuitive psychologyBeliefs (B)    Desires (D)      Probabilistic      approximate        planning        Actions (A)Proba...
Intuitive psychology                                                 In state j, chooseBeliefs (B)    Desires (D)    Actio...
Goal inference as inverse                                        constraints    goals probabilistic planning              ...
Theory of mind:                                   Agent                                                                   ...
Goal inference with                                   constraints     goals  multiple agents                  constraints ...
constraints       goals Inferring social goals (Baker, Goodman & Tenenbaum, Cog   constraints    goals             rationa...
ConclusionsFrom scenes to stories… What contents of stories are  routinely accessed through visual scenes? How can we  rep...
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Fcv rep tenenbaum
Upcoming SlideShare
Loading in …5
×

Fcv rep tenenbaum

389 views

Published on

Published in: Technology, Art & Photos
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
389
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Fcv rep tenenbaum

  1. 1. How should we represent visual scenes? Common-Sense Core, Probabilistic Programs Josh Tenenbaum MIT Brain and Cognitive Sciences CSAIL Joint work with Noah Goodman, Chris Baker, Rebecca Saxe, Tomer Ullman, Peter Battaglia, Jess Hamrick and others.
  2. 2. Core of common-sense reasoningHuman thought is structured around a basic understanding of physical objects, intentional agents, and their relations.“Core knowledge” (Spelke, Carey, Leslie, Baillargeon, Gergely…)Intuitive theories (Carey, Gopnik, Wellman, Gelman, Gentner, Forbus, McCloskey…)Primitives of lexical semantics (Pinker, Jackendoff, Talmy, Pustejovsky)Visual scene understanding (Everyone here…) From scenes to stories…The key questions: (1) What is the form and content of human common-sense theories of the physical world, intentional agents, and their interaction? (2) How are these theories used to parse visual experience into representations that support reasoning, planning, communication?
  3. 3. A developmental perspectiveA 3 year old and her dad:Dad: “Whats this a picture of?”Sarah: “A bear hugging a panda bear.” ...Dad: “What is the second panda bear doing?”Sarah: “Its trying to hug the bear.”Dad: “What about the third bear?”Sarah: “It’s walking away.” But this feels too hard to approach now, so what about looking at younger children (e.g.12 months or younger)?
  4. 4. Intuitive physics and psychologySouthgate and Csibra, 2009(13 month olds) Heider and Simmel, 1944
  5. 5. Intuitive physics(Gupta, Efros, Hebert) (Whiting et al)
  6. 6. Intuitive psychology
  7. 7. Probabilistic generative models• early 1990’s-early 2000’s – Bayesian networks: model the causal processes that give rise to observations; perform reasoning, prediction, planning via probabilistic inference. – The problem: not sufficiently flexible, expressive.
  8. 8. Scene understanding as an inverse problemThe “inverse Pixar” problem: World state (t) graphics Image (t)
  9. 9. Scene understanding as an inverse problem The “inverse Pixar” problem: physics… World state (t-1) World state (t) World state (t+1) … graphics Image (t-1) Image (t) Image (t+1)
  10. 10. Probabilistic programs• Probabilistic models a la Laplace. – The world is fundamentally deterministic (described by a program), and perfectly predictable if we could observe all relevant variables. – Observations are always incomplete or indirect, so we put probability distributions on what we can’t observe.• Compare with Bayesian networks. – Thick nodes. Programs defined over unbounded sets of objects, their properties, states and relations, rather than traditional finite- dimensional random variables. – Thick arrows. Programs capture fine-grained causal processes unfolding over space and time, not simply directed statistical dependencies. – Recursive. Probabilistic programs can be arbitrarily manipulated inside other programs. (e.g. perceptual inferences about entities that make perceptual inferences, entities with goals and plans re: other agents’ goals and plans.)• Compare with grammars or logic programs.
  11. 11. Probabilistic programs for “inverse pixar” scene understanding• World state: CAD++• Graphics – Approximate Rendering • Simple surface primitives • Rasterization rather than ray tracing (for each primitive, which pixels does it affect?) • Image features rather than pixels – Probabilities: • Image noise, image features • Unseen objects (e.g., due to occlusion)
  12. 12. Probabilistic programs for “inverse pixar” scene understanding• World state: CAD++• Graphics• Physics – Approximate Newton (physical simulation toolkit, e.g. ODE) • Collision detection: zone of interaction • Collision response: transient springs • Dynamics simulation: only for objects in motion – Probabilities: • Latent properties (e.g., mass, friction) • Latent forces
  13. 13. Modeling stability judgments
  14. 14. Modeling stability judgments physics… World state (t-1) World state (t) World state (t+1) … graphics Image (t-1) Image (t) Image (t+1)
  15. 15. Modeling stability judgments physics… World state (t-1) World state (t) World state (t+1) … Prob. approx. rendering Image (t-1) Image (t) Image (t+1)
  16. 16. Modeling stability judgments physics… World state (t-1) World state (t) World state (t+1) … Prob. approx. rendering Image (t-1) Image (t) Image (t+1)
  17. 17. Modeling stability judgments Prob. approx. Newton… World state (t-1) World state (t) World state (t+1) … Prob. approx. rendering Image (t-1) Image (t) Image (t+1)
  18. 18. Modeling stability judgments Prob. approx. Newton… World state (t-1) World state (t) World state (t+1) … Prob. approx. rendering Image (t-1) Image (t) Image (t+1) = perceptual uncertainty
  19. 19. Modeling stability judgments (Hamrick, Battaglia, Tenenbaum, Cogsci 2011)Perception: Approximate posterior with block positions normally distributed around ground truth, subject to global stability.Reasoning : Draw multiple samples from perception. Simulate forward with deterministic approx. Newton (ODE)Decision: Expectations of various functions evaluated on simulation outputs.
  20. 20. ResultsMean humanstabilityjudgment Model prediction (expected proportion of tower that will fall)
  21. 21. Simpler alternatives?
  22. 22. The flexibility of common sense(“infinite use of finite means”, “visual Turing test”)• Which way will the blocks fall?• How far will the blocks fall?• If this tower falls, will it knock that one over?• If you bump the table, will more red blocks or yellow blocks fall over?• If this block had (not) been present, would the tower (still) have fallen over?• Which of these blocks is heavier or lighter than the others?• …
  23. 23. Direction of fall
  24. 24. Direction and distance of fall
  25. 25. If you bump the table…
  26. 26. If you bump the table… (Battaglia, & Tenenbaum, in prep)Mean humanjudgment Model prediction (expected proportion of red vs. yellow blocks that fall)
  27. 27. Experiment 1: Cause/ Prevention Judgments (Gerstenberg, Tenenbaum, Goodman, et al., in prep)
  28. 28. Modeling people’s cause/prevention judgments• Physics Simulation Model p(B|A) – p(B| not A) 0 if ball misses p(B|A) 1 if ball goes in p(B| not A): assume sparse latent Gaussian perturbations on B’s velocity.
  29. 29. Simulation Model
  30. 30. Intuitive psychologyBeliefs (B) Desires (D) Actions (A) Heider and Simmel, 1944
  31. 31. Intuitive psychology Beliefs (B) Desires (D) Actions (A)Pr(A|B,D)Beliefs (B)… Heider and Simmel, 1944 Desires (D) …
  32. 32. Intuitive psychologyBeliefs (B) Desires (D) Probabilistic approximate planning Actions (A)Probabilistic program Heider and Simmel, 1944
  33. 33. Intuitive psychology In state j, chooseBeliefs (B) Desires (D) Actions i action i* = States j arg max pij , j u j Probabilistic i j approximate “Inverse economics” planning “Inverse optimal control” “Inverse reinforcement learning” “Inverse Bayesian decision theory” Actions (A) (Lucas & Griffiths; Jern & Kemp; Tauber & Steyvers; Rafferty & Griffiths; Goodman & Baker; Goodman & Stuhlmuller;Probabilistic program Bergen, Evans & Tenenbaum … Ng & Russell; Todorov; Rao; Ziebart, Dey & Bagnell…)
  34. 34. Goal inference as inverse constraints goals probabilistic planning rational planning (Baker, Tenenbaum & Saxe, Cognition, 2009) (MDP) 1 r = 0.98 actions Agent People 0.5 0 0 0.5 1 Model
  35. 35. Theory of mind: Agent Environment state Joint inferences about beliefs rational and preferences perception (Baker, Saxe & Tenenbaum, CogSci 2011) Beliefs PreferencesFood truck scenarios: rational planning Preferences Initial Beliefs Actions Agent
  36. 36. Goal inference with constraints goals multiple agents constraints goals rational planning (MDP) (Baker, Goodman & Tenenbaum, CogSci 2008, in prep) rational planning (MDP) actions AgentSouthgate& Csibra: actions Agent People Model
  37. 37. constraints goals Inferring social goals (Baker, Goodman & Tenenbaum, Cog constraints goals rational planning Sci 2008; Ullman, Baker, Evans, (MDP) Macindoe & Tenenbaum, NIPS 2009) rational planning (MDP) actionsHamlin, Kuhlmeier, Wynn & Bloom: Agent actions Agent Subject ratings prediction Model Subject ratings prediction Model
  38. 38. ConclusionsFrom scenes to stories… What contents of stories are routinely accessed through visual scenes? How can we represent that content for reasoning, communication, prediction and planning?Focus on core knowledge present in preverbal infants: intuitive physics, intuitive psychology.Representations using probabilistic programs: thick nodes (e.g. CAD++), thick arrows (physics, graphics, planning), recursive (inference about inference, goals about goals).Challenges for future work: (1) Integrating physics and psychology. (2) Efficient inference. (3) Learning.

×