Brenden studies computational problems that are easier for people than they are for machines. He received his Ph.D. in Cognitive Science from MIT in 2014, and his M.S. and B.S. in Symbolic Systems from Stanford University in 2009. He is a recipient of the Robert J. Glushko Prize for Outstanding Doctoral Dissertation in Cognitive Science. His recent research on Bayesian Program Learning has been covered by many media outlets (New York Times, Washington Post, etc.) and was selected by Scientific American as one of the most important advances of 2016.
Both cognitive science and AI can gain by studying the human solutions to difficult computational problems. Brenden's talk will focus concept learning and question asking, two problems that people solve far better than machines. People can learn a new concept from fewer examples, and then use their concepts in richer ways -- for imagination, extrapolation, and explanation, not just classification. Moreover, learning is often an active process; people can ask rich and probing questions in order to reduce uncertainty, while algorithms for active learning ask simple and stereotyped queries. He will also discuss work on program induction as a cognitive model and potential solution for extracting richer concepts from less data, with applications to learning handwritten characters and learning recursive visual concepts from examples. Brenden will end with program synthesis as a model of question asking in simple games.
4. Outline
case study 1:
handwritten characters
case study 2:
recursive visual concepts
Concept learning
Question asking
... Are any ships 3 tiles long?
case study 3:
question asking in simple games
Are the blue ship and red ship
parallel?
5. Concepts and questions as programs,
learning as program induction
...
Are any objects 3 tiles long?
( >
( +
( map
( lambda
x
( =
( size x )
3
)
)
( set Blue Red Purple )
)
)
0
)
G GG
angle = 120
start = F
niter = 3
F F G + G + G F
Are the blue and red
objects parallel?
( =
( orient Blue )
( orient Red )
)
sij P(sij|si(j 1))
end for
Ri P(Ri|S1, ..., Si 1)
end for
{, R, S}
return @GENERATETOKEN( )
end procedure
procedure GENERATETOKEN( )
for i = 1... do
S
(m)
i P(S
(m)
i |Si)
L
(m)
i P(L
(m)
i |Ri, T
(m)
1 , ..., T
(m)
i 1 )
T
(m)
i f(L
(m)
i , S
(m)
i )
end for
A(m)
P(A(m)
)
I(m)
P(I(m)
|T(m)
, A(m)
)
return I(m)
end procedure
1
8. Outline
case study 1:
handwritten characters
case study 2:
fractal concepts
Concept learning
Question asking
...
Are any ships 3 tiles long?
active learning with rich questions
Are the blue ship and red ship
parallel?
9. Outline
case study 1:
handwritten characters
case study 2:
fractal concepts
Concept learning
Question asking
...
Are any ships 3 tiles long?
active learning with rich questions
Are the blue ship and red ship
parallel?
Josh Tenenbaum Russ
Salakhutdinov
Lake, Salakhutdinov, & Tenenbaum
(2015). Science.
10. where are the others?
generating
new examples
generating
new concepts
How do people learn such rich concepts from very little data?
parsing
the speed of learning the richness of representation
“one-shot
learning”
(e.g., Carey & Bartlett, 1978;
Markman, 1989;
Tenenbaum, 1999;
Bloom, 2000;
Smith et al., 2002)
11. Training data (ImageNet)
• 1.2 million images
• ~1000 images per
category
Architecture:
• dozens of layers
• millions of parameters
Concept learning in computer vision:
deep neural networks and big data
input
output:
“daisy”
layers of feature maps
12. We would like to study one-shot learning in a domain with…
1) Natural, high-dimensional concepts.
2) A reasonable chance of building models that can see most of
the structure that people see.
3) Insights that generalize across domains.
A testbed domain for one-shot learning
15. where are the others?
generating
new examples
generating
new concepts
human-level concept learning
parsing
the speed of learning the richness of representation
“one-shot
learning”
20. m a set of discrete primitive actions learned
m the background set (Fig. 3A, i), such that
probability of the next action depends on
parameterized curves (splines) by sampling the
control points and scale parameters for each
subpart. Last, parts are roughly positioned to
76
1 2 3 4 5stroke order:
e distinguished
stroke and an
e five best pro-
bability scores
ew test images
ight) is shown
tion score (log
as black dots.
hown with their
t).
Original
Human drawings
21. m a set of discrete primitive actions learned
m the background set (Fig. 3A, i), such that
probability of the next action depends on
parameterized curves (splines) by sampling the
control points and scale parameters for each
subpart. Last, parts are roughly positioned to
76
1 2 3 4 5stroke order:
e distinguished
stroke and an
e five best pro-
bability scores
ew test images
ight) is shown
tion score (log
as black dots.
hown with their
t).
Original
Human drawings
22. Original
Human drawings
m a set of discrete primitive actions learned
m the background set (Fig. 3A, i), such that
probability of the next action depends on
parameterized curves (splines) by sampling the
control points and scale parameters for each
subpart. Last, parts are roughly positioned to
76
1 2 3 4 5stroke order:
e distinguished
stroke and an
e five best pro-
bability scores
ew test images
ight) is shown
tion score (log
as black dots.
hown with their
t).
23. Original
Human drawings
m a set of discrete primitive actions learned
m the background set (Fig. 3A, i), such that
probability of the next action depends on
parameterized curves (splines) by sampling the
control points and scale parameters for each
subpart. Last, parts are roughly positioned to
76
1 2 3 4 5stroke order:
e distinguished
stroke and an
e five best pro-
bability scores
ew test images
ight) is shown
tion score (log
as black dots.
hown with their
t).
24. Original
Human drawings
m a set of discrete primitive actions learned
m the background set (Fig. 3A, i), such that
probability of the next action depends on
parameterized curves (splines) by sampling the
control points and scale parameters for each
subpart. Last, parts are roughly positioned to
76
1 2 3 4 5stroke order:
e distinguished
stroke and an
e five best pro-
bability scores
ew test images
ight) is shown
tion score (log
as black dots.
hown with their
t).
25. Original
Human drawings
m a set of discrete primitive actions learned
m the background set (Fig. 3A, i), such that
probability of the next action depends on
parameterized curves (splines) by sampling the
control points and scale parameters for each
subpart. Last, parts are roughly positioned to
76
1 2 3 4 5stroke order:
e distinguished
stroke and an
e five best pro-
bability scores
ew test images
ight) is shown
tion score (log
as black dots.
hown with their
t).
26. ...
relation:!
attached along
relation:!
attached along
relation:!
attached at start
exemplars
raw data
object template
parts
sub-parts
primitives
(1D curvelets, 2D
patches, 3D geons,
actions, sounds, etc.)
type level
token level
inference
✓
I
latent program
raw binary image
Bayes’ rule
renderer prior on parts,
relations, etc.
P(✓|I) =
P(I|✓)P(✓)
P(I)
Bayesian Program Learning (BPL)
27. ...
relation:!
attached along
relation:!
attached along
relation:!
attached at start
exemplars
raw data
object template
parts
sub-parts
primitives
(1D curvelets, 2D
patches, 3D geons,
actions, sounds, etc.)
type level
token level
inference
✓
I
latent program
raw binary image
Bayes’ rule
renderer prior on parts,
relations, etc.
P(✓|I) =
P(I|✓)P(✓)
P(I)
Bayesian Program Learning (BPL)
Key ingredients for learning
good programs:
compositionality
causality
learning-to-learn
28. Principle 1: Compositionality
Parts:
motor powers wheels
handlebars on post
wheels below platform
Relations:
supports
(e.g., Winston, 1975; Fodor, 1975; Marr & Nishihara, 1978; Biederman, 1987)
Building complex representations from simpler parts/primitives.
29. Principle 1: Compositionality
wheels
Parts:
motor powers wheels
handlebars on post
wheels below platform
Relations:
supports
(e.g., Winston, 1975; Fodor, 1975; Marr & Nishihara, 1978; Biederman, 1987)
Building complex representations from simpler parts/primitives.
30. Principle 1: Compositionality
handlebars
wheels
Parts:
motor powers wheels
handlebars on post
wheels below platform
Relations:
supports
(e.g., Winston, 1975; Fodor, 1975; Marr & Nishihara, 1978; Biederman, 1987)
Building complex representations from simpler parts/primitives.
31. Principle 1: Compositionality
handlebars
posts
wheels
Parts:
motor powers wheels
handlebars on post
wheels below platform
Relations:
supports
(e.g., Winston, 1975; Fodor, 1975; Marr & Nishihara, 1978; Biederman, 1987)
Building complex representations from simpler parts/primitives.
34. Principle 1: Compositionality
Segway
One-shot learning
handlebars
posts
seats
motors
wheels
Parts:
motor powers wheels
handlebars on post
wheels below platform
Relations:
supports
(e.g., Winston, 1975; Fodor, 1975; Marr & Nishihara, 1978; Biederman, 1987)
Building complex representations from simpler parts/primitives.
35. Principle 2: Causality
Representing hypothetical real-world processes that produce perceptual
observations.
(analysis-by-synthesis; intuitive theories; concepts as causal explanations)
same causal process different examples
(e.g., computer science: Revow et al, 1996; Hinton & Nair, 2006; Freyd, 1983; cog. psychology and cog.
neuroscience; Freyd, 1983; Loncamp et al., 2003; James & Gauthier, 2006, 2009)
36. Principle 2: Causality
Representing hypothetical real-world processes that produce perceptual
observations.
(analysis-by-synthesis; intuitive theories; concepts as causal explanations)
same causal process different examples
(e.g., computer science: Revow et al, 1996; Hinton & Nair, 2006; Freyd, 1983; cog. psychology and cog.
neuroscience; Freyd, 1983; Loncamp et al., 2003; James & Gauthier, 2006, 2009)
…
Is it growing too
close to my house?
How will it grow if I trim it?
37. Principle 2: Causality
Representing hypothetical real-world processes that produce perceptual
observations.
(analysis-by-synthesis; intuitive theories; concepts as causal explanations)
an airplane is parked on the
tarmac at an airport
a group of people standing on
top of a beach
ng a horse on a
road
same causal process different examples
(e.g., computer science: Revow et al, 1996; Hinton & Nair, 2006; Freyd, 1983; cog. psychology and cog.
neuroscience; Freyd, 1983; Loncamp et al., 2003; James & Gauthier, 2006, 2009)
…
Is it growing too
close to my house?
How will it grow if I trim it?
38. Principle 2: Causality
Representing hypothetical real-world processes that produce perceptual
observations.
(analysis-by-synthesis; intuitive theories; concepts as causal explanations)
an airplane is parked on the
tarmac at an airport
a group of people standing on
top of a beach
ng a horse on a
road
same causal process different examples
(e.g., computer science: Revow et al, 1996; Hinton & Nair, 2006; Freyd, 1983; cog. psychology and cog.
neuroscience; Freyd, 1983; Loncamp et al., 2003; James & Gauthier, 2006, 2009)
…
Is it growing too
close to my house?
How will it grow if I trim it?
Machine caption generation:
“A group of people standing on top of a beach”
39. 1 2 3 4 5 6 7 8 9 10
0
1000
2000
3000
4000
5000
Number of strokes
frequency
number of strokes 1 2
3
1 2
3
3
≥ 31 2
Start position for strokes in each positionstroke start positions
Stroke
global transformations
relations between strokes
1
2
12
1
2
1
2
independent (34%) attached at start (5%) attached at end (11%) attached along (50%)
Principle 3: Learning-to-learn
stroke primitives
Experience with previous concepts helps for learning new concepts.
(e.g., Harlow, 1949; Schyns, Goldstone, & Thibaut, 1998; Smith et al., 2002)
40. ...
relation:!
attached along
relation:!
attached along
relation:!
attached at start
Bayesian Program Learning (BPL)
exemplars
raw data
object template
parts
sub-parts
primitives
(1D curvelets, 2D
patches, 3D geons,
actions, sounds, etc.)
type level
token level
...
relation:!
attached at start
B
https://github.com/brendenlake/BPL
42. ...
relation:!
attached along
relation:!
attached along
relation:!
attached at start
exemplars
raw data
object template
parts
sub-parts
primitives
(1D curvelets, 2D
patches, 3D geons,
actions, sounds, etc.)
type level
token level
inference
✓
I
latent program
raw binary image
Bayes’ rule
renderer prior on parts,
relations, etc.
P(✓|I) =
P(I|✓)P(✓)
P(I)
Bayesian Program Learning (BPL)
48. Novel, large-scale,
reverse engineering paradigm
Behavioral data
(Omniglot)
Computational
model
Behavioral
experiment
(drawing
new examples)
visual Turing test
(behavioral experiment)
simulated
behavior
human
behavior
Standard
evaluation paradigm
Computational
model
Behavioral
experiment
compare behavior
49. Novel, large-scale,
reverse engineering paradigm
Behavioral data
(Omniglot)
Computational
model
Behavioral
experiment
(drawing
new examples)
visual Turing test
(behavioral experiment)
simulated
behavior
human
behavior
Standard
evaluation paradigm
Computational
model
Behavioral
experiment
compare behavior
repeated for each alternative model
50. Experimental design
• Participants (judges) on Amazon Mechanical Turk (N = 147)
• Each judge saw behavior from only one algorithm
• Instructions: Computer program that simulates how people
draw a new example. Can you tell humans from machines?
• Pre-experiment comprehension tests
• 49 trials with accuracy displayed after each block of 10.
51. new exemplarsnew exemplars (dynamic)new concepts (from type)new concepts (unconstrained)
40
45
50
55
60
65
70
75
80
85Identification(ID)Level
(%judgeswhocorrectlyIDmachinevs.human)
Generating
new
exem
plars
Indistinguishable
BPL Lesion (no compositionality)
BPL
BPL Lesion (no learning-to-learn)
Bayesian Program Learning models
Visual Turing Test — Generating new examples
error bars ± 1 SEM
53. One-shot classification performance
Errorrate(%)
0
5
10
15
20
25
30
35
People
BPL Lesion (no compositionality)
BPL
BPL Lesion (no learning-to-learn)
Deep Siamese Convnet
(Koch et al., 2015)
Hierarchical Deep
Deep Convnet
Bayesian Program Learning models Deep neural networks
After all models pre-trained on 30 alphabets of characters.
(no causality)
54. Generating new
concepts
(unconstrained)
Alphabet
Human or Machine?
Generating new concepts
(from type)
Alphabet
Human or Machine?
Human or Machine?
Human or Machine?
Generating new examples
(dynamic)
Human
or Machine?
More large-scale behavioral experiments to evaluate BPL model
55. Generating new
concepts
(unconstrained)
Alphabet
Human or Machine?
Generating new concepts
(from type)
Alphabet
Human or Machine?
Human or Machine?
Human or Machine?
Generating new examples
(dynamic)
Human
or Machine?
More large-scale behavioral experiments to evaluate BPL model
56. new exemplarsnew exemplars (dynamic)new concepts (from type)new concepts (unconstrained)
40
45
50
55
60
65
70
75
80
85
Identification(ID)Level
(%judgeswhocorrectlyIDmachinevs.human)
Generating
new
exem
plars
Generating
new
exem
plars (dynam
ic)
Generating
new
concepts (from
type)
Generating
new
concepts (unconstrained)
Indistinguishable
BPL Lesion (no compositionality)
BPL
BPL Lesion (no learning-to-learn)
Bayesian Program Learning models
Visual Turing Tests
http://cims.nyu.edu/~brenden/supplemental/turingtests/turingtests.html
https://github.com/brendenlake/visual-turing-tests
57. • Simple visual concepts with real-world complexity.
• Computational model that embodies three principles —
compositionality, causality, and learning-to-learn — supporting rich
concept learning from very few examples
• Through large-scale, multi-layered behavioral evaluations, the
model’s creative generalizations are difficult to distinguish from
human behavior
• Current and future directions include understanding developmental
and neural mechanisms.
Interim conclusions: Case study 1
58. Outline
case study 1:
handwritten characters
Concept learning
Question asking
...
Are any ships 3 tiles long?
active learning with rich questions
Are the blue ship and red ship
parallel?
case study 2:
recursive visual concepts
59. Outline
case study 1:
handwritten characters
Concept learning
Question asking
...
Are any ships 3 tiles long?
active learning with rich questions
Are the blue ship and red ship
parallel?
case study 2:
recursive visual concepts
Steve Piantadosi
60. If the mind can infer compositional, causal
programs from their outputs— what are the limits?
61. What is another example
of the same species?
causal knowledge influences perception and extrapolation
62. What is another example
of the same species?
causal knowledge influences perception and extrapolation
63. What is another example
of the same species?
causal knowledge influences perception and extrapolation
Angle: 35 degrees
Start symbol: F+FG
F ➔ C0FF-[C1-F+F]+[C2+F-F]G
G ➔ C0FF+[C1+F]+[C3-F]
more similar according to L-system program more similar according to deep neural network
64. Before infection After infection
A surface was infected with a new type of alien crystal.
The crystal has been growing for some time.
A B C
What do you think the crystal will look like if you let it grow longer?
65. Before infection After infection
A surface was infected with a new type of alien crystal.
The crystal has been growing for some time.
A B C
What do you think the crystal will look like if you let it grow longer?
66. Before infection After infection
A surface was infected with a new type of alien crystal.
The crystal has been growing for some time.
A B C
What do you think the crystal will look like if you let it grow longer?
67. Before infection After infection
A surface was infected with a new type of alien crystal.
The crystal has been growing for some time.
A B C
What do you think the crystal will look like if you let it grow longer?
68. L-system (Lindenmayer, 1968)
Angle: 120 degrees
Start symbol: F
F ➔ F-G+F+G-F
G ➔ GG
Legend
“+” : right turn
“-” : left turn
“F” : go straight
“G” : go straight
Iteration1Iteration2…
F-G+F+G-F
Image Dynamics Symbolic
F-G+F+G-F…
-GG+F-G+FG-F+GG-…
F-G+F+G-F
Compositional language for expressing causal processes
69. L-system (Lindenmayer, 1968)
Angle: 120 degrees
Start symbol: F
F ➔ F-G+F+G-F
G ➔ GG
Legend
“+” : right turn
“-” : left turn
“F” : go straight
“G” : go straight
Iteration1Iteration2…
F-G+F+G-F
Image Dynamics Symbolic
F-G+F+G-F…
-GG+F-G+FG-F+GG-…
F-G+F+G-F
Compositional language for expressing causal processes
70. Experiment 1 - Classification
• No feedback
• Six choices
• Distractors generated by replacing rule
(with new rule from grammar)
• Participants recruited on Amazon
Mechanical Turk in USA (N = 30)
• 24 different fractal concepts
“latent” condition
Before infection After infection
“stepwise” condition
Before infection Step 1 Step 2
Visual Recursion Task (VRT)
[Maurício Martins and colleagues]
74. Bayesian program learning
Meta-grammar
L-System
Image
renderer
I
Axiom = F
Angle =120
F ➔ F-G+F+G-F
G ➔ GG
depth = 2L
I
(context free)
d
Start ➔ XYZ
X ➔ F
G
Z ➔ F
G
‘’
Y ➔ F
G
YY
-Y+
+Y-
‘’
Angle ➔ 60
90
120
Axiom ➔ F
Bayesian Inference
(MCMC algorithm)
Inference
P(L, d|I) / P(I|L, d)P(L)P(d)
M
L
75. Bayesian program learning
Meta-grammar
L-System
Image
renderer
I
Axiom = F
Angle =120
F ➔ F-G+F+G-F
G ➔ GG
depth = 2L
I
(context free)
d
Start ➔ XYZ
X ➔ F
G
Z ➔ F
G
‘’
Y ➔ F
G
YY
-Y+
+Y-
‘’
Angle ➔ 60
90
120
Axiom ➔ F
Bayesian Inference
(MCMC algorithm)
Inference
P(L, d|I) / P(I|L, d)P(L)P(d)
M
L
Note
The model has several key advantage: it has
exactly the right programming language.
If people can infer programs like these, it’s because
their “language of thought” is general enough to
represent these causal descriptions, and many
more…
78. Human performance
chance
Why is the model better than people?
A failure of search?
neural net
program induction (with limited MCMC)
Limited search (MCMC)
predicts which concepts are
easier to learn: r = 0.57
Easy
Hard
Rational process models
(Griffiths, Vul, Hamrick, Lieder, Goodman, etc.)
79. The “look for a smaller copy” heuristic
Iteration 2 Iteration 3 Iteration 2 Iteration 3
87. Experiment 2 - Generation
• Participants recruited on Amazon Mechanical Turk in
USA (n = 30)
• 13 different fractal concepts
(subset of previous experiment)
• No feedback
“latent” condition
Before infection After infection
“stepwise” condition
Before infection Step 1 Step 2 Step 3
89. individual decisions (clicks)
Results - Experiment 2 - Generation
precisely right exemplar
**
*
**
** p < 0.001
* p < 0.05random
deep neural network
Always use “none” button
baselines:
Always use “all” button
90. individual decisions (clicks)
Results - Experiment 2 - Generation
precisely right exemplar
**
*
**
** p < 0.001
* p < 0.05random
deep neural network
Always use “none” button
baselines:
Always use “all” button
91. What do you think the crystal will look like if you let it grow longer?
number
above image
indicates
frequency
92. What do you think the crystal will look like if you let it grow longer?
number
above image
indicates
frequency
93. What do you think the crystal will look like if you let it grow longer?
number
above image
indicates
frequency
94. What do you think the crystal will look like if you let it grow longer?
number
above image
indicates
frequency
95. Interim conclusions: Case study 2
•Explored very difficult concept learning task
•Computational model that infers causal processes from
composition of primitives
•People generalized in ways consistent with model (and
inconsistent with other models), despite model’s substantial
advantages.
•Generation aided by having a sequence of examples, rather
than just one — A pattern the model does not fully explain.
97. Outline
case study 1:
handwritten characters
case study 2:
fractal concepts
Concept learning
Question asking
...
Are any ships 3 tiles long?
active learning with rich questions
Are the blue ship and red ship
parallel?
98. Outline
case study 1:
handwritten characters
case study 2:
fractal concepts
Concept learning
Question asking
...
Are any ships 3 tiles long?
active learning with rich questions
Are the blue ship and red ship
parallel?
Todd GureckisAnselm Rothe
Rothe, Lake, & Gureckis (2016). Proceedings of
the 38th Annual Conference of the Cognitive
Science Society. (More content in prep.).
101. rich, human questions
active learning for people and machines
How does it
make
sound?
What is the
difference
between the
second and
the third?
?
102. rich, human questions
active learning for people and machines
How does it
make
sound?
What is the
difference
between the
second and
the third?
Which
features are
especially
important?
?
103. rich, human questions
active learning for people and machines
How does it
make
sound?
What is the
difference
between the
second and
the third?
Which
features are
especially
important?
simple, machine questions
?
104. rich, human questions
active learning for people and machines
How does it
make
sound?
What is the
difference
between the
second and
the third?
Which
features are
especially
important?
simple, machine questions
What is the
the category
label of this
object??
105. rich, human questions
active learning for people and machines
How does it
make
sound?
What is the
difference
between the
second and
the third?
Which
features are
especially
important?
simple, machine questions
What is the
the category
label of this
object?
What is the
the category
label of this
object?
?
106. rich, human questions
active learning for people and machines
How does it
make
sound?
What is the
difference
between the
second and
the third?
Which
features are
especially
important?
simple, machine questions
What is the
the category
label of this
object?
What is the
the category
label of this
object?
What is the
the category
label of this
object?
?
107. We need a task that frees people to ask rich questions, yet is
still amendable to formal (ideal observer) modeling.
A testbed domain for question asking
108. We need a task that frees people to ask rich questions, yet is
still amendable to formal (ideal observer) modeling.
A testbed domain for question asking
(Battleship task: Gureckis & Markant, 2009; Markant & Gureckis, 2012, 2014)
109. Experiment 1: Free-form question asking
11 Markant & Gureckis 2009
A B C D E F
1
2
3
4
5
6
Hidden gameboardps
random
samples
A B C D E F
1
2
3
4
5
6
Revealed gameboard
enerative model Current data/context
Identify the hidden
gameboard!
Goal
es
Ground truth
3 ships (blue, purple, red)
3 possible sizes (2-4 tiles)
1.6 million possible configurations
110. Experiment 1: Free-form question asking
...
Phase 1: Sampling
11 Markant & Gureckis 2009
A B C D E F
1
2
3
4
5
6
Hidden gameboardps
random
samples
A B C D E F
1
2
3
4
5
6
Revealed gameboard
enerative model Current data/context
Identify the hidden
gameboard!
Goal
es
Ground truth
3 ships (blue, purple, red)
3 possible sizes (2-4 tiles)
1.6 million possible configurations
111. Experiment 1: Free-form question asking
...
Phase 1: Sampling
11 Markant & Gureckis 2009
A B C D E F
1
2
3
4
5
6
Hidden gameboardps
random
samples
A B C D E F
1
2
3
4
5
6
Revealed gameboard
enerative model Current data/context
Identify the hidden
gameboard!
Goal
es
Ground truth
3 ships (blue, purple, red)
3 possible sizes (2-4 tiles)
1.6 million possible configurations
Phase 2: Question asking
Is the red ship horizontal?
Constraints
• one word answers
• no combinations
112. RESULTS
17
A B C D E F
1
2
3
4
5
6
Context Example questions
...
RESULTS
17
At what location is the top left part of the purple ship?
What is the location of one purple tile?
Is the blue ship horizontal?
Is the red ship 2 tiles long?
Is the purple ship horizontal?
Is the red ship horizontal?
Context Example questions
...
Game board Example Questions
Results: generated questions
x18 different game scenarios…
113. “How many squares long is the blue ship?”
“How long is the blue ship?”
“How many tiles is the blue ship?”
…
}shipsize(blue)
“Is the blue ship horizontal?”
“Does the blue one go from left to right?”
…
horizontal(blue)
...
}
Extracting semantics from free-form questions
114. y revealed game boards (see
e introduced participants to
letting them click on a pre-
hich are the past queries X
We chose this format of tile-
he warm-up phase, to give
s playing a game that was
bsequently, as a comprehen-
ked to indicate the possible
, whether the tile could be
he task would only continue
ectly (or a maximum of six
the following prompt: “If
ask any question about the
you ask?” (represented as x
ded participants’ responses.
at combinations of questions
wo questions together with
to be answerable with a sin-
ord, a number, true/false, or
ipants could not ask for the
ce, although their creativity
to practical limitations par-
n per trial, no feedback was
ing phase. We emphasized
sk questions as though they
ady had experience with in
produce a variety of different
N Location/standard queries
24 What color is at [row][column]?
24 Is there a ship at [row][column]?
31 Is there a [color incl water] tile at [row][column]?
Region queries
4 Is there any ship in row [row]?
9 Is there any part of the [color] ship in row [row]?
5 How many tiles in row [row] are occupied by ships?
1 Are there any ships in the bottom half of the grid?
10 Is there any ship in column [column]?
10 Is there any part of the [color] ship in column [column]?
3 Are all parts of the [color] ship in column [column]?
2 How many tiles in column [column] are occupied by ships?
1 Is any part of the [color] ship in the left half of the grid?
Ship size queries
185 How many tiles is the [color] ship?
71 Is the [color] ship [size] tiles long?
8 Is the [color] ship [size] or more tiles long?
5 How many ships are [size] tiles long?
8 Are any ships [size] tiles long?
2 Are all ships [size] tiles long?
2 Are all ships the same size?
2 Do the [color1] ship and the [color2] ship have the same size?
3 Is the [color1] ship longer than the [color2] ship?
3 How many tiles are occupied by ships?
Ship orientation queries
94 Is the [color] ship horizontal?
7 How many ships are horizontal?
3 Are there more horizontal ships than vertical ships?
1 Are all ships horizontal?
4 Are all ships vertical?
7 Are the [color1] ship and the [color2] ship parallel?
Adjacency queries
12 Do the [color1] ship and the [color2] ship touch?
6 Are any of the ships touching?
9 Does the [color] ship touch any other ship?
2 Does the [color] ship touch both other ships?
Demonstration queries
14 What is the location of one [color] tile?
28 At what location is the top left part of the [color] ship?
5 At what location is the bottom right part of the [color] ship?
115. y revealed game boards (see
e introduced participants to
letting them click on a pre-
hich are the past queries X
We chose this format of tile-
he warm-up phase, to give
s playing a game that was
bsequently, as a comprehen-
ked to indicate the possible
, whether the tile could be
he task would only continue
ectly (or a maximum of six
the following prompt: “If
ask any question about the
you ask?” (represented as x
ded participants’ responses.
at combinations of questions
wo questions together with
to be answerable with a sin-
ord, a number, true/false, or
ipants could not ask for the
ce, although their creativity
to practical limitations par-
n per trial, no feedback was
ing phase. We emphasized
sk questions as though they
ady had experience with in
produce a variety of different
N Location/standard queries
24 What color is at [row][column]?
24 Is there a ship at [row][column]?
31 Is there a [color incl water] tile at [row][column]?
Region queries
4 Is there any ship in row [row]?
9 Is there any part of the [color] ship in row [row]?
5 How many tiles in row [row] are occupied by ships?
1 Are there any ships in the bottom half of the grid?
10 Is there any ship in column [column]?
10 Is there any part of the [color] ship in column [column]?
3 Are all parts of the [color] ship in column [column]?
2 How many tiles in column [column] are occupied by ships?
1 Is any part of the [color] ship in the left half of the grid?
Ship size queries
185 How many tiles is the [color] ship?
71 Is the [color] ship [size] tiles long?
8 Is the [color] ship [size] or more tiles long?
5 How many ships are [size] tiles long?
8 Are any ships [size] tiles long?
2 Are all ships [size] tiles long?
2 Are all ships the same size?
2 Do the [color1] ship and the [color2] ship have the same size?
3 Is the [color1] ship longer than the [color2] ship?
3 How many tiles are occupied by ships?
Ship orientation queries
94 Is the [color] ship horizontal?
7 How many ships are horizontal?
3 Are there more horizontal ships than vertical ships?
1 Are all ships horizontal?
4 Are all ships vertical?
7 Are the [color1] ship and the [color2] ship parallel?
Adjacency queries
12 Do the [color1] ship and the [color2] ship touch?
6 Are any of the ships touching?
9 Does the [color] ship touch any other ship?
2 Does the [color] ship touch both other ships?
Demonstration queries
14 What is the location of one [color] tile?
28 At what location is the top left part of the [color] ship?
5 At what location is the bottom right part of the [color] ship?
N Location/standard queries
24 What color is at [row][column]?
24 Is there a ship at [row][column]?
31 Is there a [color incl water] tile at [row][column]?
Region queries
4 Is there any ship in row [row]?
9 Is there any part of the [color] ship in row [row]?
5 How many tiles in row [row] are occupied by ships?
1 Are there any ships in the bottom half of the grid?
10 Is there any ship in column [column]?
10 Is there any part of the [color] ship in column [column]?
3 Are all parts of the [color] ship in column [column]?
2 How many tiles in column [column] are occupied by ships?
1 Is any part of the [color] ship in the left half of the grid?
Ship size queries
185 How many tiles is the [color] ship?
71 Is the [color] ship [size] tiles long?
8 Is the [color] ship [size] or more tiles long?
5 How many ships are [size] tiles long?
8 Are any ships [size] tiles long?
2 Are all ships [size] tiles long?
2 Are all ships the same size?
2 Do the [color1] ship and the [color2] ship have the same size?
116. y revealed game boards (see
e introduced participants to
letting them click on a pre-
hich are the past queries X
We chose this format of tile-
he warm-up phase, to give
s playing a game that was
bsequently, as a comprehen-
ked to indicate the possible
, whether the tile could be
he task would only continue
ectly (or a maximum of six
the following prompt: “If
ask any question about the
you ask?” (represented as x
ded participants’ responses.
at combinations of questions
wo questions together with
to be answerable with a sin-
ord, a number, true/false, or
ipants could not ask for the
ce, although their creativity
to practical limitations par-
n per trial, no feedback was
ing phase. We emphasized
sk questions as though they
ady had experience with in
produce a variety of different
N Location/standard queries
24 What color is at [row][column]?
24 Is there a ship at [row][column]?
31 Is there a [color incl water] tile at [row][column]?
Region queries
4 Is there any ship in row [row]?
9 Is there any part of the [color] ship in row [row]?
5 How many tiles in row [row] are occupied by ships?
1 Are there any ships in the bottom half of the grid?
10 Is there any ship in column [column]?
10 Is there any part of the [color] ship in column [column]?
3 Are all parts of the [color] ship in column [column]?
2 How many tiles in column [column] are occupied by ships?
1 Is any part of the [color] ship in the left half of the grid?
Ship size queries
185 How many tiles is the [color] ship?
71 Is the [color] ship [size] tiles long?
8 Is the [color] ship [size] or more tiles long?
5 How many ships are [size] tiles long?
8 Are any ships [size] tiles long?
2 Are all ships [size] tiles long?
2 Are all ships the same size?
2 Do the [color1] ship and the [color2] ship have the same size?
3 Is the [color1] ship longer than the [color2] ship?
3 How many tiles are occupied by ships?
Ship orientation queries
94 Is the [color] ship horizontal?
7 How many ships are horizontal?
3 Are there more horizontal ships than vertical ships?
1 Are all ships horizontal?
4 Are all ships vertical?
7 Are the [color1] ship and the [color2] ship parallel?
Adjacency queries
12 Do the [color1] ship and the [color2] ship touch?
6 Are any of the ships touching?
9 Does the [color] ship touch any other ship?
2 Does the [color] ship touch both other ships?
Demonstration queries
14 What is the location of one [color] tile?
28 At what location is the top left part of the [color] ship?
5 At what location is the bottom right part of the [color] ship?
t of tile-
to give
that was
mprehen-
possible
could be
continue
m of six
mpt: “If
bout the
nted as x
sponses.
uestions
her with
ith a sin-
/false, or
k for the
reativity
4 Is there any ship in row [row]?
9 Is there any part of the [color] ship in row [row]?
5 How many tiles in row [row] are occupied by ships?
1 Are there any ships in the bottom half of the grid?
10 Is there any ship in column [column]?
10 Is there any part of the [color] ship in column [column]?
3 Are all parts of the [color] ship in column [column]?
2 How many tiles in column [column] are occupied by ships?
1 Is any part of the [color] ship in the left half of the grid?
Ship size queries
185 How many tiles is the [color] ship?
71 Is the [color] ship [size] tiles long?
8 Is the [color] ship [size] or more tiles long?
5 How many ships are [size] tiles long?
8 Are any ships [size] tiles long?
2 Are all ships [size] tiles long?
2 Are all ships the same size?
2 Do the [color1] ship and the [color2] ship have the same size?
3 Is the [color1] ship longer than the [color2] ship?
3 How many tiles are occupied by ships?
Ship orientation queries
94 Is the [color] ship horizontal?
7 How many ships are horizontal?
3 Are there more horizontal ships than vertical ships?
1 Are all ships horizontal?
4 Are all ships vertical?
117. 81
y revealed game boards (see
e introduced participants to
letting them click on a pre-
hich are the past queries X
We chose this format of tile-
he warm-up phase, to give
s playing a game that was
bsequently, as a comprehen-
ked to indicate the possible
, whether the tile could be
he task would only continue
ectly (or a maximum of six
the following prompt: “If
ask any question about the
you ask?” (represented as x
ded participants’ responses.
at combinations of questions
wo questions together with
to be answerable with a sin-
ord, a number, true/false, or
ipants could not ask for the
ce, although their creativity
to practical limitations par-
n per trial, no feedback was
ing phase. We emphasized
sk questions as though they
ady had experience with in
produce a variety of different
N Location/standard queries
24 What color is at [row][column]?
24 Is there a ship at [row][column]?
31 Is there a [color incl water] tile at [row][column]?
Region queries
4 Is there any ship in row [row]?
9 Is there any part of the [color] ship in row [row]?
5 How many tiles in row [row] are occupied by ships?
1 Are there any ships in the bottom half of the grid?
10 Is there any ship in column [column]?
10 Is there any part of the [color] ship in column [column]?
3 Are all parts of the [color] ship in column [column]?
2 How many tiles in column [column] are occupied by ships?
1 Is any part of the [color] ship in the left half of the grid?
Ship size queries
185 How many tiles is the [color] ship?
71 Is the [color] ship [size] tiles long?
8 Is the [color] ship [size] or more tiles long?
5 How many ships are [size] tiles long?
8 Are any ships [size] tiles long?
2 Are all ships [size] tiles long?
2 Are all ships the same size?
2 Do the [color1] ship and the [color2] ship have the same size?
3 Is the [color1] ship longer than the [color2] ship?
3 How many tiles are occupied by ships?
Ship orientation queries
94 Is the [color] ship horizontal?
7 How many ships are horizontal?
3 Are there more horizontal ships than vertical ships?
1 Are all ships horizontal?
4 Are all ships vertical?
7 Are the [color1] ship and the [color2] ship parallel?
Adjacency queries
12 Do the [color1] ship and the [color2] ship touch?
6 Are any of the ships touching?
9 Does the [color] ship touch any other ship?
2 Does the [color] ship touch both other ships?
Demonstration queries
14 What is the location of one [color] tile?
28 At what location is the top left part of the [color] ship?
5 At what location is the bottom right part of the [color] ship?
continue
um of six
mpt: “If
about the
nted as x
esponses.
questions
ther with
with a sin-
e/false, or
sk for the
creativity
ions par-
back was
mphasized
ough they
e with in
different
m which
Ship size queries
185 How many tiles is the [color] ship?
71 Is the [color] ship [size] tiles long?
8 Is the [color] ship [size] or more tiles long?
5 How many ships are [size] tiles long?
8 Are any ships [size] tiles long?
2 Are all ships [size] tiles long?
2 Are all ships the same size?
2 Do the [color1] ship and the [color2] ship have the same size?
3 Is the [color1] ship longer than the [color2] ship?
3 How many tiles are occupied by ships?
Ship orientation queries
94 Is the [color] ship horizontal?
7 How many ships are horizontal?
3 Are there more horizontal ships than vertical ships?
1 Are all ships horizontal?
4 Are all ships vertical?
7 Are the [color1] ship and the [color2] ship parallel?
Adjacency queries
12 Do the [color1] ship and the [color2] ship touch?
6 Are any of the ships touching?
9 Does the [color] ship touch any other ship?
2 Does the [color] ship touch both other ships?
Demonstration queries
14 What is the location of one [color] tile?
28 At what location is the top left part of the [color] ship?
5 At what location is the bottom right part of the [color] ship?
118. A B C D E F
1
2
3
4
5
6
Trial4
At what location is the top left part of the purple ship?
What is the location of one purple tile?
Is the blue ship horizontal?
Is the red ship 2 tiles long?
Is the purple ship horizontal?
Is the red ship horizontal?
question options
At what location is the top left part of the purple ship?
What is the location of one purple tile?
Experiment 2: Evaluating questions for quality
ranked list
best
worst
120. y revealed game boards (see
e introduced participants to
letting them click on a pre-
hich are the past queries X
We chose this format of tile-
he warm-up phase, to give
s playing a game that was
bsequently, as a comprehen-
ked to indicate the possible
, whether the tile could be
he task would only continue
ectly (or a maximum of six
the following prompt: “If
ask any question about the
you ask?” (represented as x
ded participants’ responses.
at combinations of questions
wo questions together with
to be answerable with a sin-
ord, a number, true/false, or
ipants could not ask for the
ce, although their creativity
to practical limitations par-
n per trial, no feedback was
ing phase. We emphasized
sk questions as though they
ady had experience with in
produce a variety of different
N Location/standard queries
24 What color is at [row][column]?
24 Is there a ship at [row][column]?
31 Is there a [color incl water] tile at [row][column]?
Region queries
4 Is there any ship in row [row]?
9 Is there any part of the [color] ship in row [row]?
5 How many tiles in row [row] are occupied by ships?
1 Are there any ships in the bottom half of the grid?
10 Is there any ship in column [column]?
10 Is there any part of the [color] ship in column [column]?
3 Are all parts of the [color] ship in column [column]?
2 How many tiles in column [column] are occupied by ships?
1 Is any part of the [color] ship in the left half of the grid?
Ship size queries
185 How many tiles is the [color] ship?
71 Is the [color] ship [size] tiles long?
8 Is the [color] ship [size] or more tiles long?
5 How many ships are [size] tiles long?
8 Are any ships [size] tiles long?
2 Are all ships [size] tiles long?
2 Are all ships the same size?
2 Do the [color1] ship and the [color2] ship have the same size?
3 Is the [color1] ship longer than the [color2] ship?
3 How many tiles are occupied by ships?
Ship orientation queries
94 Is the [color] ship horizontal?
7 How many ships are horizontal?
3 Are there more horizontal ships than vertical ships?
1 Are all ships horizontal?
4 Are all ships vertical?
7 Are the [color1] ship and the [color2] ship parallel?
Adjacency queries
12 Do the [color1] ship and the [color2] ship touch?
6 Are any of the ships touching?
9 Does the [color] ship touch any other ship?
2 Does the [color] ship touch both other ships?
Demonstration queries
14 What is the location of one [color] tile?
28 At what location is the top left part of the [color] ship?
5 At what location is the bottom right part of the [color] ship?
What principles and representations can explain the
productivity and creativity of question asking?
How do people think of a question to ask?
121. “Color at tile A1?”
( color A1 )
“Size of the blue ship?”
( size Blue )
“Orientation of the blue ship?”
( orient Blue )
Game primitives
...
question asking as program synthesis
122. (+ X X ) (= X X )
Primitive operators
“Color at tile A1?”
( color A1 )
“Size of the blue ship?”
( size Blue )
“Orientation of the blue ship?”
( orient Blue )
Game primitives
...
question asking as program synthesis
123. (+ X X ) (= X X )
Primitive operators
“Are the blue ship and the red
ship parallel?”
( =
( orient Blue )
( orient Red )
)
“What is the total size
of all the ships?”
(+
(+
( size Blue )
( size Red )
)
( size Purple )
)
Novel questions
compositionality
“Color at tile A1?”
( color A1 )
“Size of the blue ship?”
( size Blue )
“Orientation of the blue ship?”
( orient Blue )
Game primitives
...
question asking as program synthesis
124. (+ X X ) (= X X )
Primitive operators
“Are the blue ship and the red
ship parallel?”
( =
( orient Blue )
( orient Red )
)
“What is the total size
of all the ships?”
(+
(+
( size Blue )
( size Red )
)
( size Purple )
)
Novel questions
compositionality
“Color at tile A1?”
( color A1 )
“Size of the blue ship?”
( size Blue )
“Orientation of the blue ship?”
( orient Blue )
Game primitives
...
learning-to-learn
question asking as program synthesis
125. How many ships are three tiles long?
( +
( map
( lambda
x
( =
( size x )
3
)
)
( set Blue Red Purple )
)
)
Are any ships 3 tiles long?
( >
( +
( map
( lambda
x
( =
( size x )
3
)
)
( set Blue Red Purple )
)
)
0
)
Are all ships three tiles long?
( =
( +
( map
( lambda
x
( =
( size x )
3
)
)
( set Blue Red Purple )
)
)
3
)
Compositionality in question structure
126. 3 All questions
GROUP QUESTION FUNCTION EXPRESSION
location What color is at A1? location (color A1)
Is there a ship at A1? locationA (not (= (color A1) Water))
Is there a blue tile at A1? locationD (= (color A1) Blue)
segmentation Is there any ship in row 1? row (> (+ (map ( x (and (= (row x) 1) (not (= (color x) Water)))) (set A1 ... F6))) 0)
Is there any part of the blue ship in row 1? rowD (> (+ (map ( x (and (= (row x) 1) (= (color x) Blue))) (set A1 ... F6))) 0)
Are all parts of the blue ship in row 1? rowDL (> (+ (map ( x (and (= (row x) 1) (= (color x) Blue))) (set A1 ... F6))) 1)
How many tiles in row 1 are occupied by ships? rowNA (+ (map ( x (and (= (row x) 1) (not (= (color x) Water)))) (set A1 ... F6)))
Are there any ships in the bottom half of the grid? rowX2 ...
Is there any ship in column 1? col (> (+ (map ( x (and (= (col x) 1) (not (= (color x) Water)))) (set A1 ... F6))) 0)
Is there any part of the blue ship in column 1? colD (> (+ (map ( x (and (= (col x) 1) (= (color x) Blue))) (set A1 ... F6))) 0)
Are all parts of the blue ship in column 1? colDL (> (+ (map ( x (and (= (col x) 1) (= (color x) Blue))) (set A1 ... F6))) 1)
How many tiles in column 1 are occupied by ships? colNA (+ (map ( x (and (= (col x) 1) (not (= (color x) Water)))) (set A1 ... F6)))
Is any part of the blue ship in the left half of the grid? colX1 ...
ship size How many tiles is the blue ship? shipsize (size Blue)
Is the blue ship 3 tiles long? shipsizeD (= (size Blue) 3)
Is the blue ship 3 or more tiles long? shipsizeM (or (= (size Blue) 3) (> (size Blue) 3))
How many ships are 3 tiles long? shipsizeN (+ (map ( x (= (size x) 3)) (set Blue Red Purple)))
Are any ships 3 tiles long? shipsizeDA (> (+ (map ( x (= (size x) 3)) (set Blue Red Purple))) 0)
Are all ships 3 tiles long? shipsizeDL (= (+ (map ( x (= (size x) 3)) (set Blue Red Purple))) 3)
Are all ships the same size? shipsizeL (= (map ( x (size x)) (set Blue Red Purple)))
Do the blue ship and the red ship have the same size? shipsizeX1 (= (size Blue) (size Red))
Is the blue ship longer than the red ship? shipsizeX2 (> (size Blue) (size Red))
How many tiles are occupied by ships? totalshipsize (+ (map ( x (size x)) (set Blue Red Purple)))
orientation Is the blue ship horizontal? horizontal (= (orient Blue) H)
How many ships are horizontal? horizontalN (+ (map ( x (= (orient x) H) (set Blue Red Purple))))
Are there more horizontal ships than vertical ships? horizontalM (> (+ (map ( x (= (orient x) H) (set Blue Red Purple)))) 1)
Are all ships horizontal? horizontalL (= (+ (map ( x (= (orient x) H) (set Blue Red Purple)))) 3)
Are all ships vertical? verticalL (= (+ (map ( x (= (orient x) H) (set Blue Red Purple)))) 0)
Are the blue ship and the red ship parallel? parallel (= (orient Blue) (orient Red))
touching Do the blue ship and the red ship touch? touching (touch Blue Red)
Are any of the ships touching? touchingA (or (touch Blue Red) (or (touch Blue Purple) (touch Red Purple)))
Does the blue ship touch any other ship? touchingXA (or (touch Blue Red) (touch Blue Purple))
Does the blue ship touch both other ships? touchingX1 (and (touch Blue Red) (touch Blue Purple))
demonstration What is the location of one blue tile? demonstration (draw (select (set A1 ... F6) Blue))*
At what location is the top left part of the blue ship? topleft (topleft Blue)
At what location is the bottom right part of the blue ship? bottomright (bottomright Blue)
Questions as programs
128. Preliminary results for generating questions
Example for ideal observer maximizing expected information gain (EIG):
129. Preliminary results for generating questions
(+
( +
( * 100 (size Red) )
( * 10 (size Blue) )
)
(size Purple)
)
Example for ideal observer maximizing expected information gain (EIG):
130. Preliminary results for generating questions
(+
( +
( * 100 (size Red) )
( * 10 (size Blue) )
)
(size Purple)
)
Example for ideal observer maximizing expected information gain (EIG):
Learning a generative model of questions:
Lake and Gureckis: Proposal for Huawei
as-programs framework developed in the previous
section. We will use a log-linear modeling frame-
work to learn a distribution over questions where
the probability of a question is a function of its fea-
tures, f1(·), . . . , fK(·). The features will include the
expected information gain (EIG) of a question in the
current context, f1(·), as well as features that encode
program length, answer type, and various grammat-
ical operators. We define the energy of a question x
to be
E(x) = ✓1f1(x) + ✓2f2(x) + · · · + ✓KfK(x), (1)
where ✓i is a weight assigned to feature fi. The prob-
ability of a question x is determined by its energy
P(x; ✓) =
exp E(x)
P , (2)
model can be evaluated b
uinely novel human-like
probability questions th
training set. Evaluating
tions – in terms of their
bility, and creativity – p
productivity of the mode
human ability to ask que
We will also explore
games. This will provid
stand how much of the f
to the Battleship task, an
other games or goal-direc
For example, we will ex
such as “Hangman.” In t
word marked by placeho
ditionally, the person pl
grams framework developed in the previous
n. We will use a log-linear modeling frame-
to learn a distribution over questions where
obability of a question is a function of its fea-
f1(·), . . . , fK(·). The features will include the
ted information gain (EIG) of a question in the
t context, f1(·), as well as features that encode
am length, answer type, and various grammat-
perators. We define the energy of a question x
x) = ✓1f1(x) + ✓2f2(x) + · · · + ✓KfK(x), (1)
✓i is a weight assigned to feature fi. The prob-
of a question x is determined by its energy
P(x; ✓) =
exp E(x)
P
x02X exp E(x0)
, (2)
high-energy questions have a lower probabil-
model can be evaluated by a
uinely novel human-like qu
probability questions that
training set. Evaluating th
tions – in terms of their un
bility, and creativity – prov
productivity of the model, w
human ability to ask questi
We will also explore ext
games. This will provide a
stand how much of the form
to the Battleship task, and h
other games or goal-directed
For example, we will explo
such as “Hangman.” In this
word marked by placeholde
ditionally, the person playin
the presence of a letter, o
there any ‘A’s?”, after whiGoal: predict human questions in novel scenarios
x
✓
f(·)
: question
: features (EIG, length, etc.)
: trainable parameters
energy:
generative model:
131. Preliminary results for generating questions
(+
( +
( * 100 (size Red) )
( * 10 (size Blue) )
)
(size Purple)
)
Example for ideal observer maximizing expected information gain (EIG):
(topleft
( map
(lambda
x
(topleft
(colortiles
x
)
)
)
(set Blue Red Purple)
)
)
Example question:Learning a generative model of questions:
Lake and Gureckis: Proposal for Huawei
as-programs framework developed in the previous
section. We will use a log-linear modeling frame-
work to learn a distribution over questions where
the probability of a question is a function of its fea-
tures, f1(·), . . . , fK(·). The features will include the
expected information gain (EIG) of a question in the
current context, f1(·), as well as features that encode
program length, answer type, and various grammat-
ical operators. We define the energy of a question x
to be
E(x) = ✓1f1(x) + ✓2f2(x) + · · · + ✓KfK(x), (1)
where ✓i is a weight assigned to feature fi. The prob-
ability of a question x is determined by its energy
P(x; ✓) =
exp E(x)
P , (2)
model can be evaluated b
uinely novel human-like
probability questions th
training set. Evaluating
tions – in terms of their
bility, and creativity – p
productivity of the mode
human ability to ask que
We will also explore
games. This will provid
stand how much of the f
to the Battleship task, an
other games or goal-direc
For example, we will ex
such as “Hangman.” In t
word marked by placeho
ditionally, the person pl
grams framework developed in the previous
n. We will use a log-linear modeling frame-
to learn a distribution over questions where
obability of a question is a function of its fea-
f1(·), . . . , fK(·). The features will include the
ted information gain (EIG) of a question in the
t context, f1(·), as well as features that encode
am length, answer type, and various grammat-
perators. We define the energy of a question x
x) = ✓1f1(x) + ✓2f2(x) + · · · + ✓KfK(x), (1)
✓i is a weight assigned to feature fi. The prob-
of a question x is determined by its energy
P(x; ✓) =
exp E(x)
P
x02X exp E(x0)
, (2)
high-energy questions have a lower probabil-
model can be evaluated by a
uinely novel human-like qu
probability questions that
training set. Evaluating th
tions – in terms of their un
bility, and creativity – prov
productivity of the model, w
human ability to ask questi
We will also explore ext
games. This will provide a
stand how much of the form
to the Battleship task, and h
other games or goal-directed
For example, we will explo
such as “Hangman.” In this
word marked by placeholde
ditionally, the person playin
the presence of a letter, o
there any ‘A’s?”, after whiGoal: predict human questions in novel scenarios
x
✓
f(·)
: question
: features (EIG, length, etc.)
: trainable parameters
energy:
generative model:
132. compositionality, causality, and learning-to-learn for
building more human-like learning algorithms
...
Are any ships 3 tiles long?
( >
( +
( map
( lambda
x
( =
( size x )
3
)
)
( set Blue Red Purple )
)
)
0
)
G GG
angle = 120
start = F
niter = 3
F F G + G + G F
Are the blue ship and
red ship parallel?
( =
( orient Blue )
( orient Red )
)
sij P(sij|si(j 1))
end for
Ri P(Ri|S1, ..., Si 1)
end for
{, R, S}
return @GENERATETOKEN( )
end procedure
procedure GENERATETOKEN( )
for i = 1... do
S
(m)
i P(S
(m)
i |Si)
L
(m)
i P(L
(m)
i |Ri, T
(m)
1 , ..., T
(m)
i 1 )
T
(m)
i f(L
(m)
i , S
(m)
i )
end for
A(m)
P(A(m)
)
I(m)
P(I(m)
|T(m)
, A(m)
)
return I(m)
end procedure
1
133. Future directions:
Causal, compositional, and embodied concepts
learning new gestures
learning new dance moves learning new spoken words
“Ban Ki-moon”
“Kofi Annan”
learning new handwritten characters
134. Developmental origins of one-shot learning
With Eliza Kosoy and Josh Tenenbaum
Which is another example? Draw another example
Child 2
Child 1
One-shot classification One-shot generation
linked?
135. with Shannon Tubridy, Jason Fuller, & Todd Gureckis
Can we decode letter identity from pre-motor cortex, especially for
novel letters?
Neural mechanisms of one-shot learning
Overlapping representations for reading and writing letters in pre-
motor cortex.
(e.g., Longcamp et al., 2003)
N A B K
Stimuli:
136. Question asking in simple goal directed dialogs
with Anselm Rothe and Todd Gureckis
Concierge : What type of food are you thinking?
Guest : I feel like Italian food.
Concierge: How large is your party?
Guest : Four people.
Concierge : <insert question here>
Are you willing to travel
between 20 and 30 minutes
for four star place?
Should the average entree
price be more or less than
$20?
Do you prefer a four star Italian
restaurant or three star French?
?
137. Learning to play new video games
compositionality
causality
learning-to-learn
but also,
intuitive physics
intuitive psychology
How can people learn to play a new game so quickly?
What are the underlying cognitive principles?
(Lake et al., in press, Behavioral and Brain Sciences)
138. Learning to play new video games
compositionality
causality
learning-to-learn
but also,
intuitive physics
intuitive psychology
How can people learn to play a new game so quickly?
What are the underlying cognitive principles?
(Lake et al., in press, Behavioral and Brain Sciences)
139. Conclusions
How can people learn such rich concepts from only one or
a few examples?
• Bayesian Program Learning answers this question for a range
of simple visual concepts.
• Embodies three principles — compositionality, causality, and
learning-to-learn — likely to be important for rapid learning of
rich concepts in many domains.
How can people synthesize novel questions when faced
with uncertainty?
• Questions can be represented as programs, and synthesized
utilizing compositionality and learning-to-learn
140. Thank you
Funding
Moore-Sloan Data Science Environment at NYU, NSF Graduate Research
Fellowship, the Center for Minds, Brains and Machines (CBMM) funded by
NSF STC award CCF-1231216, and ARO MURI contract
W911NF-08-1-0242
Josh Tenenbaum
Russ Salakhutdinov
Collaborators
Steve Piantadosi
Todd Gureckis Anselm Rothe