1. The social Bayesian brain
Brain and Spine Institute - Paris
“Motivation, Brain & Behaviour” group
Jean Daunizeau
4. The Bayesian brain hypothesis
Cerebral information processing was optimized through natural selection [Friston
2005, Fiorillo 2010]
The brain uses a model of the world that (i) is optimal on average, but (ii) can induce
systematic biases [Weiss 2002, Alais 2004]
5. the social Bayesian brain
Theory of mind
= ability to attribute mental states (e.g., beliefs, desires, ...) to others [Premack 1978]
• teaching, persuading, deceiving, … → success in social interactions [Baron-Cohen 1999]
• develops very early in life [Surian 2007, Kovacs 2010]
• impairment → severe psychiatric disorders [Baron-Cohen 1985, Frith 1994, Brüne 2005]
• meta-cognitive insight: others’ behaviour is driven by their beliefs [Frith 2012]
The social “Bayesian brain” hypothesis
• the brain’s model of other brains should assume they are Bayesian too!
→ ToM = meta-Bayesian inference [Daunizeau 2010a, Daunizeau 2010b]
• although it is limited, is ToM optimal in an evolutionary sense?
6. Overview of the talk
Inverse Bayesian Decision Theory
Meta-Bayesian modelling of Theory of Mind
Limited ToM sophistication: did evolution fool us?
7. Overview of the talk
Inverse Bayesian Decision Theory
Meta-Bayesian modelling of Theory of Mind
Limited ToM sophistication: did evolution fool us?
8. BDT: from observations to beliefs
• The “amount of information” is related to probability in the sense that one’s belief
“vagueness” is well characterized in terms of the dispersion of a probability distribution
that measures how (subjectively) plausible is any “state of affairs”.
• The subjective plausibility of any “state of affairs” is not captured by the objective
frequency of the ensuing observable event. This is because beliefs are shaped by all sorts
of (implicit or explicit) prior assumptions that act as a (potentially very complex) filter
upon sensory observations.
x
hidden (unknown) state of the world
u
accessible observations or data
• likelihood:
p u x, m
• priors:
p x m
• posterior:
p x u, m p u x, m p x m
9. BDT: from beliefs to decisions
• Bayesian Decision Theory (BDT) is concerned with how decisions are made or should be
made, in ambiguous or uncertain situations:
- normative: optimal/rational decision? (cf. statistical test)
- descriptive: what do people do? (cf. behavioural economics)
• BDT is bound to a perspective on preferences → utility theory:
- utility functions: surrogate for the task goal (reward contingent on a decision)
- subsumes game theory and control theory
a
x, a
alternative actions or decisions
loss function
→ expected cost (posterior risk):
u, a E x, a u, m
→ BDT-optimal decision rule:
a* arg min u, a
a
10. BDT example: speed-accuracy trade-off
• loss = estimation error + estimation time:
• generative model:
2
x, a, t x a t
p ut x, m N x,1
ut x t( y )
( )
p x m N 0 , 0 2
x 0 t
t
2
t 0 t u 0
1
:
1
t
0 2 t
• posterior belief:
p x u1:t , m N t , t 2
• expected cost:
u, a, t t a t 2 t
2
11. BDT example: speed-accuracy trade-off (2)
12
posterior variance
time cost (K=1)
expected loss (K=1)
time cost (K=2)
expected loss (K=2)
10
8
6
4
expected inaccuracy
(K=2)
expected inaccuracy
(K=1)
2
0
0
0.5
1
t*
t*
(K=2) (K=1)
1.5
2
decision time
12. inverse BDT: the complete class theorem
There (almost) always exist a duplet of prior belief p x m
and loss function
such that any observed decision can be interpreted as BDT-optimal
t* arg min u, a, t
t
1
02
0
→ interpreting BDT-optimal responses (e.g., decision times): weak duality
x, a
13. xt 1
p xt | xt 1 , , m(1)
xt
Perceptual priors
meta-Bayesian model
Perceptual model
m(1)
ut
p ut | xt , , m(1)
Perceptual likelihood
m(2)
u
Free-energy maximization
(optimal learner assumption)
t f t 1 , ut ,
q xt | t
f : t 1 arg max F
t
u,
approximate posterior
Posterior risk minimization
(optimal decider assumption)
g arg min u, a
a
Response model
m
(2)
y
p , | m(2)
y
p y | , ( ), m(2)
priors
likelihood
Daunizeau et al., 2010a
14. inverse BDT: the meta-Bayesian approach
Andy Murray’s belief
-3
4
meta-Bayesian model
m
x 10
3
(2)
2
t
1
u
ball position
0
t , t
0
-10
-5
0
5
10
t
prior uncertainty
meta-Bayesian estimate
of Andy Murray’s belief
Andy Murray’s belief
-3
4
cost of time
x 10
3
2
y
reaction time
1
0
-10
-5
0
5
10
15. Overview of the talk
Inverse Bayesian Decision Theory
Meta-Bayesian modelling of Theory of Mind
Limited ToM sophistication: did evolution fool us?
16. Recursivity and limited sophistication
•
Operational definition of ToM: taking the intentional stance [Dennett 1996]
– Infer beliefs and preferences from observed behavior
– Use them to understand/predict behaviour
•
In reciprocal/repeated social interactions, ToM is potentially recursive [Yoshida 2008]
– « I think that you think », « I think that you think that I think », …
– ToM sophistication levels induce different learning rules / behavioural policies
•
Questions:
– Does the meta-Bayesian approach reallistically capture peoples’ ToM?
– Can people appeal to these sophistication levels (e.g. by pure reinforcement) or are
these (social) priors that are set by the context?
– What is the inter-individual variability of ToM sophistication?
17. 0-ToM
0-ToM does not apply the intentional stance
→ 0-ToM is a Bayesian agent with:
- beliefs (about non-intentional contingencies)
- preferences
« I think that you will hide behind the tree »
18. 1-ToM
1-ToM learns how the other learns
→ 1-ToM is a meta-Bayesian agent with:
- beliefs (about other’s beliefs and preferences)
- preferences
« I think that you think that I will hide behind the tree »
19. 2-ToM
2-ToM learns how the other learns and her ToM sophistication level
→ 2-ToM is a meta-Bayesian agent with:
- beliefs (about other’s beliefs – about one’s beliefs - and preferences)
- preferences
« I think that you think that I think, … »
20. Recursive meta-Bayesian modelling
• k-ToM learns how the other learns and her ToM sophistication level:
k
( k ) f (1) , a ,1( k )
• k-ToM acts according to her beliefs and preferences:
k)
p a1, 1 ( k ) exp (1 , a1, 1 2( k )
• This induces a likelihood for a k+1-ToM observer:
p a1,
(1,..., k )
k
, , mk 1 p a1, '
k ' 0 '1
(k )
k '
• Deriving the ensuing Free-Energy yields the k+1-ToM learning rule:
k
(1 1) f ( k 1) , a ,1( k 1)
f : ( k 1) arg max F( k 1)
( k 1)
1
F( k 1) ln p a1, (1,...,k ) , , mk 1 ln p (1,...,k ) , mk 1 ln q (1,...,k ) ,
23. Behavioural task design
social framing
(game « hide and seek »)
non-social framing
(casino gambling task)
You are playing against Player 1
Session 1
alternative options
(1.2 sec)
1
2
1
2
subject’s choice
feedback
(1sec)
Well done!
You win!
24. the social framing effect: group results (N=26)
group-average performance (cumulated earnings after 60 game repetitions)
6
*
4
2
0
-2
-4
2
-6
random biased
1.8
*
0-ToM
1.6
1.4
2.1
1-ToM
2
2-ToM
1.9
1.8
1.7
1.6
non-social framing
social framing
1.5
inter-individual variability in cognitive skills: regression on performance against 1-ToM
6
*
4
2
0
-2
-4
-6
IMT
false belief Frith-Happe
EQ
WSCT
Go-NoGo
3-back
*
mean
25. Volterra decompositions: group results
k
p at 1 s 0 ( k )ut() ...
k
Volterra 1st-order kernels:
own action
opponent's action
Volterra weight: S-NS
Volterra weight
0.1
0
-0.1
-0.2
-0.3
-0.4
*
own action
-0.5
1
2
0.75
4
5
lag
lag
3
6
7
lag
4
-0.4
*
-0.5
1
6
3
-0.3
8
0.6
2
-0.2
2
3
4
5
6
7
8
5
6
7
8
chance level
0-ToM (acc=86%)
1-ToM (acc=75%)
2-ToM (acc=74%)
0.65
1
-0.1
lag
0.7
0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
0
lag
weight
Volterra weight: S-NS
Volterra weight
0.1
5
6
7
8
0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
1
2
3
4
26. Group-level Bayesian model comparison (I)
log model evidences (group average)
free energies
4
2
0
-2
2
1.8
1.6
1.4
-4
2.1
2
1.9
1.8
1.7
1.6
non-social framing
social framing
1.5
-6
Volterra
Nash
WSLS
RL
2-ToM’s best fit: subject #7 against 0-ToM
(acc=79%)
Model fit: <g(x)|y,m> versus y
0-ToM
1-ToM
3-ToM
2-ToM’s worst fit: subject #21 against 0-ToM
(acc=43%)
Model fit: <g(x)|y,m> versus y
1.2
1
1
y
observed choices
1.2
observed choices
y
2-ToM
0.8
0.6
0.4
0.8
0.6
0.4
0.2
0.2
0
0
-0.2
0
0.2
0.4
0.6
<g(x)|y,m>
modelled choices
0.8
1
-0.2
0
0.2
0.4
0.6
<g(x)|y,m>
modelled choices
0.8
1
27. Group-level Bayesian model comparison (II)
modelmodel families: exceedance probabilities
families: exceedance probabilities
1
0.8
2
1.8
1.6
1.4
0.6
2.1
2
1.9
1.8
1.7
1.6
non-social framing
social framing
1.5
0.4
0.2
0
no-ToM
ToM
estimated model frequencies (social condition)
model frequencies (social condition)
0.6
0.5
0.4
0.3
0.2
0.1
0
Volterra
Nash
WSLS
no-ToM family
RL
0-ToM
1-ToM
2-ToM
ToM family
3-ToM
30. Rhesus-macaques: group-results (N=4)
log model evidences (group average)
free energies
20
10
0
-10
-20
Volterra
Nash
WSLS
RL
0-ToM
no-ToM family
1-ToM
ToM family
model families: exceedance probabilities
1
0.8
0.6
0.4
0.2
0
no-ToM
2-ToM
ToM
31. Overview of the talk
Inverse Bayesian Decision Theory
Meta-Bayesian modelling of Theory of Mind
Limited ToM sophistication: did evolution fool us?
33. Being right is as good as being smart
1-ToM predicts 0-ToM
« hide and seek »
« battle of the sexes »
0-ToM predicts 1-ToM
34. Biases in ToM induction
2-ToM vs 2-ToM
3-ToM vs 3-ToM
4-ToM vs 4-ToM
35. Evolutionary game theory
Can we explain the emergence of the natural bound on ToM sophistication?
→ Average adaptive fitness:
• is a function of the behavioural performance, relative to other phenotypes
• depends upon the frequency of other phenotypes within the population
sk
frequency of phenotype k within the population
i
frequency of game i
Q(i )
expected payoff matrix of game i at round τ
→ Replicator dynamics [Maynard-Smith 1982, Hofbauer 1998]:
ds
Diag s i Q(i ) s i sT Q (i ) s
dt
i
i
evolutionary stable states:
s lim s t
t
36. Replicator dynamics and ESS
EGT replicator dynamics
1
type and frequency of EGT steady states
« hide and seek »
traits' frequencies
0.8
0.6
0.4
0.2
0
0
200
400
600
800
k=0
k=1
k=2
k=3
k=4
1000
evolutionary time
EGT replicator dynamics
1
« battle of the sexes »
traits' frequencies
0.8
EGT steady state
k=0
k=1
k=2
k=3
k=4
0.6
0.4
0.2
0
0
200
400
600
evolutionary time
800
1000
41. The social Bayesian brain: summary
• Meta-Bayesian inference
- the brain’s model of other brains assumes they are Bayesian too!
- there is an inevitable inflation of the uncertainty of others’ beliefs
• Theory of mind:
- reciprocal social interaction → recursive beliefs
- Humans → social framing effect (mentalize or be fooled)
- Macaque monkeys → no intentional stance (but training?)
• Evolution of ToM:
- cooperation+learning during evolution → natural bounds to ToM
sophistication (“being right is as good as being smart”)
- evolutionary stable ToM distribution = mixed!
42. Dealing with uncertain motives: advice taking task
probabilistic cue
player’s decision
outcome
informed advice
?
or
*
P
progress bar
A
Gold target = 20 CHF
Silver target = 10 CHF
P
A
Diaconescu et al., in prep.