The Most Attractive Hyderabad Call Girls Kothapet 𖠋 6297143586 𖠋 Will You Mis...
Modeling decision making deficits in frontostriatal disorders using reinforcement learning
1. Modeling decision making deficits in frontostriatal disorders
Michael Frank
Laboratory for Neural Computation and Cognition
Brown University
2. Computational Psychiatry and...
Neurogenocomputomics
• Many disorders broadly characterized by changes in motivation
• Several fronto-striatal disorders have substantial genetic heritability
• Individual differences in reinforcement learning?
3. Computational Psychiatry and...
Neurogenocomputomics
• Many disorders broadly characterized by changes in motivation
• Several fronto-striatal disorders have substantial genetic heritability
• Individual differences in reinforcement learning?
• But... Candidate gene effects are generally small
• Which genes? Which task? Which measure?
4. Computational Psychiatry and...
Neurogenocomputomics
• Many disorders broadly characterized by changes in motivation
• Several fronto-striatal disorders have substantial genetic heritability
• Individual differences in reinforcement learning?
• But... Candidate gene effects are generally small
• Which genes? Which task? Which measure?
• Need theoretical model! (and converging pharmacology/imaging)
Frank & Fossella, 2011; Maia & Frank, 2011; Huys et al, 2011
8. D1 effects on striatal learning: Positive PE
Three factor learning: presynaptic, postsynaptic and DA
9. D2 effects on striatal learning: Negative PE
Frank 2005
10. Neural model of basal ganglia and dopamine
Integrates a wide range of data into a single coherent framework
Separate Go and NoGo populations integrate statistics of reinforcement
preSMA
Input
Striatum γ [Vm− Θ]
cVm = gege[E Vm] y j ≈ γ [V − ] + 1
+
m Θ+
e
+ g g [E V ]
i i i m
+ g g [E Vm] β
l l l net = ge ≈ <x i w ij > +
N
STN + ...
w ij
GPe
xi
Go NoGo Thalamus
p p t t
∆wij ≈ (xi yj )−(xi yj )
SNc GPi/SNr
Frank, 2005, 2006 J Cog Neurosci, Neural Networks
11. Maximizing Reward via RT Adaptation:
Temporal Utility Integration Task
Reward Frequency Reward Magnitude
1.0 350
0.9 CEV CEV
DEV 300 DEV
0.8 IEV IEV
0.7 CEVR # Points Gained 250 CEVR
Probability
0.6 200
0.5
0.4 150
0.3 100
0.2
50
0.1
0.0 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Time (ms) Time (ms)
Expected Value
60
Expected Value (freq*mag)
55
50
45
40
35
30
25
20 CEV
15 DEV
10 IEV
5 CEVR
0
0 1000 2000 3000 4000 5000
Time (ms)
12. RL model: Fit to data across all subjects
RL model : adjust RTs as a function of reward prediction errors
Frank, Doll, Oas-Terpstra & Moreno (2009, Nature Neuroscience)
15. Exploration vs Exploitation
• By exploiting learned strategies, we know we can get a certain amount
of reward
• But don’t know how good it can get. ⇒ Need to Explore
• Theory: Explore based on relative uncertainty about whether other
actions might yield better outcomes than status quo (Dayan & Sejnowksi 96)
16. Exploration vs Exploitation
• By exploiting learned strategies, we know we can get a certain amount
of reward
• But don’t know how good it can get. ⇒ Need to Explore
• Theory: Explore based on relative uncertainty about whether other
actions might yield better outcomes than status quo (Dayan & Sejnowksi 96)
22. EEG reveals temporal dynamics
Relative uncertainty represented prior to choice, and more so in exploratory trials
Cavanagh, Cohen, Figueroa & Frank, under review
23. Negative symptoms in schizophrenia:
Uncertainty-Based Exploration
Anhedonia & Exploration
Uncertainty-driven exploration
0.8
0.40
0.6
0.35 SZ
CN 0.4
ε (x 1e4)
0.30 0.2
0
ε (x1e4)
0.25
0.20 -0.2
** -0.4
0.15 -0.6
0.10 -0.8 r = -.44, p = .002
0.05 -1.0
0.00 -1.2
0 1 2 3 4
ε(uncert) Global Anhedonia
• Anhedonia = behavioral component of reward seeking (e.g., initiating
social/recreational activities) not capacity to experience pleasure
• Anhedonia related to exploration and not learning from reward prediction errors
Strauss et al, 2011, Biological Psychiatry
24. Obsessive Compulsive Disorder: Aversion to Uncertainty
Uncertainty-driven exploration
0.6
CN
0.4 OCD
ε (x 1e4)
0.2
0.0
-0.2
-0.4
gains losses
preliminary data, N=17 per group
with Mascha van ’t Wout, Ben Greenberg, Steve Rasmussen
25. Summary
• Dopamine modulates reinforcement learning and choice based on
positive and negative outcomes: patients, pharmacology, genetics,
imaging
• Prefrontal cortex tracks outcome uncertainty so as to reduce it
• Disruption of these mechanisms is associated with fronto-striatal
disorders, Parkinson’s, schizophrenia, OCD
• Models integrate between multiple levels of analysis:
neural mechanism to abstract computation (see Thomas Wiecki
demonstration tomorrow!).
26. Thanks To...
Bradley Doll
Christina Figueroa
Jim Cavanagh
David Badre
Jeff Cockburn
Anne Collins
Thomas Wiecki
Jim Gold
Kent Hutchison
Mascha van ’t Wout
Nicole Long
Mike Cohen
Ahmed Moustafa
Scott Sherman Lab for Neural Computation and Cognition
The patients