SlideShare a Scribd company logo
1 of 78
Download to read offline
Neurobiological Models of
Instrumental Conditioning
Matthew J. Crossley
Department of Psychological and Brain Sciences	

University of California, Santa Barbara, 93106
I. A neurobiological model of appetitive instrumental
conditioning	

II. Applications of model	

Fast Reacquisition	

Partial Reinforcement Extinction	

Renewal	

III. Temporal-Difference model of DA
Outline
Why Instrumental Conditioning?
• The Ashby lab bread and butter is category
learning	

• Information-Integration category-learning is a
procedural skill	

• Appetitive Instrumental Conditioning is a
procedural skill
• Learned incrementally from feedback	

• Model-free reinforcement learning	

• Habitual control	

• E.g., riding a bike or playing an instrument	

• E.g., radiology
Procedural Skills
Procedural Skills
Where are the tumors?
Procedural Skills
TUMORS!
Procedural Skills Depend on the
Basal Ganglia
• Basal ganglia are a
collection of subcortical
nuclei	

• Interconnects with
cortex in well defined
circuits	

• Striatum is a major
input structure
Cortex Excites the Striatum
Striatum Inhibits the GPi
GPi Inhibits the Thalamus
High baseline firing
rate
Striatum Disinhibits the Thalamus
Thalamus Excites Cortex
Dopamine Modulates Activity
Procedural Learning Depends on the
Striatum
• Single-cell recordings	

Carelli, Wolske, & West, 1997; Merchant, Zainos, Hernadez, Salinas, & Romo,
1997; Romo, Merchant, Ruiz, Crespo, & Zainos, 1995	

• Lesion studies	

Eacott & Gaffan, 1991; Gaffan & Eacott, 1995; Gaffan & Harrison, 1987;
McDonald & White, 1993, 1994; Packard, Hirsch, & White, 1989; Packard &
McGaugh, 1992	

• Neuropsychological patient studies	

Filoteo, Maddox, & Davis, 2001; Filoteo, Maddox, Salmon, & Song, 2005;
Knowlton, Mangels, & Squire, 1996	

• Neuroimaging	

Nomura et al., 2007; Seger & Cincotta, 2002; Waldschmidt & Ashby, 2011
Striatal Neurons
Medium Spiny 	

Projection Neurons (MSNs)	

96%
GABA Interneurons	

2%
TANs - Cholinergic Interneurons	

2%
The TANs are of Particular Interest
• Tonically active and pause to excitatory input	

• Presynaptically inhibit cortical input to MSNs	

• Get major input from CM-Pf (thalamus)	

• Learn to pause to stimuli that predict reward
(requires dopamine)
I. A neurobiological model of appetitive instrumental
conditioning	

II. Applications of model	

Fast Reacquisition	

Partial Reinforcement Extinction	

Renewal	

III. Temporal-Difference model of DA
Outline
Model Architecture
Ashby and Crossley (2011)
Learning Occurs at the CTX-MSN
Synapse and at Pf-TAN Synapses
Pf-TAN
Synapse
CTX-MSN
Synapse
Ashby and Crossley (2011)
Network Dynamics:
Early Trial
Network Dynamics:
Early Trial
Network Dynamics - Early Trial
Network Dynamics - Early Trial
Network Dynamics - Early Trial
SMA
Response and Feedback
• Model responds if SMA
crosses threshold	

• Model is given feedback after
every trial
Learning Occurs at the CTX-MSN
Synapse and at Pf-TAN Synapses
Pf-TAN
Synapse
CTX-MSN
Synapse
Ashby and Crossley (2011)
CTX-MSN Synaptic Modification
Requires a TANs Pause
• Synaptic Strengthening:	

- Strong presynaptic
activation	

- Strong
postsynaptic
activation
- Elevated DA levels
• Synaptic Weakening:	

- Strong presynaptic
activation	

- Strong postsynaptic
activation
- Depressed DA levels
Arbuthnott, Ingham, & Wickens (2000)	

Calabresi, Pisani, Mercuri, & Bernardi (1996)	

Reynolds & Wickens (2002)
Synaptic Plasticity in the Striatum
Depends on Dopamine (DA)
• Synaptic Strengthening:	

- Strong presynaptic
activation	

- Strong postsynaptic
activation	

- Elevated DA levels
• Synaptic Weakening:	

- Strong presynaptic
activation	

- Strong postsynaptic
activation	

- Depressed DA levels
Arbuthnott, Ingham, & Wickens (2000)	

Calabresi, Pisani, Mercuri, & Bernardi (1996)	

Reynolds & Wickens (2002)
DA Encodes Reward Prediciton Error
(RPE)
• Elevated after unexpected
reward	

• Depressed after unexpected
no-reward	

• Does nothing if anything
expected happens
Bayer & Glimcher (2005)
Computing RPE
Obtained feedback on trial n:
Predicted feedback on trial n:
Rn =
1 if positive feedback
0 otherwise
Pn = Pn 1 + (Rn 1 Pn 1)
RPE on trial n:
RPE(n) = Rn Pn
DA Released on Trial n
DA(n) =
⌅⇤
⌅⇥
1 if RPE > 1
0.8RPE + 0.2 if 0.25 < RPE 1
0 if RPE < 0.25
Updating Synapses in the Model
!
wK,J (n +1) = wK,J (n)
+ "wIK (n) SJ (n) #$NMDA[ ]
+
D(n) # Dbase[ ]
+
1# wK,J (n)[ ]
# %wIK (n) SJ (n) #$NMDA[ ]
+
Dbase # D(n)[ ]
+
wK,J (n)
# &wIK (n) $NMDA # SJ (n)[ ]
+
' SJ (n) #$AMPA[ ]
+
wK,J (n).
Presynaptic Activity
Presynaptic Activity
Synaptic
Strengthening
Synaptic
Weakening
Updating Synapses in the Model
!
wK,J (n +1) = wK,J (n)
+ "wIK (n) SJ (n) #$NMDA[ ]
+
D(n) # Dbase[ ]
+
1# wK,J (n)[ ]
# %wIK (n) SJ (n) #$NMDA[ ]
+
Dbase # D(n)[ ]
+
wK,J (n)
# &wIK (n) $NMDA # SJ (n)[ ]
+
' SJ (n) #$AMPA[ ]
+
wK,J (n).
Postsynaptic Activation
Postsynaptic Activation
Synaptic
Strengthening
Synaptic
Weakening
Updating Synapses in the Model
!
wK,J (n +1) = wK,J (n)
+ "wIK (n) SJ (n) #$NMDA[ ]
+
D(n) # Dbase[ ]
+
1# wK,J (n)[ ]
# %wIK (n) SJ (n) #$NMDA[ ]
+
Dbase # D(n)[ ]
+
wK,J (n)
# &wIK (n) $NMDA # SJ (n)[ ]
+
' SJ (n) #$AMPA[ ]
+
wK,J (n).
Elevated DA
Depressed DA
Synaptic
Strengthening
Synaptic
Weakening
Network Dynamics:
Late Trial
Network Dynamics:
Late Trial
Network Dynamics - Late Trial
Network Dynamics - Late Trial
Network Dynamics - Late Trial
SMA
Model Accounts for Electrophysiological
Recordings from TANs
Ashby and Crossley (2011)
Model Accounts for Electrophysiological
Recordings from MSNs
Ashby and Crossley (2011)
I. A neurobiological model of appetitive instrumental
conditioning	

II. Applications of model	

Fast Reacquisition	

Partial Reinforcement Extinction	

Renewal	

III. Temporal-Difference model of DA
Outline
Fast Reacquisition
Ashby and Crossley (2011)
Fast reacquisition is evidence that extinction
did not erase initial learning
Fast Reacquisition Mechanics
TANs quickly stop pausing, and thereby
protect cortico-striatal synapses
Fast Reacquisition Mechanics
Partial Reinforcement Extinction (PRE)
Extinction is slower when acquisition
is trained with partial reinforcement
PRE Mechanics
TANs take longer to stop pausing
under partial reinforcement
Slowed Reacquisition
Condition
Phase
Ext2 Ext8 Prf2 Prf8
Acquisition VI-30 sec VI-30 sec VI-30 sec VI-30 sec
Extinction
No
Reinforcement
No
Reinforcement
Lean Schedule Lean Schedule
Reacquisition VI-2 min VI-8 min VI-2 min VI-8 min
Woods and Bouton (2007)
Behavioral Results
Crossley, Horvitz, Balsam, & Ashby (in prep)
Modeling Results
Crossley, Horvitz, Balsam, & Ashby (in prep)
TANs don’t stop pausing during
extinction in Prf Conditions
CTX-MSN Synapse Pf-TAN Synapse
Renewal - Basic Design
Condition
Phase
ABA AAB ABC
Acquisition Environment A Environment A Environment A
Extinction Environment B Environment A Environment B
Renewal	

(Extinction)
Environment A Environment B Environment C
Bouton et al. (2011)
Renewal
Model Architecture
Crossley, Horvitz, Balsam, & Ashby (in prep)
Synaptic Plasticity at ALL Pf-TAN
Synapses
Crossley, Horvitz, Balsam, & Ashby (in prep)
Renewal
Crossley, Horvitz, Balsam, & Ashby (in prep)
ABA Mechanics
Crossley, Horvitz, Balsam, & Ashby (in prep)
Net Pf-TAN synaptic weight is the average of all
active Pf-TAN synapses
Instrumental Conditioning Summary
• The TANs protect learning at CTX-MSN synapses.	

• Manipulations that keep the TANs paused during
extinction leave learning at the CTX-MSN synapse
subject to change.
Untested Physiological
Predictions
• Development of TANs pause precedes
development of category-specific responses in
MSNs	

• TANs should stop pausing during extinction
I. A neurobiological model of appetitive instrumental
conditioning	

II. Applications of model	

Fast Reacquisition	

Partial Reinforcement Extinction	

Renewal	

III. Temporal-Difference (TD) model of DA
Outline
Putting TD into the model
We want to replace the
discrete-trial model of
DA with a continuous
time model
The TD Prediction Error
Trial
Time Step
Prediction
Error
The TD Prediction Error
⇥t = rt + V (t + 1) V (t)
rt =
1 if reward at time t
0 if no reward at time t
Montague, Dayan, Sejnowski (1996) journal of neuroscience 16(5): 1936-1947
Model Architecture
Spiking Neuron Driven by
TD prediction error:
TANs were removed for
initial TD applications
⇥t = rt + V (t + 1) V (t)
We Need Modified Learning
Equations
!
wK,J (n +1) = wK,J (n)
+ "wIK (n) SJ (n) #$NMDA[ ]
+
D(n) # Dbase[ ]
+
1# wK,J (n)[ ]
# %wIK (n) SJ (n) #$NMDA[ ]
+
Dbase # D(n)[ ]
+
wK,J (n)
# &wIK (n) $NMDA # SJ (n)[ ]
+
' SJ (n) #$AMPA[ ]
+
wK,J (n).
Synaptic
Strengthening
Synaptic
Weakening
DA is no longer modeled on a
discrete trial-by-trial basis!
A Cortico-Striatal Synapse
CaMKII, PP-1 and Striatal Plasticity
Learning Equations
w(n + 1) = w(n)
+ w [SCaMKII(t) SCaMKII base]+
[DPP-1(t) Dbase]+
[wmax w(n)]dt
⇥w [SCaMKII(t) SCaMKII base]+
[Dbase DPP-1(t)]+
w(n)dt
Synaptic
Strengthening
Synaptic
Weakening
CaMKII Activity
CaMKII Activity
Learning Equations
w(n + 1) = w(n)
+ w [SCaMKII(t) SCaMKII base]+
[DPP-1(t) Dbase]+
[wmax w(n)]dt
⇥w [SCaMKII(t) SCaMKII base]+
[Dbase DPP-1(t)]+
w(n)dt
Synaptic
Strengthening
Synaptic
Weakening
PP-1 Activity
PP-1 Activity
Acquisition and Extinction
Trial
ProportionResponsesEmitted
Trial
CTX-MSNSynapticStrength
MSN and SNc
Trial
Time Step
TrialTime Step
MSNOutputSNcOutput
CaMKII and PP-1
DA model learns very quickly that
reward is taken away
Trial
TimeStep
Trial
TimeStep
Extinction under noncontingent
reward delivery
Trial
ProportionResponsesEmitted
Trial
CTX-MSNSynapticStrength
MSN and SNc
TrialTime Step
MSNOutput
Trial
Time Step
SNcOutput
MSN and SNc
Noncontingent reward delivery
keeps DA surprised
Trial
TimeStep
Trial
TimeStep
CaMKII and PP-1
Noncontingent reward delivery
keeps DA surprised
Trial
TimeStep
Trial
TimeStep
Summary and Future Directions
• TANs need to be added to account for
reacquisition, renewal, and other effects after
extinction with noncontingent reward	

• TD model might need to be modified once the
TANs are included and post-extinction effects are
examined
Acknowledgments
Collaborators:	

Greg Ashby	

The Ashby Lab	

Todd Maddox	

Jon Horvitz	

Peter Balsam	

!
Funding:	

NIMH Grant MH3760-2, 
Todd Wilkinson

More Related Content

Similar to Neurobiological Models of Instrumental Conditioning

Jennie Si: "Computing with Neural Spikes"
Jennie Si: "Computing with Neural Spikes" Jennie Si: "Computing with Neural Spikes"
Jennie Si: "Computing with Neural Spikes" ieee_cis_cyprus
 
Integrated RF and Shim coils for MRI
 Integrated RF and Shim coils for MRI Integrated RF and Shim coils for MRI
Integrated RF and Shim coils for MRINeuroPoly
 
Models Can Lie
Models Can LieModels Can Lie
Models Can LieRaju Rimal
 
How Do Retina Cope With Time Delay?
How Do Retina Cope With Time Delay?How Do Retina Cope With Time Delay?
How Do Retina Cope With Time Delay?ChuanYuHu
 
Doctoral Defense Pres SlideShare
Doctoral Defense Pres SlideShareDoctoral Defense Pres SlideShare
Doctoral Defense Pres SlideShareMeera Paleja, PhD
 
Computational Motor Control: State Space Models for Motor Adaptation (JAIST s...
Computational Motor Control: State Space Models for Motor Adaptation (JAIST s...Computational Motor Control: State Space Models for Motor Adaptation (JAIST s...
Computational Motor Control: State Space Models for Motor Adaptation (JAIST s...hirokazutanaka
 
Model of visual cortex
Model of visual cortexModel of visual cortex
Model of visual cortexSSA KPI
 
Spontaneous behavior in animals
Spontaneous behavior in animalsSpontaneous behavior in animals
Spontaneous behavior in animalsBjörn Brembs
 
PSA pattern to predict CRPC
PSA pattern to predict CRPCPSA pattern to predict CRPC
PSA pattern to predict CRPCYejin Kim
 
Reinforcement Learning and Neuroscience
Reinforcement Learning and NeuroscienceReinforcement Learning and Neuroscience
Reinforcement Learning and NeuroscienceMichael Bosello
 
Lec4a policy-gradients-actor-critic
Lec4a policy-gradients-actor-criticLec4a policy-gradients-actor-critic
Lec4a policy-gradients-actor-criticRonald Teo
 
Ldb Convergenze Parallele_De barros_01
Ldb Convergenze Parallele_De barros_01Ldb Convergenze Parallele_De barros_01
Ldb Convergenze Parallele_De barros_01laboratoridalbasso
 

Similar to Neurobiological Models of Instrumental Conditioning (20)

Jennie Si: "Computing with Neural Spikes"
Jennie Si: "Computing with Neural Spikes" Jennie Si: "Computing with Neural Spikes"
Jennie Si: "Computing with Neural Spikes"
 
SURP2014_MS
SURP2014_MSSURP2014_MS
SURP2014_MS
 
File4
File4File4
File4
 
Integrated RF and Shim coils for MRI
 Integrated RF and Shim coils for MRI Integrated RF and Shim coils for MRI
Integrated RF and Shim coils for MRI
 
Models Can Lie
Models Can LieModels Can Lie
Models Can Lie
 
How Do Retina Cope With Time Delay?
How Do Retina Cope With Time Delay?How Do Retina Cope With Time Delay?
How Do Retina Cope With Time Delay?
 
Annintro
AnnintroAnnintro
Annintro
 
basal-ganglia
basal-gangliabasal-ganglia
basal-ganglia
 
Doctoral Defense Pres SlideShare
Doctoral Defense Pres SlideShareDoctoral Defense Pres SlideShare
Doctoral Defense Pres SlideShare
 
Computational Motor Control: State Space Models for Motor Adaptation (JAIST s...
Computational Motor Control: State Space Models for Motor Adaptation (JAIST s...Computational Motor Control: State Space Models for Motor Adaptation (JAIST s...
Computational Motor Control: State Space Models for Motor Adaptation (JAIST s...
 
Model of visual cortex
Model of visual cortexModel of visual cortex
Model of visual cortex
 
Spontaneous behavior in animals
Spontaneous behavior in animalsSpontaneous behavior in animals
Spontaneous behavior in animals
 
PSA pattern to predict CRPC
PSA pattern to predict CRPCPSA pattern to predict CRPC
PSA pattern to predict CRPC
 
Reinforcement Learning and Neuroscience
Reinforcement Learning and NeuroscienceReinforcement Learning and Neuroscience
Reinforcement Learning and Neuroscience
 
ICTAM-POSTERIV
ICTAM-POSTERIVICTAM-POSTERIV
ICTAM-POSTERIV
 
AICHE 15 - CSTR
AICHE 15 - CSTRAICHE 15 - CSTR
AICHE 15 - CSTR
 
Lec4a policy-gradients-actor-critic
Lec4a policy-gradients-actor-criticLec4a policy-gradients-actor-critic
Lec4a policy-gradients-actor-critic
 
Ldb Convergenze Parallele_De barros_01
Ldb Convergenze Parallele_De barros_01Ldb Convergenze Parallele_De barros_01
Ldb Convergenze Parallele_De barros_01
 
kape_science
kape_sciencekape_science
kape_science
 
Presentation ECMTB14
Presentation ECMTB14Presentation ECMTB14
Presentation ECMTB14
 

Recently uploaded

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 

Recently uploaded (20)

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 

Neurobiological Models of Instrumental Conditioning

  • 1. Neurobiological Models of Instrumental Conditioning Matthew J. Crossley Department of Psychological and Brain Sciences University of California, Santa Barbara, 93106
  • 2. I. A neurobiological model of appetitive instrumental conditioning II. Applications of model Fast Reacquisition Partial Reinforcement Extinction Renewal III. Temporal-Difference model of DA Outline
  • 3. Why Instrumental Conditioning? • The Ashby lab bread and butter is category learning • Information-Integration category-learning is a procedural skill • Appetitive Instrumental Conditioning is a procedural skill
  • 4. • Learned incrementally from feedback • Model-free reinforcement learning • Habitual control • E.g., riding a bike or playing an instrument • E.g., radiology Procedural Skills
  • 7. Procedural Skills Depend on the Basal Ganglia • Basal ganglia are a collection of subcortical nuclei • Interconnects with cortex in well defined circuits • Striatum is a major input structure
  • 10. GPi Inhibits the Thalamus High baseline firing rate
  • 14. Procedural Learning Depends on the Striatum • Single-cell recordings Carelli, Wolske, & West, 1997; Merchant, Zainos, Hernadez, Salinas, & Romo, 1997; Romo, Merchant, Ruiz, Crespo, & Zainos, 1995 • Lesion studies Eacott & Gaffan, 1991; Gaffan & Eacott, 1995; Gaffan & Harrison, 1987; McDonald & White, 1993, 1994; Packard, Hirsch, & White, 1989; Packard & McGaugh, 1992 • Neuropsychological patient studies Filoteo, Maddox, & Davis, 2001; Filoteo, Maddox, Salmon, & Song, 2005; Knowlton, Mangels, & Squire, 1996 • Neuroimaging Nomura et al., 2007; Seger & Cincotta, 2002; Waldschmidt & Ashby, 2011
  • 15. Striatal Neurons Medium Spiny Projection Neurons (MSNs) 96% GABA Interneurons 2% TANs - Cholinergic Interneurons 2%
  • 16. The TANs are of Particular Interest • Tonically active and pause to excitatory input • Presynaptically inhibit cortical input to MSNs • Get major input from CM-Pf (thalamus) • Learn to pause to stimuli that predict reward (requires dopamine)
  • 17. I. A neurobiological model of appetitive instrumental conditioning II. Applications of model Fast Reacquisition Partial Reinforcement Extinction Renewal III. Temporal-Difference model of DA Outline
  • 18. Model Architecture Ashby and Crossley (2011)
  • 19. Learning Occurs at the CTX-MSN Synapse and at Pf-TAN Synapses Pf-TAN Synapse CTX-MSN Synapse Ashby and Crossley (2011)
  • 22. Network Dynamics - Early Trial
  • 23. Network Dynamics - Early Trial
  • 24. Network Dynamics - Early Trial SMA
  • 25. Response and Feedback • Model responds if SMA crosses threshold • Model is given feedback after every trial
  • 26. Learning Occurs at the CTX-MSN Synapse and at Pf-TAN Synapses Pf-TAN Synapse CTX-MSN Synapse Ashby and Crossley (2011)
  • 27. CTX-MSN Synaptic Modification Requires a TANs Pause • Synaptic Strengthening: - Strong presynaptic activation - Strong postsynaptic activation - Elevated DA levels • Synaptic Weakening: - Strong presynaptic activation - Strong postsynaptic activation - Depressed DA levels Arbuthnott, Ingham, & Wickens (2000) Calabresi, Pisani, Mercuri, & Bernardi (1996) Reynolds & Wickens (2002)
  • 28. Synaptic Plasticity in the Striatum Depends on Dopamine (DA) • Synaptic Strengthening: - Strong presynaptic activation - Strong postsynaptic activation - Elevated DA levels • Synaptic Weakening: - Strong presynaptic activation - Strong postsynaptic activation - Depressed DA levels Arbuthnott, Ingham, & Wickens (2000) Calabresi, Pisani, Mercuri, & Bernardi (1996) Reynolds & Wickens (2002)
  • 29. DA Encodes Reward Prediciton Error (RPE) • Elevated after unexpected reward • Depressed after unexpected no-reward • Does nothing if anything expected happens Bayer & Glimcher (2005)
  • 30. Computing RPE Obtained feedback on trial n: Predicted feedback on trial n: Rn = 1 if positive feedback 0 otherwise Pn = Pn 1 + (Rn 1 Pn 1) RPE on trial n: RPE(n) = Rn Pn
  • 31. DA Released on Trial n DA(n) = ⌅⇤ ⌅⇥ 1 if RPE > 1 0.8RPE + 0.2 if 0.25 < RPE 1 0 if RPE < 0.25
  • 32. Updating Synapses in the Model ! wK,J (n +1) = wK,J (n) + "wIK (n) SJ (n) #$NMDA[ ] + D(n) # Dbase[ ] + 1# wK,J (n)[ ] # %wIK (n) SJ (n) #$NMDA[ ] + Dbase # D(n)[ ] + wK,J (n) # &wIK (n) $NMDA # SJ (n)[ ] + ' SJ (n) #$AMPA[ ] + wK,J (n). Presynaptic Activity Presynaptic Activity Synaptic Strengthening Synaptic Weakening
  • 33. Updating Synapses in the Model ! wK,J (n +1) = wK,J (n) + "wIK (n) SJ (n) #$NMDA[ ] + D(n) # Dbase[ ] + 1# wK,J (n)[ ] # %wIK (n) SJ (n) #$NMDA[ ] + Dbase # D(n)[ ] + wK,J (n) # &wIK (n) $NMDA # SJ (n)[ ] + ' SJ (n) #$AMPA[ ] + wK,J (n). Postsynaptic Activation Postsynaptic Activation Synaptic Strengthening Synaptic Weakening
  • 34. Updating Synapses in the Model ! wK,J (n +1) = wK,J (n) + "wIK (n) SJ (n) #$NMDA[ ] + D(n) # Dbase[ ] + 1# wK,J (n)[ ] # %wIK (n) SJ (n) #$NMDA[ ] + Dbase # D(n)[ ] + wK,J (n) # &wIK (n) $NMDA # SJ (n)[ ] + ' SJ (n) #$AMPA[ ] + wK,J (n). Elevated DA Depressed DA Synaptic Strengthening Synaptic Weakening
  • 37. Network Dynamics - Late Trial
  • 38. Network Dynamics - Late Trial
  • 39. Network Dynamics - Late Trial SMA
  • 40. Model Accounts for Electrophysiological Recordings from TANs Ashby and Crossley (2011)
  • 41. Model Accounts for Electrophysiological Recordings from MSNs Ashby and Crossley (2011)
  • 42. I. A neurobiological model of appetitive instrumental conditioning II. Applications of model Fast Reacquisition Partial Reinforcement Extinction Renewal III. Temporal-Difference model of DA Outline
  • 43. Fast Reacquisition Ashby and Crossley (2011) Fast reacquisition is evidence that extinction did not erase initial learning
  • 44. Fast Reacquisition Mechanics TANs quickly stop pausing, and thereby protect cortico-striatal synapses
  • 46. Partial Reinforcement Extinction (PRE) Extinction is slower when acquisition is trained with partial reinforcement
  • 47. PRE Mechanics TANs take longer to stop pausing under partial reinforcement
  • 48. Slowed Reacquisition Condition Phase Ext2 Ext8 Prf2 Prf8 Acquisition VI-30 sec VI-30 sec VI-30 sec VI-30 sec Extinction No Reinforcement No Reinforcement Lean Schedule Lean Schedule Reacquisition VI-2 min VI-8 min VI-2 min VI-8 min Woods and Bouton (2007)
  • 49. Behavioral Results Crossley, Horvitz, Balsam, & Ashby (in prep)
  • 50. Modeling Results Crossley, Horvitz, Balsam, & Ashby (in prep)
  • 51. TANs don’t stop pausing during extinction in Prf Conditions CTX-MSN Synapse Pf-TAN Synapse
  • 52. Renewal - Basic Design Condition Phase ABA AAB ABC Acquisition Environment A Environment A Environment A Extinction Environment B Environment A Environment B Renewal (Extinction) Environment A Environment B Environment C Bouton et al. (2011)
  • 54. Model Architecture Crossley, Horvitz, Balsam, & Ashby (in prep)
  • 55. Synaptic Plasticity at ALL Pf-TAN Synapses Crossley, Horvitz, Balsam, & Ashby (in prep)
  • 57. ABA Mechanics Crossley, Horvitz, Balsam, & Ashby (in prep) Net Pf-TAN synaptic weight is the average of all active Pf-TAN synapses
  • 58. Instrumental Conditioning Summary • The TANs protect learning at CTX-MSN synapses. • Manipulations that keep the TANs paused during extinction leave learning at the CTX-MSN synapse subject to change.
  • 59. Untested Physiological Predictions • Development of TANs pause precedes development of category-specific responses in MSNs • TANs should stop pausing during extinction
  • 60. I. A neurobiological model of appetitive instrumental conditioning II. Applications of model Fast Reacquisition Partial Reinforcement Extinction Renewal III. Temporal-Difference (TD) model of DA Outline
  • 61. Putting TD into the model We want to replace the discrete-trial model of DA with a continuous time model
  • 62. The TD Prediction Error Trial Time Step Prediction Error
  • 63. The TD Prediction Error ⇥t = rt + V (t + 1) V (t) rt = 1 if reward at time t 0 if no reward at time t Montague, Dayan, Sejnowski (1996) journal of neuroscience 16(5): 1936-1947
  • 64. Model Architecture Spiking Neuron Driven by TD prediction error: TANs were removed for initial TD applications ⇥t = rt + V (t + 1) V (t)
  • 65. We Need Modified Learning Equations ! wK,J (n +1) = wK,J (n) + "wIK (n) SJ (n) #$NMDA[ ] + D(n) # Dbase[ ] + 1# wK,J (n)[ ] # %wIK (n) SJ (n) #$NMDA[ ] + Dbase # D(n)[ ] + wK,J (n) # &wIK (n) $NMDA # SJ (n)[ ] + ' SJ (n) #$AMPA[ ] + wK,J (n). Synaptic Strengthening Synaptic Weakening DA is no longer modeled on a discrete trial-by-trial basis!
  • 67. CaMKII, PP-1 and Striatal Plasticity
  • 68. Learning Equations w(n + 1) = w(n) + w [SCaMKII(t) SCaMKII base]+ [DPP-1(t) Dbase]+ [wmax w(n)]dt ⇥w [SCaMKII(t) SCaMKII base]+ [Dbase DPP-1(t)]+ w(n)dt Synaptic Strengthening Synaptic Weakening CaMKII Activity CaMKII Activity
  • 69. Learning Equations w(n + 1) = w(n) + w [SCaMKII(t) SCaMKII base]+ [DPP-1(t) Dbase]+ [wmax w(n)]dt ⇥w [SCaMKII(t) SCaMKII base]+ [Dbase DPP-1(t)]+ w(n)dt Synaptic Strengthening Synaptic Weakening PP-1 Activity PP-1 Activity
  • 71. MSN and SNc Trial Time Step TrialTime Step MSNOutputSNcOutput
  • 72. CaMKII and PP-1 DA model learns very quickly that reward is taken away Trial TimeStep Trial TimeStep
  • 73. Extinction under noncontingent reward delivery Trial ProportionResponsesEmitted Trial CTX-MSNSynapticStrength
  • 74. MSN and SNc TrialTime Step MSNOutput Trial Time Step SNcOutput
  • 75. MSN and SNc Noncontingent reward delivery keeps DA surprised Trial TimeStep Trial TimeStep
  • 76. CaMKII and PP-1 Noncontingent reward delivery keeps DA surprised Trial TimeStep Trial TimeStep
  • 77. Summary and Future Directions • TANs need to be added to account for reacquisition, renewal, and other effects after extinction with noncontingent reward • TD model might need to be modified once the TANs are included and post-extinction effects are examined
  • 78. Acknowledgments Collaborators: Greg Ashby The Ashby Lab Todd Maddox Jon Horvitz Peter Balsam ! Funding: NIMH Grant MH3760-2, Todd Wilkinson