1. Neurobiological Models and
Research Themes
Matthew J. Crossley
Department of Psychological and Brain Sciences
University of California, Santa Barbara, 93106
2. I. A neurobiological model of appetitive instrumental
conditioning
II. Overview of my research
III. Contribution to the Ivry lab
Talk Goals
3. Why Instrumental Conditioning?
• The Ashby lab bread and butter is category
learning
• Information-Integration category-learning is a
procedural skill
• Appetitive Instrumental Conditioning is a
procedural skill
4. • Procedural Skills
• Model Architecture
• Instrumental Conditioning Applications
• Instrumental Conditioning Summary
Part I Outline
6. • Learned incrementally from feedback
• Model-free reinforcement learning
• Habitual control
• E.g., riding a bike or playing an instrument
• E.g., radiology
Procedural Skills
9. Procedural Skills Depend on the
Basal Ganglia
• Basal ganglia are a
collection of subcortical
nuclei
• Interconnects with
cortex in well defined
circuits
• Striatum is a major
input structure
18. The TANs are of Particular Interest
• Tonically active and pause to excitatory input
• Presynaptically inhibit cortical input to MSNs
• Get major input from CM-Pf (thalamus)
• Learn to pause to stimuli that predict reward
(requires dopamine)
53. TANs don’t stop pausing during
extinction in Prf Conditions
CTX-MSN Synapse Pf-TAN Synapse
54. Renewal - Basic Design
Condition
Phase
ABA AAB ABC
Acquisition Environment A Environment A Environment A
Extinction Environment B Environment A Environment B
Renewal
(Extinction)
Environment A Environment B Environment C
Bouton et al. (2011)
59. ABA Mechanics
Crossley, Horvitz, Balsam, & Ashby (in prep)
Net Pf-TAN synaptic weight is the average of all
active Pf-TAN synapses
60. Instrumental Conditioning Summary
• The TANs protect learning at CTX-MSN synapses.
• Manipulations that keep the TANs paused during
extinction leave learning at the CTX-MSN synapse
subject to change.
61. I. A Neurobiological model of appetitive
instrumental conditioning
II. Overview of my research
III. Contribution to the Ivry Lab
Talk Goals
65. Many Qualitative Differences
Between RB and II
RB II
Unsupervised learning Yes No
Observational learning Yes No
Dual-task interference Yes No
Time needed to process
feedback
Yes No
Interference from button
switch
No Yes
Interference from Feedback
Delay
No Yes
II Category Learning is a Procedural Skill
72. System Interaction Theme
• Development of TANs pause precedes
development of category-specific responses in
MSNs
• TANs should stop pausing during extinction (i.e.,
reward removal in instrumental conditioning and
noncontingent feedback in category learning).
• Phasic DA response should be scaled by response-
feedback contingency.
• Do systems cooperate to learn optimal behavior?
• What does it take to get system-switching?
• Does the procedural system learn during declarative
control?
• What mechanistic models describe system switching
throughout learning?
• What is the correct neurobiological model of
system switching?
73. System Interaction Theme
• Development of TANs pause precedes
development of category-specific responses in
MSNs
• TANs should stop pausing during extinction (i.e.,
reward removal in instrumental conditioning and
noncontingent feedback in category learning).
• Phasic DA response should be scaled by response-
feedback contingency.
• Do systems cooperate to learn optimal behavior?
• What does it take to get system-switching?
• Does the procedural system learn during declarative
control?
• What mechanistic models describe system switching
throughout learning?
• What is the correct neurobiological model of
system switching?
74. Do Systems Cooperate?
Perfect accuracy is possible with trial-by-trial switching
between RB and II strategies
Ashby & Crossley (2010)
2 days (1200 trials) of training on:
75. Systems Compete
Information-Integration Uniform Hybrid Non-Uniform Hybrid
Guessing
Rule-Based
Information_integration
Hybrid
Decision-Bound Model Fit Summary
NumberofParticipants
05101520
Almost nobody was best fit by a hybrid model
Ashby & Crossley (2010)
76. System Interaction Theme
• Development of TANs pause precedes
development of category-specific responses in
MSNs
• TANs should stop pausing during extinction (i.e.,
reward removal in instrumental conditioning and
noncontingent feedback in category learning).
• Phasic DA response should be scaled by response-
feedback contingency.
• Do systems cooperate to learn optimal behavior?
• What does it take to get system-switching?
• Does the procedural system learn during declarative
control?
• What mechanistic models describe system switching
throughout learning?
• What is the correct neurobiological model of
system switching?
77. What does it take to get
successful system switching?
A
B
DC
Behavioral: Crossley, Roeder & Ashby (in prep)
fMRI:Turner, Crossley & Ashby (in prep)
78. Crossley, Roeder & Ashby (in prep)
Successful System-Switching
Training Protocol
• 100 RB trials
• 400 II trials
• 300 intermixed trials
• 100 button-switched
intermixed trials
79. Successful System-Switching
Button Switch
Crossley, Roeder & Ashby (in prep)
Persistent button-switch interference on II trials but not RB
trials supports true system switching
ButtonSwitchInterference
ButtonSwitchInterference
80. System Interaction Theme
• Development of TANs pause precedes
development of category-specific responses in
MSNs
• TANs should stop pausing during extinction (i.e.,
reward removal in instrumental conditioning and
noncontingent feedback in category learning).
• Phasic DA response should be scaled by response-
feedback contingency.
• Do systems cooperate to learn optimal behavior?
• What does it take to get system-switching?
• Does the procedural system learn during declarative
control?
• What mechanistic models describe system switching
throughout learning?
• What is the correct neurobiological model of
system switching?
81. Does the procedural system learn during declarative
control?
Conditions
• Transfer Positive
• All Positive
• Transfer Negative
• All Negative
Crossley & Ashby (in prep)
82. Potential for weak bootstrapping
Small, but significant hit in
Transfer Negative condition
during first 50 trials after
transfer
TransferTrain
Crossley & Ashby (in prep)
83. System Interaction Theme
• Development of TANs pause precedes
development of category-specific responses in
MSNs
• TANs should stop pausing during extinction (i.e.,
reward removal in instrumental conditioning and
noncontingent feedback in category learning).
• Phasic DA response should be scaled by response-
feedback contingency.
• Do systems cooperate to learn optimal behavior?
• What does it take to get system-switching?
• Does the II system learn during RB control?
• What mechanistic models describe system switching
throughout learning?
• What is the correct neurobiological model of
system switching?
85. System Interaction Theme
• Development of TANs pause precedes
development of category-specific responses in
MSNs
• TANs should stop pausing during extinction (i.e.,
reward removal in instrumental conditioning and
noncontingent feedback in category learning).
• Phasic DA response should be scaled by response-
feedback contingency.
• Do systems cooperate to learn optimal behavior?
• What does it take to get system-switching?
• Does the II system learn during RB control?
• What mechanistic models describe system switching
throughout learning?
• What is the correct neurobiological model of
system switching?
88. Category Structure and
Feedback Effects
• Development of TANs pause precedes
development of category-specific responses in
MSNs
• TANs should stop pausing during extinction (i.e.,
reward removal in instrumental conditioning and
noncontingent feedback in category learning).
• Phasic DA response should be scaled by response-
feedback contingency.
• What system learns unstructured categories?
• Does probabilistic feedback induce procedural
learning?
91. The Experiment
Crossley, Madsen & Ashby (in prep)
ButtonSwitchInterference
Accuracy
ButtonSwitchInterference
ReactionTime
Button-switch effect on unstructured categories suggests
procedural control
92. Learning Under a Dual-Task
• Development of TANs pause precedes
development of category-specific responses in
MSNs
• TANs should stop pausing during extinction (i.e.,
reward removal in instrumental conditioning and
noncontingent feedback in category learning).
• Phasic DA response should be scaled by response-
feedback contingency.
• Hypothesis 1: Dual-task induces procedural control.
• Hypothesis 2: Dual-task only slows the declarative
system down.
RB category learning with a simultaneous numerical Stroop task
93. The Experiment
Paul, Crossley & Ashby (in prep)
• Every participant does either RB or II structures with:
• Single-task, button-switch
• Dual-task, button-switch
95. I. A Neurobiological model of appetitive
instrumental conditioning
II. Overview of my research
III. Contribution to the Ivry Lab
Talk Goals
96. I. Lots of room to build spiking networks
Hand / Object Choice networks
Inhibitory Control and Competition Resolution
Supervised learning in the cerebellum
Model of timing in instrumental conditioning
II. Object choice, hand choice, and categorization:
Experiment ideas
Contribution to the Ivry Lab
97. Spiking Networks of Hand and Object Choice
Motivation
• Predictive clarity
• Model-based imaging
• Natural ability to account for
patient data
• Generate new experiments
98. Supervised Learning in the Cerebellum
Hypothesized hand and object choice brain systems
operate with different learning algorithms.
Doya, 2000
99. Spiking Networks of IC and CR
• Role of the hyperdirect
pathway?
• Relationship to our studies of
system switching?
100. I. Many of the tools used to dissociate RB and II
category learning systems might be used to
dissociate hand choice from object choice, and
subsystems thereof.
Feedback delay
Time duration to process feedback
Feedback contingency
Automaticity
Object choice, hand choice, and categorization experiment ideas