SlideShare a Scribd company logo
1 of 18
Download to read offline
Sequence learning in a
model of the basal ganglia
Thesis submitted for MSc in computer science
NTNU, Trondheim 2006-09-08
Stian Soiland
http://soiland.no/master
This presentation
Theory
Control theory & Actor-Critic
Basal Ganglia and pathways
CTRNN
Previous work by Berns & Sejnowski
What was done?
Results
Globus pallidus, Weight learning
Error function
Experiments & Code
Noise, Integrator update rule
CTRNN library
Discussion
Equation or code?
Actor-critic architecture
System
Feedback
Reference signal
Disturbances
Output
Control
signals
Controller
_
+
error
Environment
(Controlled
System)
Context
Disturbances
Output
Actions
(Control
signals)
Actor
(Controller)
Critic
Feedback
Reinforcement
signal
Classic control system
Basal ganglia
Caudate
Globus pallidus
external internal
Substantia nigra
pars reticulata
Putamen
Wikipedia 2006
Adopted from Purves et al. 2003
Basal ganglia pathways
excitatory
inhibitory
dopaminergic
direct pathway
indirect pathway
Globus
pallidus
externalSubthalamic
nucleus
Subtantia
nigra pars
compacta
Subtantia
nigra pars
reticulata
Cerebral
cortex
Frontal
cortex
Striatum
Thalamus
Globus
pallidus
internal
Inhibitory connections
excitatory
inhibitory
firing
at rest
cortex inputs
sensory and
other inputs
Frontal
cortex
Striatum ThalamusGlobus
pallidus
STR excited GP inhibited Thalamus
disinhibited
Motor cortex
excited
STR at rest GP tonically
active
Thalamus
inhibited
Motor cortex
not excited
Purves et al. 2003
Basal ganglia in action
d in alert monkeys while they per-
vioral acts and receive rewards.
delivery (Fig. 1, top). After several days of
training, the animal learns to reach for the
puts appear to code for a d
between the actual reward
dictions of the time and ma
These neurons are activated
of the reward is uncertain, th
by any preceding cues. Dopa
therefore excellent feature
“goodness” of environment
to learned predictions abo
They emit a positive signa
production) if an appetitiv
than predicted, no signal (n
production) if an appetitiv
predicted, and a negative
spike production) if an ap
worse than predicted (Fig. 1
Computational Theo
The TD algorithm (6, 7) is
suited to understanding th
played by the dopamine s
the information it construc
(8, 10, 12). This work has
in dopamine activity in d
supervisory signal for
changes (8, 10, 12) and (
influence directly and indi
Reward predicted
Reward occurs
No prediction
Reward occurs
Reward predicted
No reward occurs
(No CS)
(No R)CS
-1 0 1 2 s
CS
R
R
Do dopamine neurons report an error
in the prediction of reward?
hanges in dopamine neurons’
e for an error in the prediction of
events. (Top) Before learning, a
petitive fruit juice occurs in the
f prediction—hence a positive
prediction of reward. The do-
uron is activated by this unpre-
urrence of juice. (Middle) After
e conditioned stimulus predicts
d the reward occurs according
diction—hence no error in the
of reward. The dopamine neu-
ated by the reward-predicting
ut fails to be activated by the
reward (right). (Bottom) After
e conditioned stimulus predicts
ut the reward fails to occur be-
mistake in the behavioral re-
the monkey. The activity of the
neuron is depressed exactly at
hen the reward would have oc-
e depression occurs more than
e conditioned stimulus without
ning stimuli, revealing an inter-
ntation of the time of the pre-
ard. Neuronal activity is aligned Scultz et al. 1997
Continious time recurrent neural
network (CTRNN)
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20
voltage
time
potential
input
I
i
(t)
y
2
(t)
τ=1.0
y
1
(t)
τ=4 w
1,2
w
2,1
θ
1
θ
2
Berns & Sejnowski's model
error
signal
indirect
pathway
striatum
globus pallidus
error
substantia
nigra
direct
pathway
subthalamic
nucleus
input
output
fast
slow
input
output
fast
slow
input
output
fast
slow
excitatory
inhibitory
weight-
modulating
fixed link
dynamic link
Thalamus
Cortex
What have I done?
Early attempts: C++ implementations of CTRNN & B&S
Direct reproduction of Berns & Sejnowski, equation to code
Experiments and tweaks on the direct reproduction
Implementation of CTRNN library
Reproducing B&S using CTRNN (failed)
Reproducing Prescott (2006) using CTRNN (worked)
Reproducing response of
globus pallidus
GP 5
4
3
2
1
GP 5
4
3
2
1
1 20
200180
Berns & Sejnowski 1998
Berns & Sejnowski Soiland
Weight learning
Figure 5. Changes in connection strengths, wij, from learning the se-
quence 1, 2, 3, 4, 2, 5. The ve weights from the ve STN units with
hort time constants to GP unit 2 are shown. The three weights that
ncreased to saturation levels were from STN units 2, 3, and 5 (i.e.,
hose STN units that were not active prior to GP unit 2 being ac-
ive). Conversely, the weights from STN units 1 and 4 did not in-
Berns & Sejnowski 1998
Berns & Sejnowski
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80 100 120 140 160 180
weight
timestep
STN 1
STN 2
STN 3
STN 4
STN 5
Soiland
Error function
he GP output is compared to the weighted STR inpu
e(s) =
i
Gi(s) − vi(s)Si(s)
weight representing the connection between STR un
ed for updating the weight of the connection between ST
STN unit has connections to all GP units, and the co
using a Hebbian learning rule (Sutton and Barto, 1981
def calc_error(self):
sum = 0.0
for i in range(self.inputs):
sum += self.GP[i]
sum -= self.v[i]* self.STR[i]
return sum
1
1.5
2
2.5
3
3.5
4
0 20 40 60 80 100 120 140 160 180 200
error
timestep
Berns & Sejnowski Soiland
Berns & Sejnowski 1998
0.85
0.9
0.95
1
1.05
0 20 40 60 80 100
GPfiringrate
Timestep
without noise
w/noise #1
w/noise #2
fixed bias=1
Experiment: Noise
Experiment: Sigmoidal update ruleAs shown in the discrete update rule in equation 3.5 on page 24
is applied several times, as simplified in:
B(s) = σ (αB(s − 1) + β)
B(s + 1) = σ (ασ (αB(s − 1) + β) + β)
To investigate the effect of using an iterative sigmoidal update ru
were applied in a Octave (Murphy, 1997) simulation, a function
normal leaky integrator potential, while function v(t) represents
rule as in Berns and Sejnowski:
u(t) = u(t − 1) +
1
τ
− u(t − 1) + I(t)
v(t) = σ v(t − 1) +
1
τ
− v(t − 1) + I(t)
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25 30
voltage
input
u(t)
sigmoid(u(t))
v(t)
as a modified version of equation 2.3 on page 16, includ
σ, which is an extended version of equation 2.2 on p
bias β:
σ(x) =
1
1 + e−γ(x−β)
Berns and Sejnowski (1998) introduce a term λ rep
by the size of the time step:
λ =
τ
τ + ∆t
Substituting equation 3.4 into equation 3.2 yields th
def calc_STN(self, i):
return self.sigmoid(lambd * self.STN[i] -
(1-lambd) * self.effect * self.GP[i/2])
(..)
STN[i] = calc_STN(i)
def sigmoid(self, x):
return 1.0 / (1.0 +
math.exp(-self.gain * (x-self.bias)))
CTRNN library for Python
import ctrnn
net = ctrnn.CTRNN(2)
net.bias[0] = 0.4
net.timeconst[1] = 1.4
net.weight[0,1] = 1.5
net.weight[1,0] = -1.0
net.calc_timestep(); print net.output
net.calc_timestep(); print net.output
[ 0.59868766 0.5 ]
[ 0.47502081 0.6550814 ]
y1(t)
τ=1.4
y0(t)
τ=1.0
1.5
-1.0
0.4
Equations or code?
inputs = numpy.matrixmultiply(output, weight)
change = timestep/timeconst *
(-potential + inputs + bias)
potential += change
output = map(transfer, potential)
More readable, but also more verbose
ors, while weight is an n × n sized matrix. The code is easily compared
ion of equation 2.3 on page 16:
y(t + ∆t) = y(t) +
∆t
τ
− y(t) + u(t) × W + θ (3.10)
u(t) = σ y(t) (3.11)
to making code clearer, the matrix operations of NumPy are implemented
mpiled languages as C and FORTRAN and exploit CPU vector features
vec engine (Diefendorff et al., 2000), which in informal tests on the basal
gave a considerable speed-up compared to pure Python code, in some
or 30.
k, experimenting with network shapes and layout was essential, and so
evel view of the calculations seemed like a reasonable approach, justifying
ing NumPy.
Concise, but can be difficult to understand
Mathematics can be general, but code is reproducible.
Maybe the best is a combination?
Questions?
All code, thesis and presentation at:
http://soiland.no/master

More Related Content

What's hot

VIBRATIONS AND WAVES TUTORIAL#2
VIBRATIONS AND WAVES TUTORIAL#2VIBRATIONS AND WAVES TUTORIAL#2
VIBRATIONS AND WAVES TUTORIAL#2Farhan Ab Rahman
 
Structural dynamics and earthquake engineering
Structural dynamics and earthquake engineeringStructural dynamics and earthquake engineering
Structural dynamics and earthquake engineeringBharat Khadka
 
Response spectrum method
Response spectrum methodResponse spectrum method
Response spectrum method321nilesh
 
Relativistic theory of spin relaxation mechanisms in the Landau-Lifshitz equa...
Relativistic theory of spin relaxation mechanisms in the Landau-Lifshitz equa...Relativistic theory of spin relaxation mechanisms in the Landau-Lifshitz equa...
Relativistic theory of spin relaxation mechanisms in the Landau-Lifshitz equa...Ritwik Mondal
 
Uppsala department presentation
Uppsala department presentationUppsala department presentation
Uppsala department presentationRitwik Mondal
 
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theoryRestricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theorySeongwon Hwang
 
Response spectra
Response spectraResponse spectra
Response spectra321nilesh
 
two degree of freddom system
two degree of freddom systemtwo degree of freddom system
two degree of freddom systemYash Patel
 
V. B. Jovanovic/ S. Ignjatovic: Mass Spectrum of the Light Scalar Tetraquark ...
V. B. Jovanovic/ S. Ignjatovic: Mass Spectrum of the Light Scalar Tetraquark ...V. B. Jovanovic/ S. Ignjatovic: Mass Spectrum of the Light Scalar Tetraquark ...
V. B. Jovanovic/ S. Ignjatovic: Mass Spectrum of the Light Scalar Tetraquark ...SEENET-MTP
 
Annintro
AnnintroAnnintro
Annintroairsrch
 
DriP PSO- A fast and inexpensive PSO for drifting problem spaces
DriP PSO- A fast and inexpensive PSO for drifting problem spacesDriP PSO- A fast and inexpensive PSO for drifting problem spaces
DriP PSO- A fast and inexpensive PSO for drifting problem spacesZubin Bhuyan
 
Langevin theory of Paramagnetism
Langevin theory of ParamagnetismLangevin theory of Paramagnetism
Langevin theory of ParamagnetismDr. HAROON
 

What's hot (20)

VIBRATIONS AND WAVES TUTORIAL#2
VIBRATIONS AND WAVES TUTORIAL#2VIBRATIONS AND WAVES TUTORIAL#2
VIBRATIONS AND WAVES TUTORIAL#2
 
Structural dynamics and earthquake engineering
Structural dynamics and earthquake engineeringStructural dynamics and earthquake engineering
Structural dynamics and earthquake engineering
 
Response spectrum method
Response spectrum methodResponse spectrum method
Response spectrum method
 
Backpropagation
BackpropagationBackpropagation
Backpropagation
 
10.1.1 Comparison
10.1.1 Comparison10.1.1 Comparison
10.1.1 Comparison
 
Relativistic theory of spin relaxation mechanisms in the Landau-Lifshitz equa...
Relativistic theory of spin relaxation mechanisms in the Landau-Lifshitz equa...Relativistic theory of spin relaxation mechanisms in the Landau-Lifshitz equa...
Relativistic theory of spin relaxation mechanisms in the Landau-Lifshitz equa...
 
M1l5
M1l5M1l5
M1l5
 
Uppsala department presentation
Uppsala department presentationUppsala department presentation
Uppsala department presentation
 
Quantum state
Quantum stateQuantum state
Quantum state
 
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theoryRestricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theory
 
Cap 12
Cap 12Cap 12
Cap 12
 
Response spectra
Response spectraResponse spectra
Response spectra
 
two degree of freddom system
two degree of freddom systemtwo degree of freddom system
two degree of freddom system
 
M1l6
M1l6M1l6
M1l6
 
V. B. Jovanovic/ S. Ignjatovic: Mass Spectrum of the Light Scalar Tetraquark ...
V. B. Jovanovic/ S. Ignjatovic: Mass Spectrum of the Light Scalar Tetraquark ...V. B. Jovanovic/ S. Ignjatovic: Mass Spectrum of the Light Scalar Tetraquark ...
V. B. Jovanovic/ S. Ignjatovic: Mass Spectrum of the Light Scalar Tetraquark ...
 
SDEE: Lectures 3 and 4
SDEE: Lectures 3 and 4SDEE: Lectures 3 and 4
SDEE: Lectures 3 and 4
 
Bazzucchi-Campolmi-Zatti
Bazzucchi-Campolmi-ZattiBazzucchi-Campolmi-Zatti
Bazzucchi-Campolmi-Zatti
 
Annintro
AnnintroAnnintro
Annintro
 
DriP PSO- A fast and inexpensive PSO for drifting problem spaces
DriP PSO- A fast and inexpensive PSO for drifting problem spacesDriP PSO- A fast and inexpensive PSO for drifting problem spaces
DriP PSO- A fast and inexpensive PSO for drifting problem spaces
 
Langevin theory of Paramagnetism
Langevin theory of ParamagnetismLangevin theory of Paramagnetism
Langevin theory of Paramagnetism
 

Viewers also liked (14)

Power point
Power pointPower point
Power point
 
Instituto tecnológico superior «vida nueva»
Instituto tecnológico superior «vida nueva»Instituto tecnológico superior «vida nueva»
Instituto tecnológico superior «vida nueva»
 
0001 (1)
0001 (1)0001 (1)
0001 (1)
 
sanctuary story
sanctuary storysanctuary story
sanctuary story
 
Research Project Primus
Research Project PrimusResearch Project Primus
Research Project Primus
 
Textos del Padre Federico Salvador Ramón - 38
Textos del Padre Federico Salvador Ramón - 38Textos del Padre Federico Salvador Ramón - 38
Textos del Padre Federico Salvador Ramón - 38
 
rcm branding + marketing tips
rcm branding + marketing tipsrcm branding + marketing tips
rcm branding + marketing tips
 
Power point
Power pointPower point
Power point
 
Megha Sharma
Megha SharmaMegha Sharma
Megha Sharma
 
Presentacion joenssu
Presentacion joenssuPresentacion joenssu
Presentacion joenssu
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
CV of Kgaugelo
CV of KgaugeloCV of Kgaugelo
CV of Kgaugelo
 
Cardiopatias
 Cardiopatias Cardiopatias
Cardiopatias
 
Ma091 ex4
Ma091 ex4Ma091 ex4
Ma091 ex4
 

Similar to basal-ganglia

Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...Elena Cannuccia
 
MATHEMATICAL MODELING OF COMPLEX REDUNDANT SYSTEM UNDER HEAD-OF-LINE REPAIR
MATHEMATICAL MODELING OF COMPLEX REDUNDANT SYSTEM UNDER HEAD-OF-LINE REPAIRMATHEMATICAL MODELING OF COMPLEX REDUNDANT SYSTEM UNDER HEAD-OF-LINE REPAIR
MATHEMATICAL MODELING OF COMPLEX REDUNDANT SYSTEM UNDER HEAD-OF-LINE REPAIREditor IJMTER
 
A new Reinforcement Scheme for Stochastic Learning Automata
A new Reinforcement Scheme for Stochastic Learning AutomataA new Reinforcement Scheme for Stochastic Learning Automata
A new Reinforcement Scheme for Stochastic Learning Automatainfopapers
 
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...Claudio Attaccalite
 
JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules
JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rulesJAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules
JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning ruleshirokazutanaka
 
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithmOptimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithminfopapers
 
3. convolution fourier
3. convolution fourier3. convolution fourier
3. convolution fourierskysunilyadav
 
2D CFD Code Based on MATLAB- As Good As FLUENT!
2D CFD Code Based on MATLAB- As Good As FLUENT!2D CFD Code Based on MATLAB- As Good As FLUENT!
2D CFD Code Based on MATLAB- As Good As FLUENT!Jiannan Tan
 
DSP Lab Manual (10ECL57) - VTU Syllabus (KSSEM)
DSP Lab Manual (10ECL57) - VTU Syllabus (KSSEM)DSP Lab Manual (10ECL57) - VTU Syllabus (KSSEM)
DSP Lab Manual (10ECL57) - VTU Syllabus (KSSEM)Ravikiran A
 
2 classical field theories
2 classical field theories2 classical field theories
2 classical field theoriesSolo Hermelin
 
EC8553 Discrete time signal processing
EC8553 Discrete time signal processing EC8553 Discrete time signal processing
EC8553 Discrete time signal processing ssuser2797e4
 
Robust model predictive control for discrete-time fractional-order systems
Robust model predictive control for discrete-time fractional-order systemsRobust model predictive control for discrete-time fractional-order systems
Robust model predictive control for discrete-time fractional-order systemsPantelis Sopasakis
 
Neuronal self-organized criticality (II)
Neuronal self-organized criticality (II)Neuronal self-organized criticality (II)
Neuronal self-organized criticality (II)Osame Kinouchi
 
Admission in India
Admission in IndiaAdmission in India
Admission in IndiaEdhole.com
 
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...Waqas Tariq
 

Similar to basal-ganglia (20)

Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
 
MATHEMATICAL MODELING OF COMPLEX REDUNDANT SYSTEM UNDER HEAD-OF-LINE REPAIR
MATHEMATICAL MODELING OF COMPLEX REDUNDANT SYSTEM UNDER HEAD-OF-LINE REPAIRMATHEMATICAL MODELING OF COMPLEX REDUNDANT SYSTEM UNDER HEAD-OF-LINE REPAIR
MATHEMATICAL MODELING OF COMPLEX REDUNDANT SYSTEM UNDER HEAD-OF-LINE REPAIR
 
Technical
TechnicalTechnical
Technical
 
A new Reinforcement Scheme for Stochastic Learning Automata
A new Reinforcement Scheme for Stochastic Learning AutomataA new Reinforcement Scheme for Stochastic Learning Automata
A new Reinforcement Scheme for Stochastic Learning Automata
 
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
Neutral Electronic Excitations: a Many-body approach to the optical absorptio...
 
Edm Pnc Standard Model
Edm Pnc Standard ModelEdm Pnc Standard Model
Edm Pnc Standard Model
 
JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules
JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rulesJAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules
JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules
 
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithmOptimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
 
3. convolution fourier
3. convolution fourier3. convolution fourier
3. convolution fourier
 
2D CFD Code Based on MATLAB- As Good As FLUENT!
2D CFD Code Based on MATLAB- As Good As FLUENT!2D CFD Code Based on MATLAB- As Good As FLUENT!
2D CFD Code Based on MATLAB- As Good As FLUENT!
 
MD_course.ppt
MD_course.pptMD_course.ppt
MD_course.ppt
 
DSP Lab Manual (10ECL57) - VTU Syllabus (KSSEM)
DSP Lab Manual (10ECL57) - VTU Syllabus (KSSEM)DSP Lab Manual (10ECL57) - VTU Syllabus (KSSEM)
DSP Lab Manual (10ECL57) - VTU Syllabus (KSSEM)
 
2 classical field theories
2 classical field theories2 classical field theories
2 classical field theories
 
EC8553 Discrete time signal processing
EC8553 Discrete time signal processing EC8553 Discrete time signal processing
EC8553 Discrete time signal processing
 
Robust model predictive control for discrete-time fractional-order systems
Robust model predictive control for discrete-time fractional-order systemsRobust model predictive control for discrete-time fractional-order systems
Robust model predictive control for discrete-time fractional-order systems
 
AN IMPROVED NONINVASIVE AND MULTIMODEL PSO ALGORITHM FOR EXTRACTING ARTIFACT...
AN IMPROVED NONINVASIVE AND MULTIMODEL PSO ALGORITHM FOR  EXTRACTING ARTIFACT...AN IMPROVED NONINVASIVE AND MULTIMODEL PSO ALGORITHM FOR  EXTRACTING ARTIFACT...
AN IMPROVED NONINVASIVE AND MULTIMODEL PSO ALGORITHM FOR EXTRACTING ARTIFACT...
 
Neuronal self-organized criticality (II)
Neuronal self-organized criticality (II)Neuronal self-organized criticality (II)
Neuronal self-organized criticality (II)
 
Admission in India
Admission in IndiaAdmission in India
Admission in India
 
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
 
1st-order-system.pdf
1st-order-system.pdf1st-order-system.pdf
1st-order-system.pdf
 

More from Stian Soiland-Reyes

2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research ObjectStian Soiland-Reyes
 
2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systemsStian Soiland-Reyes
 
2016-11-21 BioExcel Workflows and Pipelines Interest Group
2016-11-21 BioExcel Workflows and Pipelines Interest Group2016-11-21 BioExcel Workflows and Pipelines Interest Group
2016-11-21 BioExcel Workflows and Pipelines Interest GroupStian Soiland-Reyes
 
2016-10-20 BioExcel: Building Workflows with Apache Taverna
2016-10-20 BioExcel: Building Workflows with Apache Taverna2016-10-20 BioExcel: Building Workflows with Apache Taverna
2016-10-20 BioExcel: Building Workflows with Apache TavernaStian Soiland-Reyes
 
2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow Environments2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow EnvironmentsStian Soiland-Reyes
 
2016-04-21 BioExcel Usecase Open PHACTS
2016-04-21 BioExcel Usecase Open PHACTS2016-04-21 BioExcel Usecase Open PHACTS
2016-04-21 BioExcel Usecase Open PHACTSStian Soiland-Reyes
 

More from Stian Soiland-Reyes (8)

2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object
 
2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems
 
2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro
 
2016-11-21 BioExcel Workflows and Pipelines Interest Group
2016-11-21 BioExcel Workflows and Pipelines Interest Group2016-11-21 BioExcel Workflows and Pipelines Interest Group
2016-11-21 BioExcel Workflows and Pipelines Interest Group
 
2016-10-20 BioExcel: Building Workflows with Apache Taverna
2016-10-20 BioExcel: Building Workflows with Apache Taverna2016-10-20 BioExcel: Building Workflows with Apache Taverna
2016-10-20 BioExcel: Building Workflows with Apache Taverna
 
2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow Environments2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow Environments
 
2016-07-06-openphacts-docker
2016-07-06-openphacts-docker2016-07-06-openphacts-docker
2016-07-06-openphacts-docker
 
2016-04-21 BioExcel Usecase Open PHACTS
2016-04-21 BioExcel Usecase Open PHACTS2016-04-21 BioExcel Usecase Open PHACTS
2016-04-21 BioExcel Usecase Open PHACTS
 

basal-ganglia

  • 1. Sequence learning in a model of the basal ganglia Thesis submitted for MSc in computer science NTNU, Trondheim 2006-09-08 Stian Soiland http://soiland.no/master
  • 2. This presentation Theory Control theory & Actor-Critic Basal Ganglia and pathways CTRNN Previous work by Berns & Sejnowski What was done? Results Globus pallidus, Weight learning Error function Experiments & Code Noise, Integrator update rule CTRNN library Discussion Equation or code?
  • 4. Basal ganglia Caudate Globus pallidus external internal Substantia nigra pars reticulata Putamen Wikipedia 2006 Adopted from Purves et al. 2003
  • 5. Basal ganglia pathways excitatory inhibitory dopaminergic direct pathway indirect pathway Globus pallidus externalSubthalamic nucleus Subtantia nigra pars compacta Subtantia nigra pars reticulata Cerebral cortex Frontal cortex Striatum Thalamus Globus pallidus internal
  • 6. Inhibitory connections excitatory inhibitory firing at rest cortex inputs sensory and other inputs Frontal cortex Striatum ThalamusGlobus pallidus STR excited GP inhibited Thalamus disinhibited Motor cortex excited STR at rest GP tonically active Thalamus inhibited Motor cortex not excited Purves et al. 2003
  • 7. Basal ganglia in action d in alert monkeys while they per- vioral acts and receive rewards. delivery (Fig. 1, top). After several days of training, the animal learns to reach for the puts appear to code for a d between the actual reward dictions of the time and ma These neurons are activated of the reward is uncertain, th by any preceding cues. Dopa therefore excellent feature “goodness” of environment to learned predictions abo They emit a positive signa production) if an appetitiv than predicted, no signal (n production) if an appetitiv predicted, and a negative spike production) if an ap worse than predicted (Fig. 1 Computational Theo The TD algorithm (6, 7) is suited to understanding th played by the dopamine s the information it construc (8, 10, 12). This work has in dopamine activity in d supervisory signal for changes (8, 10, 12) and ( influence directly and indi Reward predicted Reward occurs No prediction Reward occurs Reward predicted No reward occurs (No CS) (No R)CS -1 0 1 2 s CS R R Do dopamine neurons report an error in the prediction of reward? hanges in dopamine neurons’ e for an error in the prediction of events. (Top) Before learning, a petitive fruit juice occurs in the f prediction—hence a positive prediction of reward. The do- uron is activated by this unpre- urrence of juice. (Middle) After e conditioned stimulus predicts d the reward occurs according diction—hence no error in the of reward. The dopamine neu- ated by the reward-predicting ut fails to be activated by the reward (right). (Bottom) After e conditioned stimulus predicts ut the reward fails to occur be- mistake in the behavioral re- the monkey. The activity of the neuron is depressed exactly at hen the reward would have oc- e depression occurs more than e conditioned stimulus without ning stimuli, revealing an inter- ntation of the time of the pre- ard. Neuronal activity is aligned Scultz et al. 1997
  • 8. Continious time recurrent neural network (CTRNN) 0 0.2 0.4 0.6 0.8 1 1.2 0 5 10 15 20 voltage time potential input I i (t) y 2 (t) τ=1.0 y 1 (t) τ=4 w 1,2 w 2,1 θ 1 θ 2
  • 9. Berns & Sejnowski's model error signal indirect pathway striatum globus pallidus error substantia nigra direct pathway subthalamic nucleus input output fast slow input output fast slow input output fast slow excitatory inhibitory weight- modulating fixed link dynamic link Thalamus Cortex
  • 10. What have I done? Early attempts: C++ implementations of CTRNN & B&S Direct reproduction of Berns & Sejnowski, equation to code Experiments and tweaks on the direct reproduction Implementation of CTRNN library Reproducing B&S using CTRNN (failed) Reproducing Prescott (2006) using CTRNN (worked)
  • 11. Reproducing response of globus pallidus GP 5 4 3 2 1 GP 5 4 3 2 1 1 20 200180 Berns & Sejnowski 1998 Berns & Sejnowski Soiland
  • 12. Weight learning Figure 5. Changes in connection strengths, wij, from learning the se- quence 1, 2, 3, 4, 2, 5. The ve weights from the ve STN units with hort time constants to GP unit 2 are shown. The three weights that ncreased to saturation levels were from STN units 2, 3, and 5 (i.e., hose STN units that were not active prior to GP unit 2 being ac- ive). Conversely, the weights from STN units 1 and 4 did not in- Berns & Sejnowski 1998 Berns & Sejnowski 0 0.2 0.4 0.6 0.8 1 0 20 40 60 80 100 120 140 160 180 weight timestep STN 1 STN 2 STN 3 STN 4 STN 5 Soiland
  • 13. Error function he GP output is compared to the weighted STR inpu e(s) = i Gi(s) − vi(s)Si(s) weight representing the connection between STR un ed for updating the weight of the connection between ST STN unit has connections to all GP units, and the co using a Hebbian learning rule (Sutton and Barto, 1981 def calc_error(self): sum = 0.0 for i in range(self.inputs): sum += self.GP[i] sum -= self.v[i]* self.STR[i] return sum 1 1.5 2 2.5 3 3.5 4 0 20 40 60 80 100 120 140 160 180 200 error timestep Berns & Sejnowski Soiland Berns & Sejnowski 1998
  • 14. 0.85 0.9 0.95 1 1.05 0 20 40 60 80 100 GPfiringrate Timestep without noise w/noise #1 w/noise #2 fixed bias=1 Experiment: Noise
  • 15. Experiment: Sigmoidal update ruleAs shown in the discrete update rule in equation 3.5 on page 24 is applied several times, as simplified in: B(s) = σ (αB(s − 1) + β) B(s + 1) = σ (ασ (αB(s − 1) + β) + β) To investigate the effect of using an iterative sigmoidal update ru were applied in a Octave (Murphy, 1997) simulation, a function normal leaky integrator potential, while function v(t) represents rule as in Berns and Sejnowski: u(t) = u(t − 1) + 1 τ − u(t − 1) + I(t) v(t) = σ v(t − 1) + 1 τ − v(t − 1) + I(t) 0 0.2 0.4 0.6 0.8 1 1.2 0 5 10 15 20 25 30 voltage input u(t) sigmoid(u(t)) v(t) as a modified version of equation 2.3 on page 16, includ σ, which is an extended version of equation 2.2 on p bias β: σ(x) = 1 1 + e−γ(x−β) Berns and Sejnowski (1998) introduce a term λ rep by the size of the time step: λ = τ τ + ∆t Substituting equation 3.4 into equation 3.2 yields th def calc_STN(self, i): return self.sigmoid(lambd * self.STN[i] - (1-lambd) * self.effect * self.GP[i/2]) (..) STN[i] = calc_STN(i) def sigmoid(self, x): return 1.0 / (1.0 + math.exp(-self.gain * (x-self.bias)))
  • 16. CTRNN library for Python import ctrnn net = ctrnn.CTRNN(2) net.bias[0] = 0.4 net.timeconst[1] = 1.4 net.weight[0,1] = 1.5 net.weight[1,0] = -1.0 net.calc_timestep(); print net.output net.calc_timestep(); print net.output [ 0.59868766 0.5 ] [ 0.47502081 0.6550814 ] y1(t) τ=1.4 y0(t) τ=1.0 1.5 -1.0 0.4
  • 17. Equations or code? inputs = numpy.matrixmultiply(output, weight) change = timestep/timeconst * (-potential + inputs + bias) potential += change output = map(transfer, potential) More readable, but also more verbose ors, while weight is an n × n sized matrix. The code is easily compared ion of equation 2.3 on page 16: y(t + ∆t) = y(t) + ∆t τ − y(t) + u(t) × W + θ (3.10) u(t) = σ y(t) (3.11) to making code clearer, the matrix operations of NumPy are implemented mpiled languages as C and FORTRAN and exploit CPU vector features vec engine (Diefendorff et al., 2000), which in informal tests on the basal gave a considerable speed-up compared to pure Python code, in some or 30. k, experimenting with network shapes and layout was essential, and so evel view of the calculations seemed like a reasonable approach, justifying ing NumPy. Concise, but can be difficult to understand Mathematics can be general, but code is reproducible. Maybe the best is a combination?
  • 18. Questions? All code, thesis and presentation at: http://soiland.no/master