Computational Properties of the Hippocampus Increase the Efficiency of Goal-Directed Foraging through Hierarchical Reinforcement Learning

Computational Properties of the Hippocampus
Increase the Efficiency of Goal-Directed Foraging
through Hierarchical Reinforcement Learning
Front. Computational Neurosci.
Chalmers et al.
2016

Goal-directed Navigation in Animal
How?

Model-based Reward Learning Algorithm
▪ ‘Model-based’ means, uses representations of the environment,
expectations and prospective calculations to make cognitive
predictions of future value [1]
▪ Previous artificial intelligence (AI) research suggest the model-
based reward learning (MBRL) learning algorithm to
understand how an animal navigation to the goal (reward)
s: state
a: action
R: reward function
Q-learning algorithm
Next action Current actionTransition probability
[1] Dayan and Berridge, 2014, Cogn Affect Behav Neurosci.

Goal-directed Navigation based on MBRL
▪ ‘Tree search’ model for goal-directed navigation model based on MBRL
▪ A rat faces a maze, in which different turns lead to states and rewards
[2] Daw, 2012, IEEE

Place Cell and Reward
▪ The location specific spatial information is processed in the brain,
‘place cell’
▪ Place cell has ‘forward sweeps’ (journey-dependent activity) that is
related to the specific route for goal-directed navigation
[3] Grieves et al., 2016, eLIFE

Place Cell and Reward
▪ Dopamine is known to modulate the synaptic plasticity
▪ Hippocampus that has place cell receives dopaminergic synaptic
input from ventral tegmental area (VTA)
▪ VTA is the origin of the dopaminergic cell bodies and plays a major
role in reward circuitry of the brain
[4] Elmann and Lessmann, 2013, Front. Neurosci. [5] Russo and Nestler, 2013, Nat. Rev. Neurosci.

Hierarchical Navigation Model
▪ Hierarchical cognitive map model
▫ Learning intelligent distribution agent (LIDA) has three phase that
similar to MBRL
• Understanding
◦ Sensing environment
◦ Detecting feature
◦ Recognizing object and category
• Attending
• Action
[6] Madl et al., 2015, Neural Networks

Hierarchical Representation
▪ Place cell processes the spatial information of an animal as
hierarchical space
[7] Keinath et al., 2014, Hippocampus [8] Lyttle et al., 2013, Hippocampus
Standard environment
(diam = 35 cm)
Different
visual cue
Large environment
(diam = 70 cm)

Hierarchical Spatial Information Processing
▪ Animals with lesions of ventral hippocampus (vH) suffer
delayed acquisition, requiring more trials to learn the task
▪ Whereas animals with lesions of dorsal hippocampus (dH)
never learn to perform the water maze task as well as intact
animals
[9] Ruediger et al., 2012, Nat. Neurosci.

Research Question
▪ Conventional MBRL algorithms do not fully explain animals’
ability to rapidly adapt to environmental change, or learn
multiple complex tasks
▪ To implement a computational MBRL frameworks that
incorporates features inspired by computational properties of
the hippocampus
▫ Hierarchical representation of space
▫ Forward sweeps

Method
▪ Simulated environment
▫ 16 by 48 discrete states in an environment
▫ Four actions (up, down, left and right)
▫ Open arena 10 trials → boundaries were added to the arena (5 mazes)
▫ Agent could not see or otherwise sense barriers or rewards
▫ Goal location triggers a ‘reward’ update

Method
▪ Computational model of place cell and reinforcement learning
Goal
Start
Abstracted spatial hierarchy (6 levels) MBRL for forward sweeps
Hierarchy planning process

Hippocampal Lesion Simulation
▪ The agent with simulated vH lesion learns the task more slowly
than the unimpaired agent
▪ While the agent with simulated dH lesion had worse asymptotic
performance

Efficient Spatial Navigation and Adaptation
▪ Open arena 10 trials → maze
▪ The hierarchical approach
adapted more quickly and
used far less computational
resources than the standard
MBRL algorithm

Trapped by Learning
▪ Non-hierarchical approach is slow to adapt to the added
boundaries
▪ Conventional MBRL algorithm requires many steps and much
computation to adapt
Occupancy density for trajectories

Probabilistic Reward Locations
▪ The reward was placed randomly in each trials at one of four
locations in an environment (7 x 7 states)
▪ The conventional MBRL algorithm could not solve this task

Conclusion
▪ The hierarchical representation of place cell enhances the goal-
directed navigation efficiency even in a novel environment
▪ The hippocampus hierarchical representation of space can
support a computationally efficient learning and adaptation that
may not be possible otherwise

Discussion
▪ The problems of previous research on MBRL models
▪ Previous goal-directed navigational model based on
neuromorphic neural network represent an environment as
non-hierarchical space by place cells
▪ MBRL learning paradigm with place cell can adapt to
neuromorphic neural network model based on place cell
▫ State: sensory information → spike of place cell
▫ Action and probability → STDP learning rule between place cells

Discussion
▪ Authors supposed place cell activity is came from grid cell
activity and this model regard the place cell will be remapped by
new rewards
Global re-mapping in PC Grid structure of GC in novel place

Computational Properties of the Hippocampus Increase the Efficiency of Goal-Directed Foraging through Hierarchical Reinforcement Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (7)

Similar to Computational Properties of the Hippocampus Increase the Efficiency of Goal-Directed Foraging through Hierarchical Reinforcement Learning

Similar to Computational Properties of the Hippocampus Increase the Efficiency of Goal-Directed Foraging through Hierarchical Reinforcement Learning (20)

More from Seonghyun Kim

More from Seonghyun Kim (13)

Recently uploaded

Recently uploaded (20)

Computational Properties of the Hippocampus Increase the Efficiency of Goal-Directed Foraging through Hierarchical Reinforcement Learning