1. Representation of object position in various frames
of reference using robotic simulator
Gain Modulation
CNC seminar 5.12.2012
Student: Marcel Švec
Supervisor: doc. Ing. Igor Farkaš, PhD.
2. 2
Annotation
• Humans use several egocentric frames of reference for effective sensomotoric
coordination (eye-hand): eye-centered, head-centered, body-centered location…
• How does brain carry out computations such as coordinate transformation? How
are sensory inputs translated to motor outputs?
• Our aim is to implement and evaluate neural network model capable of
coordinate translation
• Goal of the design is to create inner representation of the surrounding space in
simulated agent
• We use robotic simulator iCub for experiments
• Study of area 7a (visual and eye-position neurons), most likely this area
preforms spatial transformations
3. 3
Experiment One – introduction
• Inspired by the article:
• A back-propagation programmed network that simulates response
properties of a subset of posterior parietal neurons. David Zipser,
Richard A. Andersen (Nature, 1988) [http://dx.doi.org/10.1038/331679a0]
• Excerpt:
• Experiments in macaque monkeys (proposed that
area 7a contains visual and eye-position neurons)
• Experimental results from area 7a:
eye-position neurons (15%),
visual neurons (21%), combination (57%)
• Interaction between eye-position and visual
responses ( gain fields) Image source:
http://en.wikipedia.org/wiki/Brodmann_area_7
4. 4
Experiment One – spatial gain fields
• Determining the effect of eye position on receptive fields
• visual stimulus always present at the same retinal location
• monkey has had fixed and fixates on a point f
(Zipser, Andersen 1988)
5. 5
Experiment One – neural network model
Proposed neural network model:
• Three layer, back-propagation
• Input: visual stimulus and eye
position
• Output: head-centered coordinates
Model and experimental retinal
receptive fields were remarkably
similar (Zipser, Andersen 1988)
6. 6
Experiment One – dataset
• Input:
• eye position – vertical and horizontal orientation (angle)
• visual stimulus – images from the left and right eye (processed)
• Output:
• Body-referenced target
position expressed by
horizontal and vertical
slope
• iCub generator:
• controlling iCub (eyes limits)
• where to put an object (camera
properties)
• eyes rotation (keeping object in
FOV)
• what size (scaling)
• processing (color filter)
7. 7
Experiment One – training, testing, results
• FANN – Fast Artificial Neural Network Library
• C, multilayer neural networks, back-propagation training (incremental, RPROP, Quickprop, batch), cross-platform,
bindings to >15 languages…
• Network model:
• Input layer = eye_tilt + eye_version + left_eye_image + right_eye_image = 11 + 21 + 64*48 + 64*48 = 6176 neurons
• Hidden layer = 250 neurons
• Output layer = x-slope + y-slope = 10 + 10 = 20 neurons (every 20 degrees)
• So far best results:
• Sigmoids, steepness hidden = 0.05 (1/20), output = 0.0715 (1/14)
• Algorithm – RPROP (batch, doesn’t use learning rate)
• 93 epochs on 1000 patterns, MSE < 0.0001, training took less than 15 minutes,
comparing to incremental training – 200 epochs, MSE < 0.0017, 53 minutes
• Average error: x < 4, y < 3.3 degrees, standard deviations: x = 3.6, y = 3.2
8. 8
Experiment One – results
sorted errors on testing data
• Possible cause of inaccuracies is
the character of patterns – objects
of different size, shape and
orientation may appear at the same
location, which means that inputs
that differ in visual stimulus
(size, shading) may require the
same output.
• Solutions (open): another hidden
layer, different network model?
Comparing real and desired values of output neurons
9. 9
Gain fields
Mapping the receptive
field of parietal neuron
Stimulus is present at
the same retinal location (head turned left)
Neural response of the
neuron when the head is
turned right and left:
• location and shape
remains
• amplitude (gain)
changes
Salinas et al. (2001)
10. 10
Gain fields – computing
• Gain fields of parietal neurons depend on eye position, head position…
note: although single neurons are modulated, several neurons have to be combined to get the whole information
(population code)
• r = f(x – a) g(y) // single parietal neuron
• x – retinal location of the stimulus
• y – gaze angle
• f – function for the response to the visual stimulus
• a – location of the peak of function f
• r – amplitude of the response
• g – gain field
• R = F(c1x + c2y) // downstream response
• F – peaked function that represent receptive field
• Set of downstream neurons may represent quantity (x + y), while another (x-y), both sets being driven
by the same population of gain-modulated neurons.
• Very general mechanism.
Salinas et al. (2001)
11. 11
Gain modulation and coordinate transformations
• Gain modulated neurons are suited to add, subtract and performs operations essential for coordinate
transformations.
• Figure:
• 3 eyes positions
• left columns –
responses of 4 idealized
gain-modulated neurons
• right column –
responses of
a downstream neuron
(weighted sum)
Salinas et al. (2001)
12. 12
Gain modulation
• it is an extremely widespread mechanism
• non-linear combination of information from several sources
• affected is sensitivity (amplitude, gain), not selectivity (sensitivity or
receptive field properties)
• indications that it serves as a basis for computations (coordinate
transformations, invariant responses)
• related also to
• Object recognition (invariance)
• Focusing attention
• Motion processing
13. 13
http://masterthesis.itbrutus.com
Thanks for your attention
Cited sources:
• Salinas, E. and T. J. Sejnowski (2001). Gain modulation in the central nervous system: Where
behavior, neurophysiology, and computation meet. The Neuroscientist 7, pp. 430440.
• Zipser, D. and R. A. Andersen (1988). A backpropagation programmed network that simulates
response properties of a subset of posterior parietal neurons. Nature 331, pp. 679684.