Evolutionary Optimization of Bio-Inspired Controllers for Modular Soft Robots

UNIVERSITÀ DEGLI STUDI DI TRIESTE
DIPARTIMENTO DI INGEGNERIA E ARCHITETTURA
Laurea Magistrale in Ingegneria Elettronica e Informatica
Curriculum Applicazioni Informatiche
Evolutionary optimization of
bio-inspired controllers for
modular soft robots: synaptic
pruning and spiking neural
networks
LAUREANDA RELATORE
Giorgia Nadizar Chiar.mo Prof. Eric Medvet
CORRELATORE
Chiar.mo Prof. Stefano Nichele
Anno Accademico 2020/2021

B >
1
N
N
X
i=1
Xi
(be greater than average)
A chi ha creduto e continua a credere in me

Abstract
Voxel-based Soft Robots (VSRs) are a form of modular soft robots where
sensors and actuators are distributed over the body. They can be controlled
with Artificial Neural Networks (ANNs) optimized by means of neuroevolu-
tion. Even though these robots are strongly inspired by living organisms,
their controllers still lack in biological resemblance. Therefore, we experi-
ment along two research directions to address the following question: can the
introduction of bio-inspired features to the ANNs controlling VSRs positively
impact their performance? Namely, we first analyze the effects of synaptic
pruning on the controller of VSRs. Pruning is a technique that consists in the
removal of some synapses from the ANN and could improve its generalization
ability, while also making it more energy efficient. We find that, with some
forms of pruning, a vast portion of synapses can be removed without causing
detrimental effects. Secondly, we shift to a different ANN paradigm based
on Spiking Neural Networks (SNNs), a more biologically plausible model
of ANNs. SNNs appear promising in terms of adaptability thanks to bio-
inspired self-regulatory mechanisms. Our findings are that SNNs can lead
to significantly better performance than traditional ANN models, especially
when VSRs are faced with unforeseen circumstances.
i

Abstract (IT)
I Voxel-based Soft Robot (VSR) sono un tipo di soft robot modulari in
cui sensori e attuatori sono distribuiti lungo il corpo, e che possono es-
sere controllati da reti neurali artificiali (ANN) ottimizzate mediante neu-
roevoluzione. Malgrado questi robot siano fortemente ispirati agli organismi
viventi, i loro controllori mancano di verosimiglianza biologica. Pertanto,
in questo lavoro sperimentiamo lungo due assi di ricerca per rispondere alla
seguente domanda: l’introduzione di caratteristiche di ispirazione biologica
nelle ANN che controllano i VSR può impattarne positivamente le perfor-
mance? In particolare, analizziamo per prima cosa gli effetti del pruning
sinaptico sui controllori dei VSR. Il pruning è una tecnica che consiste nella
rimozione di alcune sinapsi dalle ANN, che potrebbe aumentarne la capacità
di generalizzazione oltre che l’efficienza energetica. Osserviamo sperimen-
talmente che, con alcune forme di pruning, è possibile rimuovere una vasta
porzione di sinapsi senza causare effetti dannosi. In secondo luogo, ci orien-
tiamo verso un diverso paradigma di ANN, basato sulle reti neurali Spiking
(SNN), un modello di ANN caratterizzato da maggiore verosimiglianza bio-
logica. Le SNN sembrano promettenti in termini di adattabilità grazie a mec-
canismi di autoregolazione ispirati alla biologia. I nostri risultati mostrano
che le SNN riescono a migliorare significativamente le prestazioni dei mod-
elli tradizionali di ANN, specialmente quando i VSR sono posti in condizioni
impreviste.
ii

Contents
Abstract i
Abstract (IT) ii
Introduction xii
1 Background 1
1.1 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Multi Layer Perceptron . . . . . . . . . . . . . . . . . 2
1.1.2 Spiking Neural Networks . . . . . . . . . . . . . . . . . 3
1.2 Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 General scheme . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Some examples . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Neuroevolution . . . . . . . . . . . . . . . . . . . . . . 6
2 Voxel-based Soft Robots 7
2.1 VSR morphology . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 VSR controller . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Sinusoidal controller . . . . . . . . . . . . . . . . . . . 9
2.2.2 Centralized neural controller . . . . . . . . . . . . . . . 9
2.2.3 Distributed neural controller . . . . . . . . . . . . . . 10
2.3 Physical realizations . . . . . . . . . . . . . . . . . . . . . . . 13
I The effects of Synaptic Pruning on Evolved Neural Con-
trollers for Voxel-based Soft Robots 15
3 Related work 16
3.1 Synaptic pruning in the nervous system . . . . . . . . . . . . 16
3.2 Pruning in ANNs . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Pruning ANNs in the context of statistical learning . . . . . . 18
3.4 Pruning ANNs in the context of neuroevolution . . . . . . . . 18
3.5 Pruning biologically-inspired ANNs . . . . . . . . . . . . . . . 19
iii

CONTENTS
4 Pruning techniques 20
5 Experiments and results 23
5.1 Static characterization of pruning variants . . . . . . . . . . . 24
5.2 RQ1: impact on the evolution . . . . . . . . . . . . . . . . . . 28
5.3 RQ2: impact on the adaptability . . . . . . . . . . . . . . . . 32
5.4 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . 34
II Evolved Spiking Neural Networks for controlling Voxel-
based Soft Robots 36
6 Related work 37
7 Spiking Neural Networks as controllers 39
7.1 Spiking Neuron models . . . . . . . . . . . . . . . . . . . . . . 39
7.1.1 Integrate and Fire model . . . . . . . . . . . . . . . . 39
7.1.2 Izhikevich model . . . . . . . . . . . . . . . . . . . . . 41
7.2 Homeostasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.3 Spike Timing Dependent Plasticity . . . . . . . . . . . . . . . 43
7.4 Data encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.4.1 Spike trains representation . . . . . . . . . . . . . . . . 45
7.4.2 Input and output converters . . . . . . . . . . . . . . . 47
8 Experiments and results 50
8.1 Input/Output converters tuning . . . . . . . . . . . . . . . . . 51
8.2 RQ1 & RQ2: effectiveness and adaptability of SNNs . . . . . 54
8.2.1 Velocity as fitness measure . . . . . . . . . . . . . . . . 55
8.2.2 Efficiency as fitness measure . . . . . . . . . . . . . . . 68
8.3 RQ3: exploiting modularity . . . . . . . . . . . . . . . . . . . 70
8.4 RQ4: behavioral analysis . . . . . . . . . . . . . . . . . . . . . 73
8.4.1 Behavior features . . . . . . . . . . . . . . . . . . . . . 74
8.4.2 Resulting behaviors . . . . . . . . . . . . . . . . . . . . 75
8.5 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . 77
Conclusions 78
iv

List of Figures
1.1 Graph representation of a fully-connected feed-forward ANN
with three input neurons, two hidden layers with five neurons
each, and three output neurons. . . . . . . . . . . . . . . . . . 2
1.2 Schematic representation of a McCulloch and Pitts neuron
(perceptron). . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Hebbian learning rules. The graphs illustrate the percentual
adjustment of a weight ∆w(%) dependent on the relative tim-
ing between input and output spikes ∆t (image taken from [41]). 4
1.4 General scheme of an EA [16]. . . . . . . . . . . . . . . . . . . 5
2.1 Frames of the two VSR morphologies used in the experiments.
The color of each voxel encodes the ratio between its current
area and its rest area: red indicated contraction, yellow rest
state, and green expansion. The circular sector drawn at the
center of each voxel indicates the current sensed values: sub-
sectors represent sensors and are, where appropriate, inter-
nally divided in slices according to the sensor dimensionality
m. The rays of the vision sensors are shown in red. . . . . . 8
2.2 The mechanical model of the voxel. The four masses are de-
picted in gray, the different components of the scaffolding are
depicted in blue, green, red, and orange, and the ropes are
depicted in black (image taken from [47]). . . . . . . . . . . . 8
2.3 A schematic representation of the centralized controller for a
3×1 VSR with two sensors in each voxel. Blue and red curved
arrows represent the connection of the MLP with inputs (sen-
sors) and outputs (actuators), respectively. . . . . . . . . . . 11
2.4 A schematic representation of the portion of the distributed
neural controller corresponding to one single voxel with two
sensors and nc = 1 communication contact per side. Blue
and red curved arrows represent the connection of the MLP
with inputs (sensors and input communication contacts) and
outputs (actuator and output communication contacts), re-
spectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
v

LIST OF FIGURES
2.5 Some examples of VSR physical implementations. . . . . . . . 14
4.1 Graphical representation of the scopes taken into considera-
tion in the pruning algorithm. . . . . . . . . . . . . . . . . . . 21
5.1 Mean absolute difference e between the output of a pruned
ANN and the output of the corresponding unpruned ANN
vs. the pruning rate ρ, for different ANN structures and with
different pruning criteria (color) and scopes (linetype). . . . 25
5.2 Comparison of the output of pruned and unpruned versions
of two ANNs of different structures: ninput = 10, nlayers = 0,
above, and ninput = 100, nlayers = 2, below. . . . . . . . . . . 27
5.3 Fitness vx (median with lower and upper quartiles across the
10 repetitions) vs. pruning rate ρ, for different pruning criteria
(color), VSR morphologies (plot row), and ANN topologies
(plot column). . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.4 Median, lower quartiles, and upper quartiles of validation ve-
locity vx vs. validation pruning rate ρ of individuals evolved
with and without pruning for different ANN topologies and
VSR morphologies. . . . . . . . . . . . . . . . . . . . . . . . 32
5.5 Different types of terrains employed for measuring VSR adapt-
ability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.6 Median, lower quartiles, and upper quartiles of validation ve-
locity vx vs. validation pruning rate ρ averaged across valida-
tion terrains fir different pruning criteria, VSR morphologies,
and ANN topologies. . . . . . . . . . . . . . . . . . . . . . . 34
7.1 Comparison of existing spiking neuron models in terms of
biological plausibility and computational cost (image taken
from [33]). The y axis describes the biological plausibility by
the number of biological features that each model implements,
whereas the x axis accounts for the implementation cost in
terms of number of floating point operations (# of FLOPS)
needed to simulate the model for 1 ms. . . . . . . . . . . . . . 40
7.2 Different firing patterns resulting from different combinations
of values for a, b, c, and d. Electronic version of the fig-
ure and reproduction permissions are freely available at www.
izhikevich.com. . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.3 Schematic representation of the controller for VSRs with its
inputs and outputs at each simulation time step k. . . . . . . 44
7.4 Schematic representation of the MLP controller for VSRs with
its inputs and outputs at each simulation time step k. . . . . 45
7.5 Schematic representation of the SNN controller for VSRs with
its inputs and outputs at each simulation time step k. . . . . 45
vi

LIST OF FIGURES
7.6 Graphical representation of the encoding schemes and com-
parison on how they work on a sample spike train. The first
row shows the original spike train, the second row the encod-
ing, and the last row how the spike train would be decoded. . 46
7.7 A schematic representation of the SNN controller together
with the input and output converters for a 3 × 1 VSR with
one sensor in each voxel. Blue and red curves represent the
links between the converters and inputs or outputs respectively. 48
8.1 Schematic representation of the configuration employed for
tuning input and output converters parameters. The schema
represents a LIF neuron proceeded and followed by converters.
The input converter is fed with a numerical signal and outputs
spike trains to pass on to the neuron (red), which in turn
outputs spike trains that are converted to a numerical signal
by the output converter (blue). . . . . . . . . . . . . . . . . . 51
8.2 Input (dashed) and output (continuous) signals for the system
of Figure 8.1 with the configurations of Table 8.1 (one per
column). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.3 Box plots of the fitness vx distributions of the best VSRs at
the end of evolution for different ANN topologies (plot col-
umn), VSR morphologies (plot row), and types of controller
(color). Above the box plots the p-values resulting from a
Mann-Whitney U test with the null hypothesis that the dis-
tribution originated from MLP controllers is the same as each
one generated by SNN controllers. . . . . . . . . . . . . . . . 57
8.4 Median of the fitness vx of the best individuals during evo-
lution for different ANN topologies (plot column), VSR mor-
phologies (plot row), and types of controller (color). . . . . . . 59
8.5 Box plots of the average vx on unseen terrains of the best indi-
viduals resulting from evolution, for different ANN topologies
(plot column), VSR morphologies (plot row), and types of
controller (color). Above the box plots the p-values resulting
from a Mann-Whitney U test with the null hypothesis that
the distribution originated from MLP controllers is the same
as each one generated by SNN controllers. . . . . . . . . . . . 60
8.6 Raster plots of the spikes produced by the neurons of two
neural controllers with 2 hidden layers during the locomotion
of the VSR (biped) on a flat terrain. Each row corresponds
to a neuron and each color indicates the layer in the SNN the
neuron belongs to. . . . . . . . . . . . . . . . . . . . . . . . . 62
8.7 Membrane potential v and threshold potential vth evolution
over time for a LIF and a LIFH neuron fed with the same
inputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
vii

LIST OF FIGURES
8.8 Box plot of velocities distributions for SNN with lmem = 5
and nlayers = 2 evolved with and without homeostasis (color)
and re-assessed on a flat terrain with and without homeostasis
(plot column). . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.9 VSR morphologies employed to validate the performance of
LIFH-SNN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
8.10 Distribution of vx of the best individuals after evolution for
the VSR morphologies of Figure 8.9 (one per column) con-
trolled by a MLP and LIFH-SNN (color). The value above
each couple of box plots is the p-value resulting from a Mann-
Whitney U test with the null hypothesis that the distributions
of the two box plots are the same. . . . . . . . . . . . . . . . 66
viduals after evolution for the VSR morphologies of Figure 8.9
(one per column) controlled by a MLP and LIFH-SNN (color).
The value above each couple of box plots is the p-value result-
ing from a Mann-Whitney U test with the null hypothesis that
the distributions of the two box plots are the same. . . . . . . 66
8.12 Distribution of vx of the best individuals after evolution for
sensor configurations (a) and (b) (one per column) controlled
by a MLP and LIFH-SNN (color). The value above each cou-
ple of box plots is the p-value resulting from a Mann-Whitney
U test with the null hypothesis that the distributions of the
two box plots are the same. . . . . . . . . . . . . . . . . . . . 67
viduals after evolution sensor configurations (a) and (b) (one
per column) controlled by a MLP and LIFH-SNN (color). The
value above each couple of box plots is the p-value resulting
8.14 Distributions of efficiencies ex and velocities vx of the best
VSRs at the end of evolution for MLP and LIFH-SNN con-
trollers. The value above each couple of box plots is the p-
value resulting from a Mann-Whitney U test with the null
hypothesis that the distributions of the two box plots are the
same. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.15 Distributions of average efficiencies ex and velocities vx on
new terrains of the best evolved VSRs for MLP and LIFH-
SNN controllers. The value above each couple of box plots is
the p-value resulting from a Mann-Whitney U test with the
null hypothesis that the distributions of the two box plots are
the same. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
viii

LIST OF FIGURES
8.16 Box plots of the velocities vx of the best individuals at the
end of evolution for each sensor configuration and controller
architecture (plot column), and controller type (color). The
value above each couple of box plots is the p-value resulting
8.17 Box plots of the velocities vx of the best individuals on unseen
terrains for each sensor configuration and controller architec-
ture (plot column), and controller type (color). The value
above each couple of box plots is the p-value resulting from a
Mann-Whitney U test with the null hypothesis that the dis-
tributions of the two box plots are the same. . . . . . . . . . . 73
8.18 Scatter plot of the first two components resulting from the
PCA analysis for the centralized controller architecture for
different types of neural controllers (color). The velocity vx
of the VSR is proportional to the size of the bubble. . . . . . 76
PCA analysis for the homo-distributed controller architecture
for different types of neural controllers (color). The velocity
vx of the VSR is proportional to the size of the bubble. . . . . 76
PCA analysis for the centralized controller architecture for
different sensor configurations (shape) and neural controllers
(color). The velocity vx of the VSR is is proportional to the
size of the bubble. . . . . . . . . . . . . . . . . . . . . . . . . 77
ix

List of Tables
8.1 Parameter configurations chosen for experimenting with input
and output converters. . . . . . . . . . . . . . . . . . . . . . . 52
8.2 Number of parameters to be optimized by the EA for each
controller and ANN topology. Note that the sinusoidal con-
troller does not depend on the ANN size, but is inserted in
this table for easing references. The number of weights to op-
timize for the MLP is always slightly larger than for SNNs as
it includes each neuron bias, which SNNs do not have. . . . . 56
controller and VSR morphology. . . . . . . . . . . . . . . . . . 65
controller and sensor equipment. . . . . . . . . . . . . . . . . 68
VSR combination of distributed controller, sensor configura-
tion and NN type. . . . . . . . . . . . . . . . . . . . . . . . . 71
x

List of Algorithms
1 The algorithm for pruning a vector θ of ANN parameters given
the parameters scope, criterion, and ρ. . . . . . . . . . . . . . 21
2 The simple EA used for neuroevolution. . . . . . . . . . . . . 29
xi

Introduction
Through evolution, nature has come up with unique solutions to a wide
range of difficult problems, among which life is surely the most noteworthy.
Hence, various disciplines have taken inspiration from it to handcraft solu-
tions to many scientific and engineering challenges. The field of robotics is
no exception, as the design of artificial agents is often influenced by how
their natural counterparts work. Namely, natural organisms generally adapt
well when faced with unforeseen circumstances, i.e., new environments, and
such trait is of paramount importance for robots, which should ideally be as
robust and general as possible.
In this context, Voxel-based Soft Robots (VSRs), a form of modular
robots composed of aggregates of soft cubes (voxels), promise to be suitable
for achieving better generalization performance than their rigid equivalents.
As a matter of fact, VSRs display two properties that make them intrin-
sically more prone to adaptability: (a) softness, which is strongly related
to compliance (observed in innumerable living beings), and (b) modularity,
that can be exploited to produce favorable shapes and react to partial dam-
ages in the body of the agent. In addition, other features strongly bound
them with biological agents: (a) they have sensors to perceive their status
and the environment around them, (b) they move thanks to the actuation
of the voxels, which contract and expand similarly to the muscular tissue,
and (c) they can process sensory information to produce actuation by the
means of an Artificial Neural Network (ANN). A broad description of VSRs
is provided in Chapter 2.
Due to their high expressiveness and the complex interactions between
their components, engineering VSRs by hand is unfeasible. For this reason,
designers often rely on optimization to devise morphologies or controllers
that are suitable for a given task. In this work, we focus on the optimization
of VSR controllers applying evolutionary optimization. Chapter 1 is hence
dedicated to providing the reader with some background knowledge on topics
like evolutionary optimization and ANNs.
Focusing on the controllers of VSRs, they are currently based on fully-
connected feed-forward ANNs, also known as Multi Layer Perceptrons (MLPs),
which very coarsely approximate biological Neural Networks (NNs), sacrific-
ing biological resemblance in favor of simplicity and lower computation costs.
xii

INTRODUCTION
Therefore, the aim of this work is to introduce biologically inspired features
to the VSRs controller, with the purpose of achieving some of the positive
traits that characterize living beings’ brains such as high energy efficiency
and adaptability.
We target the goal along two different research directions, respectively
in Part I and Part II. First, we introduce the biologically inspired technique
of synaptic pruning of the controller, while leaving the computational model
untouched. It is, in fact, widely recognized that pruning in biological neural
networks plays a fundamental role in the development of brains and their
ability to learn. Hence, the target of this part of the work is to investigate
whether pruning can improve the adaptability of artificial agents, while also
easing computation. A broad overview of previous works encompassing prun-
ing is provided in Chapter 3, while a more detailed description of pruning
techniques is available in Chapter 4. Our findings, described in Chapter 5,
are that, with some forms of pruning, a large portion of the connections
can be pruned without strongly affecting robot capabilities. In addition, we
observe sporadic improvements in generalization ability.
As a second part of this work, we shift the computational model of the
ANN to Spiking Neural Networks (SNNs), which mimic natural NNs more
accurately. The main idea behind the model is that, just like biological neu-
rons, Spiking Neurons do not transmit on each propagation cycle, but they
only fire when their membrane potential exceeds a given threshold. There-
fore, information is encoded in the temporal distribution of spikes trans-
mitted by neurons. Peculiar to such NNs is the possibility to enable some
self-regulatory mechanisms, such as homeostasis, which, together with some
forms of learning, could contribute to dramatically increasing the general-
ization ability of robots. Previous works related to the usage of SNNs are
described in Chapter 6, whereas in Chapter 7 we give an insight into the
challenges faced when employing SNNs as robotic controllers. The results
obtained are described in Chapter 8 and show that SNNs with homeostasis
not only significantly outperform MLPs throughout evolution, but also dis-
play a much higher adaptability to unforeseen circumstances, which is one of
the traits for artificial agents that are most desirable yet difficult to achieve.
All in all, the outcomes of this work demonstrate how enhancing the
biological resemblance of VSRs can introduce beneficial effects on the overall
performance achieved. This work could serve as a starting point to improve
the biological plausibility of other aspects of the VSRs, e.g., the sensors, and
to investigate the relationships occurring between modularity and biological
resemblance, e.g., moving towards some primitive form of robotic tissue.
xiii

Chapter 1
Background
The aim of this introductory chapter is to provide the reader with enough
background knowledge to successfully understand the techniques utilized in
this work.
1.1 Neural Networks
Artificial Neural Networks (ANNs) are computational models that take in-
spiration from biological neural networks (NNs). ANNs are composed of
artificial neurons interconnected by weighted connections (weights) that re-
semble the different intensities of synaptic contact. Neurons are modeled
as electrically excitable nerve cells which pass electronic signals along the
synapses [50].
ANNs attempt to mimic biological neural circuits in order to reflect their
behavior and adaptive features. Hence, they are often used for solving diffi-
cult problems, like modeling complex relationships between inputs and out-
puts, or to find patterns in data, where hand coding a solution might not be
feasible.
Feed-forward ANNs are ANNs where information flows in only one di-
rection and cycles are not allowed. Such ANNs are organized into layers:
(a) input, (b) hidden, and (c) output layers. The input layer receives infor-
mation from some source and forwards it to neurons in the hidden layers,
until it reaches the neurons in the output layer. Figure 1.1 displays a fully-
connected feed-forward ANN architecture.
ANNs mainly differ in the model used to compute the neurons outputs
and in the encoding of information: the next two sections, Section 1.1.1 and
Section 1.1.2, provide a brief overview of the two models employed in this
work. Note that since we are utilizing ANNs as controllers for robotic agents,
we introduce the concept of time in the description of the computation of
the outputs given some inputs.
1

CHAPTER 1. BACKGROUND
Input layer
Hidden
layer 1
Hidden
layer 2
Output layer
Figure 1.1: Graph representation of a fully-connected feed-forward ANN
with three input neurons, two hidden layers with five neurons each, and three
output neurons.
1.1.1 Multi Layer Perceptron
The McCulloch and Pitts neuron model (perceptron) is one of the most
commonly used [50] for computing neurons outputs. In this model, at each
propagation cycle, i.e., at every time step k, each neuron calculates the
weighted (according to the incoming weights {w1, w2, · · · , wn}) sum of inputs
{x1(k), x2(k), . . . , xn(k)}, s(k), which is then passed through some activation
function ϕ, to compute the output y(k), as described by Equation (1.1).
Figure 1.2 displays a schematic representation of the model.
y(k) = ϕ(s(k)) = ϕ
n
X
i=1
wixi(k)
!
(1.1)
A fully-connected feed-forward ANN consisting of perceptrons is called a
Multi Layer Perceptron (MLP).
xn(k)
wn
·
·
·
x2(k) w2
x1(k) w1
Σ
s(k)
ϕ(s(k))
y(k)
Figure 1.2: Schematic representation of a McCulloch and Pitts neuron
(perceptron).
2

1.1.2 Spiking Neural Networks
Spiking Neural Networks (SNNs) [32, 70] are a more biologically plausible
type of ANNs, which process information in the form of sequences of spikes.
Differently from neurons in regular ANNs, spiking neurons do not transmit
on each propagation cycle. Instead, similarly to biological neurons, they have
a membrane potential that changes each time the neuron receives a spike in
input, and, whenever the membrane potential surpasses a given threshold,
the neurons produce an output in the form of a spike. Hence, unlike MLPs,
SNNs have an internal status, so they can be considered a form of dynamic
system, where the output at time t depends not only on the input at time t,
but also on the status of the system.
In addition, it is worth mentioning that SNNs incorporate a temporal
dimension, as information is encoded in the time distribution of the spike
trains.
SNNs share with biological networks the peculiar trait of enabling some
self-regulatory mechanisms, that should lead to higher performance and in-
crease adaptability.
Neuroplasticity Neuroplasticity [49] is the continuous process of change
of the synapses in the brain in response to sensory stimuli such as pain,
pleasure, smell, sight, taste, hearing, and any other sense a living organism
can have. Following this paradigm, each synapse in SNNs is associated to
a learning rule, that determines how to adjust its weight, in order to mimic
biological learning mechanisms. Namely, SNNs incorporate learning, in the
form of Spike Timing Dependent Plasticity (STDP) [41], where weights are
modified depending on the relative timing of input and output spikes—this
form of unsupervised learning in ANNs is referred to as Hebbian learning.
Four different Hebbian learning rules are illustrated in Figure 2.3.
Homeostasis Homeostasis in biology is a steady equilibrium of physical
and chemical conditions of a living system. In biological brains, homeostasis
is employed to counterbalance the effect of learning [73], which naturally
tends to unbalance networks, strengthening and weakening synapses. Simi-
larly, in SNNs homeostasis acts on neurons thresholds to maintain balance,
since it is desirable that all neurons have approximately equal firing rates [14],
although the number and weights of their inputs may strongly differ. A more
detailed description of homeostasis in SNNs is available in Section 7.2.
3

Figure 1.3: Hebbian learning rules. The graphs illustrate the percentual
adjustment of a weight ∆w(%) dependent on the relative timing between
input and output spikes ∆t (image taken from [41]).
1.2 Evolutionary Algorithms
An evolutionary algorithm (EA) [16, 71] is a meta-heuristic population-
based1 optimization algorithm. The idea behind EAs arises from Charles
Darwin’s theory of evolution [12], where the individuals in a population
compete for a limited set of resources, i.e., under environmental pressure,
which as a consequence causes natural selection, i.e., survival and higher
mating likelihood of the fittest individuals. Given an optimization problem,
the definition of a fitness function to be maximized is required, in order
to evaluate the performance of individuals. As in the natural evolutionary
process, individuals showing higher fitness are more prone to survive and
reproduce, causing a general increase in the overall fitness of the population.
In this context, an individual is a candidate solution (phenotype) for
the considered problem, which can be internally represented (genotype) as
itself or as some well defined data structure, e.g., a numerical vector. Such
distinction is drawn not only to enhance the biological resemblance of EAs,
but also to allow reuse of the techniques developed for evolving a certain
genotype, i.e., a certain data structure. Clearly, when using an indirect
encoding, i.e., when the genotype is not the solution itself, a proper genotype-
phenotype mapping must be found.
1
The algorithm processes a set of multiple candidate solutions simultaneously.
4

1.2.1 General scheme
The general scheme of an EA is displayed in Figure 1.4. First the population
is initialized, either randomly or according to some background knowledge
on the problem. Then, a subset of the population is selected to generate
offspring: this phase is called parent selection, and resembles the biological
event of mating. Just like in nature, the individual generated by two parents
through recombination or crossover inherits traits from both parents. In
addition to crossover, mutation might be applied to generate the new indi-
vidual, in order to enhance diversity in the population, hence favoring the
exploration of new regions of the solution space for the problem. Note that
in some cases it is also possible to generate new individuals from parents by
solely applying mutation. After the generation of a new set of individuals,
environmental pressure is mimicked by computing the fitness of all the in-
dividuals and deciding on its basis which ones should be promoted to the
next generation and which ones should be removed. The entire process is
repeated until a termination condition is met: most commonly a given fit-
ness is reached or a certain computation limit, e.g., number of births or
generations, is exceeded.
Initialization
Termination
Parents
Offspring
Population
Parent selection
Survivor selection
Termination
condition
Recombination
Mutation
Figure 1.4: General scheme of an EA [16].
1.2.2 Some examples
Evolutionary Strategies Evolutionary strategies (ESs) arise from real-
valued function optimization and hence employ vectors of real numbers as
genotypes [13]. In this context, reproduction takes the form of mutating one
or more of the parent’s gene values (real numbers) via a normally distributed
perturbation N(0, σ) with zero mean and a standard deviation of σ [13].
Genetic Algorithms The genetic algorithm (GA) approach uses a ge-
netic like “universal” fixed-length string representation as genotype for in-
5

dividuals along with string-oriented reproductive operators for producing
offspring [13]. This is advantageous since few, if any, changes to the GA are
required when changing domain, but it is clear that the effectiveness of a GA
strongly depends on the choice of the genotype-to-phenotype mapping [13].
1.2.3 Neuroevolution
Neuroevolution [76] is the application of EAs on NNs, with the ultimate goal
of optimizing their performance. From the perspective of the EA, the NNs
serve as the phenotype. Different features of NNs can be evolved, such as
weights, topology, learning rules, and activation functions.
Historically, the most common way of evolving NNs is the evolution of
weights. In this case, a NN architecture is fixed first, generally a fully-
connected feed-forward NN, and then the synaptic weights are determined
by the means of neuroevolution instead of using gradient based methods. On
the other hand, evolving topologies enables neuroevolution to explore differ-
ent architectures and adapt them to best suit problems without the need of
human design and predefined knowledge. Similarly, evolving learning rules
enables automatic design and discovery of novel learning rule combinations
within the NN. This technique is especially interesting with regards to Ar-
tificial General Intelligence (AGI) [55], because it can be considered as a
process of “learning how to learn”.
6

Chapter 2
Voxel-based Soft Robots
This chapter provides an introduction to the modular soft robots encom-
passed in this work: first a general overview is given, and then a more spe-
cific characterization of morphological features and state-of-the-art control
technologies is presented. Last, a brief outline of studies aimed at physically
realizing them is provided.
Voxel-based Soft Robots (VSRs) [27] are a kind of modular robots com-
posed of several soft cubes (voxels). In this work we consider a 2-D variant
of VSRs in which voxels are actually squares, rather than cubes, since work-
ing in 2-D makes the simulation more lightweight. However, the findings
described in this study are conceptually portable to the 3-D case.
A VSR has a morphology, or body, and a controller, or brain. The mor-
phology consists of the voxels composing the VSR, arranged in a 2-D grid.
The voxels can be equipped with sensors that can provide the controller
with the information regarding the environment and the VSR itself. The
controller is in charge of determining how the area of each voxel varies over
the time, possibly based on the readings of the sensors of the VSR.
2.1 VSR morphology
The morphology of a VSR is an arrangement of voxels, i.e., deformable
squares, organized in a 2-D grid, where two neighboring voxels are rigidly
connected. Figure 2.1 shows two examples of VSR morphologies, both com-
posed of 10 voxels.
To produce movement, the size of each voxel varies over time, similarly
to biological muscles. In particular, the actual size of each voxel depends
on external forces, i.e., forces caused by its interaction with other connected
voxels and the ground, and on an actuation value that causes the voxel to
shrink or expand. Namely, at each simulation time step t = k∆tsim, the
actuation value a(k) is assigned by the controller and is defined in [−1, 1],
7

CHAPTER 2. VOXEL-BASED SOFT ROBOTS
(a) Biped. (b) Worm.
Figure 2.1: Frames of the two VSR morphologies used in the experiments.
The color of each voxel encodes the ratio between its current area and its rest
area: red indicated contraction, yellow rest state, and green expansion. The
circular sector drawn at the center of each voxel indicates the current sensed
values: subsectors represent sensors and are, where appropriate, internally
divided in slices according to the sensor dimensionality m. The rays of the
vision sensors are shown in red.
where −1 corresponds to maximum requested expansion and 1 corresponds
to maximum requested contraction.
More precisely, the size variation mechanism depends on the mechanical
model of the voxel, either physically implemented or simulated. In this work,
we experiment with 2D-VSR-Sim [47], that models each voxel with four
masses at the corners, some spring-damper systems, which confer softness,
and ropes, which limit the maximum distance two bodies can have. In
this simulator, actuation is modeled as a variation of the rest-length of the
spring-damper systems. Figure 2.2 provides a schematic representation of
the mechanical model employed for one voxel.
Figure 2.2: The mechanical model of the voxel. The four masses are
depicted in gray, the different components of the scaffolding are depicted
in blue, green, red, and orange, and the ropes are depicted in black (image
taken from [47]).
8

Moreover, a VSR can be equipped with sensors, that are located in its
voxels. At each time step, a sensor S outputs a sensor reading rS ∈ [0, 1]m,
with m being the dimensionality of the sensor type. To ensure the output
is defined in [0, 1]m, sensors employ a soft normalization of the values, using
the tanh function and rescaling. Here, we consider four types of sensor,
described below, and put at most one sensor of each type in each voxel of
the VSR. For each of the following sensors we consider the average of the
last 5 readings as the current sensed value.
• Sensors of type area sense the ratio between the current area of the
voxel and its rest area (m = 1).
• Sensors of type touch sense if the voxel is in contact with the ground
or not and output a value being 1 or 0, respectively (m = 1).
• Sensors of type velocity sense the velocity of the center of mass of the
voxel along the voxel x- and y-axes (m = 2).
• Sensors of type vision sense the distances towards closest objects along
a predefined set of directions: for each direction, the corresponding
element of the sensor reading rS is the distance of the closest object, if
any, from the voxel center of mass along that direction. If the distance
is greater than a threshold d, it is clipped to d. We use the vision
sensor with the following directions with respect to the voxel positive
x-axis: −1
4π, −1
8π, 0, 1
8π, 1
4π; the dimensionality is hence m = 5.
2.2 VSR controller
2.2.1 Sinusoidal controller
The first and most simple form of controller computes each actuation value
starting from the current time, according to a sinusoidal function [27, 11, 35]
(sinusoidal controller). In particular, the actuation value of the i-th voxel
at time t = k∆tsim is set to:
a
(k)
i = sin (2fiπk∆tsim + φi) (2.1)
where fi is the frequency, and φi is the phase, which may be different among
voxels. Given a shape of n voxels, the vector p = [f1 . . . fn, φ1 . . . φn] of
frequencies and phases unequivocally defines a controller of this type. Note
that this controller does not exploit sensor readings, hence it is a non-sensing
kind controller.
2.2.2 Centralized neural controller
To benefit from the sensing abilities of VSRs, Talamini et al. [68] suggested
the usage of an ANN to process information coming from sensors to produce
9

a meaningful actuation. At each simulation time step k∆tsim this type of
controller processes the vector of sensor readings x(k) (2.2) and outputs a
vector of actuation values (2.3), one for each voxel.
x(k)
=
h
s
(k)
1 . . . s(k)
n
i
(2.2)
y(k)
= ANNθ

x(k)

=
h
a
(k)
1 . . . a(k)
n
i
(2.3)
This controller can be called centralized as there is a single central ANN
processing the sensory information coming from each voxel to generate all
the actuation values of the VSR. Given a shape of n voxels, an ANN model,
e.g., MLP or SNN, and an ANN architecture (i.e., the number and size of
hidden layers), the value of θ unequivocally defines a centralized controller.
The number |θ| of controller parameters depends on the overall number of
sensors, the number of voxels, the ANN model, and the ANN architecture.
Figure 2.3 depicts a centralized controller for a simple VSR composed
of three voxels, arranged along the x-axis. In this example, each voxel is
equipped with two sensors and the ANN has one inner layer consisting of 5
neurons. As a result, supposing the ANN model is a MLP, this centralized
controllers has |θ| = (6 + 1) · 5 + (5 + 1) · 3 = 53 parameters, the +1 being
associated with the bias.
2.2.3 Distributed neural controller
As opposed to the centralized controller, Medvet et al. [45] developed a
distributed controller, that aims at exploiting the intrinsic modularity of
VSRs. The key idea is that each voxel is equipped with an ANN, which
processes local inputs to produce the actuation value for such voxel. In order
to enable the transfer of information along the body of the VSR, neighboring
voxels are connected by the means of nc input communications. Namely,
each ANN reads the sensors values together with the 4nc values coming
from adjacent voxels, and in turn outputs an actuation signal and 4nc values
to feed to contiguous voxels.
In detail, each ANN takes as input a vector x(k) built as follows:
x(k)
=
h
s(k)
i
(k)
N i
(k)
E i
(k)
S i
(k)
W
i
(2.4)
where s(k) are the local sensor readings at time k∆tsim, and i
(k)
N , i
(k)
E , i
(k)
S ,
i
(k)
W are the nc input communication values coming from the adjacent voxel
placed above, right, below, left—if the voxel is not connected to another
voxel on a given side, the corresponding vector of communication values is
the zero vector 0 ∈ Rnc . The ANN outputs a vector y(k) built as follows:
y(k)
= ANNθ

x(k)

=
h
a(k)
o
(k)
N o
(k)
E o
(k)
S o
(k)
W
i
(2.5)
10

S1 S1 S1
S2 S2 S2
A
A
A
Figure 2.3: A schematic representation of the centralized controller for a 3×
1 VSR with two sensors in each voxel. Blue and red curved arrows represent
the connection of the MLP with inputs (sensors) and outputs (actuators),
respectively.
where a(k) is the local actuation value, and o
(k)
N , o
(k)
E , o
(k)
S , o
(k)
W are the
vectors of nc output communication values going towards the adjacent voxel
placed above, right, below, left of the voxel.
Figure 2.4 shows a scheme of a distributed neural controller portion cor-
responding to one single voxel.
Output communication values produced by the ANN of a voxel at k − 1
are used by the adjacent voxels ANNs at k. Let the subscript x, y denote the
position of a voxel in the VSR grid, then communication inputs and outputs
of adjacent ANNs are related as follows:
i
(k)
x,y,N = o
(k−1)
x,y+1,S i
(k)
x,y,E = o
(k−1)
x+1,y,W
i
(k)
x,y,S = o
(k−1)
x,y−1,N i
(k)
x,y,W = o
(k−1)
x−1,y,E
The distributed controller in a VSR can be instantiated according to two
11

S1 S2
A
IS OS
INON
IW
OW
OE
IE
Figure 2.4: A schematic representation of the portion of the distributed
neural controller corresponding to one single voxel with two sensors and
nc = 1 communication contact per side. Blue and red curved arrows repre-
sent the connection of the MLP with inputs (sensors and input communica-
tion contacts) and outputs (actuator and output communication contacts),
respectively.
design choices: (a) there could be an identical ANN in each voxel, both in
terms of architecture and weights (homo-distributed), or (b) each voxel can
have its own independent ANN that can differ from others in weights, hid-
den layers, and number of inputs (hetero-distributed). The main differences
between the two proposed configurations regard the optimization process
and the allowed sensor equipment of the VSRs. Namely, for a VSR con-
trolled by a homo-distributed controller, each voxel needs to have the same
amount of sensor readings to pass to the controller, to ensure the number of
inputs fed to the ANN is the same. In addition, evolving a single ANN for
each voxel requires less exploration, given the reduced number of parameters
to optimize, but requires more fine-tuning to make it adequate for control-
ling each voxel and achieve a good global performance. On the contrary,
the hetero-distributed architecture leaves more freedom, allowing any sensor
configuration, but has a much larger search space in terms of number of
parameters to optimize.
12

2.3 Physical realizations
Physically realizing effective VSRs is still an open challenge due to the dif-
ficulty in combining softness with modularity. Nevertheless, several groups
have achieved concrete implementations of VSRs with different level of com-
pliance with their simulated counterparts (see Figure 2.5 for some examples).
The first attempt at physically building VSRs was performed by Hiller
and Lipson [27], who were driven by the idea of designing and manufacturing
locomotion objects in an automated fashion. Their model was based on 3-D
printed silicon foam rubber voxels, whose actuation was determined by envi-
ronment pressure modulation. Anyway, this approach had two main draw-
backs: the rather unpractical actuation and the lack of modularity caused
by the impossibility of disassembling the artifact.
Two more recent works in the direction of physically realizing a VSR are
those of Kriegman et al. [37] and Sui et al. [67], who rely on the softness of
silicon to build voxels and exploit a pneumatic system to achieve actuation.
Even though such works satisfy both the softness and modularity require-
ments, as silicone voxels can be manually assembled, they still suffer from
practical problems like a complex fabrication phase and cable-dependent ac-
tuation.
The latest and probably most ambitious experiment involves living mat-
ter for building self-renewing bio-compatible machines [36]. This technique is
completely different from the previous ones in all aspects, but is not exempt
from problems, which range from the difficulty in fabrication to the limited
duration in time of such cells.
13

(a) VSR made of silicon foam rubber voxels [27].
(b) Silicon VSR with pneumatic actuation [37].
(c) VSR made of living matter [36].
Figure 2.5: Some examples of VSR physical implementations.
14

Part I
The effects of Synaptic Pruning
on Evolved Neural Controllers
for Voxel-based Soft Robots
15

Chapter 3
Related work
This chapter provides an overview of previous research related to synaptic
pruning. Such works span across different fields, from biological studies (Sec-
tion 3.1 and Section 3.5) to works focused on ANNs (Section 3.2, Section 3.3,
and Section 3.4).
3.1 Synaptic pruning in the nervous system
It is observed that animals with larger brains are more likely to have higher
learning abilities [57]. Intuitively, the more neurons and synapses result in
more redundancy, and in turn, the ability to learn more and faster [57].
However, over an optimal network size threshold, adding more neurons and
synapses deteriorates learning performance [56]. In addition, maintaining a
large brain is expensive in terms of energy [39]. The developmental stage
of the brain is characterized by hyperconnectivity, i.e., the formation of an
excessive number of synapses. Superfluous synapses are to be removed. For
humans, failing in doing so may results in neural diseases, e.g., autism,
epilepsy, schizophrenia [52]. Neural disorders are particularly affected by
network topology. In particular, it is widely recognized that neural behaviors
beneficial for computation and learning, such as power-law scaling of neu-
ral avalanches (criticality), are dependent on the network connectivity [26].
The mechanism that performs synaptic pruning is carried out by glial cells.
In humans, synapses are eliminated from birth until the mid-twenties [79].
It has been shown that the process of synaptic pruning results in roughly
half of the synapses to be eliminated until puberty, while the performance
of the brain is retained [10]. In particular, the identified elimination strat-
egy resulted in the pruning of weaker synapses first, i.e., deletion based on
synaptic efficacy. Such biological findings raise questions in regards to ANN
controllers for artificial agents, and in particular the identification of suitable
network sizes and number of parameters needed to learn a given task. The
possibility of optimizing the networks in regards to learning, robustness to
16

CHAPTER 3. RELATED WORK
noise, and other factors such as energetic cost remains still an open area of
research. In the following sections, several pruning strategies for ANNs are
reviewed.
3.2 Pruning in ANNs
Pruning in the context of ANNs is a sparsification technique consisting in
the removal of connections (synapses) between neurons. It can be moti-
vated by either efficiency (i.e., we want our network to train or evaluate
faster) or by the belief that pruned ANNs are more robust or generalize bet-
ter [28]. Early attempts at pruning ANNs in the Machine Learning (ML)
environment include Optimal Brain Damage (OBD) [40] and L1-norm loss
regularization [21] adapted from LASSO regression [60].
Pruning techniques for ANNs can be categorized as either structured or
unstructured [3]. Structured pruning removes synapses from well-defined
substructures of an ANN, such as a whole neuron or a convolutional filter
in a Convolutional Neural Network. On the contrary, unstructured pruning
removes connections without concern for the geometry of the ANN. Since
structured pruning can lead to a regular pattern of sparsity on the parame-
ters space, it is usually possible to directly take advantage of this sparsity as
far as computation is concerned (e.g., if we remove one neuron from a fully-
connected ANN, we can effectively reduce the dimension of the weights and
bias matrices, thus leading to immediate computational gains). On the other
hand, with unstructured pruning, the resulting irregular sparsity in the pa-
rameters tensors can be taken advantage of only via dedicated software [42]
(e.g., CUSPARSE [51]) or hardware (e.g., NVIDIA Tesla A1001).
Hoefler et al. [28] list a large amount of heuristics for pruning an ANN.
They can be categorized into (a) data-free heuristics, which in principle
require no model evaluation to apply them, and (b) data-driven heuristics,
which require the ANN to be evaluated on some given data points.
In this work, we will be adopting unstructured pruning solutions, adapt-
ing them to the context of neuroevolution. We will experiment with either
data-free heuristics, namely Least-Magnitude Pruning (LMP) [7, 24], which
prunes parameters exhibiting small magnitude, and data-driven heuristics,
namely Contribution Variance Pruning (CVP) [72], which prunes parame-
ters exhibiting low variance across multiple data instances. Moreover, we
will utilize two pruning schemes similar in concept to CVP which prune con-
nections manifesting low signals across data instances. Eventually, we will
be employing random pruning (as done also in [18] and [78]) in order to
obtain a “control group” to ensure whether the results obtained by means of
the pruning schemes we exploited are indeed due to the heuristic and not
due to randomness.
1
See https://blogs.nvidia.com/blog/2020/05/14/sparsity-ai-inference/.
17

3.3 Pruning ANNs in the context of statistical learn-
ing
In the context of statistical learning, ANNs are typically trained iteratively
(e.g., using any variant of stochastic gradient descent), updating the param-
eters after each iteration (e.g., using any variant of backpropagation).
In this background, a very useful resource for exploring various pruning
paradigms is [28]. We can distinguish between different kinds of pruning
techniques depending on whether the application of pruning is performed
during or after the training phase. Usually, in the latter case, a performance
drop is noticed in pruned networks: hence a re-training phase follows with
various heuristics, after which the ANN may even improve the performance
of the unpruned model even at high sparsity rates [24, 42, 18, 58]. In the
literature, it is still a matter of debate what is the most effective re-training
schedule [58, 80, 77], as the pressure is high to find well-performing pruned
ANNs trained in a time-efficient fashion, and how these pruned ANNs com-
pare with respect to the unpruned counterpart [1, 2]. Nevertheless, the effect
of unstructured pruning in helping with generalization is well-known in the
literature (e.g., [58]). On the other hand, pruning techniques acting during
training struggle to keep up with the performance of an analogous unpruned
ANN at high pruning rates (for instance, larger than 90 %), even if recent
advances such as [38] show very promising results.
3.4 Pruning ANNs in the context of neuroevolution
Unlike statistical learning, neuroevolution does not employ iterative training
for ANNs. Rather, ANNs usually go through multiple phases of (a) fitness
evaluation, and (b) variation, either via crossover and/or mutation [66]. One
of the paramount approaches for neuroevolution is NEAT [65], which incor-
porates both crossover and mutation to jointly evolve the topology and the
weights of the ANN.
There exist works applying pruning phases in addition to the ones op-
erated by NEAT or one of its variants. For instance, in [62], pruning is
operated on neural controllers in a manner inspired by OBD. The need for
pruning is motivated by numerical conditioning and empirical observations
that EANT [34], a NEAT variant, was already removing a large number of
parameters in the ANNs.
Recently, Gerum et al. [20] experimented with the application of random
pruning to small neural controllers designed to navigate agents through a
maze, concluding that pruning improved generalization. This work is of
interest for our work since it presents approach and setting similar to our
experiments, although our conclusions are different.
18

3.5 Pruning biologically-inspired ANNs
In the context of ANNs inspired by biological neural networks, Spiking Neu-
ral Networks (SNNs), driven by the early work of Gerstner and Kistler [19],
represent what has been called the “third generation of Neural Network mod-
els” [43]. Despite inheriting the fully-connected structures typical of MLPs,
they differ greatly from their statistical learning counterpart as (a) the input
is encoded in a temporal rather than spatial structure, and (b) the training is
operated in an unsupervised manner using Hebbian-based parameter update
rules [25], thus detatching from the gradient-based methods of statistical
learning.
Motivated by the aforementioned discoveries on human brain connectiv-
ity, some works have experimented with the application of pruning techniques
to SNNs. For instance, Iglesias et al. [31] experimented with the application
of a pruning heuristic similar to CVP to SNNs, although their work was not
focused on producing high-performing models, rather observing the patterns
of connectivity after various phases of pruning. Moreover, Shi et al. [61] ex-
perimented with applying LMP to SNNs during training. They were unable,
though, to produce SNNs whose performance was comparable to those of the
unpruned models.
19

Chapter 4
Pruning techniques
This chapter is devoted to thoroughly describing the pruning techniques
employed in this study.
We consider different forms of pruning of a fully-connected feed-forward
ANN. They share a common working scheme and differ in three parameters
that define an instance of the scheme: the scope, i.e., the subset of connec-
tions that are considered for the pruning, the criterion, defining how those
connections are sorted in order to decide which ones are to be pruned first,
and the pruning rate, i.e., the rate of connections in the scope that are actu-
ally pruned. In all cases, the pruning of a connection corresponds to setting
to 0 the value of the corresponding element θi of the network parameters
vector θ.
Since we are interested in the effects of pruning of ANNs used as con-
trollers for robotic agents, we assume that the pruning can occur during the
life of the agent, at a given time. As a consequence, we may use information
related to the working of the network up to the pruning time, as, e.g., the
actual values computed by the neurons, when defining a criterion.
Algorithm 1 shows the general scheme for pruning. Given the vector
θ of the parameters of the ANN, we first partition its elements, i.e., the
connections between neurons, using the scope parameter (as detailed below):
in Algorithm 1, the outcome of the partitioning is a list (i1, . . . , in) of lists
of indices of θ. Then, for each partition, we sort its elements according to
the criterion, storing the result in a list of indices i. Finally, we set to 0 the
θ elements corresponding to an initial portion of i: the size of the portion
depends on the pruning rate ρ and is b|i|ρc.
We explore three options for the scope parameter and five for the criterion
parameter; concerning the pruning rate ρ ∈ [0, 1], we experimented with
many values (see Chapter 5).
For the scope, as shown in Figure 4.1, we have:
• Network: all the connections are put in the same partition.
20

CHAPTER 4. PRUNING TECHNIQUES
function prune(θ):
(i1, . . . , in) ← partition(θ, scope)
foreach j ∈ {1, . . . , n} do
i ← sort(ij, criterion)
foreach k ∈ {1, . . . , b|i|ρc} do
θi ← 0
end
end
return θ
end
Algorithm 1: The algorithm for pruning a vector θ of ANN parameters
given the parameters scope, criterion, and ρ.
• Layer: connections are partitioned according to the layer of the desti-
nation neuron (also called post-synaptic neuron).
• Neuron: connections are partitioned according to the destination neu-
ron.
For the criterion, we have:
• Weight: connections are sorted according to the absolute value of the
corresponding weight. This corresponds to LMP (see Chapter 3).
• Signal mean: connections are sorted according to the mean value of
the signal they carried from the beginning of the life of the robot to
the pruning time.
• Absolute signal mean: similar to the previous case, but considering the
mean of the absolute value.
• Signal variance: similar to the previous case, but considering the vari-
ance of the signal. This corresponds to CVP (see Chapter 3).
• Random: connections are sorted randomly.
(a) Network. (b) Layer. (c) Neuron.
Figure 4.1: Graphical representation of the scopes taken into consideration
in the pruning algorithm.
21

CHAPTER 4. PRUNING TECHNIQUES
All the criteria work with ascending ordering: lowest values are pruned first.
Obviously, the ordering does not matter for the random criterion. When
we use the signal variance criterion and prune a connection, we take care
to adjust the weight corresponding to the bias of the neuron the pruned
connection goes to by adding the signal mean of the pruned connection: this
basically corresponds to making that connection carry a constant signal.
We highlight that the three criteria based on signal are data-driven; on
the contrary, the weight and the random criteria are data-free. In other
words, signal-based criteria operate based on the experience the ANN ac-
quired up to pruning time. As a consequence, they constitute a form of
adaptation acting on the time scale of the robot life, that is shorter than
the adaptation that occurs at the evolutionary time scale; that is, they are
a form of learning. As such, we might expect that, on a given robot that ac-
quires different experiences during the initial stage of its life the pruning may
result in different outcomes. Conversely, the weight criterion always results
in the same outcome, given the same robot. In principle, hence, signal-based
criteria might result in a robot being able to adapt and perform well also in
conditions that are different than those used for the evolution. We verified
experimentally this hypothesis: we discuss the results in Chapter 5.
22

Chapter 5
Experiments and results
This chapter exposes the research questions addressed in the first part of this
thesis and tries to give an answer to them through experimental evaluation.
Namely, each section analyzes a question in depth, describes the design of
the experiments and the experimental settings, and outlines the obtained
results.
We performed several experiments in order to answer the following re-
search questions:
RQ1 Is the evolution of effective VSR controllers adversely affected by prun-
ing? Does the impact depend on the type of pruning and on the size
of the ANN?
RQ2 Does pruning have an impact on the adaptability of the evolved VSR
controllers to different tasks? Does the impact depend on the type of
pruning and on the size of the ANN?
For answering these questions, we evolved the controller for two different
robots, each with three different ANN topologies: during the evolution, we
enabled different variants of pruning, including, as a baseline, the case of no
pruning. We considered the task of locomotion, in which the goal for the
robot is to travel as fast as possible on a terrain. We describe in detail the
experimental procedure and discuss the results in Section 5.2.
After the evolution, we took each evolved robot and measured its perfor-
mance in locomotion on a set of terrains different than the one used during
the evolution, in order to assess the adaptability of the robot. We describe
the procedure and discuss the results in Section 5.3.
In order to reduce the number of variants of pruning to consider when
answering RQ1 and RQ2, we first performed a set of experiments to assess
the impact of pruning in a static context, i.e., in ANNs not subjected to
evolutionary optimization. We present these experiments and our findings
in the next section.
23

CHAPTER 5. EXPERIMENTS AND RESULTS
5.1 Static characterization of pruning variants
We aimed at evaluating the effect of different forms of pruning on ANNs in
terms of how the output changes with respect to no pruning, given the same
input. In order to make this evaluation significant with respect to the use
case of this study, i.e., ANNs employed as controllers for VSRs, we considered
ANNs with a topology that resembles the one used in the next experiments
and fed them with inputs that resemble the readings of the sensors of a VSR
doing locomotion.
In particular, for the ANN topology we considered three input sizes
ninput ∈ {10, 25, 50} and three depths nlayers ∈ {0, 1, 2}, resulting in 3×3 = 9
topologies, all with a single output neuron. For the topologies with inner
layers, we set the inner layer size to the size of the input layer. In terms of the
dimensionality p of the vector θ of the parameters of the ANN, the considered
ANN topologies corresponding to values ranging from p = (10 + 1)1 = 11,
for ninput = 10 and nlayers = 0, to p = (50 + 1)(50 + 1)(50 + 1)1132 651,
for ninput = 50 and nlayers = 2, where the +1 is the bias. We instantiated
10 ANNs for each topology, setting θ by sampling the multivariate uniform
distribution U(−1, 1)p of appropriate size, hence obtaining 90 ANNs.
Concerning the input, we fed the network with sinusoidal signals with
different frequencies for each input, discretized in time with a time step
of ∆t = 1
10 s. Precisely, at each time step k, with t = k∆t, we set the
ANN input to x(k), with x
(k)
i = sin

k∆t
i+1

, and we read the single output
y(k) = ANNθ x(k)

.
We considered the 3 × 5 pruning variants (scope and criteria) and 20
values for the pruning rate ρ, evenly distributed in [0, 0.75]. We took each
one of the 90 ANNs and each one of the 300 pruning variants, we applied the
periodic input for 10 s, triggering the actual pruning at t = 5 s, and we mea-
sured the mean absolute difference e of the output ANNθ x(k)

during the
last 5 s, i.e., after pruning, to the output ANNθ̂ x(k)

of the corresponding
unpruned ANN:
e =
1
50
k=100
X
k=50

ANNθ

x(k)

− ANNθ̂

x(k)

. (5.1)
Figure 5.1 summarizes the outcome of this experiment. It displays one
plot for each ANN topology (i.e., combination of nlayer and ninput) and one
line showing the mean absolute difference e, averaged across the 10 ANNs
with that topology, vs. the pruning rate ρ for each pruning variant: the
color of the line represents the criterion, the line type represents the context.
Larger ANNs are shown in the bottom right matrix of plots.
By looking at Figure 5.1 we can do the following observations. First, the
factor that appears to have the largest impact on the output of the pruned
24

0
5
10
e
ninput = 10 ninput = 25
n
layers
=
0
ninput = 50
0
5
10
e
n
layers
=
1
0 0.2 0.4 0.6 0.8
0
5
10
ρ
e
0 0.2 0.4 0.6 0.8
ρ
0 0.2 0.4 0.6 0.8
ρ
n
layers
=
2
weight signal mean abs. signal mean
signal variance random
network layer neuron
Figure 5.1: Mean absolute difference e between the output of a pruned
ANN and the output of the corresponding unpruned ANN vs. the pruning
rate ρ, for different ANN structures and with different pruning criteria (color)
and scopes (linetype).
25

ANN is the criterion (the color of the line in Figure 5.1). Weight and absolute
signal mean criteria consistently result in lower values for the difference e,
regardless of the scope and of the pruning rate. On the other hand, with
the signal mean criterion, e becomes large even with low pruning rates: for
ρ 0.1 there seems to be no further increase in e. Interestingly, the random
criterion appears to be less detrimental, in terms of e, than signal mean in
the vast majority of cases. We explain this finding by the kind of input
these ANNs have been fed with, that is, sinusoidal signals: the mean of
signals that are periodic with a period shorter enough than the time before
the pruning is close to 0 and this results in connections actually carrying
some information to be pruned. We recall that we chose to use sinusoidal
signals because they are representative of the sensor readings a VSR doing
locomotion could collect, in particular when exhibiting an effective gait, that
likely consists of movements that are repeated over the time.
Second, apparently, there are no bold differences among the three val-
ues for the scope parameter. As expected, for the shallow ANNs (with
nlayers = 0) the scope parameter does not play any role, since there is one
single layer and one single output neuron (being the same destination for all
connections).
Third, the pruning rate ρ impacts on e as expected: in general, the larger
ρ, the larger e. However, the way e changes by increasing ρ seems to depend
on the pruning criterion: for weight and absolute signal mean, Figure 5.1
suggests a linear dependency. For the other criteria, e quickly increases with
ρ and then remains stable, for signal mean, or slowly increases, for signal
variance and random.
Fourth and finally, the ANN topology appears to play a minor role in
determining the impact of pruning. The ANN depth (i.e., nlayers) seems to
impact slightly on the difference between pruning variants: the deeper the
ANN, the fuzzier the difference. Concerning the number of inputs ninput, by
looking at Figure 5.1 we are not able to extract any strong claim.
Based on the results of this experiment, summarized in Figure 5.1, we
decided to consider only weight, absolute signal mean, and random criteria
and only the network scope for the next experiments.
To better understand the actual impact of the chosen pruning variants
on the output y(k) of an ANN, we show in Figure 5.2 the case of two ANNs.
The figure shows the value of the output of the unpruned ANN (in gray),
when fed with the input described above (up to t = 20 s), and the outputs
of the 3×4 pruned versions of the same ANN, according to the three chosen
criteria and four values of ρ.
26

−1
−0.5
0
0.5
1
y
(k)
@(10,
0)
0 2 4 6 8 10 12 14 16 18 20
−1
−0.5
0
0.5
1
t = k∆t
y
(k)
@(100,
2)
unprun. weight abs. sig. mean random
ρ = 0 ρ = 0.25 ρ = 0.5 ρ = 0.75
Figure 5.2: Comparison of the output of pruned and unpruned versions
of two ANNs of different structures: ninput = 10, nlayers = 0, above, and
ninput = 100, nlayers = 2, below.
27

5.2 RQ1: impact on the evolution
In order to provide an answer to this question, we evolved the controller
for six VSRs, resulting from the combination of two morphologies and three
ANN topologies. For each combination, we optimized the weights of the
ANN with and without pruning.
Figure 2.1 shows the two morphologies. Both consist of 10 voxels and
have several sensors. We put area sensors in each voxel, velocity sensors in
the voxels in the top row, touch sensors in the voxels in the bottom row
(just the two “legs” for the biped), and vision sensors in the voxels of the
rightmost column. As a result, both morphologies correspond to the same
number of inputs and outputs for the ANN, respectively 35 and 10.
Concerning the ANN topologies, we experimented with nlayers ∈ {0, 1, 2}.
For the ANN with inner layers, we set the size of those layers to 35. These
settings resulted in the size p of the parameter vector θ to be 360, 1620, and
2880, respectively.
For each of the six combinations of morphology and ANN topology, we
used three different pruning criteria: weight, absolute signal mean, and ran-
dom, all with network scope, as thoroughly described in Chapter 4. For each
criterion, we employed the following pruning rates: ρ ∈ {0.125, 0.25, 0.5, 0.75}.
Furthermore, we evolved, for each combination, an ANN without pruning to
have a baseline for meaningful comparisons.
To perform evolution we used the simple EA described in Algorithm 2, a
form of ES (Section 1.2.2). At first, npop individuals, i.e., numerical vectors
θ, are put in the initially empty population, all generated by assigning to
each element of the vector a randomly sampled value from a uniform distri-
bution over the interval [−1, 1]. Subsequently, ngen evolutionary iterations
are performed. On every iteration, which corresponds to a generation, the
fittest quarter of the population is chosen to generate npop −1 children, each
obtained by adding values sampled from a normal distribution N(0, σ) to
the element-wise mean µ of all parents. The generated offspring, together
with the fittest individual of the previous generation, end up forming the
population of the next generation, which maintains the fixed size npop.
We used the following EA parameters: npop = 48, ngen = 416 (corre-
sponding to 20 000 fitness evaluations), and σ = 0.35. We verified that, with
these values, the evolution was in general capable of converging to a solution,
i.e., longer evolutions would have resulted in negligible fitness improvements.
We optimized VSRs for the task of locomotion: the goal of the VSR is to
travel as fast as possible on a terrain along the positive x axis. We quantified
the degree of achievement of the locomotion task of a VSR by performing
a simulation of duration tf and measuring the VSR average velocity vx =
x(tf )−x(ti)
tf −ti
, x(t) being the position of the robot center of mass at time t and ti
being the initial time of assessment. In the EA of Algorithm 2 we hence used
28

function evolve():
P ← ∅
foreach i ∈ {1, . . . , npop} do
P ← P ∪ {0 + U(−1, 1)p}
end
foreach g ∈ {1, . . . , ngen} do
Pparents ← bestIndividuals

P,
j
|P|
4
k
µ ← mean(Pparents)
P0 ← {bestIndividuals(P, 1)}
while |P0| npop do
P0 ← P0 ∪ {µ + N(0, σ)p}
end
P ← P0
end
return bestIndividuals(P, 1)
end
Algorithm 2: The simple EA used for neuroevolution.
vx as fitness for selecting the best individuals. We set tf = 60 s and ti = 20 s
to discard the initial transitory phase. For the controllers with pruning, we
set the pruning time at tp = 20 s.
We remark that the EA of Algorithm 2 constitutes a form of Darwinian
evolution with respect to pruning: the effect of pruning on an individual does
not impact on the genetic material that is passed to the offspring by that
individual. More precisely, the element-wise mean µ is computed by consid-
ering the parents θ vectors before the pruning. We leave the investigation
on pruning effects on neuroevolution in the case of Lamarckian evolution to
future work.
For favoring generalization, we evaluated each VSR on a different ran-
domly generated hilly terrain (Figure 5.5b), i.e., a terrain with hills of vari-
able heights and distances between each other. To avoid propagating VSRs
that were fortunate in the random generation of the terrain, we re-evaluated,
on a new terrain, the fittest individual of each generation when moving it to
the population of the next generation.
For each of the 2 × 3 × (3 × 4 + 1) combinations of VSR morphology,
ANN topology, pruning criterion, and pruning rate (the +1 being associated
with no pruning) we performed 10 independent, i.e., with different random
seeds, evolutionary optimizations of the controller with the aforementioned
EA. We hence performed 780 evolutionary optimizations.
Figure 5.3 summarizes the findings of this experiment. In particular, the
plots show how the pruning rate ρ impacts the fitness of the best individual of
the last generation, for the different VSR morphologies and ANN topologies
employed in the experiment.
29

0
2
4
6
8
v
x
nlayers = 0 nlayers = 1
Biped
nlayers = 2
0 0.2 0.4 0.6 0.8
0
2
4
6
8
ρ
v
x
0 0.2 0.4 0.6 0.8
ρ
0 0.2 0.4 0.6 0.8
ρ
Worm
weight abs. signal mean random
Figure 5.3: Fitness vx (median with lower and upper quartiles across the
10 repetitions) vs. pruning rate ρ, for different pruning criteria (color), VSR
morphologies (plot row), and ANN topologies (plot column).
What immediately pops out from the plots is how individuals whose
controllers have been pruned with weight or absolute signal mean criteria
significantly outperform those who have undergone random pruning. This
suggests that randomly pruning controllers at each fitness evaluation is detri-
mental to their evolution. In fact, individuals with a good genotype could
perform poorly after the removal of important connections, while others
could surpass them thanks to a luckier pruning, hence the choice of fittest
individuals for survival and reproduction could be distorted. Moreover, Fig-
ure 5.3 confirms that the heuristics employed, based on weight and absolute
signal mean criteria (Chapter 4), successfully choose connections that are
less important for the controller to be removed, thus limiting the damage of
connection removal.
In addition, comparing the subplots of Figure 5.3 there are no bold dif-
ferences between the first and the second row, which leads us to conclude
that the morphology of the VSR does not play a key role in determining
the impact of pruning on the performance of the controller. On the con-
trary, different ANN topologies seem to be affected differently by pruning.
The subplots of the first column (nlayers = 0), in particular, suggest pruning
could have a beneficial impact on shallow networks. However, the upper
and lower quartiles reveal that the distribution of the fitness vx is spread
across a considerably large interval, hence it is difficult to draw any sharp
30

conclusion on the possible benefits of pruning for such controllers. Differ-
ently, for ANN topologies with nlayers ∈ {1, 2}, we can note that a higher
pruning rate ρ leads to weaker performance of the controller. In this case,
the result is in line with our expectations, as an increasing ρ means that
we are removing more connections from the ANN, thus making it harder for
the signal to spread across neurons. Nevertheless, controllers pruned with
a proper heuristic have evolved to achieve results comparable to those who
have not undergone pruning during their evolution, considered here as base-
line. We performed a Mann-Whitney U test with the null hypothesis that,
for each combination of VSR morphology, ANN topology, pruning criterion,
and pruning rate ρ, the distribution of the best fitness is the same as ob-
tained from the corresponding baseline controller, i.e., with the same VSR
morphology and ANN topology, evolved without pruning, and we found that
the p-value is greater than 0.05 in 30 out of 72 cases.
Based on the results of Figure 5.3, we speculate that controllers with
weight and absolute signal mean pruning criteria look robust to pruning be-
cause they result from an evolution in which VSRs are subjected to pruning,
rather than because those kinds of pruning are, in general, not particularly
detrimental. To test this hypothesis, we carried out an additional experi-
ment. We took all the best individuals of the last generations and we re-
assessed them (on a randomly generated hilly terrain similar to the one used
in evolution). For the individuals that were evolved without pruning, we
performed 3 × 4 additional evaluations, introducing pruning after tp = 20 s
with the previously mentioned 3 criteria and 4 rates ρ.
Figure 5.4 shows the outcome of this experiment, i.e., vx on the re-
assessment plotted against the validation pruning rate ρ for both individuals
evolved with (solid line) and without (dashed line) pruning. The foremost
finding is that individuals evolved with pruning visibly outperform the ones
whose ancestors have not experienced pruning, for almost all pruning rates.
This corroborates the explanation we provided above, that is, VSRs whose
ancestors evolved experiencing pruning are more robust to pruning than
VSRs that evolved without pruning.
Besides analyzing the aggregate results, we also examined the behavior
of a few evolved VSRs in a comparative way, i.e., with and without prun-
ing in re-assessment. We found that, interestingly, in some cases the VSR
starts to move effectively only after the pruning: this might suggest that
pruning shaped the evolutionary path at the point that the lack of prun-
ing becomes detrimental, similarly to what happens in the brain of complex
animals (see Chapter 3). We provide videos of a few VSRs exhibiting a
change in their behaviour after pruning at https://youtu.be/-HCHDEb9azY,
https://youtu.be/oOtJKri6vyw, and https://youtu.be/uwrtNezTrx8.
31

0
2
4
6
v
x
Biped
nlayers = 2
0.2 0.4 0.6 0.8
0
2
4
6
ρ
v
x
0.2 0.4 0.6 0.8
ρ
0.2 0.4 0.6 0.8
ρ
Worm
ev. w/ pruning ev. w/o pruning
Figure 5.4: Median, lower quartiles, and upper quartiles of validation veloc-
ity vx vs. validation pruning rate ρ of individuals evolved with and without
pruning for different ANN topologies and VSR morphologies.
5.3 RQ2: impact on the adaptability
For the sake of this research question, we defined VSR controllers as adaptable
if they are able to effectively accomplish locomotion on terrains they “have
never seen before”—i.e., terrains that none of their ancestors ever experienced
locomotion on. Hence, to assess the adaptability of evolved controllers, we
measured the performance in locomotion of the best individuals of the last
generations on a set of different terrains. We experimented with the following
terrains: (a) flat, (b) hilly with 6 combinations of heights and distances
between hills, (c) steppy with 6 combinations of steps heights and widths,
(d) downhill with 2 different inclinations, and (e) uphill with 2 different
inclinations (Figure 5.5). As a result, each individual was re-assessed on
a total of 17 different terrains. Note that, in this experiment, controllers
were not altered in between evolution and re-assessment, i.e., they were re-
evaluated with the same pruning criterion, if any, and pruning rate ρ as
experienced during evolution.
Figure 5.6 displays the outcome of this experiment. Namely, for each
of the different VSR morphologies, ANN topologies, and pruning criteria,
the validation velocity vx (i.e., the re-assessment velocity averaged on the 17
terrains) is plotted against the pruning rate ρ. The results in Figure 5.6 are
32

(a) Flat. (b) Hilly.
(c) Steppy. (d) Uphill.
(e) Downhill.
Figure 5.5: Different types of terrains employed for measuring VSR adapt-
ability.
33

0
2
4
v
x
Biped
nlayers = 2
0 0.2 0.4 0.6 0.8
0
2
4
ρ
v
x
0 0.2 0.4 0.6 0.8
ρ
0 0.2 0.4 0.6 0.8
ρ
Worm
Figure 5.6: Median, lower quartiles, and upper quartiles of validation ve-
locity vx vs. validation pruning rate ρ averaged across validation terrains fir
different pruning criteria, VSR morphologies, and ANN topologies.
coherent with the findings of Section 5.2: comparing the subplots, we can
conclude that the morphology of the VSR is not relevant in determining the
impact of pruning on the adaptability, whereas the ANN topology plays a
key role. More in details, for shallow networks, pruning seems to enhance
adaptability, whereas it has a slightly detrimental effect for deeper networks
(that are, however, in general better than shallow ones). Anyway, for con-
trollers evolved employing weight or absolute signal mean pruning criteria,
the validation results are comparable to those of controllers evolved without
pruning. We performed a Mann-Whitney U test with the null hypothe-
sis that, for each combination of VSR morphology, ANN topology, pruning
criterion, and pruning rate ρ, the distribution of the average validation ve-
locities across all terrains is the same as the one obtained from the validation
of the corresponding baseline controller, i.e., with the same VSR morphology
and ANN topology, evolved without pruning, and we found that the p-value
is greater than 0.05 in 38 out of 72 cases.
5.4 Summary of results
In this chapter we aimed at experimentally characterizing the impact of
pruning on evolved neural controllers for VSRs. First, we performed a pre-
liminary study to choose which pruning criteria to employ, finding that the
34

less detrimental ones are those based on the value of the synaptic weight
and on the mean of the absolute value of the signal. Then, we addressed
two research questions, regarding the effectiveness and the adaptability of
controllers evolved with pruning. We found that, if applied with a proper
criterion, pruning does not significantly hinder the evolution of efficient con-
trollers for VSRs. In addition, we noticed that the application of pruning
shaped the evolution of controllers, which were more robust to pruning than
those whose ancestors had not experienced it. Last, we have observed that
individuals evolved with pruning do not appear significantly less adaptable
to different tasks, i.e., locomotion on unseen terrains, than those evolved
without pruning.
35

Part II
Evolved Spiking Neural
Networks for controlling
Voxel-based Soft Robots
36

Chapter 6
Related work
The aim of this chapter is to briefly go through previous research encom-
passing SNNs.
SNNs, driven by the early work of Gerstner and Kistler [19], represent
what has been called the “third generation of Neural Network models” [43].
Such model is not only characterized by a much stronger biological resem-
blance than previously proposed ANNs, but is also dominant over other
NNs models in terms of the number of neurons needed to achieve the same
computation [43].
However, SNNs have one fundamental limitation with respect to the pre-
vious generations of ANNs: the presence of discrete spike events and discon-
tinuities, which hinders general purposes supervised learning algorithms [22]
based on backpropagation. Disparate methodologies have been proposed to
solve this issue, achieving different degrees of success, such as the transfer of
weights [15, 70] from gradient-descent trained “traditional” ANNs to SNNs
or a more recent event-based backpropagation algorithm [75]. Among these,
an important role is played by evolutionary approaches, which can lead to
SNNs whose generalization ability is comparable to that of standard MLPs
trained through gradient descent based algorithms [54].
Concerning the application of ANNs as controllers for artificial agents,
SNNs appear particularly suitable for the job, thanks to several of their fea-
tures. In particular, Bing et al. [5] highlight (a) speed and energy efficiency,
(b) strong computational capabilities with a reduced number of spikes, and
(c) better learning performance thanks to the spatial-temporal information
incorporated in information processing. Hence, many have experimented in
this direction, applying different training methods, e.g., unsupervised learn-
ing, reinforcement learning or evolutionary optimization, to obtain artificial
agents capable of performing diverse tasks.
The works encompassing the evolutionary optimization of SNNs-based
controllers are most relevant for this study, as we are likewise aiming at
evolving SNNs for controlling robotic agents. The first works in this field
37

are those by Floreano and Mattiussi [17] and Hagras et al. [23], who exploit
EAs to evolve the weights of a SNN used as controller. The results achieved
are notable, and confirm the strengths of SNNs concerning power consump-
tion and tolerance to noise [23]. Evolutionary optimization has also been
employed to optimize the parameters [4] or the topology of the SNN [30] in
a NEAT-like fashion. The work of Howard and Elfes [30] is especially in-
teresting in this sense, as thanks to the conjoined evolution of topology and
weights, SNNs have managed to outperform other NN baselines, e.g., MLPs,
in challenging conditions, displaying higher adaptability. Similar findings
have been presented by Markowska-Kaczmar and Koldowski [44], who have
performed a comparative study of SNNs and MLPs in controlling simulated
vehicles. Such studies are exceptionally significant for our work, as they sug-
gest VSRs could benefit from the usage of SNNs as controllers when faced
with unforeseen circumstances.
More recent research on SNNs strongly relies on neuromorphic compu-
tation [59], which tackles the problem of energy requirements of computing
platforms for the realization of artificial intelligence. In this context, Bing
et al. [6] and Spaeth et al. [63] exploit neuromorphic hardware for robotic
implementations.
The works of Spaeth et al. [63, 64], in particular, are most similar to this
study, as they experiment with a form of modular soft robots analogous to
VSRs. Nevertheless, there are two key factors that distinguish our research
from theirs: (a) our controller model only exploits sensory information to
produce actuation, without the need of a central pattern generator, and
(b) we optimize SNNs via neuroevolution instead of choosing weights by
hand.
38

Chapter 7
Spiking Neural Networks as
controllers
This chapter is dedicated to providing a more extensive description of SNNs
(briefly introduced in Section 1.1.2). In particular, we describe different
neuron models in Section 7.1, while we extend the contents on homeostasis
and STDP in Section 7.2 and Section 7.3. Last, in Section 7.4 we deal with
more technical aspects related to simulating the neurons and utilizing them
as controllers of a robotic agent.
7.1 Spiking Neuron models
An overview of the wide range of existing spiking neuron models is shown in
Figure 7.1: such models greatly differ both in terms of biological plausibility
and computational cost.
In this work, given that we use SNNs to control a robotic agent, simula-
tion cannot be unbearably costly, so we focus on the models distributed in
the first half of the x axis of Figure 7.1. In particular, our experimentation
focuses only on the integrate and fire model (Section 7.1.1), leaving the more
biologically plausible Izhikevich model (Section 7.1.2) for future work.
What is common to both models is the representation of the state of
the neuron in terms of its membrane potential, which receives excitatory or
inhibitory contributions from incoming synapses. In addition, both models
require the membrane potential to exceed a threshold to produce a spike in
output. After a spike is produced, the neuron goes through a phase of reset,
which varies according to the chosen model.
7.1.1 Integrate and Fire model
The peculiarity of this model is that the cell membrane is modeled as a ca-
pacitor. In particular, this leads to the neuron being “leaky”, as the summed
39

CHAPTER 7. SPIKING NEURAL NETWORKS AS CONTROLLERS
Figure 7.1: Comparison of existing spiking neuron models in terms of bi-
ological plausibility and computational cost (image taken from [33]). The y
axis describes the biological plausibility by the number of biological features
that each model implements, whereas the x axis accounts for the implemen-
tation cost in terms of number of floating point operations (# of FLOPS)
needed to simulate the model for 1 ms.
40

contributions to the potential decay over time with a time constant τm [9].
The leaky integrate and fire (LIF) model is described by Equation (7.1),
where Cm is the membrane capacitance, τm = RmCm is the membrane time
constant, V0 is the resting potential, Is(t) is a current caused by synaptic
inputs, and Iinj(t) is a current injected into the neuron by an electrode [9]:
Cm
dv(t)
dt
= −
Cm
τm
[v(t) − V0] + Is(t) + Iinj(t) (7.1)
When the membrane potential v exceeds the membrane threshold vth, a
spike is released and the membrane potential is reset to the resting membrane
potential V0:
v vth ⇒ v ← V0 (7.2)
Simplified Integrate and Fire Neuron model
A simplified version of the previously described model can be introduced to
abstract away from its electric circuit nature. In this model, the membrane
potential v is increased directly by the weighted sum of the inputs, and
decays over time by a factor λdecay:
∆v(∆t) =
n
X
i=1
wixi − ∆tλdecayv (7.3)
Similarly to the more complex integrate and fire model, if the potential
v surpasses a vth, an output spike is produced and the membrane potential
returns to the resting membrane potential V0, as described by Equation (7.2).
7.1.2 Izhikevich model
The Izhikevich model, as visible from its position in the plot of Figure 7.1,
merges the computation efficiency of the integrate and fire model with a
much higher biological plausibility. This model, proposed by Izhikevich [32],
introduces an additional variable u, denoting the membrane recovery, to the
definition of the neuron state. In addition, it uses four parameters, a, b, c,
and d, to describe how the values of v and u are to be updated with time:
dv(t)
dt
= 0.04v2(t) + 5v(t) + 140 − u(t) + I(t)
du(t)
dt
= a(bv(t) − u(t))
(7.4)
Equation (7.4) can be approximated in discrete time, with a time step size
∆t, resulting in:
∆v(∆t) = ∆t(0.04v2 + 5v + 140 − u + I)
∆u(∆t) = ∆ta(bv − u)
(7.5)
41

Figure 7.2: Different firing patterns resulting from different combinations
of values for a, b, c, and d. Electronic version of the figure and reproduction
permissions are freely available at www.izhikevich.com.
The aforementioned parameters affect how the value of v and u are ma-
nipulated and different combinations result in neurons with varying firing
patterns (see Figure 7.2):
• a determines the time scale for the recovery variable u. A smaller value
for a results in a slower recovery for u.
• b determines the sensitivity of the recovery variable u with regards to
the fluctuations of the membrane potential v below the firing thresh-
old. A bigger value for b makes the recovery variable u follow the
fluctuations of v more strongly.
• c determines the reset value for the membrane potential v following an
output spike.
• d determines the reset value for the recovery variable u following an
output spike.
Given the input values {x1, x2, . . . , xn} and the weights of the incoming
synapses {w1, w2, · · · , wn}, at each time step, the current I is updated with
the parameter b and the weighted sum of the inputs:
I = b +
n
X
i=1
wixi (7.6)
The reset phase in the Izhikevich model is described by
v vth ⇒
(
v ← c
u ← u + d
(7.7)
42

7.2 Homeostasis
The biological concept of homeostasis has been briefly introduced in Sec-
tion 1.1.2. As previously mentioned, this mechanism aims at balancing the
firing rates of all the neurons across a SNN, regulating the individual neurons
thresholds. In particular, the net membrane threshold v∗
th is given by:
v∗
th = min vth + Θ,
n
X
i=1
wi
!
(7.8)
where Θ is increased every time a neuron fires by Θinc, and decays expo-
nentially with time with Θdecay. To achieve an equilibrium, each neuron has
its individual Θ, so that a neuron firing very frequently gets an increasingly
large membrane threshold, resulting in a lower firing rate, whereas a neuron
with weak incoming synapses gets an increased firing rate.
7.3 Spike Timing Dependent Plasticity
Hebbian learning in SNNs takes the form of STDP [41]. Incoming synaptic
weights of each neuron are adjusted on every input and output spike by a
factor ∆w, which depends on the relative timing of pre and post synaptic
spikes:
∆t(tout, tin) = tout − tin (7.9)
This is implemented calculating the relative timings within a fixed time
frame, called STDP time window ∆twin, of approximately 40 ms. The weight
change ∆w is calculated according to one of the following rules (displayed
in Figure 1.3):
∆w(∆t) =







A+e
−∆t
τ+ ∆t 0
−A−e
∆t
τ− ∆t 0
0 ∆t = 0
Asymmetric Hebbian (7.10)
∆w(∆t) =







−A+e
−∆t
τ+ ∆t 0
A−e
∆t
τ− ∆t 0
0 ∆t = 0
Asymmetric Anti-Hebbian
(7.11)
∆w(∆t) =





A+g(∆t) g(∆t) 0
A−g(∆t) g(∆t) 0
0 g(∆t) = 0
Symmetric Hebbian (7.12)
43

∆w(∆t) =





−A+g(∆t) g(∆t) 0
−A−g(∆t) g(∆t) 0
0 g(∆t) = 0
Symmetric Anti-Hebbian (7.13)
where g(∆t) is the Difference of Gaussian (DoG) function given by
g(∆t) =
1
σ+
√
2π
e
−1
2

∆t
σ+
2
−
1
σ−
√
2π
e
−1
2

∆t
σ−
2
(7.14)
A+ and A− affect the height of the curve, while τ+ and τ− affect the width or
steepness of the Asymmetric Hebbian functions. Recall from Section 1.1.2,
that each synapse has its own learning rule associated, that can possibly be
evolved by the means of neuroevolution.
7.4 Data encoding
In Section 1.1.2 we highlighted the main differences between MLPs and
SNNs. Namely, the first ones are characterized by neurons producing nu-
merical values in output on each propagation cycle, whereas the latter ones
process information in terms of spike trains. Therefore, spike trains need to
be properly encoded in the simulation, in order to effectively propagate val-
ues across the SNN, producing a meaningful output. Section 7.4.1 is hence
dedicated to addressing the problem of the encoding of spike trains.
Moreover, the fact that we are simulating SNNs as controllers for robotic
agents poses additional constraints due to the requirements of the simula-
tor [46]. Figure 7.3 serves as a reminder of the specifications of the VSR
controller (described in Section 2.2): at each simulation time step k, the
controller receives a vector of sensor readings s(k) = {s
(k)
1 , . . . , s
(k)
m } ∈ [0, 1]m
and outputs a vector of actuation values a(k) = {a
(k)
1 , . . . , a
(k)
n } ∈ [−1, 1]n,
with m and n corresponding to the number of sensor readings and actuators,
respectively. If the controller is a MLP, there is no need for further reasoning,
since such ANN can be seen as a vector function, where a(k) = MLPθ(s(k)),
as displayed in Figure 7.4.
Controller
s(k)
a(k)
Figure 7.3: Schematic representation of the controller for VSRs with its
inputs and outputs at each simulation time step k.
44

Controller
MLPθ
s(k)
a(k)
Figure 7.4: Schematic representation of the MLP controller for VSRs with
its inputs and outputs at each simulation time step k.
Controller
SNNθ
I
t
(k)
s
O
t
(k)
a
SNNθ
s(k)
a(k)
Figure 7.5: Schematic representation of the SNN controller for VSRs with
its inputs and outputs at each simulation time step k.
On the other hand, as we have already highlighted, SNNs process in-
formation in terms of spike trains, so we need to introduce converters to
translate from decimal values to spike trains and vice versa to use them as
controllers for VSR. Figure 7.5 shows all the elements required to employ a
SNN as a controller for a VSR: on every simulation time step, s(k) is trans-
lated to t
(k)
s by the input converters I, and t
(k)
a is converted to a(k) by the
output converters O. The conversion strategy is discussed in Section 7.4.2.
In the figure, t
(k)
s and t
(k)
a represent the vectors of input and output spike
trains respectively, where t
(k)
si is the spike train produced by the conversion
of the i-th sensor reading and t
(k)
ai is the spike train which will be converted
to obtain the i-th actuation value. The data structures used to represent
t
(k)
s and t
(k)
a depend on the chosen encoding for spike trains. To summarize,
the calculation of the actuation values with a SNN controller is defined by
a(k) = O SNNθ I s(k)

.
7.4.1 Spike trains representation
Since a spike train is produced on each simulation time step k∆tsim and the
simulation frequency fsim is fixed, we can consider spike trains to always
have the same time duration of ∆tsim. In addition, we can assume spikes to
all have the same amplitude, as the weighting factors are only introduced by
synaptic weights. Given these considerations, we can opt between two design
choices, which take inspiration from signal processing. Figure 7.6 provides
a graphical comparison between the two proposed methods for encoding a
spike train.
45

Evolutionary Optimization of Bio-Inspired Controllers for Modular Soft Robots

Evolutionary Optimization of Bio-Inspired Controllers for Modular Soft Robots

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Evolutionary Optimization of Bio-Inspired Controllers for Modular Soft Robots

Similar to Evolutionary Optimization of Bio-Inspired Controllers for Modular Soft Robots (20)

Recently uploaded

Recently uploaded (20)

Evolutionary Optimization of Bio-Inspired Controllers for Modular Soft Robots