Analysis of large scale spiking networks dynamics with spatio-temporal constraints: application to Multi-Electrodes acquisitions in the retina

Analyzing large scale spike trains
with spatio-temporal constraints:
application to retinal data
Supervised by Prof. Bruno Cessac
Hassan Nasser

𝑅𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑅|𝑆
𝑃 𝑅 𝑆
Response
variability
Biological
neural network
Stimulus Spike Response
S R
Neural
prosthetics
Bio-inspired
technologies
Time (ms)
Trial
2

𝑅𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑅|𝑆
𝑃 𝑅 𝑆
Stimulus Spike Response
S R
Spike train
statistics
3

Probabilistic
Models
Maximum
entropy
Point
process
Ising
(Schneidman
et al 06)
Triplets (Ganmor et al
09)
Spatial No memory
…
5

Probabilistic
Models
Maximum
entropy
1 time-step memory
(Marre et al 09)
Generalized
Linear model
Point
process
General framework
(Vasquez et al 12)
Ising
(Schneidman
et al 06)
Triplets (Ganmor et al
09)
Spatio-
Temporal
Spatial
No memory
Limited to
1 time step
memory
Limited to
small scale
Neurons are considered
conditionally independent
given the past
Hawks
Linear Non
Linear model
# of neurons doubles
every 8 years !!
6

Goal
• Definitions
– Basic concepts
– Maximum entropy principle (Spatial
& Spatio-temporal).
• Montecarlo in the service of
large neural spike trains
• Fitting parameters
– Tests on synthetic data
– Application on real data
• The EnaS software
• Discussion
Develop a
framework to fit
spatio temporal
maximum entropy
models on large
scale spike trains
7

Goal
• Definitions
– Basic concepts
& Spatio-temporal).
• Discussion
Develop a
framework to fit
spatio temporal
maximum entropy
models on large
scale spike trains
8

Spike objects
𝑠𝑝𝑖𝑘𝑒 𝑡𝑟𝑎𝑖𝑛 = 𝜔 𝑠𝑝𝑖𝑘𝑒 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 = 𝜔 𝑡
𝑡
𝑠𝑝𝑖𝑘𝑒 𝑏𝑙𝑜𝑐𝑘 = 𝜔𝑡1
𝑡2
𝑆𝑝𝑖𝑘𝑒 𝑒𝑣𝑒𝑛𝑡 = 𝜔𝑖 𝑡
𝑡1 𝑡2
𝑖 = 𝑛𝑒𝑢𝑟𝑜𝑛 𝑖𝑛𝑑𝑒𝑥
𝑡 = 𝑡𝑖𝑚𝑒
Empirical probability 
𝑇
𝜋 𝜔
𝑇
𝑏𝑙𝑜𝑐𝑘, 𝑝𝑎𝑡𝑡𝑒𝑟𝑛, … 9

Confidence plot
1
0
0
1
0
1
0
0
0
1
1
0
1
0
1
1
0
0
0
0
1
0
0
1
0
0
0
1
1
0
Upper bound (+3𝜎)
Lower bound (−3𝜎)
10
0 Observed probability 1
Predictedprobability
0
1
2 𝑁 22𝑁 23𝑁

Monomials
• 𝑚𝑙(𝜔) = 𝑟 𝜔𝑖 𝑟
𝑡 𝑟 =
1 iff 𝜔𝑖 𝑟
𝑡 𝑟 = 1 ∀𝑟
0 otherwise
𝜔_0 47 𝜔_1 47
𝜔_7 40 𝜔_8 41𝜔_4 28 𝜔_6 28 𝜔_(28)
Pairwise
Pairwise with
1 time-step
delay
Triplet
𝜔_2 21 𝜔_4 23
11𝑆𝑡𝑎𝑡𝑖𝑜𝑛𝑎𝑟𝑖𝑡𝑦 ≡ 𝜋 𝜔
𝑇
(𝑚𝑙) does not change overall the spike train

Imagine a spatial case …
2 𝑁 possible
patterns/states
𝑚𝑙 = 𝜔𝑖(0)𝜔𝑗(0)
𝜇[𝜔 𝑡 ]
: 𝑁2 Pairwise
correlations monomials
𝑚𝑙 = 𝜔𝑖(0)
< 𝑚𝑙>
for 𝜔 𝑡
Maximum entropy
Given some measure
??
12
2 𝑁
≫ 𝑁 + 𝑁2
: N Individual activity
monomials.
𝒮 𝜇 = −
𝜔 0
𝜇 𝜔 0 log 𝜇[𝜔(0)]
Constraints:
𝜇 𝑚𝑙 = 𝜋 𝜔
𝑇
[𝑚𝑙]

Spatial models
𝜇 = arg max
𝜈∈ℳ
𝒮 𝜈 + 𝜆0
𝜔 0
𝜈 𝜔 0 − 1 +
𝑙=1
𝐿
𝜆𝑙 𝜈 𝑚𝑙 − 𝜋 𝜔
𝑇
[𝑚𝑙]
Sought distribution
Statistical entropy
Normalization Parameters Empirical measure
Predicted measure
𝒮 𝜈 = −
𝜔 0
𝜈 𝜔 0 log 𝜈[𝜔(0)]
𝜇 𝜔 0 =
1
𝑍 𝝀
𝑒ℋ 𝜔(0)
𝑃𝑜𝑡𝑒𝑛𝑡𝑖𝑎𝑙 ∶ ℋ𝜆 𝜔(0) =
𝑙
𝜆𝑙 𝑚𝑙
Partition function
𝑍𝜆 =
𝜔 0
𝑒ℋ 𝜔After fitting parameters:
Ising model
13

Prediction with a spatial model
Spatial patterns
14
Observed probability

Spatio temporal pattern of memory depth 1
𝜇 = 𝜇 × 𝜇
15

𝜇 = 𝜇 × 𝜇 × 𝜇
16

𝜇 = 𝜇 × 𝜇 × 𝜇
17
memory pattern

The Spike train as a Markov Chain
Time
Neuron#
𝐷
𝑃 𝜔 𝑛 𝜔 𝑛−1
𝑛−𝐷
𝑛
𝜇 𝜔0
𝑛
=
Present
18

Time
Neuron#
𝑃 𝜔 𝑛 − 1 𝜔 𝑛−2
𝑛−𝐷−1
𝑛−𝐷
𝑛
𝜇 𝜔0
𝑛
=
19
𝐷

Time
Neuron#
𝑛
Chapman–Kolmogorov equation
𝑃 𝜔 𝑛 − 1 𝜔 𝑛−2
𝑛−𝐷−1
𝑛−𝐷
𝑃 𝜔 𝑛 − 2 𝜔 𝑛−3
𝑛−𝐷−2
𝜇 𝜔0
𝑛
= 𝜇 𝜔0
𝐷
. 𝑃 𝜔 𝐷 + 1 𝜔0
𝐷
…
20
𝐷

Markov states with memory
𝑤 𝑤′
𝑃[𝑤 → 𝑤′]
0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
𝑤
𝑤′
𝑤 & 𝑤′
:
blocks of size 𝐷
21
1 0 0
1 0 1
1 0 1
𝑤 → 𝑤′
= 0
𝑤 → 𝑤′ = 𝑒ℋ 𝜔0
𝐷
0 0 1
0 0 0
0 0 0
1 1 1
0 1 1
1 1 1
0 0 1
1 0 0
1 1 0
Legal transitions
Illegal transitions
𝑤 → 𝑤′
1 0 0
0 1 0
1 1 1
𝑤 𝑤′
≠

0 0 1
0 0 0
0 0 0
1 1 1
0 1 1
1 1 1
0 0 1
1 0 0
1 1 0
Markov states with memory
𝑤 𝑤′
𝑃[𝑤 → 𝑤′]
Legal transitions
Illegal transitions
𝑤 → 𝑤′
1 0 0
0 1 0
1 1 1
𝑤 𝑤′
0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
≠
𝑤 & 𝑤′
:
blocks of size 𝐷
22
1 0 0
1 0 1
1 0 1
ℒ 𝑤′ 𝑤
𝑤 → 𝑤′
= 0
𝑤 → 𝑤′ = 𝑒ℋ 𝜔0
𝐷

ℒ 𝑤′ 𝑤 = 𝑒ℋ 𝜔0
𝐷 , if 𝑤′ 𝑤 is a legal transition
0 Otherwise
Transfer matrix
Non normalized
Perron-Frobenius
Theorem
Right eigenvector
Left eigenvector
𝓈 𝝀
The biggest eigenvalue
𝑅(. )
L(. )
Using Chapman–
Kolmogorov equation
𝜇 𝜔0
𝑛
=
𝑒ℋ 𝜔0
𝑛
𝓈 𝝀
𝑛−𝐷+1 𝑅 𝜔 𝑛−𝐷
𝑛
𝐿(𝜔0
𝐷
)
𝒫 ℋ = log 𝓈 𝜆
Direct computing of
the Kullback-Leibler
Divergence
𝑑 𝐾𝐿 𝜋 𝜔
𝑇
, 𝜇 𝝀 = 𝒫 ℋ − 𝜋 𝜔
𝑇
ℋ − 𝒮 𝜋 𝜔
𝑇
2 𝑁𝐷
Compute the
average of
monomials
𝜇 𝑚𝑙 =
𝜕𝒫 ℋ
𝜕𝜆𝑙
23
Pressure Entropy
Empirical
probability
of the
potential

Setting the
constraints
Computing the
empirical
distribution
𝜋 𝜔
𝑇
(𝑚𝑙)
Random set of
parameters
Computing the
predicted
distribution
𝜇 𝜆(𝑖𝑡) 𝑚𝑙
Update the
parameters
Final set of
parameters
Predicted
distribution
Comparison
Transfer
Matrix
24

Limitation of the transfer matrix ℒ 𝑤𝑤
′
2 𝑁𝐷
2 𝑁𝐷
ℒ 𝑤′ 𝑤 𝑖, 𝑗 ∈ ℝ 𝐷𝑜𝑢𝑏𝑙𝑒 = 1 𝐵𝑦𝑡𝑒
2 𝑁𝐷 × 2 𝑁𝐷 = 22𝑁𝐷 𝐵𝑦𝑡𝑒𝑠ℒ 𝑤′ 𝑤
Memoryneed
Neuron number
Range: R = D+1 = 3
20 neurons

1,099,511,627,776 𝑇𝐵
25

𝑁𝑅 = 20
Small scale Large scale
𝑁𝑅 > 20𝑁𝑅 ≤ 20
Transfer matrix Montecarlo
26

Computing the
predicted
distribution
Setting the
constraints
Computing the
empirical
distribution
𝜋 𝜔
𝑇
(𝑚𝑙)
Random set of
parameters
Update the
parameters
Final set of
parameters
Predicted
distribution
Comparison
Transfer
Matrix
27

Goal
Develop a
framework to fit
spatio temporal
maximum entropy
models on large
scale spike trains
• Definitions
– Basic concepts
& Spatio-temporal).
• Discussion
28

Metropolis-Hasting (1970)
• 𝝀  ~𝜇 𝜆 𝑚𝑙
• Montecarlo states
• Transition between states:
𝑃 𝜔 𝑛, 1
|𝜔 𝑛, 2
= max
𝑄 𝜔 𝑛, 1 |𝜔 𝑛,(2)
𝑄 𝜔 𝑛, 2 |𝜔, 𝑛, 1
×
𝜇 𝜔 𝑛, 2
𝜇 𝜔 𝑛, 1
, 1
𝑁
𝑛
Proposal function
𝜇 𝜔0
𝑛
=
𝑒ℋ 𝜔0
𝑛
𝓈 𝝀
𝑛
𝐿(𝜔0
𝐷
)Symmetric in Metropolis algorithm:
𝑄 𝜔 1 → 𝜔 2 = 𝑄 𝜔 2 → 𝜔 1
where 𝑅, 𝐿, 𝓈 are unkown
29

Metropolis-Hasting
𝜇[𝜔 𝑛, 2 ])
𝜇[𝜔 𝑛, 1 ])
=
𝑒
ℋ 𝜔0
𝑛,(2)
𝓈 𝝀
𝑛,(2)
𝐿(𝜔0
𝐷,(2)
)
𝑒
ℋ 𝜔0
𝑛,(1)
𝓈 𝝀
𝑛,(1)
𝐿(𝜔0
𝐷,(1)
)
x
30
𝑛

Metropolis-Hasting
𝜇[𝜔 𝑛, 2
])
𝜇[𝜔 𝑛, 1 ])
=
𝑒
ℋ 𝜔0
𝑛,(2)
𝓼 𝝀
𝒏−𝑫+𝟏 𝑅 𝜔 𝑛−𝐷
𝑛,(2)
𝐿(𝜔0
𝐷,(2)
)
𝑒
ℋ 𝜔0
𝑛,(1)
𝓼 𝝀
𝑛,(1)
𝐿(𝜔0
𝐷,(1)
)
31
𝑛
x

Avoiding 𝑅 & 𝐿
𝜇[𝜔 𝑛, 2
])
𝜇[𝜔 𝑛, 1 ])
=
𝑒
ℋ 𝜔0
𝑛,(2)
𝓼 𝝀
𝑛,(2)
𝐿(𝜔0
𝐷,(2)
)
𝑒
ℋ 𝜔0
𝑛,(1)
𝓼 𝝀
𝑛,(1)
𝐿(𝜔0
𝐷,(1)
)
𝑛
𝜔 𝑛−𝐷
𝑛𝜔0
𝐷
32
x

Avoiding 𝑅 & 𝐿
𝜇[𝜔 𝑛, 2
])
𝜇[𝜔 𝑛, 1 ])
=
𝑒
ℋ 𝜔0
𝑛,(2)
𝓼 𝝀
𝑛,(2)
𝐿(𝜔0
𝐷,(2)
)
𝑒
ℋ 𝜔0
𝑛,(1)
𝓼 𝝀
𝑛,(1)
𝐿(𝜔0
𝐷,(1)
)
=
𝑒
ℋ 𝜔0
𝑛,(2)
𝑒
ℋ 𝜔0
𝑛,(1) = 𝑒Δℋ 𝝀
x
𝜔 𝑛−𝐷
𝑛𝜔0
𝐷
33
𝑛

Algorithm review Start: Random spike
train
Parameters 𝝀
𝑁 neurons.
Length = 𝑛
Choose a random
event and flip it
Compute 𝑒Δℋ 𝝀
𝑒Δℋ 𝝀 > 𝜖
𝜖 ∈ [0,1]
Choose between
[𝐷 + 1, 𝑛 − 𝐷 − 1]
No
Accept the
change
Reject the
change
Yes
Updated Montecarlo
spike train
𝑁𝑓𝑙𝑖𝑝 / Loop
34
Computed only
between [−𝐷, +𝐷]

Hassan Nasser, Olivier Marre, and Bruno Cessac. Spike trains analysis using
Gibbs distributions and Montecarlo method. Journal of Statistical Mechanics:
Theory and experiments, 2013.
35

Update the
parameters
Monte-
Carlo
Computing the
predicted
distribution
?
Setting the
constraints
Computing the
empirical
distribution
𝜋 𝜔
𝑇
(𝑚𝑙)
Random set of
parameters
Final set of
parameters
Predicted
distribution
Comparison
36

Goal
Develop a
framework to fit
spatio temporal
maximum entropy
models on large
scale spike trains
• Definitions
– Basic concepts
& Spatio-temporal).
• Discussion
37

Fitting parameters / concept
Maximizing entropy
(difficult because computing the exact entropy intractable)
≡
minimizing the divergence
𝑇
, 𝜇 𝝀 = 𝒫 𝝀 − 𝜋 𝜔
𝑇
ℋ − 𝒮[𝜋 𝜔
𝑇
]
Dudík, M., Phillips, S., and Schapire, R. (2004). Performance guarantees for
regularized maximum entropy density estimation. Proceedings of the 17th Annual
Conference on Computational Learning Theory.
Small scale: easy to compute
Large scale: hard to compute
- Bounding the negative log likelihood
Divergence
Iterations
Big 𝑑 𝐾𝐿
Small 𝑑 𝐾𝐿
38
- Relaxation

Maximizing entropy
(difficult because computing the exact entropy intractable)
≡
minimizing the divergence
𝑇
, 𝜇 𝝀 = 𝒫 𝝀 − 𝜋 𝜔
𝑇
ℋ − 𝒮[𝜋 𝜔
𝑇
]
Dudík, M., Phillips, S., and Schapire, R. (2004). Performance guarantees for
regularized maximum entropy density estimation. Proceedings of the 17th Annual
Conference on Computational Learning Theory.
Small scale: easy to compute
Large scale: hard to compute
- Bounding the negative log likelihood
Divergence
Iterations
Big 𝑑 𝐾𝐿
Small 𝑑 𝐾𝐿
39
- Relaxation

Hassan Nasser and Bruno Cessac. Parameters fitting for spatio-temporal
maximum entropy distributions: application to neural spike trains. Submitted to
Entropy.
- Bounding the Divergence
40
- With relaxation

Cost function
𝐶𝑓 = 𝑑 𝐾𝐿 𝜋 𝜔
𝑇
, 𝜇 𝝀′ − 𝑑 𝐾𝐿 𝜋 𝜔
𝑇
, 𝜇 𝝀 = 𝒫 𝝀′
− 𝒫 𝝀 − 𝜋 𝜔
𝑇
[Δℋ𝝀]
𝐶𝑓 = lim
𝒏→∞
1
𝑛
log
𝜔0
𝑛−1
𝜇 𝝀 𝜔0
𝑛−1
𝑒Δℋ 𝝀 𝜔0
𝑛−1
− 𝜋 𝜔
𝑇
[Δℋ𝝀]
𝐶𝑓
𝑠𝑒𝑞
≤
1
𝑛 − 𝐷
−𝛿𝜋 𝜔
𝑇
𝑚𝑙 + log 1 + 𝑒 𝛿
− 1 𝜇 𝝀 𝑚𝑙 + 𝜖( 𝜆 + 𝛿 − |𝜆|)
𝐶𝑓
𝑝𝑎𝑟
≤
1
𝐿 𝑛 − 𝐷
𝑙
−𝛿𝜋 𝜔
𝑇
𝑚𝑙 + log 1 + 𝑒 𝛿 𝑙 − 1 𝜇 𝝀 𝑚𝑙 + 𝜖𝑙( 𝜆𝑙 + 𝛿𝑙 − |𝜆𝑙|)
Relaxation >0Using Montecarlo
Parallel:
Sequential:
41
Parameters update
Number of parameters
𝝀′
= 𝝀 + 𝜹

Setting the
constraints
Computing the
empirical
distribution
𝜋 𝜔
𝑇
(𝑚𝑙)
Random set of
parameters
Computing the
predicted
distribution
Update the
parameters
Final set of
parameters
Exact predicted
distribution
Comparison
Monte-
Carlo
Fitting
42

Updating the target distribution
𝜕2
𝒫[𝝀 ]
𝜕𝜆𝑗 𝜕𝜆 𝑘
=
𝑛=−∞
∞
𝐶𝑗𝑘 𝑛
𝜇 𝝀+𝜹 𝑚𝑙 = 𝜇 𝝀 𝑚𝑙 +
𝑘
𝜕2
𝒫[𝝀 ]
𝜕𝜆𝑗 𝜕𝜆 𝑘
𝛿 𝑘 +
1
2
𝑗,𝑘,𝑙
𝜕3
𝒫[𝝀]
𝜕𝜆𝑗 𝜕𝜆 𝑘 𝜕𝜆𝑙
𝛿𝑗 𝛿 𝑘 𝛿𝑙 + ⋯
New 𝝀  New distribution.
Montecarlo
Taylor
Expansion
Previous
distribution
Exponential decay of correlation

In practice n is finite
(If 𝜹 𝒔𝒎𝒂𝒍𝒍)𝜇 = 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛(𝜆)
43
ℎ𝑒𝑎𝑣𝑦 𝑡𝑜 𝑐𝑜𝑚𝑝𝑢𝑡𝑒

45
Error
evolving

Synthetic data sets
• Sparse
Rates
parameters
Higher order parameters
• Dense
47

Dense
Sparse
N=20, R = 3
NR = 60
Random (known)
parameter/monomials
Try to recover the parameters
Errors on parameters
48
Synthetic spike train

Comparing blocks probabilities
N = 40 / Spatial N = 40 / Spatio-temporal
NR = 40 NR = 80
49

Data Courtesy: Michael J. Berry II (Princeton University)
and Olivier Marre (Institut de la vision, Paris).
Purely spatial pairwise
Pairwise with 1 time-step memory
Binned at 20 ms
Application on retinal data
50
Schneidman et al 2006

Real data: 20 neurons
Spatial Pairwise Spatio-temporal Pairwise
51

Real data: 40 neurons
Pairwise Spatial Pairwise Spatio-temporal
52

Goal
Develop a
framework to fit
spatio temporal
maximum entropy
models on large
scale spike trains
• Definitions
– Basic concepts
& Spatio-temporal).
• Discussion
53

Event neural assembly Simulation
(EnaS)
V1
2007
V2
2010
V3
2014
Thierry Viéville
Bruno Cessac
Juan-Carlos Vasquez / Horacio-Rostro Gonzalez/Hassan Nasser
Selim Kraria
+ Graphical user interface
Goal:
Analyzing spike trains
Share research advances with the community
C++ & Qt (interface Java, Matlab, PyThon)
54

Architecture
EnaS
RasterBlock Gibbs Potential
Graphical User
interface
- Data management
- Formats
- Empirical statistics
- Grammar
- Defining models
- Generating artificial
spike trains
- Fitting
- Montecarlo process
(Parallelization)
- Interactive environment
- Visualization of
stimulus and response
simultaneously.
- Demo
Contributions Contributions
55

0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
Grammar
0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
Needed in:
Computing empirical distribution.
Montecarlo sampling.
Divergence and Entropy computing.
Confidence bounds.
Grammar
56
Only observed transitions
𝑇 − 𝐷 maximum
Transfer Matrix  2 𝑁𝐷

Grammar data structure
0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
N = 3.
D = 2.
0 1
1 0
0 1
0
1
1
Prefix
Suffix
57
Transition

0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
N = 3.
D = 2.
0 1
1 0
0 1
0
1
1
1 0
0 1
1 1
0
0
1
Prefix
Suffix
58

0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
N = 3.
D = 2.
0 1
1 0
0 1
0
1
1
1 0
0 1
1 1
0
0
1
Prefix
Suffix
59
2
This transitions appears 2 times!

0 1 0 0 1 0 1 0
1 0 1 0 1 0 0 1
0 1 1 1 1 0 0 1
0 1 0 0 1 1 0 0
1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 1
N = 3.
D = 2.
0 1
1 0
0 1
0
1
1
1 0
0 1
1 1
0
0
1
Prefix
Suffix
60
2
1 0
1 0
1 0
0
1
1
1
0
0
A new suffix

Map: C++ data container Sorting in a chosen order
…
61

Map: C++ data container Sorting in a chosen order
…
It appeared two times!
62

Architecture
EnaS
Graphical User
interface
- Data management
- Grammar
- …
- Defining models
spike trains
- Fitting
(Parallelization)
- Visualization of
simultaneously.
- Demo
63

Parallelization of Montecarlo process
𝑁𝑡𝑖𝑚𝑒𝑠
xx
x
x x
x
x x
x
x x
x x
x
x x
x
x
x x
Personal Multi-processors computer: 2-8 processors
Cluster (64 processors machines at INRIA)
OpenMp
64
MPI More processors / More time consuming in our case

Processor 1 Processor 2
Example with 2 processors
65

Boundaries between processors
x
x
𝐷
[−𝐷, +D]
6614 months + February!1 Processor

Architecture
EnaS
Graphical User
interface
- Data management
- Grammar
- …
- Defining models
spike trains
- Fitting
(Parallelization)
- Visualization of
simultaneously.
- Demo
67

Data courtesy: Gerrit Hilgen, Newcastle University,
Institute of Neuroscience, United Kingdom
Interface design:
Selim Kraria 68
EnaS Demo

Hassan Nasser, Selim Kraria, Bruno Cessac. EnaS: a new software for analyzing
large scale spike trains. In preparation.
70

Conclusion
71
Montecarlo
………….
Fitting
EnaS
Software
Models Perspectives
Parameters Perspectives
Perspectives
A framework to fit spatio-temporal maximum entropy
models on large scale spike trains

Synthetic data Vs Real data
Synthetic data
Potential shape is known
(monomials are known)
Real data
72
Potential shape is
unknown
(monomials are
unknown)Fitting only
Guessing the shape
+
Fitting

Monomials
Model
Canonical
Ising, pairwise with
delay, triplets, …
Small scaleLarge scale
- Big computation time
- Non Observed
monomials
- Estimation errors.
Pre-Selection
73
Rodrigo Cofre & Bruno Cessac
40 neurons

Making sense of parameters
Model
parameters
Evaluate the importance
of particular type of
correlations
Possibility of generalize
the model prediction on
new stimulus
74

Stationary
Maximum
entropy model
New
stimulus
1- Statistics
2- No new
response
S R
Stimulus Dependent
Maximum Entropy
models (Granot-
Atedgi et al 13)
New
stimulus
New spike
Response
S R
75

EnaS
Retina Spike sorting Spike trainStimulus
Visualization
Visualization
+
Empirical analysis
+
Maximum Entropy
modelling
NowFuture
- More empirical
observation
packages
- More neural
coding
functionalities
Spike sorting
- Receptive
field
- Neurons
selection
Type
identification
- Stimulus
design
- Features
extraction
76
Retina
models
VirtualRetina

Next …
Starting a company in IT/Data Analytics:
– First prize in innovative project competition (UNICE Foundation).
– Current project: Orientation in education using real surveys.
– EnaS is in perspective in collaboration with INRIA.
Caty Conraux & Vincent Tricard
77

Thanks collaborators
• Adrian Palacios
• Olivier Marre
• Michael J. Berry II
• Gašper Tkačik
• Thierry Morra
78

Appendix
• Tuning Ntimes.
• Tuning Nflip.
• Validating montecarlo algorithm.
• Tunnig delta.
• MPI Vs OpenMP, memory.
• Why MPI is not better than OpenMP?
• Computational complexity of the Montecarlo algorithm.
• Review of Montecarlo / Nflip.
• Number of Iterations for fitting.
• Fluctuations on parameters / Non existing monomials.
• Epsilon on fitting parameters.
• Binning.
• Tests with several stimulus.
• Granot-Atedgi et al 2013
• Granot-Atedgi et al 2013
80

Models with random parameters 𝝀
81

Tuning 𝑁𝑡𝑖𝑚𝑒𝑠
Dense Sparse
𝑁𝑡𝑖𝑚𝑒𝑠 𝑁𝑡𝑖𝑚𝑒𝑠
𝑑 𝑘𝑙𝑑 𝑘𝑙
82

Tuning 𝑘 ≡ 𝑁𝑓𝑙𝑖𝑝
𝜆 𝐷𝑒𝑛𝑠𝑒
Montecarlo Transfer Matrix
𝑁𝑓𝑙𝑖𝑝 = 𝑘 × 𝑁 × 𝑁𝑡𝑖𝑚𝑒𝑠 Exact
𝑑 𝑘𝑙(𝑘)
𝜆 𝑆𝑝𝑎𝑟𝑠𝑒
Montecarlo Transfer Matrix
𝑁𝑓𝑙𝑖𝑝 = 𝑘 × 𝑁 × 𝑁𝑡𝑖𝑚𝑒𝑠 Exact
𝑑 𝑘𝑙(𝑘)
𝑘
𝑘 = 10 𝑘 = 50
83
10
𝑘

Taylor Expansion (𝛿 test)
||𝜹|| ||𝜹||
Dense Sparse
84

• Multiprocessors computers:
– Personal computer (2-8 processors).
– Cluster (64 processors machines at INRIA).
• Parallel programming frameworks:
– OpenMp: The processors of the same computer divide
the tasks (live memory (RAM) is shared).
– MPI: several processors on each computer share the
task (Memory in not shared).
4 processors
 Time/4.
Parallelization
64 processors  Time/64.
85

MPI
• OpenMP is limited to the number of processors
on a single machine.
• With MPI, 64 processors x 10 machine  640
processors.
• Although we though it would take less time
with MPI, but …! Master
computer1 cluster of 64 proc
Another cluster of 64 proc
The whole
Montecarlo
Spike train
At each change of the memory, there will be a communication between the clusters and the master
At each flip  loss of time in communication more than computing 86

Computing complexity
Montecarlo (𝑁𝑡𝑖𝑚𝑒𝑠 = 106; 𝑘 = 50)
Vs
and Transfer Matrix
87

Computational complexity
Taken for running this algorithm:
𝐶𝑜𝑚𝑝𝑢𝑡𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 = 𝑘. 𝑁. 𝑁𝑡𝑖𝑚𝑒𝑠 . 𝑡Δℋ 𝝀
𝑡Δℋ 𝝀
= 𝑓𝑡(𝐿)
Start: Random
spike train
Choose a random
event and flip it
𝜖 ∈ [0,1]
No
Accept the
change
Reject the
change
Yes
Updated Montecarlo
spike train
Loop : 𝑘. 𝑁. 𝑁𝑡𝑖𝑚𝑒𝑠2- In each loop, computing 𝑒Δℋ 𝝀 needs to
perform a loop over the monomials.
1- We have a loop over 𝑘. 𝑁. 𝑁𝑡𝑖𝑚𝑒𝑠 .
On a cluster of 64 processors:
- 40 Neurons Ising: 10 min
- 40 Neurons Pairwise: 20 min
88

Start: Random spike
train
Parameters 𝝀
𝑁 neurons.
Length = 𝑁𝑡𝑖𝑚𝑒𝑠
Choose a random
event and flip it
𝜖 ∈ [0,1]
No
Accept the
change
Reject the
change
Yes
Updated Montecarlo
spike train
Tuning
Loop : 𝑁𝑓𝑙𝑖𝑝=
Algorithm
review
𝑘 × 𝑁 × 𝑁𝑡𝑖𝑚𝑒𝑠
Choose between
[𝐷, 𝑁𝑡𝑖𝑚𝑒𝑠 − 𝐷]
89

How many iterations do we need?
• 𝑁𝑅 < 20:
– 50 parallel + 100 sequential
• 𝑁𝑅 < 150:
– 100 parallel + 100
sequential
90

𝜖 on parameters fitting
• Dudik et al does not allow that:
• 𝛽𝑙 > 𝜋𝑙 || 𝛽𝑙 > 1 − 𝜋𝑙. In this case ⇒ 𝛽𝑙 = 0.9 𝜋𝑙
• We avoid dividing by 0 (
…
𝜇 𝑙
) … by replacing
putting 𝜆𝑙 = −∞
91

Problem of non- observed monomials
Central limit theorem (Fluctuations on monomials averages)
𝜇 𝝀 𝑚𝑙 =
𝜕𝒫
𝜕𝜆
; 𝝁 𝜆
∗
𝑚𝑙 + 𝜼 =
𝜕𝒫
𝜕𝜆
+
𝝐𝜕2 𝒫
𝜕𝜆2 + … =
𝜕𝒫
𝜕𝜆
+ 𝜖𝒳 + ⋯
𝝀 = 𝝀∗ + 𝝐 : Fluctuations on parameters
𝝐 = 𝒳−1 𝜼
Covariance matrix 𝒳𝑖𝑗 =
𝜕2 𝒫
𝜕𝜆 𝑖 𝜕𝜆 𝑗
Convex
Computing 𝒳 over 1000 potential
shows that a big percentage of 𝒳 is
zero  𝒳−1
will have big value 
flucutations on 𝝐 are big.
𝒳
92

Binning
• Change completely
the statistics.
• 700% of more new
patterns appear when
we bin at 20.
• Should be studied
rigorously.
93

0 1 1 0 0
0 0 1 0 1
2 neurons + binning = 5
0 0 0 0 0
0 0 1 0 1
1
1
0 0 0 0 0
0 0 0 0 0
0
0
1
0
0
1
….210
4

95
- Loss of information
- Loosing biological scale
- More dense spike train
- Less non-observed monomials
Why spike trains have been binned in the literature?
- No clear answer.
- Relation between taking binning as a substitute for
memory is not convincing.
- Might be because it allows having more monomials 
Less dangerous for convexity  convergence is more
guaranteed.

Making sense of parameters
Stimulus 1
Stimulus 2
Stimulus 4
= 𝑃 𝑅 𝑆1
= 𝑃 𝑅 𝑆2
= 𝑃 𝑅 𝑆3
= 𝑃 𝑅 𝑆4
Stimulus 3
96

P[S|R]
Einat Granot-Atedgi, Gašper Tkačik, Ronen Segev, Elad Schneidman. Stimulus-dependent
Maximum Entropy Models of Neural Population Codes. Plos Comp. Biol. 2013.
97
Scheidman 2006
LNL
ℋ 𝜔 =
𝑖
𝜆𝑖(𝑡)𝜔𝑖 𝑡 +
𝑖,𝑗
𝜆𝑖𝑗 𝜔𝑖 0 𝜔𝑗(0)

Cross validation on small scale
Vasquez et al 2013
98

Relaxation
99
𝑙=1
𝐿
𝛽𝑙
+
𝛼 𝜇 𝜆 𝑚𝑙 − 𝜋 𝑚𝑙 ≤ 𝜖𝑙 + 𝛽𝑙
−
𝛼[ 𝜋 𝑚𝑙 − 𝜇 𝜆 𝑚𝑙 ≤ 𝜖𝑙]

100
Schneidman et al 2006 stimulus

Confidence bounds in linear scale
101

Confidence bounds in log scale
102

Analysis of large scale spiking networks dynamics with spatio-temporal constraints: application to Multi-Electrodes acquisitions in the retina

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Analysis of large scale spiking networks dynamics with spatio-temporal constraints: application to Multi-Electrodes acquisitions in the retina

Similar to Analysis of large scale spiking networks dynamics with spatio-temporal constraints: application to Multi-Electrodes acquisitions in the retina (20)

More from Hassan Nasser

More from Hassan Nasser (6)

Recently uploaded

Recently uploaded (20)

Analysis of large scale spiking networks dynamics with spatio-temporal constraints: application to Multi-Electrodes acquisitions in the retina