Structure learning with Deep Neural Networks

Structure learning
with deep neuronal networks
7th
Network Modeling Workshop, 28/2/2014
Patrick Michl

02.2014
Patrick Michl
Network Modeling
Agenda
AutoencodersAutoencoders
Biological ModelBiological Model
Validation & ImplementationValidation & Implementation

02.2014
Patrick Michl
Network Modeling
Real world data usually is high dimensional …
x1
x2
Dataset Model
Autoencoders

02.2014
Patrick Michl
Network Modeling
… which makes structural analysis and modeling complicated!
x1
x2
x1
x2
Dataset Model
?
Autoencoders

02.2014
Patrick Michl
Network Modeling
Dimensionality reduction techinques like PCA …
x1
x2
PCA
Dataset Model
Autoencoders

02.2014
Patrick Michl
Network Modeling
… can not preserve complex structures!
x1
x2
PCA
Dataset Model
x1
x2
𝑥2=α 𝑥1+β
Autoencoders

02.2014
Patrick Michl
Network Modeling
Therefore the analysis of unknown structures …
x1
x2
Dataset Model
Autoencoders

02.2014
Patrick Michl
Network Modeling
… needs more considerate nonlinear techniques!
x1
x2
Dataset Model
x1
x2
𝑥2=𝑓(𝑥1)
Autoencoders

02.2014
Patrick Michl
Network Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
• Artificial Neuronal Network
Autoencoders
input data X
output data X‘
Perceptrons
Gaussian Units

02.2014
Patrick Michl
Network Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
Autoencoders
input data X
output data X‘
Perceptrons
Gaussian Units
Perceptron
1
0
Gauss Units
R

02.2014
Patrick Michl
Network Modeling
Autoencoder
• Multiple hidden layers
Autoencoders
… with multiple hidden layers.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Such networks are called deep networks.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)Definition (deep network)
Deep networks are artificial neuronal
networks with multiple hidden layers

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
• Deep network

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Autoencoders have a symmetric topology …
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
• Deep network
• Symmetric topology

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… with an odd number of hidden layers.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
• Deep network

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
The small layer in the center works lika an information bottleneck
input data X
output data X‘
• Deep network
• Information bottleneck
Bottleneck

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
... that creates a low dimensional code for each sample in the input data.
input data X
output data X‘
• Deep network
Bottleneck

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
The upper stack does the encoding …
input data X
output data X‘
• Deep network
• Encoder
Encoder

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… and the lower stack does the decoding.
input data X
output data X‘
• Deep network
• Encoder
• Decoder
Encoder
Decoder

02.2014
Patrick Michl
Network Modeling
• Deep network
• Encoder
• Decoder
Autoencoder
Autoencoders
… and the lower stack does the decoding.
input data X
output data X‘
Encoder
Decoder
Definition (deep network)
Deep networks are artificial neuronal
networks with multiple hidden layers
Definition (autoencoder)
Autoencoders are deep networks with a symmetric
topology and an odd number of hiddern layers,
containing a encoder, a low dimensional
representation and a decoder.

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Autoencoders can be used to reduce the dimension of data …
input data X
output data X‘
Problem: dimensionality of data
Idea:
1. Train autoencoder to minimize the distance
between input X and output X‘
2. Encode X to low dimensional code Y
3. Decode low dimensional code Y to output X‘
4. Output X‘ is low dimensional

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… if we can train them!
input data X
output data X‘
Problem: dimensionality of data
Idea:
1. Train autoencoder to minimize the distance
between input X and output X‘
2. Encode X to low dimensional code Y
3. Decode low dimensional code Y to output X‘
4. Output X‘ is low dimensional

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is a good approach.
input data X
output data X‘
Training
Backpropagation

02.2014
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
(1) The distance (error) between current output X‘ and wanted output Y is
computed. This gives a error function
error


02.2014
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is the choice
input data X
output data X‘
Training
Backpropagation
Example (linear neuronal unit with two inputs)

02.2014
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
(2) By calculating we get a vector that shows in a direction which decreases
the error
(3) We update the parameters to decrease the error


02.2014
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is the choice
input data X
output data X‘
Training
Backpropagation
(2) By calculating we get a vector that shows in a direction which decreases
the error
(3) We update the parameters to decrease the error
(4) We repeat that

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… the problem are the multiple hidden layers!
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation is known to be slow far away from the output layer …
Backpropagation
• Very slow training

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
… and can converge to poor local minima.
Backpropagation
• Maybe bad solution

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
Idea: Initialize close to a good solution
The task is to initialize the parameters close to a good solution!

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
Therefore the training of autoencoders has a pretraining phase …

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
• Restricted Boltzmann Machines
… which uses Restricted Boltzmann Machines (RBMs)

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
Restricted Boltzmann Machine
• RBMs are Markov Random Fields

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
Markov Random Field
Every unit influences every neighbor
The coupling is undirected
Motivation (Ising Model)
A set of magnetic dipoles (spins)
is arranged in a graph (lattice)
where neighbors are
coupled with a given strengt

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
• Bipartite topology: visible (v), hidden (h)
• Use local energy to calculate the probabilities of values
Training:
contrastive divergency
(Gibbs Sampling)
h1
v1 v2
v3 v4
h2 h3

02.2014
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
Gibbs Sampling

02.2014
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
The top layer RBM transforms real value data into binary codes.

Top
Training

02.2014
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Top
Therefore visible units are modeled with gaussians to encode data …

h2
v1 v2
v3 v4
h3 h4 h5h1
Training

02.2014
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Top
… and many hidden units with simoids to encode dependencies

h2
v1 v2
v3 v4
h3 h4 h5h1
Training

02.2014
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Top
The objective function is the sum of the local energies.
Local Energy

𝐸 𝑣 ≔ −∑
h
𝑤 𝑣h
𝑥 𝑣
𝜎 𝑣
𝑥
h
+
( 𝑥 𝑣 − 𝑏 𝑣 )2
2 𝜎 𝑣
2

h2
v1 v2
v3 v4
h3 h4 h5h1
Training

02.2014
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction

The next RBM layer maps the dependency encoding…
Training

02.2014
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
… from the upper layer …
v

h1
v1 v2
v3 v4
h2 h3
Training

02.2014
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
… to a smaller number of simoids …
h

h1
v1 v2
v3 v4
h2 h3
Training

02.2014
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
… which can be trained faster than the top layer
Local Energy
𝐸𝑣 ≔−∑
h
𝑤 𝑣h 𝑥 𝑣 𝑥h+𝑥h 𝑏h

𝐸h ≔−∑
𝑣
𝑤 𝑣h 𝑥 𝑣 𝑥h+𝑥 𝑣 𝑏 𝑣

h1
v1 v2
v3 v4
h2 h3
Training

02.2014
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Unrolling
The symmetric topology allows us to skip further training.
Training

02.2014
Patrick Michl
Network Modeling
After pretraining backpropagation usually finds good solutions
Autoencoders
Autoencoder
Training
• Pretraining
Top RBM (GRBM)
Reduction RBMs
Unrolling
• Finetuning
Backpropagation

02.2014
Patrick Michl
Network Modeling
The algorithmic complexity of RBM training depends on the network size
Autoencoders
Autoencoder
Training
• Complexity: O(inw)
i: number of iterations
n: number of nodes
w: number of weights
• Memory Complexity: O(w)

02.2014
Patrick Michl
Network Modeling
Agenda
Autoencoders
Biological Model
Validation & Implementation

02.2014
Patrick Michl
Network Modeling Network Modeling
Restricted Boltzmann Machines (RBM)
How to model the topological structure?
S
E
TF

02.2014
Patrick Michl
Network Modeling
We define S and E as visible data Layer …
S
E
TF
Network Modeling

02.2014
Patrick Michl
Network Modeling
S E
TF
Network Modeling
We identify S and E with the visible layer …

02.2014
Patrick Michl
Network Modeling
S E
… and the TFs with the hidden layer in a RBM
TF
Network Modeling

02.2014
Patrick Michl
Network Modeling
S E
The training of the RBM gives us a model
TF
Network Modeling

02.2014
Patrick Michl
Network Modeling
Agenda
Autoencoder
Biological Model
Implementation & Results

02.2014
Patrick Michl
Network Modeling
Results
Validation of the results
• Needs information about the true regulation
• Needs information about the descriptive power of the data

02.2014
Patrick Michl
Network Modeling
Results
Validation of the results
• Needs information about the true regulation
• Needs information about the descriptive power of the data
Without this infomation validation can only be done,
using artificial datasets!

02.2014
Patrick Michl
Network Modeling
Results
Artificial datasets
We simulate data in three steps:

02.2014
Patrick Michl
Network Modeling
Results
Artificial datasets
We simulate data in three steps
Step 1
Choose number of Genes (E+S) and create random bimodal distributed
data

02.2014
Patrick Michl
Network Modeling
Results
Artificial datasets
Step 1
data
Step 2
Manipulate data in a fixed order

02.2014
Patrick Michl
Network Modeling
Results
Artificial datasets
Step 1
data
Step 2
Manipulate data in a fixed order
Step 3
Add noise to manipulated data
and normalize data

02.2014
Patrick Michl
Network Modeling
Simulation
Results
Step 1
Number of visible nodes 8 (4E, 4S)
Create random data:
Random {-1, +1} + N(0,

02.2014
Patrick Michl
Network Modeling
Simulation
Results
Noise

Step 2
Manipulate data

02.2014
Patrick Michl
Network Modeling
Simulation
Results
Step 3
Add noise: N(0,

02.2014
Patrick Michl
Network Modeling
Results
We analyse the data X
with an RBM

02.2014
Patrick Michl
Network Modeling
Results
We train an autoencoder with 9 hidden layers
and 165 nodes:
Layer 1 & 9: 32 hidden units
Layer 5: 5 hidden units
input data X
output data X‘

02.2014
Patrick Michl
Network Modeling
Results
We transform the data from X to X‘
And reduce the dimensionality

02.2014
Patrick Michl
Network Modeling
Results
We analyse the
transformed data X‘
with an RBM

02.2014
Patrick Michl
Network Modeling
Results
Lets compare the models

02.2014
Patrick Michl
Network Modeling
Results
Another Example with more nodes and larger autoencoder

02.2014
Patrick Michl
Network Modeling
Conclusion
Conclusion
• Autoencoders can improve modeling significantly by reducing the
dimensionality of data
• Autoencoders preserve complex structures in their multilayer
perceptron network. Analysing those networks (for example with
knockout tests) could give more structural information
• The drawback are high computational costs
Since the field of deep learning is getting more popular (Face
recognition / Voice recognition, Image transformation). Many new
improvements in facing the computational costs have been made.

02.2014
Patrick Michl
Network Modeling
Acknowledgement
eilsLABS
Prof. Dr. Rainer König
Prof. Dr. Roland Eils
Network Modeling Group

Structure learning with Deep Neural Networks

Recommended

Recommended

More Related Content

Similar to Structure learning with Deep Neural Networks

Similar to Structure learning with Deep Neural Networks (20)

More from Patrick Michl

More from Patrick Michl (7)

Recently uploaded

Recently uploaded (20)

Structure learning with Deep Neural Networks