Structure learning with Deep Neural Networks

Structure learning
with deep neuronal networks
6th Network Modeling Workshop, 6/6/2013
Patrick Michl

/6/2013
Patrick Michl
Network Modeling
Agenda
Autoencoders
Biological Model
Validation & Implementation

/6/2013
Patrick Michl
Network Modeling
Real world data usually is high dimensional …
x1
x2
Dataset Model
Autoencoders

/6/2013
Patrick Michl
Network Modeling
… which makes structural analysis and modeling complicated!
x1
x2
x1
x2
Dataset Model
𝐹(𝑥1, 𝑥2) ?
Autoencoders

/6/2013
Patrick Michl
Network Modeling
Dimensionality reduction techinques like PCA …
x1
x2
PCA
Dataset Model
Autoencoders

/6/2013
Patrick Michl
Network Modeling
… can not preserve complex structures!
x1
x2
PCA
Dataset Model
x1
x2
𝑥2 = α𝑥1 + β
Autoencoders

/6/2013
Patrick Michl
Network Modeling
Therefore the analysis of unknown structures …
x1
x2
Dataset Model
Autoencoders

/6/2013
Patrick Michl
Network Modeling
… needs more considerate nonlinear techniques!
x1
x2
Dataset Model
x1
x2
𝑥2 = 𝑓(𝑥1)
Autoencoders

/6/2013
Patrick Michl
Network Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
• Artificial Neuronal Network
Autoencoders
input data X
output data X‘
Perceptrons
Gaussian Units

/6/2013
Patrick Michl
Network Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
Autoencoders
input data X
output data X‘
Perceptrons
Gaussian Units
Perceptron
1
0
Gauss Units
R

/6/2013
Patrick Michl
Network Modeling
Autoencoder
• Multiple hidden layers
Autoencoders
… with multiple hidden layers.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Such networks are called deep networks.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)Definition (deep network)
Deep networks are artificial neuronal
networks with multiple hidden layers

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
• Deep network

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Autoencoders have a symmetric topology …
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
• Deep network
• Symmetric topology

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… with an odd number of hidden layers.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
• Deep network

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
The small layer in the center works lika an information bottleneck
input data X
output data X‘
• Deep network
• Information bottleneck
Bottleneck

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
... that creates a low dimensional code for each sample in the input data.
input data X
output data X‘
• Deep network
Bottleneck

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
The upper stack does the encoding …
input data X
output data X‘
• Deep network
• Encoder
Encoder

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… and the lower stack does the decoding.
input data X
output data X‘
• Deep network
• Encoder
• Decoder
Encoder
Decoder

/6/2013
Patrick Michl
Network Modeling
• Deep network
• Encoder
• Decoder
Autoencoder
Autoencoders
… and the lower stack does the decoding.
input data X
output data X‘
Encoder
Decoder
Definition (deep network)
Deep networks are artificial neuronal
networks with multiple hidden layers
Definition (autoencoder)
Autoencoders are deep networks with a symmetric
topology and an odd number of hiddern layers,
containing a encoder, a low dimensional
representation and a decoder.

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Autoencoders can be used to reduce the dimension of data …
input data X
output data X‘
Problem: dimensionality of data
Idea:
1. Train autoencoder to minimize the distance
between input X and output X‘
2. Encode X to low dimensional code Y
3. Decode low dimensional code Y to output X‘
4. Output X‘ is low dimensional

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… if we can train them!
input data X
output data X‘
Problem: dimensionality of data
Idea:
1. Train autoencoder to minimize the distance
between input X and output X‘
2. Encode X to low dimensional code Y
3. Decode low dimensional code Y to output X‘
4. Output X‘ is low dimensional

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is a good approach.
input data X
output data X‘
Training
Backpropagation

/6/2013
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
(1) The distance (error) between current output X‘ and wanted output Y is
computed. This gives a error function
𝑋′
= 𝐹 𝑋
error = 𝑋′2 − 𝑌

/6/2013
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is the choice
input data X
output data X‘
Training
Backpropagation
Example (linear neuronal unit with two inputs)

/6/2013
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
(2) By calculating −𝛻𝑒𝑟𝑟𝑜𝑟 we get a vector that shows in a direction which
decreases the error
(3) We update the parameters to decrease the error

/6/2013
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is the choice
input data X
output data X‘
Training
Backpropagation
(2) By calculating −𝛻𝑒𝑟𝑟𝑜𝑟 we get a vector that shows in a direction which
decreases the error
(3) We update the parameters to decrease the error
(4) We repeat that

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… the problem are the multiple hidden layers!
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation is known to be slow far away from the output layer …
Backpropagation
• Very slow training

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
… and can converge to poor local minima.
Backpropagation
• Maybe bad solution

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
Idea: Initialize close to a good solution
The task is to initialize the parameters close to a good solution!

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
Therefore the training of autoencoders has a pretraining phase …

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
• Restricted Boltzmann Machines
… which uses Restricted Boltzmann Machines (RBMs)

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
Restricted Boltzmann Machine
• RBMs are Markov Random Fields

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
Markov Random Field
Every unit influences every neighbor
The coupling is undirected
Motivation (Ising Model)
A set of magnetic dipoles (spins)
is arranged in a graph (lattice)
where neighbors are
coupled with a given strengt

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
• Bipartite topology: visible (v), hidden (h)
• Use local energy to calculate the probabilities of values
Training:
contrastive divergency
(Gibbs Sampling)
h1
v1 v2 v3 v4
h2 h3

/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
• Pretraining
Gibbs Sampling

/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
The top layer RBM transforms real value data into binary codes.
𝑉 ≔ set of visible units
𝑥 𝑣 ≔ value of unit 𝑣, ∀𝑣 ∈ 𝑉
𝑥 𝑣 ∈ 𝑹, ∀𝑣 ∈ 𝑉
𝐻 ≔ set of hidden units
𝑥ℎ ≔ value of unit ℎ, ∀ℎ ∈ 𝐻
𝑥ℎ ∈ {𝟎, 𝟏}, ∀ℎ ∈ 𝐻
Top
Training

/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Top
Therefore visible units are modeled with gaussians to encode data …
𝑥 𝑣~𝑁 𝑏 𝑣 + 𝑤 𝑣ℎ
ℎ
𝑥ℎ, 𝜎𝑣
𝜎𝑣 ≔ std. dev. of unit 𝑣
𝑏 𝑣 ≔ bias of unit 𝑣
𝑤 𝑣ℎ ≔ weight of edge (𝑣, ℎ)
h2
v1 v2 v3 v4
h3 h4 h5h1
Training

/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Top
… and many hidden units with simoids to encode dependencies
𝑥ℎ~sigm 𝑏ℎ + 𝑤 𝑣ℎ
𝑣
𝑥 𝑣
𝜎𝑣
𝜎𝑣 ≔ std. dev. of unit 𝑣
𝑏ℎ ≔ bias of unit ℎ
h2
v1 v2 v3 v4
h3 h4 h5h1
Training

/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Top
The objective function is the sum of the local energies.
Local Energy
𝐸ℎ ≔ − 𝑤 𝑣ℎ
𝑣
𝑥 𝑣
𝜎𝑣
𝑥ℎ + 𝑥ℎ 𝑏ℎ
𝐸 𝑣 ≔ − 𝑤 𝑣ℎ
ℎ
𝑥 𝑣
𝜎𝑣
𝑥ℎ +
𝑥 𝑣 − 𝑏 𝑣
2
2𝜎𝑣
2
h2
v1 v2 v3 v4
h3 h4 h5h1
Training

/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
𝑉 ≔ set of visible units
𝑥 𝑣 ≔ value of unit 𝑣, ∀𝑣 ∈ 𝑉
𝑥 𝑣 ∈ {𝟎, 𝟏}, ∀𝑣 ∈ 𝑉
𝐻 ≔ set of hidden units
𝑥ℎ ≔ value of unit ℎ, ∀ℎ ∈ 𝐻
𝑥ℎ ∈ {𝟎, 𝟏}, ∀ℎ ∈ 𝐻
The next RBM layer maps the dependency encoding…
Training

/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
… from the upper layer …
𝑥 𝑣~sigm 𝑏 𝑣 + 𝑤 𝑣ℎ
ℎ
𝑥ℎ
𝑏 𝑣 ≔ bias of unit v
h1
v1 v2 v3 v4
h2 h3
Training

/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
… to a smaller number of simoids …
𝑥ℎ~sigm 𝑏ℎ + 𝑤 𝑣ℎ
𝑣
𝑥 𝑣
𝑏ℎ ≔ bias of unit h
h1
v1 v2 v3 v4
h2 h3
Training

/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
… which can be trained faster than the top layer
Local Energy
𝐸 𝑣 ≔ − 𝑤 𝑣ℎ
ℎ
𝑥 𝑣 𝑥ℎ + 𝑥ℎ 𝑏ℎ
𝐸ℎ ≔ − 𝑤 𝑣ℎ
𝑣
𝑥 𝑣 𝑥ℎ + 𝑥 𝑣 𝑏 𝑣
h1
v1 v2 v3 v4
h2 h3
Training

/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Unrolling
The symmetric topology allows us to skip further training.
Training

/6/2013
Patrick Michl
Network Modeling
After pretraining backpropagation usually finds good solutions
Autoencoders
Autoencoder
Training
• Pretraining
Top RBM (GRBM)
Reduction RBMs
Unrolling
• Finetuning
Backpropagation

/6/2013
Patrick Michl
Network Modeling
The algorithmic complexity of RBM training depends on the network size
Autoencoders
Autoencoder
Training
• Complexity: O(inw)
i: number of iterations
n: number of nodes
w: number of weights
• Memory Complexity: O(w)

/6/2013
Patrick Michl
Network Modeling Network Modeling
Restricted Boltzmann Machines (RBM)
How to model the topological structure?
S
E
TF

/6/2013
Patrick Michl
Network Modeling
We define S and E as visible data Layer …
S
E
TF
Network Modeling

/6/2013
Patrick Michl
Network Modeling
S E
TF
Network Modeling
We identify S and E with the visible layer …

/6/2013
Patrick Michl
Network Modeling
S E
… and the TFs with the hidden layer in a RBM
TF
Network Modeling

/6/2013
Patrick Michl
Network Modeling
S E
The training of the RBM gives us a model
TF
Network Modeling

/6/2013
Patrick Michl
Network Modeling
Agenda
Autoencoder
Biological Model
Implementation & Results

/6/2013
Patrick Michl
Network Modeling
Results
Validation of the results
• Needs information about the true regulation
• Needs information about the descriptive power of the data

/6/2013
Patrick Michl
Network Modeling
Results
Validation of the results
• Needs information about the true regulation
• Needs information about the descriptive power of the data
Without this infomation validation can only be done,
using artificial datasets!

/6/2013
Patrick Michl
Network Modeling
Results
Artificial datasets
We simulate data in three steps:

/6/2013
Patrick Michl
Network Modeling
Results
Artificial datasets
We simulate data in three steps
Step 1
Choose number of Genes (E+S) and create random bimodal distributed
data

/6/2013
Patrick Michl
Network Modeling
Results
Artificial datasets
Step 1
data
Step 2
Manipulate data in a fixed order

/6/2013
Patrick Michl
Network Modeling
Results
Artificial datasets
Step 1
data
Step 2
Manipulate data in a fixed order
Step 3
Add noise to manipulated data
and normalize data

/6/2013
Patrick Michl
Network Modeling
Simulation
Results
Step 1
Number of visible nodes 8 (4E, 4S)
Create random data:
Random {-1, +1} + N(0, 𝜎 = 0.5)

/6/2013
Patrick Michl
Network Modeling
Simulation
Results
𝑒1 = 0.25𝑠1 + 0.25𝑠2 + 0.25𝑠3 + 0.25𝑠4
𝑒2 = 0.5𝑠1 + 0.5 Noise
𝑒3 = 0.5𝑠1 + 0.5 𝑁𝑜𝑖𝑠𝑒4
𝑒4 = 0.5𝑠1 + 0.5 𝑁𝑜𝑖𝑠𝑒
Step 2
Manipulate data

/6/2013
Patrick Michl
Network Modeling
Simulation
Results
Step 3
Add noise: N(0, 𝜎 = 0.5)

/6/2013
Patrick Michl
Network Modeling
Results
We analyse the data X
with an RBM

/6/2013
Patrick Michl
Network Modeling
Results
We train an autoencoder with 9 hidden layers
and 165 nodes:
Layer 1 & 9: 32 hidden units
Layer 5: 5 hidden units
input data X
output data X‘

/6/2013
Patrick Michl
Network Modeling
Results
We transform the data from X to X‘
And reduce the dimensionality

/6/2013
Patrick Michl
Network Modeling
Results
We analyse the
transformed data X‘
with an RBM

/6/2013
Patrick Michl
Network Modeling
Results
Lets compare the models

/6/2013
Patrick Michl
Network Modeling
Results
Another Example with more nodes and larger autoencoder

/6/2013
Patrick Michl
Network Modeling
Conclusion
Conclusion
• Autoencoders can improve modeling significantly by reducing the
dimensionality of data
• Autoencoders preserve complex structures in their multilayer
perceptron network. Analysing those networks (for example with
knockout tests) could give more structural information
• The drawback are high computational costs
Since the field of deep learning is getting more popular (Face
recognition / Voice recognition, Image transformation). Many new
improvements in facing the computational costs have been made.

/6/2013
Patrick Michl
Network Modeling
Acknowledgement
eilsLABS
Prof. Dr. Rainer König
Prof. Dr. Roland Eils
Network Modeling Group

Structure learning with Deep Neural Networks

Recommended

Recommended

More Related Content

Similar to Structure learning with Deep Neural Networks

Similar to Structure learning with Deep Neural Networks (20)

More from Patrick Michl

More from Patrick Michl (7)

Recently uploaded

Recently uploaded (20)

Structure learning with Deep Neural Networks