Structure learning
with deep neuronal networks
6th Network Modeling Workshop, 6/6/2013
Patrick Michl
Page 26/6/2013
Patrick Michl
Network Modeling
Agenda
Autoencoders
Biological Model
Validation & Implementation
Page 36/6/2013
Patrick Michl
Network Modeling
Real world data usually is high dimensional …
x1
x2
Dataset Model
Autoencode...
Page 46/6/2013
Patrick Michl
Network Modeling
… which makes structural analysis and modeling complicated!
x1
x2
x1
x2
Data...
Page 56/6/2013
Patrick Michl
Network Modeling
Dimensionality reduction techinques like PCA …
x1
x2
PCA
Dataset Model
Autoe...
Page 66/6/2013
Patrick Michl
Network Modeling
… can not preserve complex structures!
x1
x2
PCA
Dataset Model
x1
x2
π‘₯2 = Ξ±π‘₯...
Page 76/6/2013
Patrick Michl
Network Modeling
Therefore the analysis of unknown structures …
x1
x2
Dataset Model
Autoencod...
Page 86/6/2013
Patrick Michl
Network Modeling
… needs more considerate nonlinear techniques!
x1
x2
Dataset Model
x1
x2
π‘₯2 ...
Page 96/6/2013
Patrick Michl
Network Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
β€’ Artificial Neu...
Page 106/6/2013
Patrick Michl
Network Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
β€’ Artificial Ne...
Page 116/6/2013
Patrick Michl
Network Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
β€’ Artificial Ne...
Page 126/6/2013
Patrick Michl
Network Modeling
Autoencoder
β€’ Artificial Neuronal Network
β€’ Multiple hidden layers
Autoenco...
Page 136/6/2013
Patrick Michl
Network Modeling
Autoencoder
β€’ Artificial Neuronal Network
β€’ Multiple hidden layers
Autoenco...
Page 146/6/2013
Patrick Michl
Network Modeling
Autoencoder
β€’ Artificial Neuronal Network
β€’ Multiple hidden layers
Autoenco...
Page 156/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Gaussian Units
input data X
output data Xβ€˜
Percept...
Page 166/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Autoencoders have a symmetric topology …
Gaussian ...
Page 176/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… with an odd number of hidden layers.
Gaussian Un...
Page 186/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
The small layer in the center works lika an inform...
Page 196/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
... that creates a low dimensional code for each s...
Page 206/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
The upper stack does the encoding …
input data X
o...
Page 216/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… and the lower stack does the decoding.
input dat...
Page 226/6/2013
Patrick Michl
Network Modeling
β€’ Deep network
β€’ Symmetric topology
β€’ Information bottleneck
β€’ Encoder
β€’ De...
Page 236/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
Autoencoders can be used to reduce the dimension o...
Page 246/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… if we can train them!
input data X
output data X...
Page 256/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is a good appr...
Page 266/6/2013
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Traini...
Page 276/6/2013
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
In feedforward ANNs backpropagatio...
Page 286/6/2013
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Traini...
Page 296/6/2013
Patrick Michl
Network Modeling
Backpropagation
Autoencoder
Autoencoders
In feedforward ANNs backpropagatio...
Page 306/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
… the problem are the multiple hidden layers!
inpu...
Page 316/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Training
Backpropagati...
Page 326/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Training
… and can con...
Page 336/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Training
Backpropagati...
Page 346/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Training
Backpropagati...
Page 356/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Training
Backpropagati...
Page 366/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Training
Backpropagati...
Page 376/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Training
Backpropagati...
Page 386/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Training
Backpropagati...
Page 396/6/2013
Patrick Michl
Network Modeling
Autoencoder
Autoencoders
input data X
output data Xβ€˜
Training
Backpropagati...
Page 406/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
The top layer RBM transforms real value data into ...
Page 416/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Top
Therefore visible units are modeled with gauss...
Page 426/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Top
… and many hidden units with simoids to encode...
Page 436/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Top
The objective function is the sum of the local...
Page 446/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
𝑉 ≔ set of visible units
π‘₯ 𝑣 ≔ value of ...
Page 456/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
… from the upper layer …
π‘₯ 𝑣~sigm 𝑏 𝑣 + ...
Page 466/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
… to a smaller number of simoids …
π‘₯β„Ž~si...
Page 476/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Reduction
… which can be trained faster than the t...
Page 486/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Unrolling
The symmetric topology allows us to skip...
Page 496/6/2013
Patrick Michl
Network Modeling
Autoencoders
Autoencoder
Unrolling
The symmetric topology allows us to skip...
Page 506/6/2013
Patrick Michl
Network Modeling
After pretraining backpropagation usually finds good solutions
Autoencoders...
Page 516/6/2013
Patrick Michl
Network Modeling
The algorithmic complexity of RBM training depends on the network size
Auto...
Page 526/6/2013
Patrick Michl
Network Modeling
Agenda
Autoencoders
Biological Model
Validation & Implementation
Page 536/6/2013
Patrick Michl
Network Modeling Network Modeling
Restricted Boltzmann Machines (RBM)
How to model the topol...
Page 546/6/2013
Patrick Michl
Network Modeling
We define S and E as visible data Layer …
S
E
TF
Network Modeling
Restricte...
Page 556/6/2013
Patrick Michl
Network Modeling
S E
TF
Network Modeling
Restricted Boltzmann Machines (RBM)
We identify S a...
Page 566/6/2013
Patrick Michl
Network Modeling
S E
… and the TFs with the hidden layer in a RBM
TF
Network Modeling
Restri...
Page 576/6/2013
Patrick Michl
Network Modeling
S E
The training of the RBM gives us a model
TF
Network Modeling
Restricted...
Page 586/6/2013
Patrick Michl
Network Modeling
Agenda
Autoencoder
Biological Model
Implementation & Results
Page 596/6/2013
Patrick Michl
Network Modeling
Results
Validation of the results
β€’ Needs information about the true regula...
Page 606/6/2013
Patrick Michl
Network Modeling
Results
Validation of the results
β€’ Needs information about the true regula...
Page 616/6/2013
Patrick Michl
Network Modeling
Results
Artificial datasets
We simulate data in three steps:
Page 626/6/2013
Patrick Michl
Network Modeling
Results
Artificial datasets
We simulate data in three steps
Step 1
Choose n...
Page 636/6/2013
Patrick Michl
Network Modeling
Results
Artificial datasets
We simulate data in three steps
Step 1
Choose n...
Page 646/6/2013
Patrick Michl
Network Modeling
Results
Artificial datasets
We simulate data in three steps
Step 1
Choose n...
Page 656/6/2013
Patrick Michl
Network Modeling
Simulation
Results
Step 1
Number of visible nodes 8 (4E, 4S)
Create random ...
Page 666/6/2013
Patrick Michl
Network Modeling
Simulation
Results
𝑒1 = 0.25𝑠1 + 0.25𝑠2 + 0.25𝑠3 + 0.25𝑠4
𝑒2 = 0.5𝑠1 + 0.5 ...
Page 676/6/2013
Patrick Michl
Network Modeling
Simulation
Results
Step 3
Add noise: N(0, 𝜎 = 0.5)
Page 686/6/2013
Patrick Michl
Network Modeling
Results
We analyse the data X
with an RBM
Page 696/6/2013
Patrick Michl
Network Modeling
Results
We train an autoencoder with 9 hidden layers
and 165 nodes:
Layer 1...
Page 706/6/2013
Patrick Michl
Network Modeling
Results
We transform the data from X to Xβ€˜
And reduce the dimensionality
Page 716/6/2013
Patrick Michl
Network Modeling
Results
We analyse the
transformed data Xβ€˜
with an RBM
Page 726/6/2013
Patrick Michl
Network Modeling
Results
Lets compare the models
Page 736/6/2013
Patrick Michl
Network Modeling
Results
Another Example with more nodes and larger autoencoder
Page 746/6/2013
Patrick Michl
Network Modeling
Conclusion
Conclusion
β€’ Autoencoders can improve modeling significantly by ...
Page 756/6/2013
Patrick Michl
Network Modeling
Acknowledgement
eilsLABS
Prof. Dr. Rainer KΓΆnig
Prof. Dr. Roland Eils
Netwo...
Upcoming SlideShare
Loading in …5
×

Structure learning with deep neuronal networks

1,255 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,255
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
42
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Structure learning with deep neuronal networks

  1. 1. Structure learning with deep neuronal networks 6th Network Modeling Workshop, 6/6/2013 Patrick Michl
  2. 2. Page 26/6/2013 Patrick Michl Network Modeling Agenda Autoencoders Biological Model Validation & Implementation
  3. 3. Page 36/6/2013 Patrick Michl Network Modeling Real world data usually is high dimensional … x1 x2 Dataset Model Autoencoders
  4. 4. Page 46/6/2013 Patrick Michl Network Modeling … which makes structural analysis and modeling complicated! x1 x2 x1 x2 Dataset Model 𝐹(π‘₯1, π‘₯2) ? Autoencoders
  5. 5. Page 56/6/2013 Patrick Michl Network Modeling Dimensionality reduction techinques like PCA … x1 x2 PCA Dataset Model Autoencoders
  6. 6. Page 66/6/2013 Patrick Michl Network Modeling … can not preserve complex structures! x1 x2 PCA Dataset Model x1 x2 π‘₯2 = Ξ±π‘₯1 + Ξ² Autoencoders
  7. 7. Page 76/6/2013 Patrick Michl Network Modeling Therefore the analysis of unknown structures … x1 x2 Dataset Model Autoencoders
  8. 8. Page 86/6/2013 Patrick Michl Network Modeling … needs more considerate nonlinear techniques! x1 x2 Dataset Model x1 x2 π‘₯2 = 𝑓(π‘₯1) Autoencoders
  9. 9. Page 96/6/2013 Patrick Michl Network Modeling Autoencoders are artificial neuronal networks … Autoencoder β€’ Artificial Neuronal Network Autoencoders input data X output data Xβ€˜ Perceptrons Gaussian Units
  10. 10. Page 106/6/2013 Patrick Michl Network Modeling Autoencoders are artificial neuronal networks … Autoencoder β€’ Artificial Neuronal Network Autoencoders input data X output data Xβ€˜ Perceptrons Gaussian Units Perceptron 1 0 Gauss Units R
  11. 11. Page 116/6/2013 Patrick Michl Network Modeling Autoencoders are artificial neuronal networks … Autoencoder β€’ Artificial Neuronal Network Autoencoders input data X output data Xβ€˜ Perceptrons Gaussian Units
  12. 12. Page 126/6/2013 Patrick Michl Network Modeling Autoencoder β€’ Artificial Neuronal Network β€’ Multiple hidden layers Autoencoders … with multiple hidden layers. Gaussian Units input data X output data Xβ€˜ Perceptrons (Visible layers) (Hidden layers)
  13. 13. Page 136/6/2013 Patrick Michl Network Modeling Autoencoder β€’ Artificial Neuronal Network β€’ Multiple hidden layers Autoencoders Such networks are called deep networks. Gaussian Units input data X output data Xβ€˜ Perceptrons (Visible layers) (Hidden layers)
  14. 14. Page 146/6/2013 Patrick Michl Network Modeling Autoencoder β€’ Artificial Neuronal Network β€’ Multiple hidden layers Autoencoders Such networks are called deep networks. Gaussian Units input data X output data Xβ€˜ Perceptrons (Visible layers) (Hidden layers)Definition (deep network) Deep networks are artificial neuronal networks with multiple hidden layers
  15. 15. Page 156/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders Gaussian Units input data X output data Xβ€˜ Perceptrons (Visible layers) (Hidden layers) Such networks are called deep networks. β€’ Deep network
  16. 16. Page 166/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders Autoencoders have a symmetric topology … Gaussian Units input data X output data Xβ€˜ Perceptrons (Visible layers) (Hidden layers) β€’ Deep network β€’ Symmetric topology
  17. 17. Page 176/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders … with an odd number of hidden layers. Gaussian Units input data X output data Xβ€˜ Perceptrons (Visible layers) (Hidden layers) β€’ Deep network β€’ Symmetric topology
  18. 18. Page 186/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders The small layer in the center works lika an information bottleneck input data X output data Xβ€˜ β€’ Deep network β€’ Symmetric topology β€’ Information bottleneck Bottleneck
  19. 19. Page 196/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders ... that creates a low dimensional code for each sample in the input data. input data X output data Xβ€˜ β€’ Deep network β€’ Symmetric topology β€’ Information bottleneck Bottleneck
  20. 20. Page 206/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders The upper stack does the encoding … input data X output data Xβ€˜ β€’ Deep network β€’ Symmetric topology β€’ Information bottleneck β€’ Encoder Encoder
  21. 21. Page 216/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders … and the lower stack does the decoding. input data X output data Xβ€˜ β€’ Deep network β€’ Symmetric topology β€’ Information bottleneck β€’ Encoder β€’ Decoder Encoder Decoder
  22. 22. Page 226/6/2013 Patrick Michl Network Modeling β€’ Deep network β€’ Symmetric topology β€’ Information bottleneck β€’ Encoder β€’ Decoder Autoencoder Autoencoders … and the lower stack does the decoding. input data X output data Xβ€˜ Encoder Decoder Definition (deep network) Deep networks are artificial neuronal networks with multiple hidden layers Definition (autoencoder) Autoencoders are deep networks with a symmetric topology and an odd number of hiddern layers, containing a encoder, a low dimensional representation and a decoder.
  23. 23. Page 236/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders Autoencoders can be used to reduce the dimension of data … input data X output data Xβ€˜ Problem: dimensionality of data Idea: 1. Train autoencoder to minimize the distance between input X and output Xβ€˜ 2. Encode X to low dimensional code Y 3. Decode low dimensional code Y to output Xβ€˜ 4. Output Xβ€˜ is low dimensional
  24. 24. Page 246/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders … if we can train them! input data X output data Xβ€˜ Problem: dimensionality of data Idea: 1. Train autoencoder to minimize the distance between input X and output Xβ€˜ 2. Encode X to low dimensional code Y 3. Decode low dimensional code Y to output Xβ€˜ 4. Output Xβ€˜ is low dimensional
  25. 25. Page 256/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders In feedforward ANNs backpropagation is a good approach. input data X output data Xβ€˜ Training Backpropagation
  26. 26. Page 266/6/2013 Patrick Michl Network Modeling Backpropagation Autoencoder Autoencoders input data X output data Xβ€˜ Training Definition (autoencoder) Backpropagation (1) The distance (error) between current output Xβ€˜ and wanted output Y is computed. This gives a error function 𝑋′ = 𝐹 𝑋 error = 𝑋′2 βˆ’ π‘Œ In feedforward ANNs backpropagation is a good approach.
  27. 27. Page 276/6/2013 Patrick Michl Network Modeling Backpropagation Autoencoder Autoencoders In feedforward ANNs backpropagation is the choice input data X output data Xβ€˜ Training Definition (autoencoder) Backpropagation (1) The distance (error) between current output Xβ€˜ and wanted output Y is computed. This gives a error function Example (linear neuronal unit with two inputs)
  28. 28. Page 286/6/2013 Patrick Michl Network Modeling Backpropagation Autoencoder Autoencoders input data X output data Xβ€˜ Training Definition (autoencoder) Backpropagation (1) The distance (error) between current output Xβ€˜ and wanted output Y is computed. This gives a error function (2) By calculating βˆ’π›»π‘’π‘Ÿπ‘Ÿπ‘œπ‘Ÿ we get a vector that shows in a direction which decreases the error (3) We update the parameters to decrease the error In feedforward ANNs backpropagation is a good approach.
  29. 29. Page 296/6/2013 Patrick Michl Network Modeling Backpropagation Autoencoder Autoencoders In feedforward ANNs backpropagation is the choice input data X output data Xβ€˜ Training Definition (autoencoder) Backpropagation (1) The distance (error) between current output Xβ€˜ and wanted output Y is computed. This gives a error function (2) By calculating βˆ’π›»π‘’π‘Ÿπ‘Ÿπ‘œπ‘Ÿ we get a vector that shows in a direction which decreases the error (3) We update the parameters to decrease the error (4) We repeat that
  30. 30. Page 306/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders … the problem are the multiple hidden layers! input data X output data Xβ€˜ Training Backpropagation Problem: Deep Network
  31. 31. Page 316/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data Xβ€˜ Training Backpropagation is known to be slow far away from the output layer … Backpropagation Problem: Deep Network β€’ Very slow training
  32. 32. Page 326/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data Xβ€˜ Training … and can converge to poor local minima. Backpropagation Problem: Deep Network β€’ Very slow training β€’ Maybe bad solution
  33. 33. Page 336/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data Xβ€˜ Training Backpropagation Problem: Deep Network β€’ Very slow training β€’ Maybe bad solution Idea: Initialize close to a good solution The task is to initialize the parameters close to a good solution!
  34. 34. Page 346/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data Xβ€˜ Training Backpropagation Problem: Deep Network β€’ Very slow training β€’ Maybe bad solution Idea: Initialize close to a good solution β€’ Pretraining Therefore the training of autoencoders has a pretraining phase …
  35. 35. Page 356/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data Xβ€˜ Training Backpropagation Problem: Deep Network β€’ Very slow training β€’ Maybe bad solution Idea: Initialize close to a good solution β€’ Pretraining β€’ Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs)
  36. 36. Page 366/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data Xβ€˜ Training Backpropagation Problem: Deep Network β€’ Very slow training β€’ Maybe bad solution Idea: Initialize close to a good solution β€’ Pretraining β€’ Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs) Restricted Boltzmann Machine β€’ RBMs are Markov Random Fields
  37. 37. Page 376/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data Xβ€˜ Training Backpropagation Problem: Deep Network β€’ Very slow training β€’ Maybe bad solution Idea: Initialize close to a good solution β€’ Pretraining β€’ Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs) Restricted Boltzmann Machine β€’ RBMs are Markov Random Fields Markov Random Field Every unit influences every neighbor The coupling is undirected Motivation (Ising Model) A set of magnetic dipoles (spins) is arranged in a graph (lattice) where neighbors are coupled with a given strengt
  38. 38. Page 386/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data Xβ€˜ Training Backpropagation Problem: Deep Network β€’ Very slow training β€’ Maybe bad solution Idea: Initialize close to a good solution β€’ Pretraining β€’ Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs) Restricted Boltzmann Machine β€’ RBMs are Markov Random Fields β€’ Bipartite topology: visible (v), hidden (h) β€’ Use local energy to calculate the probabilities of values Training: contrastive divergency (Gibbs Sampling) h1 v1 v2 v3 v4 h2 h3
  39. 39. Page 396/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data Xβ€˜ Training Backpropagation Problem: Deep Network β€’ Very slow training β€’ Maybe bad solution Idea: Initialize close to a good solution β€’ Pretraining β€’ Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs) Restricted Boltzmann Machine Gibbs Sampling
  40. 40. Page 406/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder The top layer RBM transforms real value data into binary codes. 𝑉 ≔ set of visible units π‘₯ 𝑣 ≔ value of unit 𝑣, βˆ€π‘£ ∈ 𝑉 π‘₯ 𝑣 ∈ 𝑹, βˆ€π‘£ ∈ 𝑉 𝐻 ≔ set of hidden units π‘₯β„Ž ≔ value of unit β„Ž, βˆ€β„Ž ∈ 𝐻 π‘₯β„Ž ∈ {𝟎, 𝟏}, βˆ€β„Ž ∈ 𝐻 Top Training
  41. 41. Page 416/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Top Therefore visible units are modeled with gaussians to encode data … π‘₯ 𝑣~𝑁 𝑏 𝑣 + 𝑀 π‘£β„Ž β„Ž π‘₯β„Ž, πœŽπ‘£ πœŽπ‘£ ≔ std. dev. of unit 𝑣 𝑏 𝑣 ≔ bias of unit 𝑣 𝑀 π‘£β„Ž ≔ weight of edge (𝑣, β„Ž) h2 v1 v2 v3 v4 h3 h4 h5h1 Training
  42. 42. Page 426/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Top … and many hidden units with simoids to encode dependencies π‘₯β„Ž~sigm π‘β„Ž + 𝑀 π‘£β„Ž 𝑣 π‘₯ 𝑣 πœŽπ‘£ πœŽπ‘£ ≔ std. dev. of unit 𝑣 π‘β„Ž ≔ bias of unit β„Ž 𝑀 π‘£β„Ž ≔ weight of edge (𝑣, β„Ž) h2 v1 v2 v3 v4 h3 h4 h5h1 Training
  43. 43. Page 436/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Top The objective function is the sum of the local energies. Local Energy πΈβ„Ž ≔ βˆ’ 𝑀 π‘£β„Ž 𝑣 π‘₯ 𝑣 πœŽπ‘£ π‘₯β„Ž + π‘₯β„Ž π‘β„Ž 𝐸 𝑣 ≔ βˆ’ 𝑀 π‘£β„Ž β„Ž π‘₯ 𝑣 πœŽπ‘£ π‘₯β„Ž + π‘₯ 𝑣 βˆ’ 𝑏 𝑣 2 2πœŽπ‘£ 2 h2 v1 v2 v3 v4 h3 h4 h5h1 Training
  44. 44. Page 446/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Reduction 𝑉 ≔ set of visible units π‘₯ 𝑣 ≔ value of unit 𝑣, βˆ€π‘£ ∈ 𝑉 π‘₯ 𝑣 ∈ {𝟎, 𝟏}, βˆ€π‘£ ∈ 𝑉 𝐻 ≔ set of hidden units π‘₯β„Ž ≔ value of unit β„Ž, βˆ€β„Ž ∈ 𝐻 π‘₯β„Ž ∈ {𝟎, 𝟏}, βˆ€β„Ž ∈ 𝐻 The next RBM layer maps the dependency encoding… Training
  45. 45. Page 456/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Reduction … from the upper layer … π‘₯ 𝑣~sigm 𝑏 𝑣 + 𝑀 π‘£β„Ž β„Ž π‘₯β„Ž 𝑏 𝑣 ≔ bias of unit v 𝑀 π‘£β„Ž ≔ weight of edge (𝑣, β„Ž) h1 v1 v2 v3 v4 h2 h3 Training
  46. 46. Page 466/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Reduction … to a smaller number of simoids … π‘₯β„Ž~sigm π‘β„Ž + 𝑀 π‘£β„Ž 𝑣 π‘₯ 𝑣 π‘β„Ž ≔ bias of unit h 𝑀 π‘£β„Ž ≔ weight of edge (𝑣, β„Ž) h1 v1 v2 v3 v4 h2 h3 Training
  47. 47. Page 476/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Reduction … which can be trained faster than the top layer Local Energy 𝐸 𝑣 ≔ βˆ’ 𝑀 π‘£β„Ž β„Ž π‘₯ 𝑣 π‘₯β„Ž + π‘₯β„Ž π‘β„Ž πΈβ„Ž ≔ βˆ’ 𝑀 π‘£β„Ž 𝑣 π‘₯ 𝑣 π‘₯β„Ž + π‘₯ 𝑣 𝑏 𝑣 h1 v1 v2 v3 v4 h2 h3 Training
  48. 48. Page 486/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Unrolling The symmetric topology allows us to skip further training. Training
  49. 49. Page 496/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Unrolling The symmetric topology allows us to skip further training. Training
  50. 50. Page 506/6/2013 Patrick Michl Network Modeling After pretraining backpropagation usually finds good solutions Autoencoders Autoencoder Training β€’ Pretraining Top RBM (GRBM) Reduction RBMs Unrolling β€’ Finetuning Backpropagation
  51. 51. Page 516/6/2013 Patrick Michl Network Modeling The algorithmic complexity of RBM training depends on the network size Autoencoders Autoencoder Training β€’ Complexity: O(inw) i: number of iterations n: number of nodes w: number of weights β€’ Memory Complexity: O(w)
  52. 52. Page 526/6/2013 Patrick Michl Network Modeling Agenda Autoencoders Biological Model Validation & Implementation
  53. 53. Page 536/6/2013 Patrick Michl Network Modeling Network Modeling Restricted Boltzmann Machines (RBM) How to model the topological structure? S E TF
  54. 54. Page 546/6/2013 Patrick Michl Network Modeling We define S and E as visible data Layer … S E TF Network Modeling Restricted Boltzmann Machines (RBM)
  55. 55. Page 556/6/2013 Patrick Michl Network Modeling S E TF Network Modeling Restricted Boltzmann Machines (RBM) We identify S and E with the visible layer …
  56. 56. Page 566/6/2013 Patrick Michl Network Modeling S E … and the TFs with the hidden layer in a RBM TF Network Modeling Restricted Boltzmann Machines (RBM)
  57. 57. Page 576/6/2013 Patrick Michl Network Modeling S E The training of the RBM gives us a model TF Network Modeling Restricted Boltzmann Machines (RBM)
  58. 58. Page 586/6/2013 Patrick Michl Network Modeling Agenda Autoencoder Biological Model Implementation & Results
  59. 59. Page 596/6/2013 Patrick Michl Network Modeling Results Validation of the results β€’ Needs information about the true regulation β€’ Needs information about the descriptive power of the data
  60. 60. Page 606/6/2013 Patrick Michl Network Modeling Results Validation of the results β€’ Needs information about the true regulation β€’ Needs information about the descriptive power of the data Without this infomation validation can only be done, using artificial datasets!
  61. 61. Page 616/6/2013 Patrick Michl Network Modeling Results Artificial datasets We simulate data in three steps:
  62. 62. Page 626/6/2013 Patrick Michl Network Modeling Results Artificial datasets We simulate data in three steps Step 1 Choose number of Genes (E+S) and create random bimodal distributed data
  63. 63. Page 636/6/2013 Patrick Michl Network Modeling Results Artificial datasets We simulate data in three steps Step 1 Choose number of Genes (E+S) and create random bimodal distributed data Step 2 Manipulate data in a fixed order
  64. 64. Page 646/6/2013 Patrick Michl Network Modeling Results Artificial datasets We simulate data in three steps Step 1 Choose number of Genes (E+S) and create random bimodal distributed data Step 2 Manipulate data in a fixed order Step 3 Add noise to manipulated data and normalize data
  65. 65. Page 656/6/2013 Patrick Michl Network Modeling Simulation Results Step 1 Number of visible nodes 8 (4E, 4S) Create random data: Random {-1, +1} + N(0, 𝜎 = 0.5)
  66. 66. Page 666/6/2013 Patrick Michl Network Modeling Simulation Results 𝑒1 = 0.25𝑠1 + 0.25𝑠2 + 0.25𝑠3 + 0.25𝑠4 𝑒2 = 0.5𝑠1 + 0.5 Noise 𝑒3 = 0.5𝑠1 + 0.5 π‘π‘œπ‘–π‘ π‘’4 𝑒4 = 0.5𝑠1 + 0.5 π‘π‘œπ‘–π‘ π‘’ Step 2 Manipulate data
  67. 67. Page 676/6/2013 Patrick Michl Network Modeling Simulation Results Step 3 Add noise: N(0, 𝜎 = 0.5)
  68. 68. Page 686/6/2013 Patrick Michl Network Modeling Results We analyse the data X with an RBM
  69. 69. Page 696/6/2013 Patrick Michl Network Modeling Results We train an autoencoder with 9 hidden layers and 165 nodes: Layer 1 & 9: 32 hidden units Layer 2 & 8: 24 hidden units Layer 3 & 7: 16 hidden units Layer 4 & 6: 8 hidden units Layer 5: 5 hidden units input data X output data Xβ€˜
  70. 70. Page 706/6/2013 Patrick Michl Network Modeling Results We transform the data from X to Xβ€˜ And reduce the dimensionality
  71. 71. Page 716/6/2013 Patrick Michl Network Modeling Results We analyse the transformed data Xβ€˜ with an RBM
  72. 72. Page 726/6/2013 Patrick Michl Network Modeling Results Lets compare the models
  73. 73. Page 736/6/2013 Patrick Michl Network Modeling Results Another Example with more nodes and larger autoencoder
  74. 74. Page 746/6/2013 Patrick Michl Network Modeling Conclusion Conclusion β€’ Autoencoders can improve modeling significantly by reducing the dimensionality of data β€’ Autoencoders preserve complex structures in their multilayer perceptron network. Analysing those networks (for example with knockout tests) could give more structural information β€’ The drawback are high computational costs Since the field of deep learning is getting more popular (Face recognition / Voice recognition, Image transformation). Many new improvements in facing the computational costs have been made.
  75. 75. Page 756/6/2013 Patrick Michl Network Modeling Acknowledgement eilsLABS Prof. Dr. Rainer KΓΆnig Prof. Dr. Roland Eils Network Modeling Group

Γ—