SlideShare a Scribd company logo
deep learning
Algorithms and Applications
Bernardete Ribeiro, bribeiro@dei.uc.pt
University of Coimbra, Portugal
INIT/AERFAI Summer School on Machine Learning, Benicassim 22-26 June 2015
III - Deep Learning Algorithms
1
elements 3: deep neural networks
outline
∙ Learning in Deep Neural Networks
∙ Deep Learning: Evolution Timeline
∙ Deep Architectures
∙ Restricted Boltzmann Machines (RBMs)
∙ Deep Belief Networks (DBNs)
∙ Deep Models Overall Characteristics
3
learning in deep neural networks
learning in deep neural networks
1. No general learning algorithm (no-free lunch theorem by
Wolpert 1996)
2. Learning algorithm for specific tasks - perception, control,
prediction, planning reasoning, language understanding
3. Limitations of BP - local minima, optimization challenges
for non-convex objective functions
4. Hinton’s deep belief networks (DBNs) as stack of RBMs
5. LeCun’s energy based learning for DBNs
5
deep learning: evolution timeline
1. Perceptron [Frank Rosenblatt, 1959]
2. Neocognitron [K Fukushima, 1980]
3. Convolutional Neural Network (CNN) [LeCun, 1989]
4. Multi-level Hierarchy Networks [Jurgen Schmidthuber, 1992]
5. Deep Belief Networks (DBNs) as stack of RBMs [Geoffrey
Hinton, 2006]
6
deep architectures
from brain-like computing to deep learning
∙ New empirical and theoretical results have brought deep
architectures into the focus of the Machine Learning (ML)
researchers [Larochelle et al., 2007].
∙ Theoretical results suggest that deep architectures are
fundamental to learn the kind of brain-like complicated
functions that can represent high-level abstractions (e.g.
vision, speech, language) [Bengio, 2009]
8
deep concepts main idea
9
deep neural networks
∙ Convolutional Neural Networks (CNNs) [LeCun et al., 1989]
∙ Deep Belief Networks (DBNs) [Hinton et al, 2006]
∙ AutoEncoders (AEs) [Bengio et al, NIPS 2006]
∙ Sparse Autoencoders [Ranzato et al, NIPS’2006]
10
convolutional neural networks (cnns)
∙ Convolutional Neural Network consists of two basic
operations
∙ convolutional
∙ pooling
∙ Convolutional and pooling layers
are arranged alternately until
high-level features are obtained
∙ Several feature maps in each
convolutional layer
∙ Weights in the same map are
shared
NN
input C1 S2 C3 S4
1
1
I Arel, D Rose & T Karnowski, Deep Machine Learning—A New Frontier in Artificial Intelligence Research, IEEE,
CIM,2010
11
convolutional neural networks (cnns)
∙ Convolutional: suppose the size of the layer is d × d
and the size of the receptive fields are r × r, γ and x
denote respectively the values of the convolutional
layer and the previous layer:
γij = g(
r
m=1
r
n=1
xi+m−1,j+n−1.wm,n + b)
i, j = 1, · · · , (d − r + 1) where g is a nonlinear function.
∙ Pooling is following after convolution to reduce the
dimensionality of features and to introduce
translational invariance into the CNN network.
12
deep belief networks (dbns)
∙ Probabilistic generative models
contrasting with the discriminative
nature of other NNS
∙ Generative models provide a joint
probability distribution of data
and labels
∙ Unsupervised greedy-layer-wise
pre-training followed by final
tuning
image 28 x 28 pixels
visible
hidden
visible
hidden
visible
hidden
Top Level units
Labels Hidden Units
RBM Layer
RBM Layer
RBM Layer
Detection Layer
2
2
based on I Arel, D Rose & T Karnowski, Deep Machine Learning—A New Frontier in Artificial Intelligence
Research, IEEE, CIM,2010
13
autoencoders (aes)
∙ The auto-encoder has two
components:
∙ the encoder f (mapping x to h) and
∙ the decoder g (mapping h to r)
∙ An auto-encoder is a neural
network that tries to reconstruct
its input to its output
encoder f
…
…
…
…
…
…
decoder g
input x
code h
reconstruction r
3
3
based on Y Bengio, I Goodfellow and A Courville, Deep Learning, An MIT Press book (in preparation),
www.iro.umontreal.ca_~bengioy_dbook
14
deep architectures versus shallow architectures
∙ Deep architectures can be exponentially more efficient
than shallow architectures [Roux and Bengio, 2010].
∙ Functions that can be compactly represented with a Neural
Network (NN) of depth d, may require an exponential number
of computational elements for a network with depth d − 1
[Bengio, 2009].
15
deep architectures versus shallow architectures
∙ Deep architectures can be exponentially more efficient
than shallow architectures [Roux and Bengio, 2010].
∙ Functions that can be compactly represented with a Neural
Network (NN) of depth d, may require an exponential number
of computational elements for a network with depth d − 1
[Bengio, 2009].
∙ Since the number of computational elements depends on
the number of training samples available, using shallow
architectures may result in poor generalization
models [Bengio, 2009].
∙ As a result, deep architecture models tend to outperform
shallow models such as SVMs [Larochelle et al., 2007].
15
Resctricted Boltzmann Machines
Deep Belief Networks
16
restricted boltzmann machines
restricted boltzmann machines (rbms)
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
decoder
encoder
18
restricted boltzmann machines (rbms)
∙ Unsupervised
∙ Find complex regularities in
training data
∙ Bipartite Graph
∙ visible, hidden layer
∙ Binary stochastic units
∙ On/Off with probability
∙ 1 Iteration
∙ Update Hidden Units
∙ Reconstruct Visible Units
∙ Maximum Likelihood of
training data
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
encoder
19
restricted boltzmann machines (rbms)
∙ Training Goal: Best probable
reproduction
∙ unsupervised data
∙ find latent factors of data
set
∙ Adjust weights to get
maximum probability of
input data
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
encoder
20
restricted boltzmann machines (rbms)
Given an observed state, the energy of the joint configuration
of the visible units and hidden units (v, h) is given by:
E(v, h) = −
I
i=1
civi −
J
j=1
bjhj −
J
j=1
I
i=1
Wjivihj , (1)
where W is the matrix of weights, and b and c are the bias
units w.r.t. hidden and visible layers, respectively.
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
decoder
encoder
21
restricted boltzmann machines (rbms)
The Restricted Boltzmann Machine (RBM) assigns a
probability for each configuration (v, h), using:
p(v, h) =
e−E(v,h)
Z
, (2)
where Z is a normalization constant called partition function,
obtained by summing up the energy of all possible (v, h)
configurations [Bengio, 2009, Hinton, 2010,
Carreira-Perpiñán and Hinton, 2005]:
Z =
v,h
e−E(v,h)
. (3)
22
restricted boltzmann machines (rbms)
Since there are no connections between any two units within
the same layer, given a particular random input
configuration, v, all the hidden units are independent of each
other and the probability of h given v becomes:
p(h | v) =
j
p(hj = 1 | v) , (4)
where
p(hj = 1 | v) = σ(bj +
I
i=1
viWji) . (5)
23
restricted boltzmann machines (rbms)
Similarly given a specific hidden state, h, the probability of v
given h is obtained by (6):
p(v | h) =
i
p(vi = 1 | h) , (6)
where:
p(vi = 1 | h) = σ(ci +
J
j=1
hjWji) . (7)
24
restricted boltzmann machines (rbms)
Given a random training vector v, the state of a given hidden
unit j is set to 1 with probability:
p(hj = 1|v) = σ(bj +
i
viWij)
Similarly:
p(vi = 1|h) = σ(ci +
j
hjWij)
where σ (x) is the sigmoid squashing function 1
(1+e−x)
.
25
restricted boltzmann machines (rbms)
The marginal probability assigned to a visible vector, v, is
given by (8):
p(v) =
h
p(v, h) =
1
Z
h
e−E(v,h)
. (8)
Hence, given a specific training vector v its probability can be
raised by adjusting the weights and the biases in order to
lower the energy of that particular vector while raising the
energy of all the others.
26
restricted boltzmann machines (rbms)
To this end, we can perform stochastic gradient ascent
procedure on the log-likelihood obtained from training the
data vectors using ( 9):
∂ log p(v)
∂θ
= −
h
p(h | v)∂
E(v, h)
∂θ
positive phase
+
v,h
p(v, h)
∂E(v, h)
∂θ
negative phase
(9)
27
training an rbm
training an rbm
The learning rule for performing stochastic steepest ascent in
the log probability of the training data:
∂ log p(v)
∂θ
= vihj 0
− vihj ∞
(10)
where · 0 denotes expectations for the data distribution
(p0 = p(h | v)) and · ∞ denotes expectations under the
model distribution
p∞(v, h) = p(v, h) [Roux and Bengio, 2008].
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
decoder
encoder
29
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
p(hj = 1|v) = σ(bj + I
i=1 viWji)
30
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
v(1)
i · · ·
p(vi = 1|h) = σ(ci + J
j=1
hjWji)
31
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
v(1)
i · · ·
h(1)
· · · j
p(hj = 1|v) = σ(bj + I
i=1 viWji)
32
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
v(1)
i · · ·
h(1)
· · · j
v(1)
i · · ·
p(vi = 1|h) = σ(ci + J
j=1
hjWji)
33
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
v(1)
i · · ·
h(1)
· · · j
v(2)
i · · ·
h(2)
· · · j
v(∞)
i · · ·
h(∞)
· · · j
vihj ∞
34
contrastive divergence algorithm
contrastive divergence (cd–k)
∙ To solve this problem, Hinton proposed the Contrastive
Divergence algorithm.
∙ CD–k replaces . ∞ by · k for small values of k.
∆Wji = η( vihj 0
− vihj k
) (11)
36
contrastive divergence (cd–k)
∙ v(0) ← x
∙ Compute the binary (features) states of the hidden units,
h(0), using v(0)
∙ for n ← 1 to k
∙ Compute the “reconstruction” states for the visible units, v(n)
,
using h(n−1)
∙ Compute the “reconstruction” states for the hidden units, h(n)
,
using v(n)
∙ end for
∙ Update the weights and biases, according to:
∆Wji = η( vihj 0
− vihj k
) (12)
∆bj = η( hj 0
− hj k
) (13)
∆ci = η( vi 0 − vi k) (14)
37
deep belief networks (dbns)
deep belief networks (dbns)
x· · ·
h1· · ·
p(x|h1)p(h1|x)
x· · ·
h1· · ·
h2· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
x· · ·
h1· · ·
h2· · ·
h3· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
p(h2|h3)p(h3|h2)
39
deep belief networks (dbns)
∙ Start with a training vector
on the visible units
∙ Update all the hidden units
in parallel
∙ Update the all the visible
units in parallel to get a
“reconstruction”
∙ Update the hidden units
again
x· · ·
h1· · ·
p(x|h1)p(h1|x)
x· · ·
h1· · ·
h2· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
x· · ·
h1· · ·
h2· · ·
h3· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
p(h2|h3)p(h3|h2)
40
pre-training and fine tuning
RBM
data
500 hidden units
RBM
300 hidden units
500 hidden units
RBM
100 hidden units
300 hidden units
RBM
100 hidden units
10 hidden
data
update weights
500 hidden units
300 hidden units
100 hidden units
10 hidden
error < 0.001
BP
DBN Model
RBMs pre-training fine-tuning with BP
41
deep belief networks (dbns)
42
practical considerations
weights initialization
44
deep belief networks (dbns) - adaptive learning rate size
ηji =



uη(old)
ji
if ( vihj 0
− vihj k
)( vihj
(old)
0
− vihj
(old)
k
) > 0
dη(old)
ji
if ( vihj 0
− vihj k
)( vihj
(old)
0
− vihj
(old)
k
) < 0
4
4
Lopes et al., Towards Adaptive learning with improved
convergence of DBNs on GPUs, Pattern Recognition, [2014]
45
adaptive step size
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 100 200 300 400 500 600 700 800 900 1000
RMSE(reconstruction)
Epoch
α = 0.1
adaptive
γ = 0.1
γ = 0.4
γ = 0.7
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 100 200 300 400 500 600 700 800 900 1000
RMSE(reconstruction)
Epoch
α = 0.4
adaptive
γ = 0.1
γ = 0.4
γ = 0.7
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 100 200 300 400 500 600 700 800 900 1000
RMSE(reconstruction)
Epoch
α = 0.7
adaptive
γ = 0.1
γ = 0.4
γ = 0.7
Average reconstruction error (RMSE).
46
convergence results (α = 0.1)
Training images
Reconstruction
after 50 epochs
Reconstruction
after 100 epochs
Reconstruction
after 250 epochs
Reconstruction
after 500 epochs
Reconstruction
after 750 epochs
Reconstruc-
tion after
1000 epochs
Adaptive Step Size Fixed (optimized) learning rate η = 0.4
47
deep models characteristics
deep models characteristics
∙ Biological Plausibility
49
deep models characteristics
∙ Biological Plausibility
∙ DBNs are effective in a wide range of ML problems.
49
deep models characteristics
∙ Biological Plausibility
∙ DBNs are effective in a wide range of ML problems.
∙ Creating a Deep Belief Network (DBN) model is a time
consuming and computationally expensive task that
involves training several Restricted Boltzmann Machines
(RBMs) upholding considerable efforts.
49
deep models characteristics
∙ Biological Plausibility
∙ DBNs are effective in a wide range of ML problems.
∙ Creating a Deep Belief Network (DBN) model is a time
consuming and computationally expensive task that
involves training several Restricted Boltzmann Machines
(RBMs) upholding considerable efforts.
∙ The adaptive step-size procedure for tuning the learning
rate has been incorporated in the learning model with
excelling results.
49
deep models characteristics
∙ Biological Plausibility
∙ DBNs are effective in a wide range of ML problems.
∙ Creating a Deep Belief Network (DBN) model is a time
consuming and computationally expensive task that
involves training several Restricted Boltzmann Machines
(RBMs) upholding considerable efforts.
∙ The adaptive step-size procedure for tuning the learning
rate has been incorporated in the learning model with
excelling results.
∙ Graphics Processing Units (GPU) can reduce significantly
the convergence time for the data intensive tasks in DBNs
49
Bengio, Y. (2009).
Learning deep architectures for AI.
Foundations and Trends in Machine Learning, 2(1):1–127.
Carreira-Perpiñán, M. A. and Hinton, G. E. (2005).
On contrastive divergence learning.
In Proceedings of the 10th International Workshop on
Artificial Intelligence and Statistics (AISTATS 2005), pages
33–40.
Hinton, G. E. (2010).
A practical guide to training restricted Boltzmann
machines.
Technical report, Department of Computer Science,
University of Toronto.
Larochelle, H., Erhan, D., Courville, A., Bergstra, J., and
Bengio, Y. (2007).
49
An empirical evaluation of deep architectures on
problems with many factors of variation.
In Proceedings of the 24th international conference on
Machine learning (ICML 2007), pages 473–480. ACM.
Roux, N. L. and Bengio, Y. (2008).
Representational power of restricted Boltzmann
machines and deep belief networks.
Neural Computation, 20(6):1631–1649.
Roux, N. L. and Bengio, Y. (2010).
Deep belief networks are compact universal
approximators.
Neural Computation, 22(8):2192–2207.
50
Questions?
50
deep learning
Algorithms and Applications
Bernardete Ribeiro, bribeiro@dei.uc.pt
June 24, 2015
University of Coimbra, Portugal
INIT/AERFAI Summer School on Machine Learning, Benicassim 22-26 June 2015

More Related Content

What's hot

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Eun Ji Lee
 
Lecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationLecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image Operation
VARUN KUMAR
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationHidekazu Oiwa
 
A new look on performance of small-cell network with design of multiple anten...
A new look on performance of small-cell network with design of multiple anten...A new look on performance of small-cell network with design of multiple anten...
A new look on performance of small-cell network with design of multiple anten...
journalBEEI
 
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
Steven Duplij (Stepan Douplii)
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Gurbinder Gill
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)
VARUN KUMAR
 
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
Taegyun Jeon
 
2012 cvpr gtw
2012 cvpr gtw2012 cvpr gtw
2012 cvpr gtw
Chau Phuong
 
論文紹介 Fast imagetagging
論文紹介 Fast imagetagging論文紹介 Fast imagetagging
論文紹介 Fast imagetaggingTakashi Abe
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
Taiji Suzuki
 
Collision Detection In 3D Environments
Collision Detection In 3D EnvironmentsCollision Detection In 3D Environments
Collision Detection In 3D EnvironmentsUng-Su Lee
 
Fisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Fisher Kernel based Relevance Feedback for Multimodal Video RetrievalFisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Fisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Ionut Mironica
 
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
vijayanand Kandaswamy
 
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
홍배 김
 
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theoryRestricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Seongwon Hwang
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김
 
Optimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound imageOptimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound image
Alexander Decker
 
11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound imageAlexander Decker
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 

What's hot (20)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
 
Lecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationLecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image Operation
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
 
A new look on performance of small-cell network with design of multiple anten...
A new look on performance of small-cell network with design of multiple anten...A new look on performance of small-cell network with design of multiple anten...
A new look on performance of small-cell network with design of multiple anten...
 
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)
 
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
 
2012 cvpr gtw
2012 cvpr gtw2012 cvpr gtw
2012 cvpr gtw
 
論文紹介 Fast imagetagging
論文紹介 Fast imagetagging論文紹介 Fast imagetagging
論文紹介 Fast imagetagging
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
 
Collision Detection In 3D Environments
Collision Detection In 3D EnvironmentsCollision Detection In 3D Environments
Collision Detection In 3D Environments
 
Fisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Fisher Kernel based Relevance Feedback for Multimodal Video RetrievalFisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Fisher Kernel based Relevance Feedback for Multimodal Video Retrieval
 
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
 
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
 
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theoryRestricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theory
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Optimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound imageOptimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound image
 
11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 

Viewers also liked

A Literature Survey on Mobile-Learning Management Systems
A Literature Survey on Mobile-Learning Management SystemsA Literature Survey on Mobile-Learning Management Systems
A Literature Survey on Mobile-Learning Management Systems
AM Publications
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
Akshay Hegde
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkDEEPASHRI HK
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
台灣資料科學年會
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
Lukas Masuch
 

Viewers also liked (6)

A Literature Survey on Mobile-Learning Management Systems
A Literature Survey on Mobile-Learning Management SystemsA Literature Survey on Mobile-Learning Management Systems
A Literature Survey on Mobile-Learning Management Systems
 
Deep Learning Survey
Deep Learning SurveyDeep Learning Survey
Deep Learning Survey
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 

Similar to Dl1 deep learning_algorithms

Vector-Based Back Propagation Algorithm of.pdf
Vector-Based Back Propagation Algorithm of.pdfVector-Based Back Propagation Algorithm of.pdf
Vector-Based Back Propagation Algorithm of.pdf
Nesrine Wagaa
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
ssusere5ddd6
 
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS Academy
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
tuxette
 
Spectral convnets
Spectral convnetsSpectral convnets
Spectral convnets
xavierbresson
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
Kenta Oono
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Soma Boubou
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
Elvis DOHMATOB
 
NN_02_Threshold_Logic_Units.pdf
NN_02_Threshold_Logic_Units.pdfNN_02_Threshold_Logic_Units.pdf
NN_02_Threshold_Logic_Units.pdf
chiron1988
 
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
Seiya Ito
 
talk_NASPDE.pdf
talk_NASPDE.pdftalk_NASPDE.pdf
talk_NASPDE.pdf
Chiheb Ben Hammouda
 
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
csandit
 
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
Masahiro Suzuki
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdf
NarenRajVivek
 
00463517b1e90c1e63000000
00463517b1e90c1e6300000000463517b1e90c1e63000000
00463517b1e90c1e63000000Ivonne Liu
 
VoxelNet
VoxelNetVoxelNet
VoxelNet
taeseon ryu
 
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
 
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Report Satellite Navigation Systems
Report Satellite Navigation SystemsReport Satellite Navigation Systems
Report Satellite Navigation Systems
Ferro Demetrio
 

Similar to Dl1 deep learning_algorithms (20)

Vector-Based Back Propagation Algorithm of.pdf
Vector-Based Back Propagation Algorithm of.pdfVector-Based Back Propagation Algorithm of.pdf
Vector-Based Back Propagation Algorithm of.pdf
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
 
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
 
Spectral convnets
Spectral convnetsSpectral convnets
Spectral convnets
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
NN_02_Threshold_Logic_Units.pdf
NN_02_Threshold_Logic_Units.pdfNN_02_Threshold_Logic_Units.pdf
NN_02_Threshold_Logic_Units.pdf
 
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
 
talk_NASPDE.pdf
talk_NASPDE.pdftalk_NASPDE.pdf
talk_NASPDE.pdf
 
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
 
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdf
 
00463517b1e90c1e63000000
00463517b1e90c1e6300000000463517b1e90c1e63000000
00463517b1e90c1e63000000
 
VoxelNet
VoxelNetVoxelNet
VoxelNet
 
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
 
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
 
Report Satellite Navigation Systems
Report Satellite Navigation SystemsReport Satellite Navigation Systems
Report Satellite Navigation Systems
 

More from Armando Vieira

Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
Armando Vieira
 
Predicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsPredicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithms
Armando Vieira
 
Boosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithmsBoosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithms
Armando Vieira
 
Seasonality effects on second hand cars sales
Seasonality effects on second hand cars salesSeasonality effects on second hand cars sales
Seasonality effects on second hand cars sales
Armando Vieira
 
Visualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and ShinyVisualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and Shiny
Armando Vieira
 
Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015Armando Vieira
 
Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio
Armando Vieira
 
machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...
Armando Vieira
 
Neural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective accelerationNeural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective accelerationArmando Vieira
 
Optimization of digital marketing campaigns
Optimization of digital marketing campaignsOptimization of digital marketing campaigns
Optimization of digital marketing campaigns
Armando Vieira
 
Credit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learningCredit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learning
Armando Vieira
 
Online democracy Armando Vieira
Online democracy Armando VieiraOnline democracy Armando Vieira
Online democracy Armando VieiraArmando Vieira
 
Invtur conference aveiro 2010
Invtur conference aveiro 2010Invtur conference aveiro 2010
Invtur conference aveiro 2010Armando Vieira
 
Tourism with recomendation systems
Tourism with recomendation systemsTourism with recomendation systems
Tourism with recomendation systemsArmando Vieira
 
Manifold learning for bankruptcy prediction
Manifold learning for bankruptcy predictionManifold learning for bankruptcy prediction
Manifold learning for bankruptcy predictionArmando Vieira
 
Artificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysisArtificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysisArmando Vieira
 

More from Armando Vieira (20)

Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
 
Predicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsPredicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithms
 
Boosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithmsBoosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithms
 
Seasonality effects on second hand cars sales
Seasonality effects on second hand cars salesSeasonality effects on second hand cars sales
Seasonality effects on second hand cars sales
 
Visualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and ShinyVisualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and Shiny
 
Dl2 computing gpu
Dl2 computing gpuDl2 computing gpu
Dl2 computing gpu
 
Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015
 
Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio
 
machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...
 
Neural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective accelerationNeural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective acceleration
 
Optimization of digital marketing campaigns
Optimization of digital marketing campaignsOptimization of digital marketing campaigns
Optimization of digital marketing campaigns
 
Credit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learningCredit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learning
 
Online democracy Armando Vieira
Online democracy Armando VieiraOnline democracy Armando Vieira
Online democracy Armando Vieira
 
Invtur conference aveiro 2010
Invtur conference aveiro 2010Invtur conference aveiro 2010
Invtur conference aveiro 2010
 
Tourism with recomendation systems
Tourism with recomendation systemsTourism with recomendation systems
Tourism with recomendation systems
 
Manifold learning for bankruptcy prediction
Manifold learning for bankruptcy predictionManifold learning for bankruptcy prediction
Manifold learning for bankruptcy prediction
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
Requiem pelo ensino
Requiem pelo ensino Requiem pelo ensino
Requiem pelo ensino
 
Eurogen v
Eurogen vEurogen v
Eurogen v
 
Artificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysisArtificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysis
 

Recently uploaded

Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
GTProductions1
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
VivekSinghShekhawat2
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 

Recently uploaded (20)

Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 

Dl1 deep learning_algorithms

  • 1. deep learning Algorithms and Applications Bernardete Ribeiro, bribeiro@dei.uc.pt University of Coimbra, Portugal INIT/AERFAI Summer School on Machine Learning, Benicassim 22-26 June 2015
  • 2. III - Deep Learning Algorithms 1
  • 3. elements 3: deep neural networks
  • 4. outline ∙ Learning in Deep Neural Networks ∙ Deep Learning: Evolution Timeline ∙ Deep Architectures ∙ Restricted Boltzmann Machines (RBMs) ∙ Deep Belief Networks (DBNs) ∙ Deep Models Overall Characteristics 3
  • 5. learning in deep neural networks
  • 6. learning in deep neural networks 1. No general learning algorithm (no-free lunch theorem by Wolpert 1996) 2. Learning algorithm for specific tasks - perception, control, prediction, planning reasoning, language understanding 3. Limitations of BP - local minima, optimization challenges for non-convex objective functions 4. Hinton’s deep belief networks (DBNs) as stack of RBMs 5. LeCun’s energy based learning for DBNs 5
  • 7. deep learning: evolution timeline 1. Perceptron [Frank Rosenblatt, 1959] 2. Neocognitron [K Fukushima, 1980] 3. Convolutional Neural Network (CNN) [LeCun, 1989] 4. Multi-level Hierarchy Networks [Jurgen Schmidthuber, 1992] 5. Deep Belief Networks (DBNs) as stack of RBMs [Geoffrey Hinton, 2006] 6
  • 9. from brain-like computing to deep learning ∙ New empirical and theoretical results have brought deep architectures into the focus of the Machine Learning (ML) researchers [Larochelle et al., 2007]. ∙ Theoretical results suggest that deep architectures are fundamental to learn the kind of brain-like complicated functions that can represent high-level abstractions (e.g. vision, speech, language) [Bengio, 2009] 8
  • 11. deep neural networks ∙ Convolutional Neural Networks (CNNs) [LeCun et al., 1989] ∙ Deep Belief Networks (DBNs) [Hinton et al, 2006] ∙ AutoEncoders (AEs) [Bengio et al, NIPS 2006] ∙ Sparse Autoencoders [Ranzato et al, NIPS’2006] 10
  • 12. convolutional neural networks (cnns) ∙ Convolutional Neural Network consists of two basic operations ∙ convolutional ∙ pooling ∙ Convolutional and pooling layers are arranged alternately until high-level features are obtained ∙ Several feature maps in each convolutional layer ∙ Weights in the same map are shared NN input C1 S2 C3 S4 1 1 I Arel, D Rose & T Karnowski, Deep Machine Learning—A New Frontier in Artificial Intelligence Research, IEEE, CIM,2010 11
  • 13. convolutional neural networks (cnns) ∙ Convolutional: suppose the size of the layer is d × d and the size of the receptive fields are r × r, γ and x denote respectively the values of the convolutional layer and the previous layer: γij = g( r m=1 r n=1 xi+m−1,j+n−1.wm,n + b) i, j = 1, · · · , (d − r + 1) where g is a nonlinear function. ∙ Pooling is following after convolution to reduce the dimensionality of features and to introduce translational invariance into the CNN network. 12
  • 14. deep belief networks (dbns) ∙ Probabilistic generative models contrasting with the discriminative nature of other NNS ∙ Generative models provide a joint probability distribution of data and labels ∙ Unsupervised greedy-layer-wise pre-training followed by final tuning image 28 x 28 pixels visible hidden visible hidden visible hidden Top Level units Labels Hidden Units RBM Layer RBM Layer RBM Layer Detection Layer 2 2 based on I Arel, D Rose & T Karnowski, Deep Machine Learning—A New Frontier in Artificial Intelligence Research, IEEE, CIM,2010 13
  • 15. autoencoders (aes) ∙ The auto-encoder has two components: ∙ the encoder f (mapping x to h) and ∙ the decoder g (mapping h to r) ∙ An auto-encoder is a neural network that tries to reconstruct its input to its output encoder f … … … … … … decoder g input x code h reconstruction r 3 3 based on Y Bengio, I Goodfellow and A Courville, Deep Learning, An MIT Press book (in preparation), www.iro.umontreal.ca_~bengioy_dbook 14
  • 16. deep architectures versus shallow architectures ∙ Deep architectures can be exponentially more efficient than shallow architectures [Roux and Bengio, 2010]. ∙ Functions that can be compactly represented with a Neural Network (NN) of depth d, may require an exponential number of computational elements for a network with depth d − 1 [Bengio, 2009]. 15
  • 17. deep architectures versus shallow architectures ∙ Deep architectures can be exponentially more efficient than shallow architectures [Roux and Bengio, 2010]. ∙ Functions that can be compactly represented with a Neural Network (NN) of depth d, may require an exponential number of computational elements for a network with depth d − 1 [Bengio, 2009]. ∙ Since the number of computational elements depends on the number of training samples available, using shallow architectures may result in poor generalization models [Bengio, 2009]. ∙ As a result, deep architecture models tend to outperform shallow models such as SVMs [Larochelle et al., 2007]. 15
  • 20. restricted boltzmann machines (rbms) h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units decoder encoder 18
  • 21. restricted boltzmann machines (rbms) ∙ Unsupervised ∙ Find complex regularities in training data ∙ Bipartite Graph ∙ visible, hidden layer ∙ Binary stochastic units ∙ On/Off with probability ∙ 1 Iteration ∙ Update Hidden Units ∙ Reconstruct Visible Units ∙ Maximum Likelihood of training data h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units encoder 19
  • 22. restricted boltzmann machines (rbms) ∙ Training Goal: Best probable reproduction ∙ unsupervised data ∙ find latent factors of data set ∙ Adjust weights to get maximum probability of input data h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units encoder 20
  • 23. restricted boltzmann machines (rbms) Given an observed state, the energy of the joint configuration of the visible units and hidden units (v, h) is given by: E(v, h) = − I i=1 civi − J j=1 bjhj − J j=1 I i=1 Wjivihj , (1) where W is the matrix of weights, and b and c are the bias units w.r.t. hidden and visible layers, respectively. h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units decoder encoder 21
  • 24. restricted boltzmann machines (rbms) The Restricted Boltzmann Machine (RBM) assigns a probability for each configuration (v, h), using: p(v, h) = e−E(v,h) Z , (2) where Z is a normalization constant called partition function, obtained by summing up the energy of all possible (v, h) configurations [Bengio, 2009, Hinton, 2010, Carreira-Perpiñán and Hinton, 2005]: Z = v,h e−E(v,h) . (3) 22
  • 25. restricted boltzmann machines (rbms) Since there are no connections between any two units within the same layer, given a particular random input configuration, v, all the hidden units are independent of each other and the probability of h given v becomes: p(h | v) = j p(hj = 1 | v) , (4) where p(hj = 1 | v) = σ(bj + I i=1 viWji) . (5) 23
  • 26. restricted boltzmann machines (rbms) Similarly given a specific hidden state, h, the probability of v given h is obtained by (6): p(v | h) = i p(vi = 1 | h) , (6) where: p(vi = 1 | h) = σ(ci + J j=1 hjWji) . (7) 24
  • 27. restricted boltzmann machines (rbms) Given a random training vector v, the state of a given hidden unit j is set to 1 with probability: p(hj = 1|v) = σ(bj + i viWij) Similarly: p(vi = 1|h) = σ(ci + j hjWij) where σ (x) is the sigmoid squashing function 1 (1+e−x) . 25
  • 28. restricted boltzmann machines (rbms) The marginal probability assigned to a visible vector, v, is given by (8): p(v) = h p(v, h) = 1 Z h e−E(v,h) . (8) Hence, given a specific training vector v its probability can be raised by adjusting the weights and the biases in order to lower the energy of that particular vector while raising the energy of all the others. 26
  • 29. restricted boltzmann machines (rbms) To this end, we can perform stochastic gradient ascent procedure on the log-likelihood obtained from training the data vectors using ( 9): ∂ log p(v) ∂θ = − h p(h | v)∂ E(v, h) ∂θ positive phase + v,h p(v, h) ∂E(v, h) ∂θ negative phase (9) 27
  • 31. training an rbm The learning rule for performing stochastic steepest ascent in the log probability of the training data: ∂ log p(v) ∂θ = vihj 0 − vihj ∞ (10) where · 0 denotes expectations for the data distribution (p0 = p(h | v)) and · ∞ denotes expectations under the model distribution p∞(v, h) = p(v, h) [Roux and Bengio, 2008]. h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units decoder encoder 29
  • 32. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 p(hj = 1|v) = σ(bj + I i=1 viWji) 30
  • 33. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 v(1) i · · · p(vi = 1|h) = σ(ci + J j=1 hjWji) 31
  • 34. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 v(1) i · · · h(1) · · · j p(hj = 1|v) = σ(bj + I i=1 viWji) 32
  • 35. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 v(1) i · · · h(1) · · · j v(1) i · · · p(vi = 1|h) = σ(ci + J j=1 hjWji) 33
  • 36. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 v(1) i · · · h(1) · · · j v(2) i · · · h(2) · · · j v(∞) i · · · h(∞) · · · j vihj ∞ 34
  • 38. contrastive divergence (cd–k) ∙ To solve this problem, Hinton proposed the Contrastive Divergence algorithm. ∙ CD–k replaces . ∞ by · k for small values of k. ∆Wji = η( vihj 0 − vihj k ) (11) 36
  • 39. contrastive divergence (cd–k) ∙ v(0) ← x ∙ Compute the binary (features) states of the hidden units, h(0), using v(0) ∙ for n ← 1 to k ∙ Compute the “reconstruction” states for the visible units, v(n) , using h(n−1) ∙ Compute the “reconstruction” states for the hidden units, h(n) , using v(n) ∙ end for ∙ Update the weights and biases, according to: ∆Wji = η( vihj 0 − vihj k ) (12) ∆bj = η( hj 0 − hj k ) (13) ∆ci = η( vi 0 − vi k) (14) 37
  • 41. deep belief networks (dbns) x· · · h1· · · p(x|h1)p(h1|x) x· · · h1· · · h2· · · p(x|h1)p(h1|x) p(h1|h2)p(h2|h1) x· · · h1· · · h2· · · h3· · · p(x|h1)p(h1|x) p(h1|h2)p(h2|h1) p(h2|h3)p(h3|h2) 39
  • 42. deep belief networks (dbns) ∙ Start with a training vector on the visible units ∙ Update all the hidden units in parallel ∙ Update the all the visible units in parallel to get a “reconstruction” ∙ Update the hidden units again x· · · h1· · · p(x|h1)p(h1|x) x· · · h1· · · h2· · · p(x|h1)p(h1|x) p(h1|h2)p(h2|h1) x· · · h1· · · h2· · · h3· · · p(x|h1)p(h1|x) p(h1|h2)p(h2|h1) p(h2|h3)p(h3|h2) 40
  • 43. pre-training and fine tuning RBM data 500 hidden units RBM 300 hidden units 500 hidden units RBM 100 hidden units 300 hidden units RBM 100 hidden units 10 hidden data update weights 500 hidden units 300 hidden units 100 hidden units 10 hidden error < 0.001 BP DBN Model RBMs pre-training fine-tuning with BP 41
  • 44. deep belief networks (dbns) 42
  • 47. deep belief networks (dbns) - adaptive learning rate size ηji =    uη(old) ji if ( vihj 0 − vihj k )( vihj (old) 0 − vihj (old) k ) > 0 dη(old) ji if ( vihj 0 − vihj k )( vihj (old) 0 − vihj (old) k ) < 0 4 4 Lopes et al., Towards Adaptive learning with improved convergence of DBNs on GPUs, Pattern Recognition, [2014] 45
  • 48. adaptive step size 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0 100 200 300 400 500 600 700 800 900 1000 RMSE(reconstruction) Epoch α = 0.1 adaptive γ = 0.1 γ = 0.4 γ = 0.7 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0 100 200 300 400 500 600 700 800 900 1000 RMSE(reconstruction) Epoch α = 0.4 adaptive γ = 0.1 γ = 0.4 γ = 0.7 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0 100 200 300 400 500 600 700 800 900 1000 RMSE(reconstruction) Epoch α = 0.7 adaptive γ = 0.1 γ = 0.4 γ = 0.7 Average reconstruction error (RMSE). 46
  • 49. convergence results (α = 0.1) Training images Reconstruction after 50 epochs Reconstruction after 100 epochs Reconstruction after 250 epochs Reconstruction after 500 epochs Reconstruction after 750 epochs Reconstruc- tion after 1000 epochs Adaptive Step Size Fixed (optimized) learning rate η = 0.4 47
  • 51. deep models characteristics ∙ Biological Plausibility 49
  • 52. deep models characteristics ∙ Biological Plausibility ∙ DBNs are effective in a wide range of ML problems. 49
  • 53. deep models characteristics ∙ Biological Plausibility ∙ DBNs are effective in a wide range of ML problems. ∙ Creating a Deep Belief Network (DBN) model is a time consuming and computationally expensive task that involves training several Restricted Boltzmann Machines (RBMs) upholding considerable efforts. 49
  • 54. deep models characteristics ∙ Biological Plausibility ∙ DBNs are effective in a wide range of ML problems. ∙ Creating a Deep Belief Network (DBN) model is a time consuming and computationally expensive task that involves training several Restricted Boltzmann Machines (RBMs) upholding considerable efforts. ∙ The adaptive step-size procedure for tuning the learning rate has been incorporated in the learning model with excelling results. 49
  • 55. deep models characteristics ∙ Biological Plausibility ∙ DBNs are effective in a wide range of ML problems. ∙ Creating a Deep Belief Network (DBN) model is a time consuming and computationally expensive task that involves training several Restricted Boltzmann Machines (RBMs) upholding considerable efforts. ∙ The adaptive step-size procedure for tuning the learning rate has been incorporated in the learning model with excelling results. ∙ Graphics Processing Units (GPU) can reduce significantly the convergence time for the data intensive tasks in DBNs 49
  • 56. Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1–127. Carreira-Perpiñán, M. A. and Hinton, G. E. (2005). On contrastive divergence learning. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS 2005), pages 33–40. Hinton, G. E. (2010). A practical guide to training restricted Boltzmann machines. Technical report, Department of Computer Science, University of Toronto. Larochelle, H., Erhan, D., Courville, A., Bergstra, J., and Bengio, Y. (2007). 49
  • 57. An empirical evaluation of deep architectures on problems with many factors of variation. In Proceedings of the 24th international conference on Machine learning (ICML 2007), pages 473–480. ACM. Roux, N. L. and Bengio, Y. (2008). Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation, 20(6):1631–1649. Roux, N. L. and Bengio, Y. (2010). Deep belief networks are compact universal approximators. Neural Computation, 22(8):2192–2207. 50
  • 59. deep learning Algorithms and Applications Bernardete Ribeiro, bribeiro@dei.uc.pt June 24, 2015 University of Coimbra, Portugal INIT/AERFAI Summer School on Machine Learning, Benicassim 22-26 June 2015