SlideShare a Scribd company logo
1 of 25
RBM for unsupervised features
learning of handwritten digits
(MNIST dataset)
Neural networks
Feedforward ANN: compute
units states layer-by-layer
Hopfield networks
Recurrent ANN (Hopfield net): update
units until convergence
Hopfield networks
Energy minima correspond to learned patterns
Bolzmann machine
Stochastic neural network,
Has probability distribution defined:
Low energy states have higher probability
“Partition function”: normalization constant
(Boltzmann distribution is a probability distribution
of particles in a system over various possible states)
Bolzmann machine
Same form of energy as for Hopfield networks:
But, probabilistic units state update:
(Sigmoid function)
Bolzmann machine
v: Visible units, connected to observed data
h: Hidden units, capture dependancies and
patterns in visible variables
Interested in modeling observed data, so
marginalize over hidden variables (x for
visivle variables):
Defining “Free energy”
Allows to define
distribution
of visible states:
Bolzmann machine
Log-likelihood given a training sample (v):
Gradient with respect to model parameters:
RBM
RBM
Approximating gradient
“Contrastive divergence”: for model samples,
initialize Markov chain from the training sample
Gibbs sampling: alternating update of visible and hidden units
Contrastive divergence
Persistent Contrastive Divergence
Run several sampling chains in parallel
...
For example, one for each training sample
Mnist dataset
60,000 training images and 10,000 testing images
28 x 28 grayscale images
pixels connected
to 768 visible units
v0 v1 v2 v3 v4 v5 v6 v7 ... v768
h0 h1 h2 h3 h4 h5 h6 h7 ... h100
use 100 hidden units
Demo
v0 v1 v2 v3 v4 v5 v6 v7 ... v768
h0
Training code
// (positive phase)
// compute hidden nodes activations and probabilities
for (int i = 0; i < n_hidden; i++) {
h_prob[i] = 0.0f;
for (int j = 0; j < n_visible; j++) {
h_prob[i] += c[i] + v_data[j] * W[j * n_hidden + i];
}
h_prob[i] = sigmoid(h_prob[i]);
h[i] = ofRandom(1.0f) > h_prob[i] ? 0.0f : 1.0f;
}
for (int i = 0; i < n_visible * n_hidden; i++) {
pos_weights[i] = v_data[i / n_hidden] * h_prob[i % n_hidden];
}
// (negative phase)
// sample visible nodes
for (int i = 0; i < n_visible; i++) {
v_negative_prob[i] = 0.0f;
for (int j = 0; j < n_hidden; j++) {
v_negative_prob[i] += b[i] + h[j] * W[i * n_hidden + j];
}
v_negative_prob[i] = sigmoid(v_negative_prob[i]);
v_negative[i] = ofRandom(1.0f) > v_negative_prob[i] ? 0.0f :
1.0f;
}
Training code
// and hidden nodes once again
for (int i = 0; i < n_hidden; i++) {
h_negative_prob[i] = 0.0f;
for (int j = 0; j < n_visible; j++) {
h_negative_prob[i] += c[i] + v_negative[j] * W[j * n_hidden + i];
}
h_negative_prob[i] = sigmoid(h_negative_prob[i]);
h_negative[i] = ofRandom(1.0f) > h_negative_prob[i] ? 0.0f : 1.0f;
}
for (int i = 0; i < n_visible * n_hidden; i++) {
neg_weights[i] = v_negative[i / n_hidden] * h_negative_prob[i %
n_hidden];
}
// update weights
for (int i = 0; i < n_visible * n_hidden; i++) {
W[i] += learning_rate * (pos_weights[i] - neg_weights[i]);
}
Training
● Matrix multiplication
● Mini-batch training
● Regularization
● Weights update momentum
Training code
// (positive phase)
clear(h_data_prob, n_hidden * batch_size);
multiply(v_data, W, h_data_prob, batch_size, n_visible,
n_hidden);
getState(h_data_prob, h_data, n_hidden * batch_size);
transpose(v_data, vt, batch_size, n_visible);
multiply(vt, h_data, pos_weights, n_visible, batch_size,
n_hidden);
for (int cdk = 0; cdk < k; cdk++) {
// run update for CD1 or persistent chain for PCD
clear(v_prob, n_visible * batch_size);
transpose(W, Wt, n_visible, n_hidden);
multiply(h, Wt, v_prob, batch_size, n_hidden, n_visible);
getState(v_prob, v, n_visible * batch_size);
// and hidden nodes
clear(h_prob, n_hidden * batch_size);
multiply(v, W, h_prob, batch_size, n_visible, n_hidden);
getState(h_prob, h, n_hidden * batch_size);
}
Training code
for (int i = 0; i < n_visible * n_hidden; i++) {
W_inc[i] *= momentum;
W_inc[i] += learning_rate * (pos_weights[i] -
neg_weights[i])
/ (float) batch_size - weightcost * W[i];
W[i] += W_inc[i];
}
Sparseness & Selectivity
Sparseness & Selectivity
// increasing selectivity
float activity_smoothing = 0.99f;
float total_active = 0.0f;
for (int i = 0; i < n_hidden; i++) {
float activity = pos_weights[i] / (float)batch_size;
mean_activity[i] = mean_activity[i] *
activity_smoothing
+ activity * (1.0f - activity_smoothing);
W[i] += (0.01f - mean_activity[i]) * 0.01f;
total_active += activity;
}
// increasing sparseness
q = activity_smoothing * q
+ (1.0f - activity_smoothing) * (total_active / (float)
n_hidden);
for (int i = 0; i < n_hidden; i++) {
W[i] += (0.1f - q) * 0.01f;
}
Classification with RBM
Classification with RBM
● inputs for some standard discriminative
method (classifier)
● train a separate RBM on each class
Greedy layer-wise training
Higher layers capture more abstract features:

More Related Content

What's hot

15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer PerceptronAndres Mendez-Vazquez
 
Back propagation network
Back propagation networkBack propagation network
Back propagation networkHIRA Zaidi
 
Backpropagation
BackpropagationBackpropagation
Backpropagationariffast
 
Deep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image GenerationDeep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image GenerationTJ Torres
 
Artificial Neural Networks Lect8: Neural networks for constrained optimization
Artificial Neural Networks Lect8: Neural networks for constrained optimizationArtificial Neural Networks Lect8: Neural networks for constrained optimization
Artificial Neural Networks Lect8: Neural networks for constrained optimizationMohammed Bennamoun
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
Convolution Neural Networks
Convolution Neural NetworksConvolution Neural Networks
Convolution Neural NetworksAhmedMahany
 
2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layers2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layersJAEMINJEONG5
 
Neural network in matlab
Neural network in matlab Neural network in matlab
Neural network in matlab Fahim Khan
 
Multi Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationMulti Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationSung-ju Kim
 
Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?Victor Miagkikh
 
Auto encoders in Deep Learning
Auto encoders in Deep LearningAuto encoders in Deep Learning
Auto encoders in Deep LearningShajun Nisha
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Simplilearn
 
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowAltoros
 

What's hot (20)

Lesson 38
Lesson 38Lesson 38
Lesson 38
 
15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron
 
Back propagation network
Back propagation networkBack propagation network
Back propagation network
 
Backpropagation
BackpropagationBackpropagation
Backpropagation
 
Deep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image GenerationDeep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image Generation
 
Artificial Neural Networks Lect8: Neural networks for constrained optimization
Artificial Neural Networks Lect8: Neural networks for constrained optimizationArtificial Neural Networks Lect8: Neural networks for constrained optimization
Artificial Neural Networks Lect8: Neural networks for constrained optimization
 
Max net
Max netMax net
Max net
 
Back propagation method
Back propagation methodBack propagation method
Back propagation method
 
nn network
nn networknn network
nn network
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Convolution Neural Networks
Convolution Neural NetworksConvolution Neural Networks
Convolution Neural Networks
 
2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layers2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layers
 
Neural network in matlab
Neural network in matlab Neural network in matlab
Neural network in matlab
 
Unit 1
Unit 1Unit 1
Unit 1
 
Multi Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationMulti Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back Propagation
 
Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?
 
Auto encoders in Deep Learning
Auto encoders in Deep LearningAuto encoders in Deep Learning
Auto encoders in Deep Learning
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
 
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
 

Similar to Introduction to RBM for written digits recognition

How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITEgor Bogatov
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14Sri Ambati
 
Java Performance Puzzlers
Java Performance PuzzlersJava Performance Puzzlers
Java Performance PuzzlersDoug Hawkins
 
Lecture 5: Functional Programming
Lecture 5: Functional ProgrammingLecture 5: Functional Programming
Lecture 5: Functional ProgrammingEelco Visser
 
Back propagation
Back propagationBack propagation
Back propagationNagarajan
 
Machine Learning and Go. Go!
Machine Learning and Go. Go!Machine Learning and Go. Go!
Machine Learning and Go. Go!Diana Ortega
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Compiler Construction | Lecture 11 | Monotone Frameworks
Compiler Construction | Lecture 11 | Monotone FrameworksCompiler Construction | Lecture 11 | Monotone Frameworks
Compiler Construction | Lecture 11 | Monotone FrameworksEelco Visser
 
Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networksHojin Yang
 
The current program only run one iteration of the KMeans algorithm. .docx
The current program only run one iteration of the KMeans algorithm. .docxThe current program only run one iteration of the KMeans algorithm. .docx
The current program only run one iteration of the KMeans algorithm. .docxtodd241
 
Anomalies in X-Ray Engine
Anomalies in X-Ray EngineAnomalies in X-Ray Engine
Anomalies in X-Ray EnginePVS-Studio
 
Frequency Response with MATLAB Examples.pdf
Frequency Response with MATLAB Examples.pdfFrequency Response with MATLAB Examples.pdf
Frequency Response with MATLAB Examples.pdfSunil Manjani
 
Pointcuts and Analysis
Pointcuts and AnalysisPointcuts and Analysis
Pointcuts and AnalysisWiwat Ruengmee
 
8. Recursion.pptx
8. Recursion.pptx8. Recursion.pptx
8. Recursion.pptxAbid523408
 
Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...
Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...
Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...Flink Forward
 
Hypercritical C++ Code Review
Hypercritical C++ Code ReviewHypercritical C++ Code Review
Hypercritical C++ Code ReviewAndrey Karpov
 
Java Keeps Throttling Up!
Java Keeps Throttling Up!Java Keeps Throttling Up!
Java Keeps Throttling Up!José Paumard
 

Similar to Introduction to RBM for written digits recognition (20)

How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJIT
 
weatherr.pptx
weatherr.pptxweatherr.pptx
weatherr.pptx
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
 
Java Performance Puzzlers
Java Performance PuzzlersJava Performance Puzzlers
Java Performance Puzzlers
 
Lecture 5: Functional Programming
Lecture 5: Functional ProgrammingLecture 5: Functional Programming
Lecture 5: Functional Programming
 
Back propagation
Back propagationBack propagation
Back propagation
 
Machine Learning and Go. Go!
Machine Learning and Go. Go!Machine Learning and Go. Go!
Machine Learning and Go. Go!
 
Introduction
IntroductionIntroduction
Introduction
 
Compiler Construction | Lecture 11 | Monotone Frameworks
Compiler Construction | Lecture 11 | Monotone FrameworksCompiler Construction | Lecture 11 | Monotone Frameworks
Compiler Construction | Lecture 11 | Monotone Frameworks
 
Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networks
 
The current program only run one iteration of the KMeans algorithm. .docx
The current program only run one iteration of the KMeans algorithm. .docxThe current program only run one iteration of the KMeans algorithm. .docx
The current program only run one iteration of the KMeans algorithm. .docx
 
Anomalies in X-Ray Engine
Anomalies in X-Ray EngineAnomalies in X-Ray Engine
Anomalies in X-Ray Engine
 
OSCh7
OSCh7OSCh7
OSCh7
 
OS_Ch7
OS_Ch7OS_Ch7
OS_Ch7
 
Frequency Response with MATLAB Examples.pdf
Frequency Response with MATLAB Examples.pdfFrequency Response with MATLAB Examples.pdf
Frequency Response with MATLAB Examples.pdf
 
Pointcuts and Analysis
Pointcuts and AnalysisPointcuts and Analysis
Pointcuts and Analysis
 
8. Recursion.pptx
8. Recursion.pptx8. Recursion.pptx
8. Recursion.pptx
 
Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...
Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...
Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...
 
Hypercritical C++ Code Review
Hypercritical C++ Code ReviewHypercritical C++ Code Review
Hypercritical C++ Code Review
 
Java Keeps Throttling Up!
Java Keeps Throttling Up!Java Keeps Throttling Up!
Java Keeps Throttling Up!
 

Recently uploaded

WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 

Recently uploaded (20)

WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 

Introduction to RBM for written digits recognition

  • 1. RBM for unsupervised features learning of handwritten digits (MNIST dataset)
  • 2. Neural networks Feedforward ANN: compute units states layer-by-layer
  • 3. Hopfield networks Recurrent ANN (Hopfield net): update units until convergence
  • 4. Hopfield networks Energy minima correspond to learned patterns
  • 5. Bolzmann machine Stochastic neural network, Has probability distribution defined: Low energy states have higher probability “Partition function”: normalization constant (Boltzmann distribution is a probability distribution of particles in a system over various possible states)
  • 6. Bolzmann machine Same form of energy as for Hopfield networks: But, probabilistic units state update: (Sigmoid function)
  • 7. Bolzmann machine v: Visible units, connected to observed data h: Hidden units, capture dependancies and patterns in visible variables Interested in modeling observed data, so marginalize over hidden variables (x for visivle variables): Defining “Free energy” Allows to define distribution of visible states:
  • 8. Bolzmann machine Log-likelihood given a training sample (v): Gradient with respect to model parameters:
  • 9. RBM
  • 10. RBM
  • 11. Approximating gradient “Contrastive divergence”: for model samples, initialize Markov chain from the training sample Gibbs sampling: alternating update of visible and hidden units
  • 13. Persistent Contrastive Divergence Run several sampling chains in parallel ... For example, one for each training sample
  • 14. Mnist dataset 60,000 training images and 10,000 testing images 28 x 28 grayscale images pixels connected to 768 visible units v0 v1 v2 v3 v4 v5 v6 v7 ... v768 h0 h1 h2 h3 h4 h5 h6 h7 ... h100 use 100 hidden units
  • 15. Demo v0 v1 v2 v3 v4 v5 v6 v7 ... v768 h0
  • 16. Training code // (positive phase) // compute hidden nodes activations and probabilities for (int i = 0; i < n_hidden; i++) { h_prob[i] = 0.0f; for (int j = 0; j < n_visible; j++) { h_prob[i] += c[i] + v_data[j] * W[j * n_hidden + i]; } h_prob[i] = sigmoid(h_prob[i]); h[i] = ofRandom(1.0f) > h_prob[i] ? 0.0f : 1.0f; } for (int i = 0; i < n_visible * n_hidden; i++) { pos_weights[i] = v_data[i / n_hidden] * h_prob[i % n_hidden]; } // (negative phase) // sample visible nodes for (int i = 0; i < n_visible; i++) { v_negative_prob[i] = 0.0f; for (int j = 0; j < n_hidden; j++) { v_negative_prob[i] += b[i] + h[j] * W[i * n_hidden + j]; } v_negative_prob[i] = sigmoid(v_negative_prob[i]); v_negative[i] = ofRandom(1.0f) > v_negative_prob[i] ? 0.0f : 1.0f; }
  • 17. Training code // and hidden nodes once again for (int i = 0; i < n_hidden; i++) { h_negative_prob[i] = 0.0f; for (int j = 0; j < n_visible; j++) { h_negative_prob[i] += c[i] + v_negative[j] * W[j * n_hidden + i]; } h_negative_prob[i] = sigmoid(h_negative_prob[i]); h_negative[i] = ofRandom(1.0f) > h_negative_prob[i] ? 0.0f : 1.0f; } for (int i = 0; i < n_visible * n_hidden; i++) { neg_weights[i] = v_negative[i / n_hidden] * h_negative_prob[i % n_hidden]; } // update weights for (int i = 0; i < n_visible * n_hidden; i++) { W[i] += learning_rate * (pos_weights[i] - neg_weights[i]); }
  • 18. Training ● Matrix multiplication ● Mini-batch training ● Regularization ● Weights update momentum
  • 19. Training code // (positive phase) clear(h_data_prob, n_hidden * batch_size); multiply(v_data, W, h_data_prob, batch_size, n_visible, n_hidden); getState(h_data_prob, h_data, n_hidden * batch_size); transpose(v_data, vt, batch_size, n_visible); multiply(vt, h_data, pos_weights, n_visible, batch_size, n_hidden); for (int cdk = 0; cdk < k; cdk++) { // run update for CD1 or persistent chain for PCD clear(v_prob, n_visible * batch_size); transpose(W, Wt, n_visible, n_hidden); multiply(h, Wt, v_prob, batch_size, n_hidden, n_visible); getState(v_prob, v, n_visible * batch_size); // and hidden nodes clear(h_prob, n_hidden * batch_size); multiply(v, W, h_prob, batch_size, n_visible, n_hidden); getState(h_prob, h, n_hidden * batch_size); }
  • 20. Training code for (int i = 0; i < n_visible * n_hidden; i++) { W_inc[i] *= momentum; W_inc[i] += learning_rate * (pos_weights[i] - neg_weights[i]) / (float) batch_size - weightcost * W[i]; W[i] += W_inc[i]; }
  • 22. Sparseness & Selectivity // increasing selectivity float activity_smoothing = 0.99f; float total_active = 0.0f; for (int i = 0; i < n_hidden; i++) { float activity = pos_weights[i] / (float)batch_size; mean_activity[i] = mean_activity[i] * activity_smoothing + activity * (1.0f - activity_smoothing); W[i] += (0.01f - mean_activity[i]) * 0.01f; total_active += activity; } // increasing sparseness q = activity_smoothing * q + (1.0f - activity_smoothing) * (total_active / (float) n_hidden); for (int i = 0; i < n_hidden; i++) { W[i] += (0.1f - q) * 0.01f; }
  • 24. Classification with RBM ● inputs for some standard discriminative method (classifier) ● train a separate RBM on each class
  • 25. Greedy layer-wise training Higher layers capture more abstract features: