SlideShare a Scribd company logo
1 of 31
An Analysis of mnistclassify.m of
Hinton’s mnistdeepauto example
by Ali Riza SARAL
arsaral((at))yahoo.com
References:
Hinton’s «Lecture 12C _ Restricted Boltzmann Machines»
Hugo Larochelle’s «Neural networks [5.2] _ Restricted Boltzmann machine – inference»
Hugo Larochelle’s «Neural networks [5.4] _ Restricted Boltzmann machine - contrastive divergence»
mnistclassify
• clear allclose all
• maxepoch=1; % maxepoch=50;
• numhid=250; numpen=250; numpen2=50;
• % numhid=500; numpen=500; numpen2=2000;
• fprintf(1,'Converting Raw files into Matlab format n');
• converter;dos('erase *.ascii');
• fprintf(1,'Pretraining a deep autoencoder. n');fprintf(1,'The Science
paper used 50 epochs. This uses %3i n', maxepoch);
• makebatches;
• [numcases numdims numbatches]=size(batchdata); % 100 784 600
Mnistclassify pretrains the network
• fprintf(1,'Pretraining Layer 1 with RBM: %d-%d n',numdims,numhid); % 784 250
• restart=1;
• rbm;
• hidrecbiases=hidbiases; % 1 250
• save mnistvhclassify vishid hidrecbiases visbiases; % 784 250, 1 250, 1 784
• fprintf(1,'nPretraining Layer 2 with RBM: %d-%d n',numhid,numpen); % 250 250
• batchdata=batchposhidprobs; % 100 250 600
• numhid=numpen; %250
• restart=1;
• rbm;
• hidpen=vishid; penrecbiases=hidbiases; hidgenbiases=visbiases;
• save mnisthpclassify hidpen penrecbiases hidgenbiases;
• %hidpen 250 250, penrecbiases 1 250, hidgenbiases 1 250
• fprintf(1,'nPretraining Layer 3 with RBM: %d-%d n',numpen,numpen2); % 250 50
• batchdata=batchposhidprobs;
• numhid=numpen2;
• restart=1;
• rbm;
• hidpen2=vishid; penrecbiases2=hidbiases; hidgenbiases2=visbiases;
• save mnisthp2classify hidpen2 penrecbiases2 hidgenbiases2;
• %hidpen2 250 50, penrecbiases2 1 50, hidgenbiases2 1 250
• backpropclassify; %improves the rbm result with backpropagation
Backpropclassify.m
• maxepoch=2; %maxepoch=200;
• fprintf(1,'nTraining discriminative model on MNIST by minimizing cross entropy error. n');
• fprintf(1,'60 batches of 1000 cases each. n');
• load ...
• makebatches;
• [numcases numdims numbatches]=size(batchdata); % 100 784 600
• N=numcases; % 100
• %%%% PREINITIALIZE WEIGHTS OF THE DISCRIMINATIVE MODEL
• w1, w2 ,w3
• %%%%%%%%%% END OF PREINITIALIZATION OF WEIGHTS
• set l1,l2,l3,l4,l5 lengths
Backpropclassify.m
• test_err=[];
• train_err=[];
• for epoch = 1:maxepoch
•
• %%%%%%%%%%%%%%%%%%%% COMPUTE TRAINING MISCLASSIFICATION ERROR
• %%%%%%%%%%%%%% END OF COMPUTING TRAINING MISCLASSIFICATION ERROR
• %%%%%%%%%%%%%%%%%%%% COMPUTE TEST MISCLASSIFICATION ERROR
• %%%%%%%%%%%%%% END OF COMPUTING TEST MISCLASSIFICATION ERROR
Backpropclassify.m
• for epoch = 1:maxepoch
•
• %%%%%%%%%%%%%%%%%%%% COMPUTE TRAINING MISCLASSIFICATION ERROR
• ...
• %%%%%%%%%%%%%%%%%%%% COMPUTE TEST MISCLASSIFICATION ERROR
• ...
• for batch = 1:numbatches/10
• fprintf(1,'epoch %d batch %dn',epoch,batch);
• %%%%%%%%%%% COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH
• %%%%%%%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES
• if epoch<2 % original 6 First update top-level weights holding other weights fixed.
• [X, fX] = minimize(VV,'CG_CLASSIFY_INIT',max_iter,Dim,w3probs,targets);
• % 510 1, 4 1 = min(510 1,'CG_CLASS...',3, 2 1, 1000 51, 1000 10
• else
• [X, fX] = minimize(VV,'CG_CLASSIFY',max_iter,Dim,data,targets);
• %[% 272060 1, 4 1] %=mini..(272060 1,'CG_CLA..', 3, 5 1, 1000 784, 1000 10);
• end
• %%%%%%%%%%%%%%% END OF CONJUGATE GRADIENT WITH 3 LINESEARCHES
• end
• save mnistclassify_weights w1 w2 w3 w_class
• save mnistclassify_error test_err test_crerr train_err train_crerr;
• end
COMPUTE TRAINING MISCLASSIFICATION ERROR
• err=0;
• err_cr=0;
• counter=0;
• [numcases numdims
numbatches]=size(batchdata); % 100 784 600
• N=numcases; % 100
COMPUTE TRAINING MISCLASSIFICATION ERROR
• for batch = 1:numbatches % 1 : 600
• data = [batchdata(:,:,batch)]; % 100 784
• target = [batchtargets(:,:,batch)]; % 100 10 600
• data = [data ones(N,1)]; % 100 785
• w1probs = 1./(1 + exp(-data*w1));
• w1probs = [w1probs ones(N,1)]; % 100 785 * 785 250 += 100 251
• w2probs = 1./(1 + exp(-w1probs*w2));
• w2probs = [w2probs ones(N,1)]; % 100 251 * 251 250 += 100 251
• w3probs = 1./(1 + exp(-w2probs*w3));
• w3probs = [w3probs ones(N,1)]; % 100 251 * 251 500 += 100 501
• targetout = exp(w3probs*w_class); % 100 501 * 501 10 = 100 10
• targetout = targetout./repmat(sum(targetout,2),1,10);
• % 100 10 = 100 10 ./ repmat ( 100 1), 1 10) = 100 10
• ...
• end
COMPUTE TRAINING
MISCLASSIFICATION ERROR
• for batch = 1:numbatches % 1 : 600
• ...
• [I J]=max(targetout,[],2);
• % 100 1 , 100 1 = 100 1 -->I has the value J has the sequence
• [I1 J1]=max(target,[],2); % max(100 10,[],2) 100 1
•
• counter=counter+length(find(J==J1)); % =6 for the first batch
• err_cr = err_cr- sum(sum( target(:,1:end).*log(targetout))); %cross
entrophy
• end
• train_err(epoch)=(numcases*numbatches-counter);
• % total number of errors for all the batches in this epoche
train_crerr(epoch)=err_cr/numbatches;
• % total cross enthropy error for the complete batchdata in this epoche
COMPUTE TEST MISCLASSIFICATION ERROR
• err=0;
• err_cr=0;
• counter=0;
• [testnumcases testnumdims
testnumbatches]=size(testbatchdata);
• % 100 784 100
• N=testnumcases; % 100
• for batch = 1:testnumbatches % 1: 100
COMPUTE TEST MISCLASSIFICATION ERROR
• for batch = 1:testnumbatches % 1: 100
• data = [testbatchdata(:,:,batch)]; % 100 784
• target = [testbatchtargets(:,:,batch)]; % 100 10 = (100 10 100(:,:,batch)
• data = [data ones(N,1)]; % 100 785
• w1probs = 1./(1 + exp(-data*w1));
• w1probs = [w1probs ones(N,1)]; % 100 785 *785 250 = 100 250 ->
100 251
• w2probs = 1./(1 + exp(-w1probs*w2));
• w2probs = [w2probs ones(N,1)]; % 100 251 * 251 250 = 100 250 ->
100 251
• w3probs = 1./(1 + exp(-w2probs*w3));
• w3probs = [w3probs ones(N,1)]; % 100 251 * 251 50 = 100 50 -> 100 51
• targetout = exp(w3probs*w_class); % 100 51 * 51 10 = 100 10
• targetout = targetout./repmat(sum(targetout,2),1,10);
% = 100 10 ./ repmat ( 100 1), 1 10) = 100 10
COMPUTE TEST MISCLASSIFICATION ERROR
• [I J]=max(targetout,[],2);
• % 100 1 , 100 1 = 100 1 -->I has the value J has the sequence
• [I1 J1]=max(target,[],2); % max(100 10,[],2) 100 1
counter=counter+length(find(J==J1));
• % =9 for the first batch
• err_cr = err_cr- sum(sum(
target(:,1:end).*log(targetout))); %cross entrophy
• end
• test_err(epoch)=(testnumcases*testnumbatches-
counter); % total number of errors for all the batches in this epoche
• test_crerr(epoch)=err_cr/testnumbatches;
• % total cross enthropy error for the complete batchdata in this epoche
• fprintf(1,'Before epoch %d Train # misclassified: %d
(from %d). Test # misclassified: %d (from %d) t t n',...
epoch,train_err(epoch),numcases*numbatches,test_er
r(epoch),testnumcases*testnumbatches);
COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH
• tt=0;
• for batch = 1:numbatches/10
• fprintf(1,'epoch %d batch %dn',epoch,batch);
• %%%%%%%%%% COMBINE 10 MINIBATCHES INTO 1 LARGER
MINIBATCH
• tt=tt+1;
• data=[];
• targets=[];
• for kk=1:10
• data=[data batchdata(:,:,(tt-1)*10+kk)]; % 1000 784
targets=[targets batchtargets(:,:,(tt-1)*10+kk)]; % 1000 10
• end
PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES
• %%%%%%%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES
• max_iter=3; if epoch<2 % original 6 First update top-level weights holding other
weights fixed.
• N = size(data,1); % 1000 /1000 784)
• XX = [data ones(N,1)]; % 1000 785
• w1probs = 1./(1 + exp(-XX*w1));
• w1probs = [w1probs ones(N,1)]; % 1000 785 * 785 250 = 1000 250 -> 1000 251
• w2probs = 1./(1 + exp(-w1probs*w2));
• w2probs = [w2probs ones(N,1)]; % 1000 251 * 251 250 = 1000 250 -> 1000 251
• w3probs = 1./(1 + exp(-w2probs*w3)); %w3probs = [w3probs ones(N,1)]; % 1000
251 * 251 50 = 1000 50 -> 1000 51
• VV = [w_class(:)']'; % 51 10 = 510 1
• Dim = [l4; l5]; % 2 1
• [X, fX] = minimize(VV,'CG_CLASSIFY_INIT',max_iter,Dim,w3probs,targets); % 510 1,
4 1 = min(510 1,'CG_CLASS...',3, 2 1, 1000 51, 1000 10
• w_class = reshape(X,l4+1,l5); %reshape(X,50,10)= 51 10
Second update the rest of the weights
• else
• VV = [w1(:)' w2(:)' w3(:)' w_class(:)']'; % 272060 1
• Dim = [l1; l2; l3; l4; l5]; % 5 1
• [X, fX] = minimize(VV,'CG_CLASSIFY',max_iter,Dim,data,targets); %[% 272060 1,
4 1]=mini..(272060 1,'CG_CLA..', 3, 5 1, 1000 784, 1000 10);
• w1 = reshape(X(1:(l1+1)*l2),l1+1,l2); %reshape(272060(1 : 785*250,785,250) = 785 250
• xxx = (l1+1)*l2; % 785 * 250
• w2 = reshape(X(xxx+1:xxx+(l2+1)*l3),l2+1,l3); % reshape(X(251 : 251 * 250), 251, 250);
= 251 250
• xxx = xxx+(l2+1)*l3; % 785 * 250 + 251 * 250
• w3 = reshape(X(xxx+1:xxx+(l3+1)*l4),l3+1,l4); % reshape(X(xxx+1: xxx+251*50, 251,50)
= 251, 50
• xxx = xxx+(l3+1)*l4; % 271550
• w_class = reshape(X(xxx+1:xxx+(l4+1)*l5),l4+1,l5); % 51 10
• end
End of backpropclassify
• for epoch = 1:maxepoch
• ... The body of backpropclassify
• %%%%%%%%%%% END OF CONJUGATE GRADIENT WITH 3 LINESEARCHES
• end
• save mnistclassify_weights w1 w2 w3 w_class
• save mnistclassify_error test_err test_crerr train_err train_crerr;
• end
Outline
• There are two important loops in
mnistclassify.
• Epoche and batch loops.
• The batch loop serves to handle the data in
600 batches.
• The epoche loop determines the amount of
loops to approach the final result.
initialize
• load mnistvhclassify
• load mnisthpclassify
• load mnisthp2classify
• w1=[vishid; hidrecbiases]; % 784 250 + 1 250 = 785 250
• w2=[hidpen; penrecbiases]; % 250 250 + 1 250 = 251 250
• w3=[hidpen2; penrecbiases2]; % 250 500 + 1 500 = 251 500
• w_class = 0.1*randn(size(w3,2)+1,10); % randn(501,10) = 501 10
w weights are initialized.
• w_class is randomly initialized.
Calculate misclassification
• The epoche loop has three main sections.
• The first two sections compute misclassification
error.
• Training and test data are used to calculate the
probabilities of w1probs, w2probs, w3probs.
• This is done in a batch loop 600 times in each
section.
• Targetout is calculated using w3probs and w_class.
• targetout = exp(w3probs*w_class); % 100 501 * 501 10 = 100 10
Find the targetout and target
• targetout = targetout./repmat(sum(targetout,2),1,10);
• % 100 10 = 100 10 ./ repmat ( 100 1), 1 10) = 100 10 Normalize targetout
• Find the sequence number(1..10) of the
maximum values
• [I J]=max(targetout,[],2);
• % 100 1 , 100 1 = 100 1 -->I has the value J has the sequence
• Find the same in target
• [I1 J1]=max(target,[],2); % max(100 10,[],2) 100 1
End of misclassification calculation
• Count the number of correct results in this
batch and calculate cross entropy
• counter=counter+length(find(J==J1)); % =6 for the first batch
• err_cr = err_cr- sum(sum(
target(:,1:end).*log(targetout))) ; %cross entropy
• Repeat these for all the 600 batches and accumulate
error counter and cross entropy err_cr.
• Fprintf this statistic info for each epoch at the end of
each test misclassification calculation.
COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH
• for kk=1:10
• data=[data batchdata(:,:,(tt-1)*10+kk)]; % 1000 784
• targets=[targets batchtargets(:,:,(tt-1)*10+kk)]; % 1000 10
• End
• Calculate data and targets, namely data is
1000 items of 28x28 = 784 and targets 1000
items of 10 probabilities
PERFORM CONJUGATE GRADIENT WITH 3
LINESEARCHES
• Calculate w1probs, w2probs, w3probs again for data
• VV = [w_class(:)']'; % 51 10 = 510 1
• Dim = [l4; l5]; % 2 1
•
• [X, fX] = minimize(VV,'CG_CLASSIFY_INIT',max_iter,Dim,w3probs,targets);
• % 510 1, 4 1 = min(510 1,'CG_CLASS...',3, 2 1, 1000 51, 1000 10
• Minimize w_class using w3probs as data
• change w_class so that it will be used in the next epoche
• w_class = reshape(X,l4+1,l5); %reshape(X,51,10)= 51 10
PERFORM CONJUGATE GRADIENT WITH 3
LINESEARCHES
• if epoch<2 % original 6 First update top-level weights holding other weights fixed.
• ...prev page
• Else
• Minimize the weights and the w_class
• VV = [w1(:)' w2(:)' w3(:)' w_class(:)']'; % 272060 1
• [X, fX] = minimize(VV,'CG_CLASSIFY',max_iter,Dim,data,targets);
• %[272060 1, 4 1]=mini..(272060 1,'CG_CLA..', 3, 5 1, 1000 784, 1000 10);
• w1 = reshape(X(1:(l1+1)*l2),l1+1,l2);
• w2,w3
• w_class = reshape(X(xxx+1:xxx+(l4+1)*l5),l4+1,l5); % 51 10
•
End of each epoche
• save mnistclassify_weights w1 w2 w3 w_class
• save mnistclassify_error test_err test_crerr train_err
train_crerr;
THE PITH
• 1- We create the weight values of a net with
RBM. This net can detect itself. It is a feature
map of the data that has been used.
• 2- We make an epoche loop and batch loops in it
to repetitively approach the result.
• 3-We use the weights developed at the first stage
and a random generated w_class to compute
misclassification error.
• We perform conjugate gradient with 3 line
searches to minimize weights and w_classify.
THE PITH
• Note that
• [I J]=max(targetout,[],2); % 100 1 , 100 1 = 100 1 -->I has the max value and J has its
sequence
• We use w_classify to classify the number.
• w_classify is defined as:
• w_class = 0.1*randn(size(w3,2)+1,10); % randn(501,10) = 501 10
• W_class is a matrix of 501 rows and 10 columns.
• targetout = exp(w3probs*w_class); % 100 501 * 501 10 = 100 10
• Targetout is a matrix of 100 rows and 10 columns and is normalized
• [I J]=max(targetout,[],2); % 100 1 , 100 1 = 100 1 -->I has the value J has its sequence
• J is the number and it comes from the sequence of
the max probability value in each row.
CG_CLASSIFY_INIT
• ...
• w_class = reshape(VV,l1+1,l2); % 51 10
• w3probs = [w3probs ones(N,1)]; % 1000 51
• targetout = exp(w3probs*w_class); % 1000 51 * 51 10 = 1000
10
• targetout = targetout./repmat(sum(targetout,2),1,10); %
repmat((1000 1),1,10) normalize
Normalize mechanism
• %{
• debug> a=[1,2,3;4,5,6]
• a =
• 1 2 3
• 4 5 6
• debug> sum(a,2)
• ans =
• 6
• 15
• debug> repmat(sum(a,2),1,10)
• ans =
• 6 6 6 6 6 6 6 6 6 6
• 15 15 15 15 15 15 15 15 15 15
• %}
Backpropagate the error
• f = -sum(sum( target(:,1:end).*log(targetout))) ; % cross entropy
• IO = (targetout-target(:,1:end)); % 1000 10
• Ix_class=IO;
• dw_class = w3probs'*Ix_class; % 1000 51' * 1000 10 = 51 10
• df = [dw_class(:)']'; % 510 1
CG_CLASSIFY
• Reshape VV to produce w1,w2,w3,w_class
• Produce w1probs, w2probs, w3probs
• Produce targetout from w3probs and w_class
• Produce cost value f using target and
targetout
• Produce dw1, dw2,dw3 and dw_class
• df = [dw1(:)' dw2(:)' dw3(:)' dw_class(:)']';
• % 272060 1

More Related Content

What's hot

Concurrency in Golang
Concurrency in GolangConcurrency in Golang
Concurrency in GolangOliver N
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014PyData
 
The Ring programming language version 1.6 book - Part 82 of 189
The Ring programming language version 1.6 book - Part 82 of 189The Ring programming language version 1.6 book - Part 82 of 189
The Ring programming language version 1.6 book - Part 82 of 189Mahmoud Samir Fayed
 
Basic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersBasic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersAppier
 
Go Concurrency
Go ConcurrencyGo Concurrency
Go Concurrencyjgrahamc
 
Goroutines and Channels in practice
Goroutines and Channels in practiceGoroutines and Channels in practice
Goroutines and Channels in practiceGuilherme Garnier
 
Protocol handler in Gecko
Protocol handler in GeckoProtocol handler in Gecko
Protocol handler in GeckoChih-Hsuan Kuo
 
Assignment no39
Assignment no39Assignment no39
Assignment no39Jay Patel
 
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Chyi-Tsong Chen
 
Ownership System in Rust
Ownership System in RustOwnership System in Rust
Ownership System in RustChih-Hsuan Kuo
 
Node.js behind: V8 and its optimizations
Node.js behind: V8 and its optimizationsNode.js behind: V8 and its optimizations
Node.js behind: V8 and its optimizationsDawid Rusnak
 
Java Questions
Java QuestionsJava Questions
Java Questionsbindur87
 
Demystifying the Go Scheduler
Demystifying the Go SchedulerDemystifying the Go Scheduler
Demystifying the Go Schedulermatthewrdale
 
TensorFlow local Python XLA client
TensorFlow local Python XLA clientTensorFlow local Python XLA client
TensorFlow local Python XLA clientMr. Vengineer
 

What's hot (17)

Concurrency in Golang
Concurrency in GolangConcurrency in Golang
Concurrency in Golang
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
 
The Ring programming language version 1.6 book - Part 82 of 189
The Ring programming language version 1.6 book - Part 82 of 189The Ring programming language version 1.6 book - Part 82 of 189
The Ring programming language version 1.6 book - Part 82 of 189
 
Basic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersBasic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python Programmers
 
Go Concurrency
Go ConcurrencyGo Concurrency
Go Concurrency
 
Goroutines and Channels in practice
Goroutines and Channels in practiceGoroutines and Channels in practice
Goroutines and Channels in practice
 
Protocol handler in Gecko
Protocol handler in GeckoProtocol handler in Gecko
Protocol handler in Gecko
 
Assignment no39
Assignment no39Assignment no39
Assignment no39
 
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
 
Rust
RustRust
Rust
 
Ownership System in Rust
Ownership System in RustOwnership System in Rust
Ownership System in Rust
 
Golang Channels
Golang ChannelsGolang Channels
Golang Channels
 
Node.js behind: V8 and its optimizations
Node.js behind: V8 and its optimizationsNode.js behind: V8 and its optimizations
Node.js behind: V8 and its optimizations
 
Java Questions
Java QuestionsJava Questions
Java Questions
 
Demystifying the Go Scheduler
Demystifying the Go SchedulerDemystifying the Go Scheduler
Demystifying the Go Scheduler
 
Zone IDA Proc
Zone IDA ProcZone IDA Proc
Zone IDA Proc
 
TensorFlow local Python XLA client
TensorFlow local Python XLA clientTensorFlow local Python XLA client
TensorFlow local Python XLA client
 

Similar to Mnistauto 5

Fourier series example
Fourier series exampleFourier series example
Fourier series exampleAbi finni
 
Incorporate the SOR method in the multigridTest-m and apply the multig.pdf
Incorporate the SOR method in the multigridTest-m and apply the multig.pdfIncorporate the SOR method in the multigridTest-m and apply the multig.pdf
Incorporate the SOR method in the multigridTest-m and apply the multig.pdfaartechindia
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Béo Tú
 
Theory to consider an inaccurate testing and how to determine the prior proba...
Theory to consider an inaccurate testing and how to determine the prior proba...Theory to consider an inaccurate testing and how to determine the prior proba...
Theory to consider an inaccurate testing and how to determine the prior proba...Toshiyuki Shimono
 
error 2.pdf101316, 6(46 PM01_errorPage 1 of 5http.docx
error 2.pdf101316, 6(46 PM01_errorPage 1 of 5http.docxerror 2.pdf101316, 6(46 PM01_errorPage 1 of 5http.docx
error 2.pdf101316, 6(46 PM01_errorPage 1 of 5http.docxSALU18
 
MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709Min-hyung Kim
 
Verilog Lecture4 2014
Verilog Lecture4 2014Verilog Lecture4 2014
Verilog Lecture4 2014Béo Tú
 
Mathematicians: Trust, but Verify
Mathematicians: Trust, but VerifyMathematicians: Trust, but Verify
Mathematicians: Trust, but VerifyAndrey Karpov
 

Similar to Mnistauto 5 (20)

Mnistauto 2
Mnistauto 2Mnistauto 2
Mnistauto 2
 
Looping
LoopingLooping
Looping
 
Fourier series example
Fourier series exampleFourier series example
Fourier series example
 
Incorporate the SOR method in the multigridTest-m and apply the multig.pdf
Incorporate the SOR method in the multigridTest-m and apply the multig.pdfIncorporate the SOR method in the multigridTest-m and apply the multig.pdf
Incorporate the SOR method in the multigridTest-m and apply the multig.pdf
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014
 
Mit6 094 iap10_lec03
Mit6 094 iap10_lec03Mit6 094 iap10_lec03
Mit6 094 iap10_lec03
 
PSOGlobalSearching
PSOGlobalSearchingPSOGlobalSearching
PSOGlobalSearching
 
Numerical methods generating polynomial
Numerical methods generating polynomialNumerical methods generating polynomial
Numerical methods generating polynomial
 
SPL 8 | Loop Statements in C
SPL 8 | Loop Statements in CSPL 8 | Loop Statements in C
SPL 8 | Loop Statements in C
 
Error analysis
Error analysisError analysis
Error analysis
 
Recursion in C
Recursion in CRecursion in C
Recursion in C
 
matlab_tutorial.ppt
matlab_tutorial.pptmatlab_tutorial.ppt
matlab_tutorial.ppt
 
matlab_tutorial.ppt
matlab_tutorial.pptmatlab_tutorial.ppt
matlab_tutorial.ppt
 
matlab_tutorial.ppt
matlab_tutorial.pptmatlab_tutorial.ppt
matlab_tutorial.ppt
 
Theory to consider an inaccurate testing and how to determine the prior proba...
Theory to consider an inaccurate testing and how to determine the prior proba...Theory to consider an inaccurate testing and how to determine the prior proba...
Theory to consider an inaccurate testing and how to determine the prior proba...
 
Mnistauto 4
Mnistauto 4Mnistauto 4
Mnistauto 4
 
error 2.pdf101316, 6(46 PM01_errorPage 1 of 5http.docx
error 2.pdf101316, 6(46 PM01_errorPage 1 of 5http.docxerror 2.pdf101316, 6(46 PM01_errorPage 1 of 5http.docx
error 2.pdf101316, 6(46 PM01_errorPage 1 of 5http.docx
 
MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709
 
Verilog Lecture4 2014
Verilog Lecture4 2014Verilog Lecture4 2014
Verilog Lecture4 2014
 
Mathematicians: Trust, but Verify
Mathematicians: Trust, but VerifyMathematicians: Trust, but Verify
Mathematicians: Trust, but Verify
 

More from Ali Rıza SARAL

On the Role of Design in Creativity.pptx
On the Role of Design in Creativity.pptxOn the Role of Design in Creativity.pptx
On the Role of Design in Creativity.pptxAli Rıza SARAL
 
Human assisted computer creativity
Human assisted computer creativityHuman assisted computer creativity
Human assisted computer creativityAli Rıza SARAL
 
20160308 ars writing large music works
20160308 ars writing large music works20160308 ars writing large music works
20160308 ars writing large music worksAli Rıza SARAL
 
AR+S The Role Of Abstraction In Human Computer Interaction
AR+S   The Role Of Abstraction In Human Computer InteractionAR+S   The Role Of Abstraction In Human Computer Interaction
AR+S The Role Of Abstraction In Human Computer InteractionAli Rıza SARAL
 

More from Ali Rıza SARAL (7)

On the Role of Design in Creativity.pptx
On the Role of Design in Creativity.pptxOn the Role of Design in Creativity.pptx
On the Role of Design in Creativity.pptx
 
Mnistauto 3
Mnistauto 3Mnistauto 3
Mnistauto 3
 
Mnistauto 1
Mnistauto 1Mnistauto 1
Mnistauto 1
 
Human assisted computer creativity
Human assisted computer creativityHuman assisted computer creativity
Human assisted computer creativity
 
20160308 ars writing large music works
20160308 ars writing large music works20160308 ars writing large music works
20160308 ars writing large music works
 
Komut satırı JAVA
Komut satırı JAVAKomut satırı JAVA
Komut satırı JAVA
 
AR+S The Role Of Abstraction In Human Computer Interaction
AR+S   The Role Of Abstraction In Human Computer InteractionAR+S   The Role Of Abstraction In Human Computer Interaction
AR+S The Role Of Abstraction In Human Computer Interaction
 

Recently uploaded

Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 

Recently uploaded (20)

Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 

Mnistauto 5

  • 1. An Analysis of mnistclassify.m of Hinton’s mnistdeepauto example by Ali Riza SARAL arsaral((at))yahoo.com References: Hinton’s «Lecture 12C _ Restricted Boltzmann Machines» Hugo Larochelle’s «Neural networks [5.2] _ Restricted Boltzmann machine – inference» Hugo Larochelle’s «Neural networks [5.4] _ Restricted Boltzmann machine - contrastive divergence»
  • 2. mnistclassify • clear allclose all • maxepoch=1; % maxepoch=50; • numhid=250; numpen=250; numpen2=50; • % numhid=500; numpen=500; numpen2=2000; • fprintf(1,'Converting Raw files into Matlab format n'); • converter;dos('erase *.ascii'); • fprintf(1,'Pretraining a deep autoencoder. n');fprintf(1,'The Science paper used 50 epochs. This uses %3i n', maxepoch); • makebatches; • [numcases numdims numbatches]=size(batchdata); % 100 784 600
  • 3. Mnistclassify pretrains the network • fprintf(1,'Pretraining Layer 1 with RBM: %d-%d n',numdims,numhid); % 784 250 • restart=1; • rbm; • hidrecbiases=hidbiases; % 1 250 • save mnistvhclassify vishid hidrecbiases visbiases; % 784 250, 1 250, 1 784 • fprintf(1,'nPretraining Layer 2 with RBM: %d-%d n',numhid,numpen); % 250 250 • batchdata=batchposhidprobs; % 100 250 600 • numhid=numpen; %250 • restart=1; • rbm; • hidpen=vishid; penrecbiases=hidbiases; hidgenbiases=visbiases; • save mnisthpclassify hidpen penrecbiases hidgenbiases; • %hidpen 250 250, penrecbiases 1 250, hidgenbiases 1 250 • fprintf(1,'nPretraining Layer 3 with RBM: %d-%d n',numpen,numpen2); % 250 50 • batchdata=batchposhidprobs; • numhid=numpen2; • restart=1; • rbm; • hidpen2=vishid; penrecbiases2=hidbiases; hidgenbiases2=visbiases; • save mnisthp2classify hidpen2 penrecbiases2 hidgenbiases2; • %hidpen2 250 50, penrecbiases2 1 50, hidgenbiases2 1 250 • backpropclassify; %improves the rbm result with backpropagation
  • 4. Backpropclassify.m • maxepoch=2; %maxepoch=200; • fprintf(1,'nTraining discriminative model on MNIST by minimizing cross entropy error. n'); • fprintf(1,'60 batches of 1000 cases each. n'); • load ... • makebatches; • [numcases numdims numbatches]=size(batchdata); % 100 784 600 • N=numcases; % 100 • %%%% PREINITIALIZE WEIGHTS OF THE DISCRIMINATIVE MODEL • w1, w2 ,w3 • %%%%%%%%%% END OF PREINITIALIZATION OF WEIGHTS • set l1,l2,l3,l4,l5 lengths
  • 5. Backpropclassify.m • test_err=[]; • train_err=[]; • for epoch = 1:maxepoch • • %%%%%%%%%%%%%%%%%%%% COMPUTE TRAINING MISCLASSIFICATION ERROR • %%%%%%%%%%%%%% END OF COMPUTING TRAINING MISCLASSIFICATION ERROR • %%%%%%%%%%%%%%%%%%%% COMPUTE TEST MISCLASSIFICATION ERROR • %%%%%%%%%%%%%% END OF COMPUTING TEST MISCLASSIFICATION ERROR
  • 6. Backpropclassify.m • for epoch = 1:maxepoch • • %%%%%%%%%%%%%%%%%%%% COMPUTE TRAINING MISCLASSIFICATION ERROR • ... • %%%%%%%%%%%%%%%%%%%% COMPUTE TEST MISCLASSIFICATION ERROR • ... • for batch = 1:numbatches/10 • fprintf(1,'epoch %d batch %dn',epoch,batch); • %%%%%%%%%%% COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH • %%%%%%%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES • if epoch<2 % original 6 First update top-level weights holding other weights fixed. • [X, fX] = minimize(VV,'CG_CLASSIFY_INIT',max_iter,Dim,w3probs,targets); • % 510 1, 4 1 = min(510 1,'CG_CLASS...',3, 2 1, 1000 51, 1000 10 • else • [X, fX] = minimize(VV,'CG_CLASSIFY',max_iter,Dim,data,targets); • %[% 272060 1, 4 1] %=mini..(272060 1,'CG_CLA..', 3, 5 1, 1000 784, 1000 10); • end • %%%%%%%%%%%%%%% END OF CONJUGATE GRADIENT WITH 3 LINESEARCHES • end • save mnistclassify_weights w1 w2 w3 w_class • save mnistclassify_error test_err test_crerr train_err train_crerr; • end
  • 7. COMPUTE TRAINING MISCLASSIFICATION ERROR • err=0; • err_cr=0; • counter=0; • [numcases numdims numbatches]=size(batchdata); % 100 784 600 • N=numcases; % 100
  • 8. COMPUTE TRAINING MISCLASSIFICATION ERROR • for batch = 1:numbatches % 1 : 600 • data = [batchdata(:,:,batch)]; % 100 784 • target = [batchtargets(:,:,batch)]; % 100 10 600 • data = [data ones(N,1)]; % 100 785 • w1probs = 1./(1 + exp(-data*w1)); • w1probs = [w1probs ones(N,1)]; % 100 785 * 785 250 += 100 251 • w2probs = 1./(1 + exp(-w1probs*w2)); • w2probs = [w2probs ones(N,1)]; % 100 251 * 251 250 += 100 251 • w3probs = 1./(1 + exp(-w2probs*w3)); • w3probs = [w3probs ones(N,1)]; % 100 251 * 251 500 += 100 501 • targetout = exp(w3probs*w_class); % 100 501 * 501 10 = 100 10 • targetout = targetout./repmat(sum(targetout,2),1,10); • % 100 10 = 100 10 ./ repmat ( 100 1), 1 10) = 100 10 • ... • end
  • 9. COMPUTE TRAINING MISCLASSIFICATION ERROR • for batch = 1:numbatches % 1 : 600 • ... • [I J]=max(targetout,[],2); • % 100 1 , 100 1 = 100 1 -->I has the value J has the sequence • [I1 J1]=max(target,[],2); % max(100 10,[],2) 100 1 • • counter=counter+length(find(J==J1)); % =6 for the first batch • err_cr = err_cr- sum(sum( target(:,1:end).*log(targetout))); %cross entrophy • end • train_err(epoch)=(numcases*numbatches-counter); • % total number of errors for all the batches in this epoche train_crerr(epoch)=err_cr/numbatches; • % total cross enthropy error for the complete batchdata in this epoche
  • 10. COMPUTE TEST MISCLASSIFICATION ERROR • err=0; • err_cr=0; • counter=0; • [testnumcases testnumdims testnumbatches]=size(testbatchdata); • % 100 784 100 • N=testnumcases; % 100 • for batch = 1:testnumbatches % 1: 100
  • 11. COMPUTE TEST MISCLASSIFICATION ERROR • for batch = 1:testnumbatches % 1: 100 • data = [testbatchdata(:,:,batch)]; % 100 784 • target = [testbatchtargets(:,:,batch)]; % 100 10 = (100 10 100(:,:,batch) • data = [data ones(N,1)]; % 100 785 • w1probs = 1./(1 + exp(-data*w1)); • w1probs = [w1probs ones(N,1)]; % 100 785 *785 250 = 100 250 -> 100 251 • w2probs = 1./(1 + exp(-w1probs*w2)); • w2probs = [w2probs ones(N,1)]; % 100 251 * 251 250 = 100 250 -> 100 251 • w3probs = 1./(1 + exp(-w2probs*w3)); • w3probs = [w3probs ones(N,1)]; % 100 251 * 251 50 = 100 50 -> 100 51 • targetout = exp(w3probs*w_class); % 100 51 * 51 10 = 100 10 • targetout = targetout./repmat(sum(targetout,2),1,10); % = 100 10 ./ repmat ( 100 1), 1 10) = 100 10
  • 12. COMPUTE TEST MISCLASSIFICATION ERROR • [I J]=max(targetout,[],2); • % 100 1 , 100 1 = 100 1 -->I has the value J has the sequence • [I1 J1]=max(target,[],2); % max(100 10,[],2) 100 1 counter=counter+length(find(J==J1)); • % =9 for the first batch • err_cr = err_cr- sum(sum( target(:,1:end).*log(targetout))); %cross entrophy • end • test_err(epoch)=(testnumcases*testnumbatches- counter); % total number of errors for all the batches in this epoche • test_crerr(epoch)=err_cr/testnumbatches; • % total cross enthropy error for the complete batchdata in this epoche • fprintf(1,'Before epoch %d Train # misclassified: %d (from %d). Test # misclassified: %d (from %d) t t n',... epoch,train_err(epoch),numcases*numbatches,test_er r(epoch),testnumcases*testnumbatches);
  • 13. COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH • tt=0; • for batch = 1:numbatches/10 • fprintf(1,'epoch %d batch %dn',epoch,batch); • %%%%%%%%%% COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH • tt=tt+1; • data=[]; • targets=[]; • for kk=1:10 • data=[data batchdata(:,:,(tt-1)*10+kk)]; % 1000 784 targets=[targets batchtargets(:,:,(tt-1)*10+kk)]; % 1000 10 • end
  • 14. PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES • %%%%%%%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES • max_iter=3; if epoch<2 % original 6 First update top-level weights holding other weights fixed. • N = size(data,1); % 1000 /1000 784) • XX = [data ones(N,1)]; % 1000 785 • w1probs = 1./(1 + exp(-XX*w1)); • w1probs = [w1probs ones(N,1)]; % 1000 785 * 785 250 = 1000 250 -> 1000 251 • w2probs = 1./(1 + exp(-w1probs*w2)); • w2probs = [w2probs ones(N,1)]; % 1000 251 * 251 250 = 1000 250 -> 1000 251 • w3probs = 1./(1 + exp(-w2probs*w3)); %w3probs = [w3probs ones(N,1)]; % 1000 251 * 251 50 = 1000 50 -> 1000 51 • VV = [w_class(:)']'; % 51 10 = 510 1 • Dim = [l4; l5]; % 2 1 • [X, fX] = minimize(VV,'CG_CLASSIFY_INIT',max_iter,Dim,w3probs,targets); % 510 1, 4 1 = min(510 1,'CG_CLASS...',3, 2 1, 1000 51, 1000 10 • w_class = reshape(X,l4+1,l5); %reshape(X,50,10)= 51 10
  • 15. Second update the rest of the weights • else • VV = [w1(:)' w2(:)' w3(:)' w_class(:)']'; % 272060 1 • Dim = [l1; l2; l3; l4; l5]; % 5 1 • [X, fX] = minimize(VV,'CG_CLASSIFY',max_iter,Dim,data,targets); %[% 272060 1, 4 1]=mini..(272060 1,'CG_CLA..', 3, 5 1, 1000 784, 1000 10); • w1 = reshape(X(1:(l1+1)*l2),l1+1,l2); %reshape(272060(1 : 785*250,785,250) = 785 250 • xxx = (l1+1)*l2; % 785 * 250 • w2 = reshape(X(xxx+1:xxx+(l2+1)*l3),l2+1,l3); % reshape(X(251 : 251 * 250), 251, 250); = 251 250 • xxx = xxx+(l2+1)*l3; % 785 * 250 + 251 * 250 • w3 = reshape(X(xxx+1:xxx+(l3+1)*l4),l3+1,l4); % reshape(X(xxx+1: xxx+251*50, 251,50) = 251, 50 • xxx = xxx+(l3+1)*l4; % 271550 • w_class = reshape(X(xxx+1:xxx+(l4+1)*l5),l4+1,l5); % 51 10 • end
  • 16. End of backpropclassify • for epoch = 1:maxepoch • ... The body of backpropclassify • %%%%%%%%%%% END OF CONJUGATE GRADIENT WITH 3 LINESEARCHES • end • save mnistclassify_weights w1 w2 w3 w_class • save mnistclassify_error test_err test_crerr train_err train_crerr; • end
  • 17. Outline • There are two important loops in mnistclassify. • Epoche and batch loops. • The batch loop serves to handle the data in 600 batches. • The epoche loop determines the amount of loops to approach the final result.
  • 18. initialize • load mnistvhclassify • load mnisthpclassify • load mnisthp2classify • w1=[vishid; hidrecbiases]; % 784 250 + 1 250 = 785 250 • w2=[hidpen; penrecbiases]; % 250 250 + 1 250 = 251 250 • w3=[hidpen2; penrecbiases2]; % 250 500 + 1 500 = 251 500 • w_class = 0.1*randn(size(w3,2)+1,10); % randn(501,10) = 501 10 w weights are initialized. • w_class is randomly initialized.
  • 19. Calculate misclassification • The epoche loop has three main sections. • The first two sections compute misclassification error. • Training and test data are used to calculate the probabilities of w1probs, w2probs, w3probs. • This is done in a batch loop 600 times in each section. • Targetout is calculated using w3probs and w_class. • targetout = exp(w3probs*w_class); % 100 501 * 501 10 = 100 10
  • 20. Find the targetout and target • targetout = targetout./repmat(sum(targetout,2),1,10); • % 100 10 = 100 10 ./ repmat ( 100 1), 1 10) = 100 10 Normalize targetout • Find the sequence number(1..10) of the maximum values • [I J]=max(targetout,[],2); • % 100 1 , 100 1 = 100 1 -->I has the value J has the sequence • Find the same in target • [I1 J1]=max(target,[],2); % max(100 10,[],2) 100 1
  • 21. End of misclassification calculation • Count the number of correct results in this batch and calculate cross entropy • counter=counter+length(find(J==J1)); % =6 for the first batch • err_cr = err_cr- sum(sum( target(:,1:end).*log(targetout))) ; %cross entropy • Repeat these for all the 600 batches and accumulate error counter and cross entropy err_cr. • Fprintf this statistic info for each epoch at the end of each test misclassification calculation.
  • 22. COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH • for kk=1:10 • data=[data batchdata(:,:,(tt-1)*10+kk)]; % 1000 784 • targets=[targets batchtargets(:,:,(tt-1)*10+kk)]; % 1000 10 • End • Calculate data and targets, namely data is 1000 items of 28x28 = 784 and targets 1000 items of 10 probabilities
  • 23. PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES • Calculate w1probs, w2probs, w3probs again for data • VV = [w_class(:)']'; % 51 10 = 510 1 • Dim = [l4; l5]; % 2 1 • • [X, fX] = minimize(VV,'CG_CLASSIFY_INIT',max_iter,Dim,w3probs,targets); • % 510 1, 4 1 = min(510 1,'CG_CLASS...',3, 2 1, 1000 51, 1000 10 • Minimize w_class using w3probs as data • change w_class so that it will be used in the next epoche • w_class = reshape(X,l4+1,l5); %reshape(X,51,10)= 51 10
  • 24. PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES • if epoch<2 % original 6 First update top-level weights holding other weights fixed. • ...prev page • Else • Minimize the weights and the w_class • VV = [w1(:)' w2(:)' w3(:)' w_class(:)']'; % 272060 1 • [X, fX] = minimize(VV,'CG_CLASSIFY',max_iter,Dim,data,targets); • %[272060 1, 4 1]=mini..(272060 1,'CG_CLA..', 3, 5 1, 1000 784, 1000 10); • w1 = reshape(X(1:(l1+1)*l2),l1+1,l2); • w2,w3 • w_class = reshape(X(xxx+1:xxx+(l4+1)*l5),l4+1,l5); % 51 10 •
  • 25. End of each epoche • save mnistclassify_weights w1 w2 w3 w_class • save mnistclassify_error test_err test_crerr train_err train_crerr;
  • 26. THE PITH • 1- We create the weight values of a net with RBM. This net can detect itself. It is a feature map of the data that has been used. • 2- We make an epoche loop and batch loops in it to repetitively approach the result. • 3-We use the weights developed at the first stage and a random generated w_class to compute misclassification error. • We perform conjugate gradient with 3 line searches to minimize weights and w_classify.
  • 27. THE PITH • Note that • [I J]=max(targetout,[],2); % 100 1 , 100 1 = 100 1 -->I has the max value and J has its sequence • We use w_classify to classify the number. • w_classify is defined as: • w_class = 0.1*randn(size(w3,2)+1,10); % randn(501,10) = 501 10 • W_class is a matrix of 501 rows and 10 columns. • targetout = exp(w3probs*w_class); % 100 501 * 501 10 = 100 10 • Targetout is a matrix of 100 rows and 10 columns and is normalized • [I J]=max(targetout,[],2); % 100 1 , 100 1 = 100 1 -->I has the value J has its sequence • J is the number and it comes from the sequence of the max probability value in each row.
  • 28. CG_CLASSIFY_INIT • ... • w_class = reshape(VV,l1+1,l2); % 51 10 • w3probs = [w3probs ones(N,1)]; % 1000 51 • targetout = exp(w3probs*w_class); % 1000 51 * 51 10 = 1000 10 • targetout = targetout./repmat(sum(targetout,2),1,10); % repmat((1000 1),1,10) normalize
  • 29. Normalize mechanism • %{ • debug> a=[1,2,3;4,5,6] • a = • 1 2 3 • 4 5 6 • debug> sum(a,2) • ans = • 6 • 15 • debug> repmat(sum(a,2),1,10) • ans = • 6 6 6 6 6 6 6 6 6 6 • 15 15 15 15 15 15 15 15 15 15 • %}
  • 30. Backpropagate the error • f = -sum(sum( target(:,1:end).*log(targetout))) ; % cross entropy • IO = (targetout-target(:,1:end)); % 1000 10 • Ix_class=IO; • dw_class = w3probs'*Ix_class; % 1000 51' * 1000 10 = 51 10 • df = [dw_class(:)']'; % 510 1
  • 31. CG_CLASSIFY • Reshape VV to produce w1,w2,w3,w_class • Produce w1probs, w2probs, w3probs • Produce targetout from w3probs and w_class • Produce cost value f using target and targetout • Produce dw1, dw2,dw3 and dw_class • df = [dw1(:)' dw2(:)' dw3(:)' dw_class(:)']'; • % 272060 1