SlideShare a Scribd company logo
1 of 39
Download to read offline
Machine(s) Learning with Neural Networks


www.bgoncalves.com
github.com/bmtgoncalves/Neural-Networks
www.bgoncalves.com@bgoncalves
References
Machine Learning
Andrew Ng
Neural Networks for Machine Learning
Geoff Hinton
www.bgoncalves.com@bgoncalves
How the Brain “Works” (Cartoon version)
www.bgoncalves.com@bgoncalves
How the Brain “Works” (Cartoon version)
• Each neuron receives input from other neurons
• 1011 neurons, each with with 104 weights
• Weights can be positive or negative
• Weights adapt during the learning process
• “neurons that fire together wire together” (Hebb)
• Different areas perform different functions using same structure (Modularity)
www.bgoncalves.com@bgoncalves
How the Brain “Works” (Cartoon version)
Inputs Outputf(Inputs)
www.bgoncalves.com@bgoncalves
www.bgoncalves.com@bgoncalves
Supervised Learning - Classification
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
.
Sample N
Feature1
Feature3
Feature2
…
Label
FeatureM
• Dataset formatted as an NxM matrix of N samples and M
features
• Each sample belongs to a specific class or has a specific label.
• The goal of classification is to predict to which class a
previously unseen sample belongs to by learning defining
regularities of each class
• K-Nearest Neighbor
• Support Vector Machine
• Neural Networks
• Two fundamental types of problems:
• Classification (discrete output value)
• Regression (continuous output value)
www.bgoncalves.com@bgoncalves
Supervised Learning - Overfitting
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
.
Sample N
Feature1
Feature3
Feature2
…
Label
FeatureM
• “Learning the noise”
• “Memorization” instead of “generalization”
• How can we prevent it?
• Split dataset into two subsets: Training and Testing
• Train model using only the Training dataset and evaluate
results in the previously unseen Testing dataset.
• Different heuristics on how to split:
• Single split
• k-fold cross validation: split dataset in k parts, train in k-1
and evaluate in 1, repeat k times and average results.
TrainingTesting
www.bgoncalves.com@bgoncalves
Bias-Variance Tradeoff
www.bgoncalves.com@bgoncalves
Perceptron
x1
x2
x3
xN
w
1j
w2j
w3j
wN
j
zj
wT
x
aj
(z)
w
0j
1
Inputs Weights
Output
Activation
function
Bias
www.bgoncalves.com@bgoncalves
Activation Function
• Non-Linear function
• Differentiable
• non-decreasing
• Compute new sets of features
• Each layer builds up a more abstract
representation of the data
activation.py
http://github.com/bmtgoncalves/Neural-Networks
www.bgoncalves.com@bgoncalves
Activation Function - Sigmoid
(z) =
1
1 + e z
activation.py
• Non-Linear function
• Differentiable
• non-decreasing
• Compute new sets of features
• Each layer builds up a more abstract
representation of the data
• Perhaps the most common
http://github.com/bmtgoncalves/Neural-Networks
www.bgoncalves.com@bgoncalves
Activation Function - ReLu
activation.py
• Non-Linear function
• Differentiable
• non-decreasing
• Compute new sets of features
• Each layer builds up a more abstract
representation of the data
• Results in faster learning than with
sigmoid
http://github.com/bmtgoncalves/Neural-Networks
www.bgoncalves.com@bgoncalves
Activation Function - tanh
(z) =
ez
e z
ez + e z
activation.py
• Non-Linear function
• Differentiable
• non-decreasing
• Compute new sets of features
• Each layer builds up a more abstract
representation of the data
http://github.com/bmtgoncalves/Neural-Networks
www.bgoncalves.com@bgoncalves
Perceptron - Forward Propagation
• The output of a perceptron is determined by a
sequence of steps:
• obtain the inputs
• multiply the inputs by the respective weights
• calculate value of the activation function
• (map activation value to predicted output)
x1
x2
x3
xN
w
1j
w2j
w3j
wN
j
aj
w
0j
1
wT
x
aj = !T
x
@bgoncalves
Perceptron - Forward Propagation
import numpy as np
def forward(Theta, X, active):
N = X.shape[0]
# Add the bias column
X_ = np.concatenate((np.ones((N, 1)), X), 1)
# Multiply by the weights
z = np.dot(X_, Theta.T)
# Apply the activation function
a = active(z)
return a
if __name__ == "__main__":
Theta1 = np.load('input/Theta1.npy')
X = np.load('input/X_train.npy')[:10]
from activation import sigmoid
active_value = forward(Theta1, X, sigmoid)
forward.py
http://github.com/bmtgoncalves/Neural-Networks
www.bgoncalves.com@bgoncalves
Perceptron - Training
x1
x2
x3
xN
w
1j
w2j
w3j
wN
j
aj
w
0j
1
wT
x
• Training Procedure:
• If correct, do nothing
• If output incorrectly outputs 0, add input to weight
vector
• if output incorrectly outputs 1, subtract input to
weight vector
• Guaranteed to converge, if a correct set of weights
exists
• Given enough features, perceptrons can learn almost
anything
• Specific Features used limit what is possible to learn
www.bgoncalves.com@bgoncalves
Forward Propagation
• The output of a perceptron is determined by a sequence of steps:
• obtain the inputs
• multiply the inputs by the respective weights
• calculate output using the activation function
• To create a multi-layer perceptron, you can simply use the output of
one layer as the input to the next one.

• But how can we propagate back the errors and update the weights?
x1
x2
x3
xN
w
1j
w2j
w3j
wN
j
aj
w
0j
1
wT
x
1
w
0k
w
1k
w2k
w3k
wNk
ak
wT
a
a1
a2
aN
www.bgoncalves.com@bgoncalves
Backward Propagation of Errors (BackProp)
• BackProp operates in two phases:
• Forward propagate the inputs and calculate the deltas
• Update the weights
• The error at the output is a weighted average difference between predicted output and the
observed one.
• For inner layers there is no “real output”!
www.bgoncalves.com@bgoncalves
Chain-rule
• From the forward propagation described above, we know that the
output of a neuron is:
• But how can we calculate how to modify the weights ?
• We take the derivative of the error with respect to the weights!

• Using the chain rule:

• And finally we can update each weight in the previous layer:

• where is the learning rate
yj = wT
x
@E
@wij
=
@E
@yj
@yj
@wij
wij
@E
@wij
wij wij ↵
@E
@wij
yj
↵
www.bgoncalves.com@bgoncalves
The Cross Entropy is complementary to sigmoid
activation in the output layer and improves its stability.
Loss Functions
• For learning to occur, we must quantify how far off we are from the desired output. There are
two common ways of doing this:
• Quadratic error function:
• Cross Entropy
E =
1
N
X
n
|yn an|
2
J =
1
N
X
n
h
yT
n log an + (1 yn)
T
log (1 an)
i
www.bgoncalves.com@bgoncalves
A practical example - MNIST
8
2
visualize_digits.py

convert_input.py
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
.
Sample N
Feature1
Feature3
Feature2
…
Label
FeatureM
http://github.com/bmtgoncalves/Neural-Networks
http://yann.lecun.com/exdb/mnist/
www.bgoncalves.com@bgoncalves
A practical example - MNIST
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
.
Sample N
Feature1
Feature3
Feature2
…
Label
FeatureM
visualize_digits.py

convert_input.py
X y
3 layers:
1 input layer
1 hidden layer
1 output layer
X ⇥1 ⇥2
argmax
http://github.com/bmtgoncalves/Neural-Networks
http://yann.lecun.com/exdb/mnist/
www.bgoncalves.com@bgoncalves
A practical example - MNIST
784 50 10 1
5000examples
10 ⇥ 5150 ⇥ 785
nn.py
forward.py
X ⇥1 ⇥2
argmax
http://github.com/bmtgoncalves/Neural-Networks
www.bgoncalves.com@bgoncalves
A practical example - MNIST
784 50 10 1
5000examples
10 ⇥ 5150 ⇥ 785
nn.py
forward.py
X ⇥1 ⇥2
argmax
http://github.com/bmtgoncalves/Neural-Networks
www.bgoncalves.com@bgoncalves
A practical example - MNIST
784 50 10 1
5000examples
10 ⇥ 5150 ⇥ 785
def predict(Theta1, Theta2, X):
h1 = forward(Theta1, X, sigmoid)
h2 = forward(Theta2, h1, sigmoid)
return np.argmax(h2, 1)
nn.py
forward.py
Forward Propagation
X ⇥1 ⇥2
argmax
http://github.com/bmtgoncalves/Neural-Networks
def forward(Theta, X, active):
N = X.shape[0]
# Add the bias column
X_ = np.concatenate((np.ones((N, 1)), X), 1)
# Multiply by the weights
z = np.dot(X_, Theta.T)
# Apply the activation function
a = active(z)
return a
www.bgoncalves.com@bgoncalves
A practical example - MNIST
784 50 10 1
5000examples
10 ⇥ 5150 ⇥ 785
def predict(Theta1, Theta2, X):
h1 = forward(Theta1, X, sigmoid)
h2 = forward(Theta2, h1, sigmoid)
return np.argmax(h2, 1)
nn.py
forward.py
Forward Propagation
X ⇥1 ⇥2
argmax
http://github.com/bmtgoncalves/Neural-Networks
def forward(Theta, X, active):
N = X.shape[0]
# Add the bias column
X_ = np.concatenate((np.ones((N, 1)), X), 1)
# Multiply by the weights
z = np.dot(X_, Theta.T)
# Apply the activation function
a = active(z)
return a
www.bgoncalves.com@bgoncalves
A practical example - MNIST
784 50 10 1
5000examples
10 ⇥ 5150 ⇥ 785
def predict(Theta1, Theta2, X):
h1 = forward(Theta1, X, sigmoid)
h2 = forward(Theta2, h1, sigmoid)
return np.argmax(h2, 1)
nn.py
forward.py
Forward Propagation
X ⇥1 ⇥2
argmax
http://github.com/bmtgoncalves/Neural-Networks
def forward(Theta, X, active):
N = X.shape[0]
# Add the bias column
X_ = np.concatenate((np.ones((N, 1)), X), 1)
# Multiply by the weights
z = np.dot(X_, Theta.T)
# Apply the activation function
a = active(z)
return a
www.bgoncalves.com@bgoncalves
A practical example - MNIST
784 50 10 1
5000examples
10 ⇥ 5150 ⇥ 785
nn.py
forward.py
50 ⇥ 784 10 ⇥ 50
Forward Propagation
Backward Propagation
Vectors
Matrices
Matrices
X ⇥1 ⇥2
argmax
http://github.com/bmtgoncalves/Neural-Networks
@bgoncalves
Backprop
def backprop(Theta1, Theta2, X, y):
N = X.shape[0]
K = Theta2.shape[0]
J = 0
Delta2 = np.zeros(Theta2.shape)
Delta1 = np.zeros(Theta1.shape)
for i in range(N): # Forward propagation, saving intermediate results
a1 = np.concatenate(([1], X[i])) # Input layer
z2 = np.dot(Theta1, a1)
a2 = np.concatenate(([1], sigmoid(z2))) # Hidden Layer
z3 = np.dot(Theta2, a2)
a3 = sigmoid(z3) # Output layer
y0 = one_hot(K, y[i])
# Cross entropy
J -= np.dot(y0.T, np.log(a3))+np.dot((1-y0).T, np.log(1-a3))
# Calculate the weight deltas
delta_3 = a3-y0
delta_2 = np.dot(Theta2.T, delta_3)[1:]*sigmoidGradient(z2)
Delta2 += np.outer(delta_3, a2)
Delta1 += np.outer(delta_2, a1)
J /= N
Theta1_grad = Delta1/N
Theta2_grad = Delta2/N
return [J, Theta1_grad, Theta2_grad] nn.py
http://github.com/bmtgoncalves/Neural-Networks
@bgoncalves
Training
Theta1 = init_weights(input_layer_size, hidden_layer_size)
Theta2 = init_weights(hidden_layer_size, num_labels)
iter = 0
tol = 1e-3
J_old = 1/tol
diff = 1
while diff > tol:
J_train, Theta1_grad, Theta2_grad = backprop(Theta1, Theta2, X_train, y_train)
diff = abs(J_old-J_train)
J_old = J_train
Theta1 -= .5*Theta1_grad
Theta2 -= .5*Theta2_grad
nn.py
http://github.com/bmtgoncalves/Neural-Networks
www.bgoncalves.com@bgoncalves
Bias-Variance Tradeoff
Loss/Bias
www.bgoncalves.com@bgoncalves
Bias-Variance Tradeoff
Loss/Bias
High Bias
Low Variance
Low Bias
High Variance
www.bgoncalves.com@bgoncalves
Bias-Variance Tradeoff
Loss/Bias
High Bias
Low Variance
Low Bias
High Variance
Variance
www.bgoncalves.com@bgoncalves
Learning Rate
Epoch
High Learning Rate
Very High Learning Rate
Loss
Low Learning Rate
Best Learning Rate
wij wij ↵
@E
@wij
↵ defines size of step in direction of
@E
@!ij
www.bgoncalves.com@bgoncalves
Tips
• online learning - update weights after each case

- might be useful to update model as new data is obtained

- subject to fluctuations
• mini-batch - update weights after a “small” number of cases

- batches should be balanced

- if dataset is redundant, the gradient estimated using only a fraction of the data 

is a good approximation to the full gradient.
• momentum - let gradient change the velocity of weight change instead of the value directly
• rmsprop - divide learning rate for each weight by a running average of “recent” gradients
• learning rate - vary over the course of the training procedure and use different learning rates

for each weight
www.bgoncalves.com@bgoncalves
Neural Network Architectures
www.bgoncalves.com@bgoncalves
Interpretability
www.bgoncalves.com@bgoncalves
“Deep” learning

More Related Content

What's hot

What's hot (20)

[241]large scale search with polysemous codes
[241]large scale search with polysemous codes[241]large scale search with polysemous codes
[241]large scale search with polysemous codes
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
 
Introduction to complex networks
Introduction to complex networksIntroduction to complex networks
Introduction to complex networks
 
DRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterDRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive Writer
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache spark
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Network Analysis with networkX : Real-World Example-1
Network Analysis with networkX : Real-World Example-1Network Analysis with networkX : Real-World Example-1
Network Analysis with networkX : Real-World Example-1
 
Study material ip class 12th
Study material ip class 12thStudy material ip class 12th
Study material ip class 12th
 
Modern Algorithms and Data Structures - 1. Bloom Filters, Merkle Trees
Modern Algorithms and Data Structures - 1. Bloom Filters, Merkle TreesModern Algorithms and Data Structures - 1. Bloom Filters, Merkle Trees
Modern Algorithms and Data Structures - 1. Bloom Filters, Merkle Trees
 
[232]mist 고성능 iot 스트림 처리 시스템
[232]mist 고성능 iot 스트림 처리 시스템[232]mist 고성능 iot 스트림 처리 시스템
[232]mist 고성능 iot 스트림 처리 시스템
 
Deep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceDeep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent space
 
Class 26: Objectifying Objects
Class 26: Objectifying ObjectsClass 26: Objectifying Objects
Class 26: Objectifying Objects
 
Mastering the game of Go with deep neural networks and tree search (article o...
Mastering the game of Go with deep neural networks and tree search (article o...Mastering the game of Go with deep neural networks and tree search (article o...
Mastering the game of Go with deep neural networks and tree search (article o...
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
 
PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)
 
Bloom filter
Bloom filterBloom filter
Bloom filter
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradient
 
Time Series Analysis for Network Secruity
Time Series Analysis for Network SecruityTime Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
 

Viewers also liked

Word2vec and Friends
Word2vec and FriendsWord2vec and Friends
Word2vec and Friends
Bruno Gonçalves
 

Viewers also liked (7)

Making Sense of Data Big and Small
Making Sense of Data Big and SmallMaking Sense of Data Big and Small
Making Sense of Data Big and Small
 
Mining Georeferenced Data
Mining Georeferenced DataMining Georeferenced Data
Mining Georeferenced Data
 
Complenet 2017
Complenet 2017Complenet 2017
Complenet 2017
 
Mining Geo-referenced Data: Location-based Services and the Sharing Economy
Mining Geo-referenced Data: Location-based Services and the Sharing EconomyMining Geo-referenced Data: Location-based Services and the Sharing Economy
Mining Geo-referenced Data: Location-based Services and the Sharing Economy
 
Word2vec and Friends
Word2vec and FriendsWord2vec and Friends
Word2vec and Friends
 
Human Mobility (with Mobile Devices)
Human Mobility (with Mobile Devices)Human Mobility (with Mobile Devices)
Human Mobility (with Mobile Devices)
 
Twitterology - The Science of Twitter
Twitterology - The Science of TwitterTwitterology - The Science of Twitter
Twitterology - The Science of Twitter
 

Similar to Machine(s) Learning with Neural Networks

Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
MayuraD1
 

Similar to Machine(s) Learning with Neural Networks (20)

Artificial Neural Network.pptx
Artificial Neural Network.pptxArtificial Neural Network.pptx
Artificial Neural Network.pptx
 
Introduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowIntroduction to Neural Networks in Tensorflow
Introduction to Neural Networks in Tensorflow
 
19 - Neural Networks I.pptx
19 - Neural Networks I.pptx19 - Neural Networks I.pptx
19 - Neural Networks I.pptx
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017
 
Neural network
Neural networkNeural network
Neural network
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
 
RNNs for Timeseries Analysis
RNNs for Timeseries AnalysisRNNs for Timeseries Analysis
RNNs for Timeseries Analysis
 
Deep learning
Deep learningDeep learning
Deep learning
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learning
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
 
Deep neural networks & computational graphs
Deep neural networks & computational graphsDeep neural networks & computational graphs
Deep neural networks & computational graphs
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural Networks
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 

Recently uploaded

Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
HyderabadDolls
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 

Recently uploaded (20)

Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service AvailableVastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 

Machine(s) Learning with Neural Networks

  • 1. Machine(s) Learning with Neural Networks 
 www.bgoncalves.com github.com/bmtgoncalves/Neural-Networks
  • 3. www.bgoncalves.com@bgoncalves How the Brain “Works” (Cartoon version)
  • 4. www.bgoncalves.com@bgoncalves How the Brain “Works” (Cartoon version) • Each neuron receives input from other neurons • 1011 neurons, each with with 104 weights • Weights can be positive or negative • Weights adapt during the learning process • “neurons that fire together wire together” (Hebb) • Different areas perform different functions using same structure (Modularity)
  • 5. www.bgoncalves.com@bgoncalves How the Brain “Works” (Cartoon version) Inputs Outputf(Inputs)
  • 7. www.bgoncalves.com@bgoncalves Supervised Learning - Classification Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 . Sample N Feature1 Feature3 Feature2 … Label FeatureM • Dataset formatted as an NxM matrix of N samples and M features • Each sample belongs to a specific class or has a specific label. • The goal of classification is to predict to which class a previously unseen sample belongs to by learning defining regularities of each class • K-Nearest Neighbor • Support Vector Machine • Neural Networks • Two fundamental types of problems: • Classification (discrete output value) • Regression (continuous output value)
  • 8. www.bgoncalves.com@bgoncalves Supervised Learning - Overfitting Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 . Sample N Feature1 Feature3 Feature2 … Label FeatureM • “Learning the noise” • “Memorization” instead of “generalization” • How can we prevent it? • Split dataset into two subsets: Training and Testing • Train model using only the Training dataset and evaluate results in the previously unseen Testing dataset. • Different heuristics on how to split: • Single split • k-fold cross validation: split dataset in k parts, train in k-1 and evaluate in 1, repeat k times and average results. TrainingTesting
  • 11. www.bgoncalves.com@bgoncalves Activation Function • Non-Linear function • Differentiable • non-decreasing • Compute new sets of features • Each layer builds up a more abstract representation of the data activation.py http://github.com/bmtgoncalves/Neural-Networks
  • 12. www.bgoncalves.com@bgoncalves Activation Function - Sigmoid (z) = 1 1 + e z activation.py • Non-Linear function • Differentiable • non-decreasing • Compute new sets of features • Each layer builds up a more abstract representation of the data • Perhaps the most common http://github.com/bmtgoncalves/Neural-Networks
  • 13. www.bgoncalves.com@bgoncalves Activation Function - ReLu activation.py • Non-Linear function • Differentiable • non-decreasing • Compute new sets of features • Each layer builds up a more abstract representation of the data • Results in faster learning than with sigmoid http://github.com/bmtgoncalves/Neural-Networks
  • 14. www.bgoncalves.com@bgoncalves Activation Function - tanh (z) = ez e z ez + e z activation.py • Non-Linear function • Differentiable • non-decreasing • Compute new sets of features • Each layer builds up a more abstract representation of the data http://github.com/bmtgoncalves/Neural-Networks
  • 15. www.bgoncalves.com@bgoncalves Perceptron - Forward Propagation • The output of a perceptron is determined by a sequence of steps: • obtain the inputs • multiply the inputs by the respective weights • calculate value of the activation function • (map activation value to predicted output) x1 x2 x3 xN w 1j w2j w3j wN j aj w 0j 1 wT x aj = !T x
  • 16. @bgoncalves Perceptron - Forward Propagation import numpy as np def forward(Theta, X, active): N = X.shape[0] # Add the bias column X_ = np.concatenate((np.ones((N, 1)), X), 1) # Multiply by the weights z = np.dot(X_, Theta.T) # Apply the activation function a = active(z) return a if __name__ == "__main__": Theta1 = np.load('input/Theta1.npy') X = np.load('input/X_train.npy')[:10] from activation import sigmoid active_value = forward(Theta1, X, sigmoid) forward.py http://github.com/bmtgoncalves/Neural-Networks
  • 17. www.bgoncalves.com@bgoncalves Perceptron - Training x1 x2 x3 xN w 1j w2j w3j wN j aj w 0j 1 wT x • Training Procedure: • If correct, do nothing • If output incorrectly outputs 0, add input to weight vector • if output incorrectly outputs 1, subtract input to weight vector • Guaranteed to converge, if a correct set of weights exists • Given enough features, perceptrons can learn almost anything • Specific Features used limit what is possible to learn
  • 18. www.bgoncalves.com@bgoncalves Forward Propagation • The output of a perceptron is determined by a sequence of steps: • obtain the inputs • multiply the inputs by the respective weights • calculate output using the activation function • To create a multi-layer perceptron, you can simply use the output of one layer as the input to the next one.
 • But how can we propagate back the errors and update the weights? x1 x2 x3 xN w 1j w2j w3j wN j aj w 0j 1 wT x 1 w 0k w 1k w2k w3k wNk ak wT a a1 a2 aN
  • 19. www.bgoncalves.com@bgoncalves Backward Propagation of Errors (BackProp) • BackProp operates in two phases: • Forward propagate the inputs and calculate the deltas • Update the weights • The error at the output is a weighted average difference between predicted output and the observed one. • For inner layers there is no “real output”!
  • 20. www.bgoncalves.com@bgoncalves Chain-rule • From the forward propagation described above, we know that the output of a neuron is: • But how can we calculate how to modify the weights ? • We take the derivative of the error with respect to the weights!
 • Using the chain rule:
 • And finally we can update each weight in the previous layer:
 • where is the learning rate yj = wT x @E @wij = @E @yj @yj @wij wij @E @wij wij wij ↵ @E @wij yj ↵
  • 21. www.bgoncalves.com@bgoncalves The Cross Entropy is complementary to sigmoid activation in the output layer and improves its stability. Loss Functions • For learning to occur, we must quantify how far off we are from the desired output. There are two common ways of doing this: • Quadratic error function: • Cross Entropy E = 1 N X n |yn an| 2 J = 1 N X n h yT n log an + (1 yn) T log (1 an) i
  • 22. www.bgoncalves.com@bgoncalves A practical example - MNIST 8 2 visualize_digits.py
 convert_input.py Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 . Sample N Feature1 Feature3 Feature2 … Label FeatureM http://github.com/bmtgoncalves/Neural-Networks http://yann.lecun.com/exdb/mnist/
  • 23. www.bgoncalves.com@bgoncalves A practical example - MNIST Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 . Sample N Feature1 Feature3 Feature2 … Label FeatureM visualize_digits.py
 convert_input.py X y 3 layers: 1 input layer 1 hidden layer 1 output layer X ⇥1 ⇥2 argmax http://github.com/bmtgoncalves/Neural-Networks http://yann.lecun.com/exdb/mnist/
  • 24. www.bgoncalves.com@bgoncalves A practical example - MNIST 784 50 10 1 5000examples 10 ⇥ 5150 ⇥ 785 nn.py forward.py X ⇥1 ⇥2 argmax http://github.com/bmtgoncalves/Neural-Networks
  • 25. www.bgoncalves.com@bgoncalves A practical example - MNIST 784 50 10 1 5000examples 10 ⇥ 5150 ⇥ 785 nn.py forward.py X ⇥1 ⇥2 argmax http://github.com/bmtgoncalves/Neural-Networks
  • 26. www.bgoncalves.com@bgoncalves A practical example - MNIST 784 50 10 1 5000examples 10 ⇥ 5150 ⇥ 785 def predict(Theta1, Theta2, X): h1 = forward(Theta1, X, sigmoid) h2 = forward(Theta2, h1, sigmoid) return np.argmax(h2, 1) nn.py forward.py Forward Propagation X ⇥1 ⇥2 argmax http://github.com/bmtgoncalves/Neural-Networks def forward(Theta, X, active): N = X.shape[0] # Add the bias column X_ = np.concatenate((np.ones((N, 1)), X), 1) # Multiply by the weights z = np.dot(X_, Theta.T) # Apply the activation function a = active(z) return a
  • 27. www.bgoncalves.com@bgoncalves A practical example - MNIST 784 50 10 1 5000examples 10 ⇥ 5150 ⇥ 785 def predict(Theta1, Theta2, X): h1 = forward(Theta1, X, sigmoid) h2 = forward(Theta2, h1, sigmoid) return np.argmax(h2, 1) nn.py forward.py Forward Propagation X ⇥1 ⇥2 argmax http://github.com/bmtgoncalves/Neural-Networks def forward(Theta, X, active): N = X.shape[0] # Add the bias column X_ = np.concatenate((np.ones((N, 1)), X), 1) # Multiply by the weights z = np.dot(X_, Theta.T) # Apply the activation function a = active(z) return a
  • 28. www.bgoncalves.com@bgoncalves A practical example - MNIST 784 50 10 1 5000examples 10 ⇥ 5150 ⇥ 785 def predict(Theta1, Theta2, X): h1 = forward(Theta1, X, sigmoid) h2 = forward(Theta2, h1, sigmoid) return np.argmax(h2, 1) nn.py forward.py Forward Propagation X ⇥1 ⇥2 argmax http://github.com/bmtgoncalves/Neural-Networks def forward(Theta, X, active): N = X.shape[0] # Add the bias column X_ = np.concatenate((np.ones((N, 1)), X), 1) # Multiply by the weights z = np.dot(X_, Theta.T) # Apply the activation function a = active(z) return a
  • 29. www.bgoncalves.com@bgoncalves A practical example - MNIST 784 50 10 1 5000examples 10 ⇥ 5150 ⇥ 785 nn.py forward.py 50 ⇥ 784 10 ⇥ 50 Forward Propagation Backward Propagation Vectors Matrices Matrices X ⇥1 ⇥2 argmax http://github.com/bmtgoncalves/Neural-Networks
  • 30. @bgoncalves Backprop def backprop(Theta1, Theta2, X, y): N = X.shape[0] K = Theta2.shape[0] J = 0 Delta2 = np.zeros(Theta2.shape) Delta1 = np.zeros(Theta1.shape) for i in range(N): # Forward propagation, saving intermediate results a1 = np.concatenate(([1], X[i])) # Input layer z2 = np.dot(Theta1, a1) a2 = np.concatenate(([1], sigmoid(z2))) # Hidden Layer z3 = np.dot(Theta2, a2) a3 = sigmoid(z3) # Output layer y0 = one_hot(K, y[i]) # Cross entropy J -= np.dot(y0.T, np.log(a3))+np.dot((1-y0).T, np.log(1-a3)) # Calculate the weight deltas delta_3 = a3-y0 delta_2 = np.dot(Theta2.T, delta_3)[1:]*sigmoidGradient(z2) Delta2 += np.outer(delta_3, a2) Delta1 += np.outer(delta_2, a1) J /= N Theta1_grad = Delta1/N Theta2_grad = Delta2/N return [J, Theta1_grad, Theta2_grad] nn.py http://github.com/bmtgoncalves/Neural-Networks
  • 31. @bgoncalves Training Theta1 = init_weights(input_layer_size, hidden_layer_size) Theta2 = init_weights(hidden_layer_size, num_labels) iter = 0 tol = 1e-3 J_old = 1/tol diff = 1 while diff > tol: J_train, Theta1_grad, Theta2_grad = backprop(Theta1, Theta2, X_train, y_train) diff = abs(J_old-J_train) J_old = J_train Theta1 -= .5*Theta1_grad Theta2 -= .5*Theta2_grad nn.py http://github.com/bmtgoncalves/Neural-Networks
  • 35. www.bgoncalves.com@bgoncalves Learning Rate Epoch High Learning Rate Very High Learning Rate Loss Low Learning Rate Best Learning Rate wij wij ↵ @E @wij ↵ defines size of step in direction of @E @!ij
  • 36. www.bgoncalves.com@bgoncalves Tips • online learning - update weights after each case
 - might be useful to update model as new data is obtained
 - subject to fluctuations • mini-batch - update weights after a “small” number of cases
 - batches should be balanced
 - if dataset is redundant, the gradient estimated using only a fraction of the data 
 is a good approximation to the full gradient. • momentum - let gradient change the velocity of weight change instead of the value directly • rmsprop - divide learning rate for each weight by a running average of “recent” gradients • learning rate - vary over the course of the training procedure and use different learning rates
 for each weight