Neural Nets from Scratch

Neural Nets from
Scratch Pt. 1
Seth Weidman
March 1, 2016

–John Cochrane
“Don’t get hung up on notation. Understanding
the concepts is the important thing here…If you
understand the concepts, you can invent
your own notation.”

Table of contents
• What are neural nets used for?
• What are they, mathematically?
• How, and why, do they work?

What are neural nets used for?

They’re both examples of supervised
learning, or function approximation.
What are neural nets used for?
What is common about these two
examples?

X1 X2 X3 Y
1 1 1 1
0 1 0 0
0 0 1 1
1 1 0 0
Example problem:
Problem: what is “f”?

Function approximation example: 
Logistic regression:

???
neural nets:

neural nets:
• Neural nets can be written, mathematically, as nested
functions.
• We’ll translate this:
• Into this:
• This will tell us exactly how to “train” the neural net: that
is, how to update the weights to make better
predictions.

Goal:
• Break down how neural nets make their
predictions. That is, compute:
• Understand how this computation corresponds to a
typical notion of a neural net:

A digression on the sigmoid function, part 1:

Final step: Prediction
Goal: compute

How do we know if this prediction was any good?
Answer: we compute the loss:

How can change the weights V 
and W by to reduce this loss?
Strategy: compute the derivatives:
Then, subtract those derivatives 
from the weights themselves:

Since neural nets are nested functions, we’ll compute the
derivatives of each of those functions:
We’ll then be able to compute and by
multiplying these derivatives together. This will work
because of the chain rule.

A digression on the chain rule:
Because of the chain rule, we can compute the derivative of
a nested function by taking the derivatives of each individual
function and multiplying those derivatives together.

Updating the weights, step 1:
Goal: compute and

A digression on the sigmoid function, part 2:

Goal: compute and

Updating the weights: updating “W”

Goal: compute and

{
{
{
3 x 1 1 x 4 1 x 4
{Element-wise 
multiplication
{
Matrix 
multiplication
Updating the weights: updating “V”

Leading online learning platform
explanation of backpropogation
This tells you how backpropogation works,
but not why it works.

Key takeaways:
• Neural nets are, mathematically, nested functions.
• “Backpropogation” is just computing successive
derivatives of these individual functions.
• Backpropogation works because of the chain rule.

Other resources
• Blog posts:
• Matt Mazur: A Step by Step Backpropogation Example: 
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
• Andrej Karpathy: Hacker’s Guide to Neural Networks:  
http://karpathy.github.io/neuralnets/
• Andrew Trask: A Neural Network in 11 Lines of Code:  
http://iamtrask.github.io/2015/07/12/basic-python-network/
• sethweidman.com: sign up for the newsletter to get 
emailed when new content is posted
• Learn about other kinds of neural net architectures at 
future versions of this Meetup

Table of contents
• What are neural nets used for?
• What are they, mathematically?
• How, and why, do they work?
• Connection to “Deep Learning”

• “Layers” send “inputs” forwards and “gradients” backwards.
• Each layer has an “activation function”.
• Layers that are neither input nor output are “hidden” layers.
• Sending the “errors” backwards is called “backpropogation”.
Terminology

The underlying math is exactly the same as before, just
described differently.

To learn more, visit sethweidman.com. 
Sign up for the newsletter!

To learn about other neural net architectures, check out
future Chi Town Machine Learning Meetups, featuring
speakers such as Jeremy Watt, Rami Jachi, and Seth
Weidman. Sponsored by Nousot!

x_1
w_1
x_2
w_2
h = x_1 * w_1 + 
x_2 * w_2
a = sigmoid(h)
H = a * W
error =
y - y_hat
delta_H_out =
(y - y_hat) * 
sigmoid’(H)
W_update =
delta_H_out * 
a
scaled_error_out = 
delta_H_out * W
delta_h_out =
scaled_error_out * 
sigmoid’(h)
w_update =
delta_h_out * 
x
y_hat =  
sigmoid(H)

Neural Nets from Scratch

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Viewers also liked

Viewers also liked (8)

Similar to Neural Nets from Scratch

Similar to Neural Nets from Scratch (20)

Recently uploaded

Recently uploaded (20)

Neural Nets from Scratch