Neural Nets from Scratch
This contains the slides (without animations) for a talk I gave on the mathematical foundations of neural nets, and how one would code a neural net from scratch in a way that made the math work out, using Python.
2. –John Cochrane
“Don’t get hung up on notation. Understanding
the concepts is the important thing here…If you
understand the concepts, you can invent
your own notation.”
3. Table of contents
• What are neural nets used for?
• What are they, mathematically?
• How, and why, do they work?
12. Function approximation example:
neural nets:
• Neural nets can be written, mathematically, as nested
functions.
• We’ll translate this:
• Into this:
• This will tell us exactly how to “train” the neural net: that
is, how to update the weights to make better
predictions.
13. Table of contents
• What are neural nets used for?
• What are they, mathematically?
• How, and why, do they work?
14. Goal:
• Break down how neural nets make their
predictions. That is, compute:
• Understand how this computation corresponds to a
typical notion of a neural net:
25. How do we know if this prediction was any good?
Answer: we compute the loss:
26. How can change the weights V
and W by to reduce this loss?
Strategy: compute the derivatives:
Then, subtract those derivatives
from the weights themselves:
27. Strategy: compute the derivatives:
Since neural nets are nested functions, we’ll compute the
derivatives of each of those functions:
We’ll then be able to compute and by
multiplying these derivatives together. This will work
because of the chain rule.
28. A digression on the chain rule:
Because of the chain rule, we can compute the derivative of
a nested function by taking the derivatives of each individual
function and multiplying those derivatives together.
47. Leading online learning platform
explanation of backpropogation
This tells you how backpropogation works,
but not why it works.
48. Key takeaways:
• Neural nets are, mathematically, nested functions.
• “Backpropogation” is just computing successive
derivatives of these individual functions.
• Backpropogation works because of the chain rule.
49. Other resources
• Blog posts:
• Matt Mazur: A Step by Step Backpropogation Example:
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
• Andrej Karpathy: Hacker’s Guide to Neural Networks:
http://karpathy.github.io/neuralnets/
• Andrew Trask: A Neural Network in 11 Lines of Code:
http://iamtrask.github.io/2015/07/12/basic-python-network/
• sethweidman.com: sign up for the newsletter to get
emailed when new content is posted
• Learn about other kinds of neural net architectures at
future versions of this Meetup
51. Table of contents
• What are neural nets used for?
• What are they, mathematically?
• How, and why, do they work?
• Connection to “Deep Learning”
52. • “Layers” send “inputs” forwards and “gradients” backwards.
• Each layer has an “activation function”.
• Layers that are neither input nor output are “hidden” layers.
• Sending the “errors” backwards is called “backpropogation”.
Terminology
53. The underlying math is exactly the same as before, just
described differently.
54. To learn more, visit sethweidman.com.
Sign up for the newsletter!
55. To learn about other neural net architectures, check out
future Chi Town Machine Learning Meetups, featuring
speakers such as Jeremy Watt, Rami Jachi, and Seth
Weidman. Sponsored by Nousot!
56. Key takeaways:
• Neural nets are, mathematically, nested functions.
• “Backpropogation” is just computing successive
derivatives of these individual functions.
• Backpropogation works because of the chain rule.