SlideShare a Scribd company logo
Backprop,
Gradient Descent,
And Auto Differentiation
Sam Abrahams, Memdump LLC
https://goo.gl/tKOvr
7
Link to these slides:
YO!
I am Sam Abrahams
I am a data scientist and engineer.
You can find me on GitHub @samjabrahams
Buy my book:
TensorFlow for Machine Intelligence
1.
Gradient
Descent
Guess and Check
for Adults
Gradient Descent Outline
▣ Problem: fit data
▣ Basic OLS linear regression
▣ Visualize error curve and regression line
▣ Step by step through changes
Scatter Plot
Simple Start: Linear Regression
Ordinary Least Squares Linear Regression
Simple Start: Linear Regression
Simple Start: Linear Regression
▣ Want to find a model that can fit our data
▣ Could do it algebraically…
▣ BUT that doesn’t generalize well
Simple Start: Linear Regression
▣ Step back: what does ordinary linear regression
try to do?
▣ Minimize the sum of (or average) squared error
▣ How else could we minimize?
Gradient Descent
▣ Start with a random guess
▣ Use the derivative (gradient when dealing with
multiple variables) to get the slope of the error
curve
▣ Move our parameters to move down the error
curve
Single Variable Cost Curve
J (cost)
W
Single Variable Cost Curve
J (cost)
Random guess put us here
W
∂
W
J (cost)
∂J
∂W
∂
W
J (cost)
∂J
∂W < 0
∂
W
J (cost)
∂J
∂W < 0; move to the right!
Single Variable Cost Curve
J (cost)
W
Single Variable Cost Curve
J (cost)
W
∂J
∂W
Single Variable Cost Curve
J (cost)
W
∂J
∂W < 0
Single Variable Cost Curve
J (cost)
W
∂J
∂W < 0; move to the right!
Single Variable Cost Curve
J (cost)
W
Single Variable Cost Curve
J (cost)
W
∂J
∂W
Single Variable Cost Curve
J (cost)
W
∂J
∂W
Single Variable Cost Curve
J (cost)
∂J
∂W
W
1.5
Gradient
Descent Variants
Intelligent descent
into madness
Gradient Descent Variants
▣ There are additional techniques that can help
speed up (or otherwise improve) gradient
descent
▣ The next slides describe some of these!
▣ More details (and some awesome visuals)
here: article by Sebastian Ruder
Gradient Descent
▣Get true gradient with respect to all examples
▣One step = one epoch
▣Slow and generally unfeasible for large training
sets
Gradient Descent
Stochastic Gradient Descent
▣Basic idea: approximate derivative by only using
one example
▣“Online learning”
▣Update weights after each example
Stochastic Gradient Descent
Mini-Batch Gradient Descent
▣Similar idea to stochastic gradient descent
▣Approximate derivative with a sample batch of
examples
▣Middle ground between “true” stochastic
gradient and full gradient descent
Mini-Batch Gradient Descent
Momentum
▣Idea: if we see multiple gradients in a row with
same direction, we should increase our learning
rate
▣Accumulate a “momentum” vector to speed up
descent
Without Momentum
Momentum
Nesterov Momentum
▣ Idea: before updating our weights, look ahead
to where we have accumulated momentum
▣ Adjust our update based on “future”
Nesterov Momentum
Source: Lecture by Geoffrey Hinton
Momentum Vector
Gradient/correction
Nesterov steps
Standard momentum steps
AdaGrad
▣ Idea: update individual weights differently
depending on how frequently they change
▣ Keeps a running tally of previous updates for
each weight, and divides new updates by a
factor of the previous updates
▣ Downside: for long running training,
eventually all gradients diminish
▣ Paper on jmlr.org
AdaDelta / RMSProp
▣ Two slightly different algorithms with same
concept: only keep a window of the previous
n gradients when scaling updates
▣ Seeks to reduce diminishing gradient problem
with AdaGrad
▣ AdaDelta Paper on arxiv.org
Adam
▣ Adam expands on the concepts introduced
with AdaDelta and RMSProp
▣ Uses both first order and second order
moments, decayed over time
▣ Paper on arxiv.org
2.
Forward & Back
Propagation
The Chain Rule
got the last laugh,
high-school-you
Beyond OLS Regression
▣ Can’t do everything with linear regression!
▣ Nor polynomial…
▣ Why can’t we let the computer figure out how
to model?
Neural Networks: Idea
▣ Chain together non-linear functions
▣ Have lots of parameters that can be adjusted
▣ These “weights” determine the model function
Feed forward neural network
+1 +1
x1
x2
+1
l (2)
l (3)
l (4)
l (1)
input hidden 1 hidden 2 output
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
σ
σ
σ
+1
σ
σ
σ
+1
SM
SM
SM
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
W(l)
: weight matrix for layer l
z(l)
: input into layer l
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Layer 1
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Layer 2
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Layer 3
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Layer 4
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Biases (constant units)
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Input
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Weight matrices
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Layer inputs, z(l)
W(l)
: weight matrix for layer l
z(l)
: input into layer l
z(l)
= W(l-1)
a(l-1)
+ b(l-1)
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Activation vectors
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Sigmoid activation function
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
σ: sigmoid (logistic) function
SM: Softmax function
SM
SM
SM
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
Softmax activation function
W(l)
: weight matrix for layer l
z(l)
: input into layer l
σ: sigmoid (logistic) function
SM: Softmax function
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
σ: sigmoid (logistic) function
SM: Softmax function
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
W(l)
: weight matrix for layer l
z(l)
: input into layer l
Output
SM
SM
SM
x1
x2
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
Forward Propagation
Input vector is passed into the network
x1
x2
+1
W(1)
a(2)
W(2)
W(3)
a(3)
a(4)
Forward Propagation
Input is multiplied with W(1)
weight matrix and added with
layer 1 biases to calculate z(2)
z(2)
= W(1)
x + b(1)
σ
σ
σ
x1
x2
+1
W(1)
a(2)
Forward Propagation
W(2)
W(3)
a(3)
a(4)
Activation value for the second layer is calculated by passing
z(2)
into some function. In this case, the sigmoid function.
a(2)
= σ(z(2)
)
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
a(2)
Forward Propagation
W(3)
a(3)
a(4)
z(3)
is calculated by multiplying a(2)
vector with W(2)
weight
matrix and adding layer 2 biases
z(3)
= W(2)
a(2)
+ b(2)
σ
σ
σ
+1
σ
σ
σ
x1
x2
+1
W(1)
W(2)
a(2)
a(3)
Forward Propagation
Similar to previous layer, a(3)
is calculated by passing z(3)
into
the sigmoid function
a(3)
= σ(z(3)
)
W(3)
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
Forward Propagation
z(4)
is calculated by multiplying a(3)
vector with W(3)
weight
matrix and adding layer 3 biases
z(4)
= W(3)
a(3)
+ b(3)
a(4)
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
SM
SM
SM
Forward Propagation
For the final layer, we calculate a(4)
by passing z(4)
into the
Softmax function
a(4)
= SM(z(4)
)
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
SM
SM
SM
Forward Propagation
We then make our prediction based on the final layer’s output
Page of Math
z(2)
= W(1)
x + b(1)
z(3)
= W(2)
a(2)
+ b(2)
z(4)
= W(3)
a(3)
+ b(3)
a(2)
= σ(z(2)
)
a(3)
= σ(z(3)
)
a(4)
= ŷ = SM(z(4)
)
Goal:
Find which direction to shift weights
How:
Find partial derivatives of the cost with
respect to weight matrices
How (again):
Chain rule the sh*t out of this mofo
DANGER:
MATH
Chain Rule Reminder
Chain Rule Reminder
Chain rule example
Find derivative with respect to x:
Chain rule example
First split into two functions:
Chain rule example
Then get derivative of components:
Chain rule example
Chain rule example
Chain rule example
Chain rule example
Chain rule example
Chain rule example
Chain rule example
DEEPER
DEEPER
Want:
DEEPER
DEEPER
NOTE: “Cancelling out” isn’t how the math actually works. But it’s a handy way to think about it.
DEEPER
NOTE: “Cancelling out” isn’t how the math actually works. But it’s a handy way to think about it.
DEEPER
NOTE: “Cancelling out” isn’t how the math actually works. But it’s a handy way to think about it.
Back Prop
Back to backpropagation:
Want:
Return of Page of Math
z(2)
= W(1)
x + b(1)
z(3)
= W(2)
a(2)
+ b(2)
z(4)
= W(3)
a(3)
+ b(3)
a(2)
= σ(z(2)
)
a(3)
= σ(z(3)
)
a(4)
= ŷ = SM(z(4)
)
Partials, step by step
a(4)
= ŷ = SM(z(4)
)
With cross entropy loss:
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
c
o
s
t
σ: sigmoid (logistic) function
SM: Softmax function
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
Back Propagation
Want:
Partials, step by step
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
c
o
s
t
σ: sigmoid (logistic) function
SM: Softmax function
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
Back Propagation
Want:
Partials, step by step
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
c
o
s
t
σ: sigmoid (logistic) function
SM: Softmax function
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
Back Propagation
Want:
Partials, step by step
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
c
o
s
t
σ: sigmoid (logistic) function
SM: Softmax function
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
Back Propagation
Want:
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
c
o
s
t
σ: sigmoid (logistic) function
SM: Softmax function
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
Back Propagation
Want:
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
c
o
s
t
σ: sigmoid (logistic) function
SM: Softmax function
xi
: input value
ŷ: output vector
+1: bias (constant) unit
a(l)
: activation vector for layer l
W(l)
: weight matrix for layer l
z(l)
: input into layer l
SM
SM
SM
Back Propagation
Want:
Partials, step by step
As programmers...
How do we NOT
do this ourselves?
We’re lazy by trade.
3.
Automatic
Differentiation
Bringing sexy lazy
back
Why not hard code?
▣ Want to iterate fast!
▣ Want flexibility
▣ Want to reuse our code!
Auto-Differentiation: Idea
▣ Use functions that have easy-to-compute
derivatives
▣ Compose these functions to create more
complex super-model
▣ Use the chain rule to get partial derivatives of
the model
What makes a “good” function?
▣ Obvious stuff: differentiable (continuously
and smoothly!)
▣ Simple operations: add, subtract, multiply
▣ Reuse previous computation
Nice functions: sigmoid
Nice functions: sigmoid
Nice functions: hyperbolic tangent
Nice functions: hyperbolic tangent
Nice functions: Rectified linear unit
Nice functions: Rectified linear unit
Nice functions: Addition
Nice functions: Addition
Nice functions: Multiplication
Good news:
Most of these use activation
values! Can store in cache!
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
SM
SM
SM
Store activation values for backprop
σ
σ
σ
+1
σ
σ
σ
+1
x1
x2
+1
W(1)
W(2)
W(3)
a(2)
a(3)
a(4)
ŷ
SM
SM
SM
Chain rule takes care of the rest
It’s Over!
Any questions?
Email: sam@memdump.io
GitHub: samjabrahams
Twitter: @sabraha
Presentation template by SlidesCarnival
Neural Network terms
▣ Neuron: a unit that transforms input via an activation function and outputs the result to
other neurons and/or the final result
▣ Activation function: a(l)
, a transformation function, typically non-linear. Sigmoid, ReLU
▣ Bias unit: a trainable scalar shift, typically applied to each non-output layer (think
y-intercept term in the linear function)
▣ Layer: a grouping of “neurons” and biases that (in general) take in values from the same
previous neurons and pass values forwards to the same targets
▣ Hidden layer: A layer that is neither the input layer nor the output layer
▣ Input layer:
▣ Output layer
Terminology used
▣ Learning rate
▣ Parameters
▣ Training step
▣ Training example
▣ Epoch vs training time
▣

More Related Content

What's hot

Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
 

What's hot (20)

Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
UNIT-4.pptx
UNIT-4.pptxUNIT-4.pptx
UNIT-4.pptx
 
04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
 
Activation function
Activation functionActivation function
Activation function
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Classification using back propagation algorithm
Classification using back propagation algorithmClassification using back propagation algorithm
Classification using back propagation algorithm
 
Hetro associative memory
Hetro associative memoryHetro associative memory
Hetro associative memory
 
Adaptive Resonance Theory
Adaptive Resonance TheoryAdaptive Resonance Theory
Adaptive Resonance Theory
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 

Viewers also liked

Boston Spark Meetup May 24, 2016
Boston Spark Meetup May 24, 2016Boston Spark Meetup May 24, 2016
Boston Spark Meetup May 24, 2016
Chris Fregly
 
Machine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math RefresherMachine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math Refresher
butest
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Alex Pinto
 
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Chris Fregly
 

Viewers also liked (20)

Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
 
Boston Spark Meetup May 24, 2016
Boston Spark Meetup May 24, 2016Boston Spark Meetup May 24, 2016
Boston Spark Meetup May 24, 2016
 
Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...
Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...
Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...
 
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
 
[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊
 
Machine Learning Essentials (dsth Meetup#3)
Machine Learning Essentials (dsth Meetup#3)Machine Learning Essentials (dsth Meetup#3)
Machine Learning Essentials (dsth Meetup#3)
 
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
 
Machine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math RefresherMachine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math Refresher
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly Problem
 
02 math essentials
02 math essentials02 math essentials
02 math essentials
 
Machine Learning without the Math: An overview of Machine Learning
Machine Learning without the Math: An overview of Machine LearningMachine Learning without the Math: An overview of Machine Learning
Machine Learning without the Math: An overview of Machine Learning
 
高嘉良/Open Innovation as Strategic Plan
高嘉良/Open Innovation as Strategic Plan高嘉良/Open Innovation as Strategic Plan
高嘉良/Open Innovation as Strategic Plan
 
TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰
 
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
 
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
 
DRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterDRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive Writer
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
 

Similar to Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark and TensorFlow Meetup - 08-04-2016

Chapter No. 6: Backpropagation Networks
Chapter No. 6:  Backpropagation NetworksChapter No. 6:  Backpropagation Networks
Chapter No. 6: Backpropagation Networks
RamkrishnaPatil17
 
Using CNTK's Python Interface for Deep LearningDave DeBarr -
Using CNTK's Python Interface for Deep LearningDave DeBarr - Using CNTK's Python Interface for Deep LearningDave DeBarr -
Using CNTK's Python Interface for Deep LearningDave DeBarr -
PyData
 

Similar to Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark and TensorFlow Meetup - 08-04-2016 (20)

NeuralNets_DLbootcamp_Finaldayofseptember
NeuralNets_DLbootcamp_FinaldayofseptemberNeuralNets_DLbootcamp_Finaldayofseptember
NeuralNets_DLbootcamp_Finaldayofseptember
 
19 - Neural Networks I.pptx
19 - Neural Networks I.pptx19 - Neural Networks I.pptx
19 - Neural Networks I.pptx
 
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
 
Lesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfLesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdf
 
Neural Networks - How do they work?
Neural Networks - How do they work?Neural Networks - How do they work?
Neural Networks - How do they work?
 
3. Training Artificial Neural Networks.pptx
3. Training Artificial Neural Networks.pptx3. Training Artificial Neural Networks.pptx
3. Training Artificial Neural Networks.pptx
 
ppt - Deep Learning From Scratch.pdf
ppt - Deep Learning From Scratch.pdfppt - Deep Learning From Scratch.pdf
ppt - Deep Learning From Scratch.pdf
 
Auto encoders in Deep Learning
Auto encoders in Deep LearningAuto encoders in Deep Learning
Auto encoders in Deep Learning
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
 
Reinfrocement Learning
Reinfrocement LearningReinfrocement Learning
Reinfrocement Learning
 
Graph Modification Problems: A Modern Perspective
Graph Modification Problems: A Modern PerspectiveGraph Modification Problems: A Modern Perspective
Graph Modification Problems: A Modern Perspective
 
Chapter No. 6: Backpropagation Networks
Chapter No. 6:  Backpropagation NetworksChapter No. 6:  Backpropagation Networks
Chapter No. 6: Backpropagation Networks
 
Using CNTK's Python Interface for Deep LearningDave DeBarr -
Using CNTK's Python Interface for Deep LearningDave DeBarr - Using CNTK's Python Interface for Deep LearningDave DeBarr -
Using CNTK's Python Interface for Deep LearningDave DeBarr -
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slides
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Reinforcement Learning and Artificial Neural Nets
Reinforcement Learning and Artificial Neural NetsReinforcement Learning and Artificial Neural Nets
Reinforcement Learning and Artificial Neural Nets
 
Computer Vision - Alignment and Tracking.pptx
Computer Vision - Alignment and Tracking.pptxComputer Vision - Alignment and Tracking.pptx
Computer Vision - Alignment and Tracking.pptx
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
 

More from Chris Fregly

Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine Learning
Chris Fregly
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
Chris Fregly
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Chris Fregly
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Chris Fregly
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Chris Fregly
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
Chris Fregly
 

More from Chris Fregly (20)

AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and Data
 
Pandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdfPandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdf
 
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS MeetupRay AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
 
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedSmokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine Learning
 
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
 
Quantum Computing with Amazon Braket
Quantum Computing with Amazon BraketQuantum Computing with Amazon Braket
Quantum Computing with Amazon Braket
 
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
 
AWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapAWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:Cap
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
 

Recently uploaded

Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
mbmh111980
 

Recently uploaded (20)

iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by Skilrock
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data Migration
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by Design
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with StrimziStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf
 
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework
 
A Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data MigrationA Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data Migration
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabber
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 

Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark and TensorFlow Meetup - 08-04-2016