SlideShare a Scribd company logo
1 of 112
Machine Learning Essentials
Part 2: Artificial Neural Networks
Lior King
Lior.King@gmail.com
1
Previously on “Machine Learning Essentials...”
2
Linear regression
Finding the relation between the age
and the salary.
Predicting the salary for any given age
3
Historical
Data points
Experience
Salary
Historical
Data points
Salary (dependent)
Minimize the error
The Error (or Residual) is the offset of
the dependent variable from the
independent variable.
The goal of any regression is to minimize
the error for the training data and to
FIND THE OPTIMAL LINE (or curve in
case of logistic regression).
4
Error
Experience (independent)
Historical
Data points
Salary (dependent)
Minimize the error – sum of square diffs
The error = 𝑖=1
𝑁
(𝑦𝑖 − 𝑦𝑖)2
5
y
Error
𝒚
Experience
Minimize the error with Stochastic Gradient
Descent (SGD)
Error =
1
𝑁 𝑖=1
𝑁
(𝑦𝑖 − 𝑦𝑖)2
N -> number of historical data points
1. Initialize some value for the slope
and intercept.
2. Find the current value of the error
function.
6
Error
Slope
Intercept
3. Find the slope at the current point (partial derivative) and move slightly
downwards in the direction.
4. Repeat until you reach a minimum OR stop after certain number of iterations
Historical
Data points
Salary (dependent)
Experience
Minimize the error
The iterative SGD process will slowly
change the slope and the intercept until
the error is minimal.
7
Multiple Linear Regression
• Simple linear regression:
𝑌 = 𝑏0 + 𝑏1*𝑥1
• Multiple linear regression:
𝑌 = 𝑏0 + 𝑏1*𝑥1 + 𝑏2*𝑥2 + … + 𝑏 𝑛∗𝑥 𝑛
Important note:
You need to exclude variables that will “mess” the prediction and keep the ones
that actually help predicting the desired result.
8
Polynomial Linear Regression
9
Simple linear regression:
𝑌 = 𝑏0 + 𝑏1*𝑥1
Polynomial linear regression:
𝑌 = 𝑏0 + 𝑏1*𝑥1 + 𝑏2∗𝑥1
𝟐
+ … + 𝑏 𝑛∗𝑥1
𝒏
Quadratic: degree = 2
Cubic: degree = 3
10
Artificial Neural Networks - ANN
“Traditional” ML vs. “Representation” ML
• “Traditional” ML based systems rely on experts to decide what features to pay
attention to.
• “Representation” ML based systems figure out by themselves what features to pay
attention to.
• The most common representation ML algorithm is called Artificial Neural Network
• ANN are commonly used for:
• Image/video/audio processing
• Speech recognition
• Natural language processing (NLP)
• Games
11
The neuron
12
Neuron Axon
Dendrites
Synapse
Artificial Neural Networks - ANN
• Inspired by the neurons in the human mind.
• Can learn and organize data and thus create an understanding of relationships.
13
Artificial Neuron
14
Neuron
Input Signal 1 (X1)
Input Signal 2 (X2)
Input Signal n (Xn)
Output Signal
⁞
Independent variables
Dependent variable
Can be:
• Continuous (price)
• Binary (Yes/No)
• Categorical
The neuron behaves like a function
W1
W2
Wn
The neural network flow
In neural networks, the activation functions are non-linear.
15
Activation functions
16
MNIST Example
• NIST = US National Institute of Standards and
Technologies
• MNIST – a subset of NIST’s handwritten digit
data set
• Consists of a training set of 60,000 samples and
a test set of 10,000 samples.
• 28x28 pixels grayscale images and digit labels
for each image.
• http://Yann.lecun.com/exdb/mnist
17
MNIST
18
MNIST example – starting with simple ANN
19
W(783, 9)
W(0, 0)
W(783, 0)
784 Pixels…
…
0 1 2 9
0 1 2 3 4 5 6 7 8 783
7840 weights
W(0, 9)
28x28
Pixels
10 Nodes
20
𝑊0,9…𝑊0,3𝑊0,2𝑊0,1𝑊0,0
𝑊1,9…𝑊1,3𝑊1,2𝑊1,1𝑊1,0
𝑊2,9…𝑊2,3𝑊2,2𝑊2,1𝑊2,0
𝑊3,9…𝑊3,3𝑊3,2𝑊3,1𝑊3,0
𝑊4,9…𝑊4,3𝑊4,2𝑊4,1𝑊4,0
𝑊5,9…𝑊5,3𝑊5,2𝑊5,1𝑊5,0
𝑊6,9…𝑊6,3𝑊6,2𝑊6,1𝑊6,0
𝑊7,9…𝑊7,3𝑊7,2𝑊7,1𝑊7,0
………………
𝑊783,9…𝑊783,3𝑊783,2𝑊783,1𝑊783,0
x x x x x x
x
x
x
10 columns (for 10 digits)
783rows(foreverypixel)
𝑏9…𝑏3𝑏2𝑏1𝑏0
…
𝑋𝑖 𝑊𝑖,0 + b0
…
𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0784 pixels
𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝑋6 𝑋7 𝑋783
Softmax
0.20.40.10.30.10.80.20.60.10.2
9876543210
Softmax Softmax Softmax Softmax
+++++
Inference
function
0 1 2 3 9
Softmax function
Activation
function
Wrong !
Biases (1 bias per digit)
Using “softmax” activation function
• In this example we will use “softmax” activation function:
• Good for classification problems.
• Increases the differences so the output gets closer to 1 or closer to 0
21
Loss/error measurement function
22
9876543210
0100000000
9876543210
0.20.40.10.30.10.80.20.60.10.2
“one hot” actual probabilities
Computed probabilities
Cross entropy error measurement function: - 𝐴𝑖log(𝑌𝑖)
A
Y
Minimize the error with Gradient Descent
Optimization Function
Error =
1
𝑁 𝑖=1
𝑁
𝑒𝑖
2
N -> number of historical datapoints
1. Initialize some value for the slope
and intercept.
2. Find the current value of the error
function.
23
Error
Slope
Intercept
3. Find the slope at the current point (partial derivative) and move slightly
downwards in the direction.
4. Repeat until you reach a minimum OR stop after certain number of iterations
Training the neural network
• How can we know what should be the weights and biases?
• Through training the network
• The code will figure out the correct values BY ITSELF
• How does the training work?
1. Starting with zero weights and bias, we multiply the input values by the weights and add the bias
2. We get an incorrect output
But we know what the correct output should be.
1. The system measures the difference between the incorrect output and the correct output. This is
call “loss measurement function”.
• The loss measurement function calculates how big the error is.
2. Now the system will change the weights and biases to minimize the error. This is called
“optimization function” and goes back to step 3 until it cannot reduce the error anymore.
24
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
25
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
26
9876543210
0100000000
9876543210
0.20.40.10.30.10.80.20.60.10.2
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
27
9876543210
0100000000
9876543210
0.20.50.10.30.10.70.20.40.10.1
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
28
9876543210
0100000000
9876543210
0.10.60.10.20.10.60.20.30.10.1
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
29
9876543210
0100000000
9876543210
0.10.70.10.20.10.40.10.20.10.1
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
30
9876543210
0100000000
9876543210
0.10.90.10.100.10.10.10.10.1
Correct!
31
𝑊0,9…𝑊0,3𝑊0,2𝑊0,1𝑊0,0
𝑊1,9…𝑊1,3𝑊1,2𝑊1,1𝑊1,0
𝑊2,9…𝑊2,3𝑊2,2𝑊2,1𝑊2,0
𝑊3,9…𝑊3,3𝑊3,2𝑊3,1𝑊3,0
𝑊4,9…𝑊4,3𝑊4,2𝑊4,1𝑊4,0
𝑊5,9…𝑊5,3𝑊5,2𝑊5,1𝑊5,0
𝑊6,9…𝑊6,3𝑊6,2𝑊6,1𝑊6,0
𝑊7,9…𝑊7,3𝑊7,2𝑊7,1𝑊7,0
………………
𝑊783,9…𝑊783,3𝑊783,2𝑊783,1𝑊783,0
x x x x x x
x
x
x
10 columns (for 10 digits)
783rows(foreverypixel)
𝑏9…𝑏3𝑏2𝑏1𝑏0
…
𝑋𝑖 𝑊𝑖,0 + b0
…
𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0784 pixels
𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝑋6 𝑋7 𝑋783
Softmax
0.20.90.10.30.10.20.10.20.10.2
9876543210
Softmax Softmax Softmax Softmax
+++++
Inference
function
0 1 2 3 9
Softmax function
Activation
function
Correct!
TensorFlow
32
Tensor
• An n-dimensional array or list used to represent data
• Defined by the 3 properties:
• Rank: Scalar (number), Vector (1-dim array), Matrix (2-dim array), Cube, etc.
• Shape
• Type
33
TypeShapeRankExample
Int32[]0 (scalar)1
Int32[5]1 (vector)[1, 5, 3, 6, 2]
Int32[2, 5]2 (matrix)[[1, 5, 3, 8, 4], [3, 2, 6, 4, 7] ]
Int32[3, 2, 3]3 (cube)[ [ [1, 6, 3], [2, 4, 3] ]
[ [2, 6, 2], [3, 7, 4] ]
[ [1, 9, 2], [4, 8, 3] ] ]
What is TensorFlow
• The most popular Python library for building ensemble algorithms – mainly NN.
• Initially developed by Google and today it is open sourced
• Provides a library of predefined versions of many common ML algorithms, but also
enables to flexibly create your own algorithm.
• Can harness the GPUs
• Scalable – using “execution master” you can run on a laptop as well as on a large
scale cluster in remote servers.
34
Tensor Features and Tools
• Name property - used to identify elements in the graph
• Name Scope property – used for grouping elements (like “conv1” for 1st conv layer)
• Summary class – has methods for writing summaries to log files. Can capture how
elements change over time.
• TensorBoard – A web server that uses the log files to visualize the computation
graph and training progress. Can be used from remote desktops.
• Common add-ons (for easier developement):
• TFLearn - Simplifies the use of TensorFlow only and can converse with TF data types.
• Keras – Simplification which supports multiple frameworks (including Microsoft CNTK).
35
Training neural networks with TensorFlow
With TensorFlow you need define the following:
1. The input data:
• “Placeholders” – The input training data.
• “Variables” – What we ask TF to compute through training. With neural network these are
weights and biases.
2. The inference function (which is applied on the weights and biases).
3. Loss/error measurement function (example: “Cross Entropy”)
4. Optimization function to minimize loss (example: “Gradient Descent”)
36
TensorFlow - MNIST demo
37
ImplementationConcept
MNIST dataPrepared Data
Sum(X* weight) + bias -> ActivationInference
Cross EntropyLoss Measurement
Gradient descent optimizerOptimize to minimize loss
TensorFlow DEMO
38
Why Convolutional Neural Networks (CNN)
• Problem – Flattening the images caused us to lose the shape information.
• When we see a digit, we recognize the lines and curves.
• We need to “zoom out” slowly from the picture.
39
Hidden layers
40
Deep neural networks
41
Deep Learning
• Use of multi layered neural network is called Deep Learning
• Some applications:
• Natural language processing (NLP)
• Face recognition
• Image analysis (what’s in the picture)
• Image search
• Voice analysis
• Video analysis
42
Convolutional Neural Network(CNN):
X’s and O’s
Says whether a picture is of an X or an O
X or OCNN
A two-dimensional
array of pixels
For example
CNN X
CNN O
Trickier cases
CNN X
CNN O
Deciding is hard
?
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 1 -1 -1 -1
-1 -1 1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 1 -1 -1
-1 -1 -1 1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Computers are literal
x
ConvNets match pieces of the image
=
=
=
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
Features match pieces of the image
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
Filtering: The math behind the match
1. Line up the feature and the image patch.
2. Multiply each image pixel by the corresponding feature pixel.
3. Add them up.
4. Divide by the total number of pixels in the feature.
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1 1
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1 1
1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1 1
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1
1 1 1
1 1 1
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 -1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 -1
1 1 1
-1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1 1 -1
1 1 1
-1 1 1
Filtering: The math behind the match
55
1 1 -1
1 1 1
-1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Convolution: Trying every possible match
=
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1 -1 -1
-1 1 -1
-1 -1 1
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
=
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
=
=
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Convolution layer
• One image becomes a stack of filtered images
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
1 -1 -1
-1 1 -1
-1 -1 1
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Convolution layer
• One image becomes a stack of filtered images
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Pooling: Shrinking the image stack
1. Pick a window size (usually 2 or 3).
2. Pick a stride (usually 2). A stride = step.
3. Walk your window across your filtered images.
4. From each window, take the maximum value.
1.00
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55 0.33
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55 0.33
0.33
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
0.33 0.55 1.00 0.77
0.55 0.55 1.00 0.33
1.00 1.00 0.11 0.55
0.77 0.33 0.55 0.33
0.55 0.33 0.55 0.33
0.33 1.00 0.55 0.11
0.55 0.55 0.55 0.11
0.33 0.11 0.11 0.33
Pooling layer
• A stack of images becomes a stack of smaller images.
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
0.33 0.55 1.00 0.77
0.55 0.55 1.00 0.33
1.00 1.00 0.11 0.55
0.77 0.33 0.55 0.33
0.55 0.33 0.55 0.33
0.33 1.00 0.55 0.11
0.55 0.55 0.55 0.11
0.33 0.11 0.11 0.33
Normalization
• Keep the math from breaking by tweaking each of the values just a bit.
• Change everything negative to zero.
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.77
0.77 0
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.77 0 0.11 0.33 0.55 0 0.33
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.77 0 0.11 0.33 0.55 0 0.33
0 1.00 0 0.33 0 0.11 0
0.11 0 1.00 0 0.11 0 0.55
0.33 0.33 0 0.55 0 0.33 0.33
0.55 0 0.11 0 1.00 0 0.11
0 0.11 0 0.33 0 1.00 0
0.33 0 0.55 0.33 0.11 0 0.77
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
ReLU layer
• A stack of images becomes a stack of images with no negative values.
0.77 0 0.11 0.33 0.55 0 0.33
0 1.00 0 0.33 0 0.11 0
0.11 0 1.00 0 0.11 0 0.55
0.33 0.33 0 0.55 0 0.33 0.33
0.55 0 0.11 0 1.00 0 0.11
0 0.11 0 0.33 0 1.00 0
0.33 0 0.55 0.33 0.11 0 0.77
0.33 0 0.11 0 0.11 0 0.33
0 0.55 0 0.33 0 0.55 0
0.11 0 0.55 0 0.55 0 0.11
0 0.33 0 1.00 0 0.33 0
0.11 0 0.55 0 0.55 0 0.11
0 0.55 0 0.33 0 0.55 0
0.33 0 0.11 0 0.11 0 0.33
0.33 0 0.55 0.33 0.11 0 0.77
0 0.11 0 0.33 0 1.00 0
0.55 0 0.11 0 1.00 0 0.11
0.33 0.33 0 0.55 0 0.33 0.33
0.11 0 1.00 0 0.11 0 0.55
0 1.00 0 0.33 0 0.11 0
0.77 0 0.11 0.33 0.55 0 0.33
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
Layers get stacked
• The output of one becomes the input of the next.
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
0.33 0.55 1.00 0.77
0.55 0.55 1.00 0.33
1.00 1.00 0.11 0.55
0.77 0.33 0.55 0.33
0.55 0.33 0.55 0.33
0.33 1.00 0.55 0.11
0.55 0.55 0.55 0.11
0.33 0.11 0.11 0.33
Deep stacking
• Layers can be repeated several (or many) times.
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1.00 0.55
0.55 1.00
0.55 1.00
1.00 0.55
1.00 0.55
0.55 0.55
Fully connected layer
• Every value gets a vote
1.00 0.55
0.55 1.00
0.55 1.00
1.00 0.55
1.00 0.55
0.55 0.55
1.00
0.55
0.55
1.00
1.00
0.55
0.55
0.55
0.55
1.00
1.00
0.55
Fully connected layer
• Vote depends on how strongly a value predicts X or O
X
O
1.00
0.55
0.55
1.00
1.00
0.55
0.55
0.55
0.55
1.00
1.00
0.55
Fully connected layer
• Vote depends on how strongly a value predicts X or O
X
O
0.55
1.00
1.00
0.55
0.55
0.55
0.55
0.55
1.00
0.55
0.55
1.00
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• A list of feature values becomes a list of votes.
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Putting it all together
• A set of pixels becomes a set of votes.
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
X
O
Layer 1 Layer 2 Layer 3 Layer 4 Layer 5
Gradient descent
• For each feature pixel and voting
weight, adjust it up and down a
bit and see how the error
changes.
weighterror
Gradient descent
• For each feature pixel and voting
weight, adjust it up and down a
bit and see how the error
changes.
weighterror
Tuning the CNN
• Architecture
• How many of each type of layer?
• In what order?
• Convolution
• Number of features
• Size of features
• Pooling
• Window size
• Window stride
• Fully Connected
• Number of neurons
CNN - Not just for images
Things closer together are more closely related than things far away:
• 2D Images.
• 3D Images.
• Audio
• Video
• Signal processing
• NLP – semantic parsing, sentence modelling and more.
• Drug discovery - Chemical interactions,
MNIST demo using CNN
108
Machine Learning in the near future
There is a lot of research around ML in the academia and in commercial companies
and a lot of money is invested there….
• ML will be used adopted in much greater scales across almost every industry.
• ML will be embedded everywhere
• Specialized hardware for ML will enable deeper and faster learning
• Machine Learning as a Service (MLaaS) market will grow substantially.
• ML will save more lives.
• ML will automate more repetitive tasks.
109
Why should developers/data
engineers/DBAs invest time in ML?
• Data is the fuel of every ML system – comes from the data platforms DBAs
manage.
• The data preparation before the training is the most time consuming part.
• The DBAs can definitely assist here.
• ML – not just for data scientists (up to a certain level)
• Developers already use ML
• Data engineers use ML.
• ML can be used by DBAs too – why not?
• ML will become more and more easy to use:
• Azure ML
• AWS ML
110
111
The future of AI
?
112
Thank you !
Lior.King@gmail.com

More Related Content

What's hot

What's hot (20)

Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
 
Machine Learning Lecture 2 Basics
Machine Learning Lecture 2 BasicsMachine Learning Lecture 2 Basics
Machine Learning Lecture 2 Basics
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network  Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network
 
Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variants
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
 
Deep learning simplified
Deep learning simplifiedDeep learning simplified
Deep learning simplified
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTECFace recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
 
Machine learning the next revolution or just another hype
Machine learning   the next revolution or just another hypeMachine learning   the next revolution or just another hype
Machine learning the next revolution or just another hype
 
Deeplearning in finance
Deeplearning in financeDeeplearning in finance
Deeplearning in finance
 
Deep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr SanparitDeep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr Sanparit
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning Overview
 
Algorithms Design Patterns
Algorithms Design PatternsAlgorithms Design Patterns
Algorithms Design Patterns
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 

Similar to Machine Learning Essentials Demystified part2 | Big Data Demystified

Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
MayuraD1
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Simplilearn
 
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdfAuto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Kundjanasith Thonglek
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
Uwe Friedrichsen
 

Similar to Machine Learning Essentials Demystified part2 | Big Data Demystified (20)

Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
 
Development of Deep Learning Architecture
Development of Deep Learning ArchitectureDevelopment of Deep Learning Architecture
Development of Deep Learning Architecture
 
Machine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspectiveMachine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspective
 
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
 
Neural-Networks.ppt
Neural-Networks.pptNeural-Networks.ppt
Neural-Networks.ppt
 
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
 
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdfAuto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 

More from Omid Vahdaty

Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Omid Vahdaty
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Omid Vahdaty
 

More from Omid Vahdaty (20)

Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
 
The technology of fake news between a new front and a new frontier | Big Dat...
The technology of fake news  between a new front and a new frontier | Big Dat...The technology of fake news  between a new front and a new frontier | Big Dat...
The technology of fake news between a new front and a new frontier | Big Dat...
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
Making your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data DemystifiedMaking your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data Demystified
 
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
 
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data Demystified
 
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
 
AWS Big Data Demystified #4 data governance demystified [security, networ...
AWS Big Data Demystified #4   data governance demystified   [security, networ...AWS Big Data Demystified #4   data governance demystified   [security, networ...
AWS Big Data Demystified #4 data governance demystified [security, networ...
 
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Emr spark tuning demystified
Emr spark tuning demystifiedEmr spark tuning demystified
Emr spark tuning demystified
 
Emr zeppelin & Livy demystified
Emr zeppelin & Livy demystifiedEmr zeppelin & Livy demystified
Emr zeppelin & Livy demystified
 
Zeppelin and spark sql demystified
Zeppelin and spark sql demystifiedZeppelin and spark sql demystified
Zeppelin and spark sql demystified
 
Introduction to AWS Big Data
Introduction to AWS Big Data Introduction to AWS Big Data
Introduction to AWS Big Data
 

Recently uploaded

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
MayuraD1
 
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
pritamlangde
 

Recently uploaded (20)

Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 

Machine Learning Essentials Demystified part2 | Big Data Demystified

  • 1. Machine Learning Essentials Part 2: Artificial Neural Networks Lior King Lior.King@gmail.com 1
  • 2. Previously on “Machine Learning Essentials...” 2
  • 3. Linear regression Finding the relation between the age and the salary. Predicting the salary for any given age 3 Historical Data points Experience Salary
  • 4. Historical Data points Salary (dependent) Minimize the error The Error (or Residual) is the offset of the dependent variable from the independent variable. The goal of any regression is to minimize the error for the training data and to FIND THE OPTIMAL LINE (or curve in case of logistic regression). 4 Error Experience (independent)
  • 5. Historical Data points Salary (dependent) Minimize the error – sum of square diffs The error = 𝑖=1 𝑁 (𝑦𝑖 − 𝑦𝑖)2 5 y Error 𝒚 Experience
  • 6. Minimize the error with Stochastic Gradient Descent (SGD) Error = 1 𝑁 𝑖=1 𝑁 (𝑦𝑖 − 𝑦𝑖)2 N -> number of historical data points 1. Initialize some value for the slope and intercept. 2. Find the current value of the error function. 6 Error Slope Intercept 3. Find the slope at the current point (partial derivative) and move slightly downwards in the direction. 4. Repeat until you reach a minimum OR stop after certain number of iterations
  • 7. Historical Data points Salary (dependent) Experience Minimize the error The iterative SGD process will slowly change the slope and the intercept until the error is minimal. 7
  • 8. Multiple Linear Regression • Simple linear regression: 𝑌 = 𝑏0 + 𝑏1*𝑥1 • Multiple linear regression: 𝑌 = 𝑏0 + 𝑏1*𝑥1 + 𝑏2*𝑥2 + … + 𝑏 𝑛∗𝑥 𝑛 Important note: You need to exclude variables that will “mess” the prediction and keep the ones that actually help predicting the desired result. 8
  • 9. Polynomial Linear Regression 9 Simple linear regression: 𝑌 = 𝑏0 + 𝑏1*𝑥1 Polynomial linear regression: 𝑌 = 𝑏0 + 𝑏1*𝑥1 + 𝑏2∗𝑥1 𝟐 + … + 𝑏 𝑛∗𝑥1 𝒏 Quadratic: degree = 2 Cubic: degree = 3
  • 11. “Traditional” ML vs. “Representation” ML • “Traditional” ML based systems rely on experts to decide what features to pay attention to. • “Representation” ML based systems figure out by themselves what features to pay attention to. • The most common representation ML algorithm is called Artificial Neural Network • ANN are commonly used for: • Image/video/audio processing • Speech recognition • Natural language processing (NLP) • Games 11
  • 13. Artificial Neural Networks - ANN • Inspired by the neurons in the human mind. • Can learn and organize data and thus create an understanding of relationships. 13
  • 14. Artificial Neuron 14 Neuron Input Signal 1 (X1) Input Signal 2 (X2) Input Signal n (Xn) Output Signal ⁞ Independent variables Dependent variable Can be: • Continuous (price) • Binary (Yes/No) • Categorical The neuron behaves like a function W1 W2 Wn
  • 15. The neural network flow In neural networks, the activation functions are non-linear. 15
  • 17. MNIST Example • NIST = US National Institute of Standards and Technologies • MNIST – a subset of NIST’s handwritten digit data set • Consists of a training set of 60,000 samples and a test set of 10,000 samples. • 28x28 pixels grayscale images and digit labels for each image. • http://Yann.lecun.com/exdb/mnist 17
  • 19. MNIST example – starting with simple ANN 19 W(783, 9) W(0, 0) W(783, 0) 784 Pixels… … 0 1 2 9 0 1 2 3 4 5 6 7 8 783 7840 weights W(0, 9) 28x28 Pixels 10 Nodes
  • 20. 20 𝑊0,9…𝑊0,3𝑊0,2𝑊0,1𝑊0,0 𝑊1,9…𝑊1,3𝑊1,2𝑊1,1𝑊1,0 𝑊2,9…𝑊2,3𝑊2,2𝑊2,1𝑊2,0 𝑊3,9…𝑊3,3𝑊3,2𝑊3,1𝑊3,0 𝑊4,9…𝑊4,3𝑊4,2𝑊4,1𝑊4,0 𝑊5,9…𝑊5,3𝑊5,2𝑊5,1𝑊5,0 𝑊6,9…𝑊6,3𝑊6,2𝑊6,1𝑊6,0 𝑊7,9…𝑊7,3𝑊7,2𝑊7,1𝑊7,0 ……………… 𝑊783,9…𝑊783,3𝑊783,2𝑊783,1𝑊783,0 x x x x x x x x x 10 columns (for 10 digits) 783rows(foreverypixel) 𝑏9…𝑏3𝑏2𝑏1𝑏0 … 𝑋𝑖 𝑊𝑖,0 + b0 … 𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0784 pixels 𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝑋6 𝑋7 𝑋783 Softmax 0.20.40.10.30.10.80.20.60.10.2 9876543210 Softmax Softmax Softmax Softmax +++++ Inference function 0 1 2 3 9 Softmax function Activation function Wrong ! Biases (1 bias per digit)
  • 21. Using “softmax” activation function • In this example we will use “softmax” activation function: • Good for classification problems. • Increases the differences so the output gets closer to 1 or closer to 0 21
  • 22. Loss/error measurement function 22 9876543210 0100000000 9876543210 0.20.40.10.30.10.80.20.60.10.2 “one hot” actual probabilities Computed probabilities Cross entropy error measurement function: - 𝐴𝑖log(𝑌𝑖) A Y
  • 23. Minimize the error with Gradient Descent Optimization Function Error = 1 𝑁 𝑖=1 𝑁 𝑒𝑖 2 N -> number of historical datapoints 1. Initialize some value for the slope and intercept. 2. Find the current value of the error function. 23 Error Slope Intercept 3. Find the slope at the current point (partial derivative) and move slightly downwards in the direction. 4. Repeat until you reach a minimum OR stop after certain number of iterations
  • 24. Training the neural network • How can we know what should be the weights and biases? • Through training the network • The code will figure out the correct values BY ITSELF • How does the training work? 1. Starting with zero weights and bias, we multiply the input values by the weights and add the bias 2. We get an incorrect output But we know what the correct output should be. 1. The system measures the difference between the incorrect output and the correct output. This is call “loss measurement function”. • The loss measurement function calculates how big the error is. 2. Now the system will change the weights and biases to minimize the error. This is called “optimization function” and goes back to step 3 until it cannot reduce the error anymore. 24
  • 25. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 25
  • 26. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 26 9876543210 0100000000 9876543210 0.20.40.10.30.10.80.20.60.10.2
  • 27. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 27 9876543210 0100000000 9876543210 0.20.50.10.30.10.70.20.40.10.1
  • 28. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 28 9876543210 0100000000 9876543210 0.10.60.10.20.10.60.20.30.10.1
  • 29. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 29 9876543210 0100000000 9876543210 0.10.70.10.20.10.40.10.20.10.1
  • 30. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 30 9876543210 0100000000 9876543210 0.10.90.10.100.10.10.10.10.1 Correct!
  • 31. 31 𝑊0,9…𝑊0,3𝑊0,2𝑊0,1𝑊0,0 𝑊1,9…𝑊1,3𝑊1,2𝑊1,1𝑊1,0 𝑊2,9…𝑊2,3𝑊2,2𝑊2,1𝑊2,0 𝑊3,9…𝑊3,3𝑊3,2𝑊3,1𝑊3,0 𝑊4,9…𝑊4,3𝑊4,2𝑊4,1𝑊4,0 𝑊5,9…𝑊5,3𝑊5,2𝑊5,1𝑊5,0 𝑊6,9…𝑊6,3𝑊6,2𝑊6,1𝑊6,0 𝑊7,9…𝑊7,3𝑊7,2𝑊7,1𝑊7,0 ……………… 𝑊783,9…𝑊783,3𝑊783,2𝑊783,1𝑊783,0 x x x x x x x x x 10 columns (for 10 digits) 783rows(foreverypixel) 𝑏9…𝑏3𝑏2𝑏1𝑏0 … 𝑋𝑖 𝑊𝑖,0 + b0 … 𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0784 pixels 𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝑋6 𝑋7 𝑋783 Softmax 0.20.90.10.30.10.20.10.20.10.2 9876543210 Softmax Softmax Softmax Softmax +++++ Inference function 0 1 2 3 9 Softmax function Activation function Correct!
  • 33. Tensor • An n-dimensional array or list used to represent data • Defined by the 3 properties: • Rank: Scalar (number), Vector (1-dim array), Matrix (2-dim array), Cube, etc. • Shape • Type 33 TypeShapeRankExample Int32[]0 (scalar)1 Int32[5]1 (vector)[1, 5, 3, 6, 2] Int32[2, 5]2 (matrix)[[1, 5, 3, 8, 4], [3, 2, 6, 4, 7] ] Int32[3, 2, 3]3 (cube)[ [ [1, 6, 3], [2, 4, 3] ] [ [2, 6, 2], [3, 7, 4] ] [ [1, 9, 2], [4, 8, 3] ] ]
  • 34. What is TensorFlow • The most popular Python library for building ensemble algorithms – mainly NN. • Initially developed by Google and today it is open sourced • Provides a library of predefined versions of many common ML algorithms, but also enables to flexibly create your own algorithm. • Can harness the GPUs • Scalable – using “execution master” you can run on a laptop as well as on a large scale cluster in remote servers. 34
  • 35. Tensor Features and Tools • Name property - used to identify elements in the graph • Name Scope property – used for grouping elements (like “conv1” for 1st conv layer) • Summary class – has methods for writing summaries to log files. Can capture how elements change over time. • TensorBoard – A web server that uses the log files to visualize the computation graph and training progress. Can be used from remote desktops. • Common add-ons (for easier developement): • TFLearn - Simplifies the use of TensorFlow only and can converse with TF data types. • Keras – Simplification which supports multiple frameworks (including Microsoft CNTK). 35
  • 36. Training neural networks with TensorFlow With TensorFlow you need define the following: 1. The input data: • “Placeholders” – The input training data. • “Variables” – What we ask TF to compute through training. With neural network these are weights and biases. 2. The inference function (which is applied on the weights and biases). 3. Loss/error measurement function (example: “Cross Entropy”) 4. Optimization function to minimize loss (example: “Gradient Descent”) 36
  • 37. TensorFlow - MNIST demo 37 ImplementationConcept MNIST dataPrepared Data Sum(X* weight) + bias -> ActivationInference Cross EntropyLoss Measurement Gradient descent optimizerOptimize to minimize loss
  • 39. Why Convolutional Neural Networks (CNN) • Problem – Flattening the images caused us to lose the shape information. • When we see a digit, we recognize the lines and curves. • We need to “zoom out” slowly from the picture. 39
  • 42. Deep Learning • Use of multi layered neural network is called Deep Learning • Some applications: • Natural language processing (NLP) • Face recognition • Image analysis (what’s in the picture) • Image search • Voice analysis • Video analysis 42
  • 43. Convolutional Neural Network(CNN): X’s and O’s Says whether a picture is of an X or an O X or OCNN A two-dimensional array of pixels
  • 47. -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Computers are literal x
  • 48. ConvNets match pieces of the image = = =
  • 49. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 Features match pieces of the image
  • 50. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 51. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 52. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 53. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 54. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 55. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 56. Filtering: The math behind the match 1. Line up the feature and the image patch. 2. Multiply each image pixel by the corresponding feature pixel. 3. Add them up. 4. Divide by the total number of pixels in the feature.
  • 57. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 58. 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 59. 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 60. 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 61. 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 62. 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 63. 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 64. 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 65. 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 66. 1 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 67. 1 1 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 68. 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 69. 1 1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 70. 1 1 -1 1 1 1 -1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 71. 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 -1 1 1 1 -1 1 1 Filtering: The math behind the match 55 1 1 -1 1 1 1 -1 1 1
  • 72. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Convolution: Trying every possible match = 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 73. 1 -1 -1 -1 1 -1 -1 -1 1 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 = 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 = = -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
  • 74. Convolution layer • One image becomes a stack of filtered images 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 1 -1 -1 -1 1 -1 -1 -1 1 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
  • 75. Convolution layer • One image becomes a stack of filtered images 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
  • 76. Pooling: Shrinking the image stack 1. Pick a window size (usually 2 or 3). 2. Pick a stride (usually 2). A stride = step. 3. Walk your window across your filtered images. 4. From each window, take the maximum value.
  • 77. 1.00 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 78. 1.00 0.33 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 79. 1.00 0.33 0.55 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 80. 1.00 0.33 0.55 0.33 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 81. 1.00 0.33 0.55 0.33 0.33 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 82. 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 83. 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 0.33 0.55 1.00 0.77 0.55 0.55 1.00 0.33 1.00 1.00 0.11 0.55 0.77 0.33 0.55 0.33 0.55 0.33 0.55 0.33 0.33 1.00 0.55 0.11 0.55 0.55 0.55 0.11 0.33 0.11 0.11 0.33
  • 84. Pooling layer • A stack of images becomes a stack of smaller images. 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 0.33 0.55 1.00 0.77 0.55 0.55 1.00 0.33 1.00 1.00 0.11 0.55 0.77 0.33 0.55 0.33 0.55 0.33 0.55 0.33 0.33 1.00 0.55 0.11 0.55 0.55 0.55 0.11 0.33 0.11 0.11 0.33
  • 85. Normalization • Keep the math from breaking by tweaking each of the values just a bit. • Change everything negative to zero.
  • 86. Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.77
  • 87. 0.77 0 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 88. 0.77 0 0.11 0.33 0.55 0 0.33 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 89. 0.77 0 0.11 0.33 0.55 0 0.33 0 1.00 0 0.33 0 0.11 0 0.11 0 1.00 0 0.11 0 0.55 0.33 0.33 0 0.55 0 0.33 0.33 0.55 0 0.11 0 1.00 0 0.11 0 0.11 0 0.33 0 1.00 0 0.33 0 0.55 0.33 0.11 0 0.77 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 90. ReLU layer • A stack of images becomes a stack of images with no negative values. 0.77 0 0.11 0.33 0.55 0 0.33 0 1.00 0 0.33 0 0.11 0 0.11 0 1.00 0 0.11 0 0.55 0.33 0.33 0 0.55 0 0.33 0.33 0.55 0 0.11 0 1.00 0 0.11 0 0.11 0 0.33 0 1.00 0 0.33 0 0.55 0.33 0.11 0 0.77 0.33 0 0.11 0 0.11 0 0.33 0 0.55 0 0.33 0 0.55 0 0.11 0 0.55 0 0.55 0 0.11 0 0.33 0 1.00 0 0.33 0 0.11 0 0.55 0 0.55 0 0.11 0 0.55 0 0.33 0 0.55 0 0.33 0 0.11 0 0.11 0 0.33 0.33 0 0.55 0.33 0.11 0 0.77 0 0.11 0 0.33 0 1.00 0 0.55 0 0.11 0 1.00 0 0.11 0.33 0.33 0 0.55 0 0.33 0.33 0.11 0 1.00 0 0.11 0 0.55 0 1.00 0 0.33 0 0.11 0 0.77 0 0.11 0.33 0.55 0 0.33 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
  • 91. Layers get stacked • The output of one becomes the input of the next. -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 0.33 0.55 1.00 0.77 0.55 0.55 1.00 0.33 1.00 1.00 0.11 0.55 0.77 0.33 0.55 0.33 0.55 0.33 0.55 0.33 0.33 1.00 0.55 0.11 0.55 0.55 0.55 0.11 0.33 0.11 0.11 0.33
  • 92. Deep stacking • Layers can be repeated several (or many) times. -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1.00 0.55 0.55 1.00 0.55 1.00 1.00 0.55 1.00 0.55 0.55 0.55
  • 93. Fully connected layer • Every value gets a vote 1.00 0.55 0.55 1.00 0.55 1.00 1.00 0.55 1.00 0.55 0.55 0.55 1.00 0.55 0.55 1.00 1.00 0.55 0.55 0.55 0.55 1.00 1.00 0.55
  • 94. Fully connected layer • Vote depends on how strongly a value predicts X or O X O 1.00 0.55 0.55 1.00 1.00 0.55 0.55 0.55 0.55 1.00 1.00 0.55
  • 95. Fully connected layer • Vote depends on how strongly a value predicts X or O X O 0.55 1.00 1.00 0.55 0.55 0.55 0.55 0.55 1.00 0.55 0.55 1.00
  • 96. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 97. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 98. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 99. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 100. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 101. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 102. Fully connected layer • A list of feature values becomes a list of votes. X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 103. Putting it all together • A set of pixels becomes a set of votes. -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 X O Layer 1 Layer 2 Layer 3 Layer 4 Layer 5
  • 104. Gradient descent • For each feature pixel and voting weight, adjust it up and down a bit and see how the error changes. weighterror
  • 105. Gradient descent • For each feature pixel and voting weight, adjust it up and down a bit and see how the error changes. weighterror
  • 106. Tuning the CNN • Architecture • How many of each type of layer? • In what order? • Convolution • Number of features • Size of features • Pooling • Window size • Window stride • Fully Connected • Number of neurons
  • 107. CNN - Not just for images Things closer together are more closely related than things far away: • 2D Images. • 3D Images. • Audio • Video • Signal processing • NLP – semantic parsing, sentence modelling and more. • Drug discovery - Chemical interactions,
  • 108. MNIST demo using CNN 108
  • 109. Machine Learning in the near future There is a lot of research around ML in the academia and in commercial companies and a lot of money is invested there…. • ML will be used adopted in much greater scales across almost every industry. • ML will be embedded everywhere • Specialized hardware for ML will enable deeper and faster learning • Machine Learning as a Service (MLaaS) market will grow substantially. • ML will save more lives. • ML will automate more repetitive tasks. 109
  • 110. Why should developers/data engineers/DBAs invest time in ML? • Data is the fuel of every ML system – comes from the data platforms DBAs manage. • The data preparation before the training is the most time consuming part. • The DBAs can definitely assist here. • ML – not just for data scientists (up to a certain level) • Developers already use ML • Data engineers use ML. • ML can be used by DBAs too – why not? • ML will become more and more easy to use: • Azure ML • AWS ML 110