Day 2 build up your own neural network

ICEBREAKER
Huỳnh Ngọc Như
@nhu.huynh1142002
Let’s get to know each other!

Rules:
- The game includes 1 small example and 7 questions
- Based on the hints given, including pictures, scrambled letters, you will
have 20 seconds to come up with the correct word
- The person with the earliest correct answers will get 1 point.

y/b/ứ/t/ả/h
Example:
T h ứ b ả y
_ _ _ _ _ _

ầ/k/n/h/i/h/t/n
Question 1:
T h ầ n k i n h
_ _ _ _ _ _ _ _

á/c/y/ọ/m/h
Question 2:
H ọ c m á y
_ _ _ _ _ _

n/z/o/p/o/i/t/i/a/m/t/i
Question 3:
O p t i m i z a t i o n
_ _ _ _ _ _ _ _ _ _ _ _

n/h/ế/t/n/y/n/u/t/í
Question 4:
T u y ế n t í n h
_ _ _ _ _ _ _ _ _

Question 5:
A r t i f i c i a l I n t e l l i g e n c e
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

g/i/o/s/l/i/t/c
Question 6:
L o g i s t i c
_ _ _ _ _ _ _ _

u/n/t/s/f/n/o/i/c
Question 7:
F u n c t i o n s
_ _ _ _ _ _ _ _ _

Summary
1. Thần kinh (Neuron/Neural Network)
2. Học máy (Machine Learning)
3. Optimization (Tối ưu hoá)
4. Tuyến tính (Linear)
5. Artiﬁcial Intelligence (Trí tuệ nhân tạo)
6. Logistic
7. Functions (Hàm số)

how-to-AI Series: Unlock Potential
Day 2: Build UP your own Neural
Network
Phạm Khánh Trình
@trinh.phamkhanh
Nguyễn Thế Bình
@binh.nguyen288

Outline
1. Break the ice
2. Linear Regression & Logistic Regression
3. Neural Network
4. Demo Time

How to model the relations between the variables?
Fig 1: Regression Fig 2: Classiﬁcation

Linear Regression
Logistic Regression

Basic idea of Univariate LR:
Given some data points (x, y).
We find the red line y = wx + b that best describe / fit the data!
Index x y
0 1 2
1 2 4
2 3 6
3 4 8
4 5 10
5 6 12
6 7 14
7 8 16
8 9 18

We deﬁne a function with respect to w and b, called the loss function L, for
instance
The best ﬁt line is the line with minimum value loss function!
What does it mean by “best fit”?
It’s just an optimization problem

That makes sense, but how do we
ﬁnd the optimal value of w and b?

Two optimization approach
- Using Calculus 2 to ﬁnd the closed-form formula
(not always possible)
- Iterative algorithm: Gradient Descent

Step 1: Compute the gradient
Step 2: Find stationary points by letting the gradient
zero:
Formula derivation using Calculus 2

Using gradient descent
Loop 7749 times:

Wait, how does that even work?
Sound familiar, right?
Fig 1: One variable Fig 2: Two variables

The influence of learning rate
It’s a way to control the magnitude of gradients
Fig 1: Too small gradient Fig 2: Too large gradient

Multivariate Linear
Regression

Limitation of naive linear regression

Modeling polynomial relation
Linear regression can be not “linear”

- Step 1: Collect the relevant data (the data points)
- Step 2: Choose a suitable model (linear regression for example)
- Step 3: Choose a loss function with respect to the data (such as SSE)
- Step 4: Minimize the loss function with respect to the parameters using an
optimization algorithm (Gradient Descent)
Let’s backtrack
There are 4 steps to build a machine learning model

Linear regression
Logistic regression

Linear regression in classification?
Just don’t :)

Sometime we want the output of the function to be:
- Strictly increasing and analytical
- Bounded in (0, 1) to be probabilistic
The sigmoid function is a perfect ﬁt for both of these
Sigmoid function, an activation function
Yet another function to learn, huh?

In logistic regression, we apply sigmoid function on top of linear regression in
order to squeeze output range into (0, 1)
Logistic regression idea
Basically, that’s sigmoid(linear regression)

The choice of loss function
Gradient descent could only reach local minima
for non-convex functions
Since the label of logistic problem can be only be binary, we can be a little bit
smarter
What is wrong with sum of squares?

Formula of binary cross entropy
In short, by using this loss function, the loss function is now convex.
Binary cross entropy
Note: Multiclass variant of this is cross entropy

Optimization?
Sadly, there is no closed-form formula for Logistic Regression, here
is the gradient for univariate case, good luck solving those :)
Thus the only choice here is Gradient Descent.

Logistic regression can be used to infer whether a feature is active of not
according to the given features, by stacking up multiple logistic regressions, we
get the fundamental idea of neural network
Insight of Logistic Regression
Motivation to neural network

Neural Network
What is a Neural Network?
A human brain or something?

Artificial Neural Network (ANN)

Source: https://www.packtpub.com/product/deep-learning-essentials/9781785880360
Rosenblatt’s work: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.335.3398&rep=rep1&type=pdf
History

“Later Perceptrons will be able to recognize people
and call out their names and instantly translate
speech in one language to speech or writing in
another language, it was predicted.”
NY Times ━ "New Navy Device Learns By Doing" 7/7/1958

Reminder: Logistic Regression
Source: https://pythonmachinelearning.pro/perceptrons-the-first-neural-networks/

Reminder: Logistic Regression
Source: https://www.coursera.org/learn/machine-learning

Another Case
Source:
https://towardsdatascience.com/beginners-ask-how-many-hidden-layers-neurons-to-use-in-artificial-neural-networks-51466afa0d3e

“Data scientists call the layer-by-layer process of
matrix multiplication followed by non-linear
activation functions, transforming the feature
space.”
Alando Ballantyne ━ Minsky's "And / Or" Theorem: A Single Perceptron Limitations.

Step 1: Determine the network structure (the number of layers, which activation
functions to choose).
Step 2: Load the data as input to the Neural Network
Repeat {
Step 3: Forward the data through the network, calculate the Loss function.
Step 4: Backpropagation to ﬁnd the weight gradients → update the weights.
} Until convergence.
Let’s recap!
Neural Network in a nutshell

How does Neural Network works?
Flatten

Neural Network
Basically, it is a complex function (a combination of many
functions) to solve a complex task.

“The result of a Neural Network can approximate any
well-behaved function 𝟋 by using the same
construction for the first layer and approximating the
identity function with later layers.”
Universal Approximation Theorem

State-of-the-art Neural Network
Convolutional Neural Network
Recurrent Neural Network

Let’s play some codes!
https://colab.research.google.com/drive/1bOQdxTtE2Ge73
gbKuH2Dpk8mXBFwJmXw?usp=sharing

Feedback form
https://tinyurl.com/a78ahspe

Day 2 build up your own neural network

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Day 2 build up your own neural network

Similar to Day 2 build up your own neural network (20)

More from HuyPhmNht2

More from HuyPhmNht2 (6)

Recently uploaded

Recently uploaded (20)

Day 2 build up your own neural network