The document discusses key concepts in neural network training including weights, stochastic gradient descent, learning rate, mini-batch size, and epochs. Weights in a neural network are adjusted during training to minimize loss, using stochastic gradient descent. The learning rate determines how much weights are adjusted with each pass of data. Mini-batch size refers to the number of samples used to calculate weight updates in each iteration. Epochs refer to the number of times the full training data is passed through the network. The optimal values for these hyperparameters depend on factors like the dataset and model architecture.
3. The Model
The Network and
Its Weights
w1
1
w1
2w1
3
w1
4
w6
1
w6
2w6
3 w6
4
Real life Neural Networks can have hundreds of
hidden layers and 1000s of neurons per layer, or
more! This means millions or billions of weights!
4. How do you learn the weights ?
AI Algorithm
Training
DATA
Examples of
happy and
sad
MODE
L
= weights of the
neural
network
AI Algorithm
Prediction
PREDICT
ANSWE
R
“Happy”
• Weights of a neural network are learnt during training
5. How do you learn the weights?
• Weights of a neural network are learnt during training
• Weights are learnt using a mechanism called stochastic
gradient descent (SGD)
6. Parameters of a neural network
• Learning rate: The rate at which you will make changes to
your model weights
• For example: if initial weight = 1, should the change to a weight
value be 0.1 or 0.01 or 0.001 resulting in 1.1 or 1.01 or 1.001
7. Parameters of a neural network
• Learning rate: The rate at which you will make changes to
your model weights
• For example: if initial weight = 1, should the change to a weight
value be 0.1 or 0.01 or 0.001 resulting in 1.1 or 1.01 or 1.001
• Mini-batch size: The number of samples that will be used to
make a change to the weights.
• For example: I will look at 10 sample images to understand how
I should change my model weights.
8. Parameters of a neural network
• Learning rate: The rate at which you will make changes to your model
weights
• For example: if initial weight = 1, should the change to a weight value be 0.1 or 0.01
or 0.001 resulting in 1.1 or 1.01 or 1.001
• Mini-batch size: The number of samples that will be used to make a
change to the weights.
• For example: I will look at 10 sample images to understand how I should change my
model weights.
• Epochs: Number of times you will go through the entire data to make
changes to the weights.
• For example: If epochs = 5, I have looked at all samples 5 times and made changes
to weights (5*(samples/Mini-batch size)) number of times.
9. Let us look at these parameters a bit more
• Learning rate: The rate at which you will make changes to
your model weights
• For example: if initial weight = 1, should the change to a weight
value be 0.1 or 0.01 or 0.001 resulting in 1.1 or 1.01 or 1.001
• What happens if it is too high or too low?
10.
11. Let us look at these parameters a bit more
● Mini-batch size: The number of samples that will be used to
make a change to the weights.
○ For example: I will look at 10 sample images to understand how I
should change my model weights.
• What happens if it is too high or too low?
• Too high - not much benefit, takes a lot longer and cost more
• Too low, may not get good results
12. Let us look at these parameters a bit more
● Epochs: Number of times you will go through the entire data to
make changes to the weights.
○ For example: If epochs = 5, I have looked at all samples 5 times
and made changes to weights (5*(samples/Mini-batch size))
number of times.
• What happens if it is too high or too low?
• Too high - not much benefit, takes a lot longer and cost more
• Too low, may not get good results