Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
gradientDescentTNP (2).pdf
1. CS771: Intro to ML
Gradient descent algorithm
• Gradient descent algorithm is an optimization algorithm which is used to
minimise the function.
• The function which is set to be minimised is called as an objective
function.
• For machine learning, the objective function is also termed as the cost
function or loss function.
• Loss function is the measure of the squared difference between actual
values and predictions
2. CS771: Intro to ML
Gradient descent algorithm
2
• Gradient descent is an optimization algorithm used to
minimize some function by iteratively moving in the
direction of steepest descent .
• In machine learning, we use gradient descent to
update the parameters of our model. Parameters refer
to coefficients in Linear Regression
5. CS771: Intro to ML
5
Learning rate
• The size of these steps is called the learning rate.
• With a high learning rate we can cover more ground each step, but we
risk overshooting the lowest point since the slope of the hill is
constantly changing.
• A low learning rate is more precise, but calculating the gradient is
time-consuming, so it will take us a very long time to get to the
bottom.
10. CS771: Intro to ML
Local & Global Minima , Maxima
10
𝑓(𝑥)
Global
maxima
A local
maxima
A local
maxima
A local
minima
A local
minima A local
minima
Global
minima
𝑥
11. CS771: Intro to ML
the tangent is perfectly horizontal at the local minima and maxima.
13. CS771: Intro to ML
Derivatives
13
How the derivative itself changes tells us about the function’s optima
The second derivative 𝑓’’(𝑥) can provide this information
𝑓’(𝑥)= 0 at 𝑥,
𝑓’(𝑥)>0 just
before 𝑥 𝑓’(𝑥)<0
just after 𝑥
𝑥 is a maxima
𝑓’(𝑥)= 0 at 𝑥
𝑓’(𝑥)< 0 just
before 𝑥 𝑓’(𝑥)>0
just after 𝑥
𝑥 is a minima
𝑓’(𝑥)= 0 at 𝑥
𝑓’(𝑥)= 0 just
before 𝑥 𝑓’(𝑥)= 0
just after 𝑥
𝑥 may be a saddle
𝑓’(𝑥)= 0 and 𝑓’’(𝑥) <
0
𝑥 is a maxima
𝑓’(𝑥)= 0 and 𝑓’’ 𝑥 > 0
𝑥 is a minima
𝑓’(𝑥)= 0 and 𝑓’’ 𝑥 = 0
𝑥 may be a saddle. May
need higher derivatives
15. CS771: Intro to ML
Saddle Points
15
Points where derivative is zero but are neither minima nor maxima
Second or higher derivative may help identify if a stationary point is a
saddle
Saddle is a point of
inflection where the
derivative is also zero
A saddle
point
16. CS771: Intro to ML
Gradient Descent: An Illustration
16
𝒘∗
𝒘(0) 𝒘(1) 𝒘(2) 𝒘(0)
𝒘(1)
𝒘(2) 𝒘∗
𝒘(3) 𝒘(3)
Stuck at a
local minima
Negative gradient here (
𝛿𝐿
𝛿𝑤
<
0). Let’s move in the positive
direction
Positive gradient
here. Let’s move
in the negative
direction
Learning rate is very important
Good initialization
is very important
𝐿(𝒘)
𝒘