SlideShare a Scribd company logo
1 of 26
Deep Learning | Machine Learning
Optimization Technique
Rakshith
Table of Content
• Basic mathematics
• Introduction to simple Linear Regression
• Why optimization
• Calculation of gradient descent
• Variations in gradient descent
• Miscellaneous topics
1. Batch normalization
2. Memonization
3. Weight initialization
Basic Mathematics
Gradient: An inclined or a slope
Tangent: A straight line or plane that touches a curve or curved surface at a point, but if
extended does not cross it at that point.
Basic of Gradient Descent ….
Trigonometric value: Tan Angles Radians
0 0
30 -6.405
45 1.619
60 0.320
90 -1.995
Sin = opp/hyp
Cos = adj/hyp
Tan = opp/adj
Basic of Gradient Descent ….
• Tan becomes 0 when slop is zero
• Tan = 0 is not always equal to global minima
Introduction to simple linear regression:
X Y
-1 -1
1 2
2 3
4 3
6 5
7 8
19 20
Assumption of linear regression: Data is linearly distributed
-2
0
2
4
6
8
10
-2 0 2 4 6 8
Y
X
Simple Linear Regression
Introduction to simple linear regression:
Equation of straight line is y = Ax + B
X is given , A is slope B is intercept need to be found.
A = (∑x)(∑y)- n(∑xy) / ((∑x)^2 -n ∑x^2)
B = (∑x)(∑x^2)- n(∑y)(∑x^2) / ((∑x)^2 -n ∑x^2)
X Y
-1 -1
1 2
2 3
4 3
6 5
7 8
19 20
X sqr Y sqr XY
1 1 1
1 4 2
4 9 6
16 9 12
36 25 30
49 64 56
107 112 107
A = 0.932384
B = 0.380783
SSE = 3.4
X
y
0.93284* X + 0.380783 , SSE = 3.4
0.9226* X + 0.3567 , SSE = 2.8
0.9157* X + 0.2777 , SSE = 2.2
Way to find the best fit line which minimizes the error is by optimization technique ,
Optimization uses gradient descent to find minimum error
Why optimization
• Calculation of gradient descent
Lets consider how to calculate gradient descent
To fit a line Y pred = a + b X, start off with random values of a and b and calculate prediction error (SSE)
Step 1:
Step 2: Calculate the error gradient w.r.t the weights
∂SSE/∂a
∂SSE/∂b
So, update rules:
1.New a = a – r * ∂SSE/∂a = 0.45-0.01*3.300 = 0.42
2.New b = b – r * ∂SSE/∂b= 0.75-0.01*1.545 = 0.73
here, r is the learning rate = 0.01, which is the pace of adjustment to the weights.
Step 3: Adjust the weights with the gradients to reach the optimal values where SSE is minimized
Step 4: Use new a and b for prediction and to calculate new Total SSE
You can see with the new prediction, the total SSE has gone down (0.677 to 0.553). That means prediction accuracy has
improved
Step 5: Repeat step 3 and 4 till the time further adjustments to a, b doesn’t significantly reduces the error. At that time, we
have arrived at the optimal a , b with the highest prediction accuracy.
Extending the idea of Gradient Descent to Neural network
Forward Propagation
Initialize Weight (One Time)
Feed data
Compute Y
Compute loss
Backpropogation
Compute Partial differentials
Update weights
Gradient
Descent
Mini Batch
SGD
Stochastic
Gradient Descent Ad
grad
SGD with
momentum
Ad Delta
Adam
Flavors Of Gradients
Learning RateNo of Samples
• Gradient Descent
• Stochastic Gradient Descent
• Mini Batch SGD
Stochastic Gradient Descent
X1 X2 X3 Y
0 0.2 0.9 0
0.22 0.25 0.22 5
0.24 0.6 0.58 2.2
0.33 0.13 0.2 5.9
0.37 0.89 0.55 3.2
0.44 0.3 0.39 1.5
0.44 0.5 0.54 1.8
0.57 0.78 0.53 2.9
0.93 3 1 9.4
1 0.61 0.61 2.3
Feed one rows at a time
• Random Weight initialization
• No of coefficients is propositional to no of columns
X1 X2 X3 Y
0 0.2 0.9 0
0.22 0.25 0.22 5
0.24 0.6 0.58 2.2
0.33 0.13 0.2 5.9
0.37 0.89 0.55 3.2
0.44 0.3 0.39 1.5
0.44 0.5 0.54 1.8
0.57 0.78 0.53 2.9
0.93 3 1 9.4
1 0.61 0.61 2.3
Mini Batch SGD
B1
B2
B3
B4
B5
B1
Batch Normalization
Fully connected NW
L1 L3 L4 L5 L6L2I/P O/P
Before we feed data into network we normalize the data to bring the value under same scale this is nothing but
Mean centering /variance scaling / normalization
In Deep NW small change in input causes large change in the output because of lot of multiplication
Normalization is always recommend in neural networks
Because of fast convergence
Why Batch normalization ?
Internal co- variance shift
• L1 To L2 no much changes or difference by the time it reaches L5 there is a huge shift
• Where to introduce normalization ? Heuristic
N1
N2 N3
Memonization
Instead of computing some repeated partial
Derivate , what about compute once
and reuse it ?
Weight initializations
Don’t Does
1. Never initialize your weights is equal to zero
2. Initialize same weight across all neuron
this problem is a called problem of symmetry
5
3
3
3
3
3
3
2
3
3
34
15 +
15 +
15 +
6 +
6 +
6 +
12
12
12
= 33
= 33
= 33
3. Large negative values
• Relu = Dead activation (refer activation function presentation)
• Sigmoid = vanishing gradient
Does
Random initialization , on Random initialization each neuron learns different aspects
Imagine each neuron is base model combine multiple base models ex: Random forest each model is built on different
attributes so it sees lot of variation and learns very well.
• Zeros: Initializer that generates tensors initialized to 0.
• Ones: Initializer that generates tensors initialized to 1.
• Constant: Initializer that generates tensors initialized to a constant value.
• RandomNormal: Initializer that generates tensors with a normal distribution.
• RandomUniform: Initializer that generates tensors with a uniform distribution.
• TruncatedNormal: Initializer that generates a truncated normal distribution.
• VarianceScaling: Initializer capable of adapting its scale to the shape of
weights.
• Orthogonal: Initializer that generates a random orthogonal matrix.
• Identity: Initializer that generates the identity matrix.
• lecun_uniform: LeCun uniform initializer.
• glorot_normal: Glorot normal initializer, also called Xavier normal initializer.
• glorot_uniform: Glorot uniform initializer, also called Xavier uniform initializer.
• he_normal: He normal initializer.
• lecun_normal: LeCun normal initializer.
• he_uniform: He uniform variance scaling initializer.
from keras import initializers
• RandomNormal
keras.initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None)
• RandomUniform
keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None)
• TruncatedNormal
keras.initializers.TruncatedNormal(mean=0.0, stddev=0.05, seed=None)
• glorot_normal
keras.initializers.glorot_normal(seed=None)
It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / (fan_in + fan_out))
• glorot_uniform
keras.initializers.glorot_uniform(seed=None)
It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / (fan_in + fan_out))
• he_uniform
keras.initializers.he_uniform(seed=None)
It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / fan_in).
• he_normal
keras.initializers.he_normal(seed=None)
It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / fan_in)where fan_in is the number of input units in the weight tensor.

More Related Content

What's hot

3 e physcial quantities and units_pure_upload
3 e physcial quantities and units_pure_upload3 e physcial quantities and units_pure_upload
3 e physcial quantities and units_pure_upload
mrangkk
 
Chapter 1 slides na-12
Chapter 1 slides na-12Chapter 1 slides na-12
Chapter 1 slides na-12
seokhwee
 
Chapter 1 slides na-12
Chapter 1 slides na-12Chapter 1 slides na-12
Chapter 1 slides na-12
seokhwee
 

What's hot (20)

Scalars and Vectors Part 3
Scalars and Vectors Part 3Scalars and Vectors Part 3
Scalars and Vectors Part 3
 
Physical quantities and units pps
Physical quantities and units ppsPhysical quantities and units pps
Physical quantities and units pps
 
9th grade physics exam
9th grade physics exam9th grade physics exam
9th grade physics exam
 
Physics 1
Physics 1Physics 1
Physics 1
 
Unit and Measure
Unit and MeasureUnit and Measure
Unit and Measure
 
1.0 Physical Quantities and Measurement
1.0 Physical Quantities and Measurement1.0 Physical Quantities and Measurement
1.0 Physical Quantities and Measurement
 
Measurement and uncertainty
Measurement and uncertainty Measurement and uncertainty
Measurement and uncertainty
 
Calculating Uncertainties
Calculating UncertaintiesCalculating Uncertainties
Calculating Uncertainties
 
3 e physcial quantities and units_pure_upload
3 e physcial quantities and units_pure_upload3 e physcial quantities and units_pure_upload
3 e physcial quantities and units_pure_upload
 
Chapter 1 slides na-12
Chapter 1 slides na-12Chapter 1 slides na-12
Chapter 1 slides na-12
 
Center of mass ppt.
Center of mass ppt.Center of mass ppt.
Center of mass ppt.
 
Tales of a colorblind scientist
Tales of a colorblind scientistTales of a colorblind scientist
Tales of a colorblind scientist
 
2 5 metric prefix conversions
2 5 metric prefix conversions2 5 metric prefix conversions
2 5 metric prefix conversions
 
Chapter 1 slides na-12
Chapter 1 slides na-12Chapter 1 slides na-12
Chapter 1 slides na-12
 
Measurement of Geometrical Errors in Manufacturing Flatness
Measurement of Geometrical Errors in Manufacturing FlatnessMeasurement of Geometrical Errors in Manufacturing Flatness
Measurement of Geometrical Errors in Manufacturing Flatness
 
Introduction to Engineering and Profession Ethics Lecture4-Fundamental Dimens...
Introduction to Engineering and Profession Ethics Lecture4-Fundamental Dimens...Introduction to Engineering and Profession Ethics Lecture4-Fundamental Dimens...
Introduction to Engineering and Profession Ethics Lecture4-Fundamental Dimens...
 
NumXL 1.55 LYNX release notes
NumXL 1.55 LYNX release notesNumXL 1.55 LYNX release notes
NumXL 1.55 LYNX release notes
 
Theory of errors
Theory of errors Theory of errors
Theory of errors
 
2 3 unit conversions part 2
2 3 unit conversions part 22 3 unit conversions part 2
2 3 unit conversions part 2
 
Introductory Physics - Physical Quantities, Units and Measurement
Introductory Physics - Physical Quantities, Units and MeasurementIntroductory Physics - Physical Quantities, Units and Measurement
Introductory Physics - Physical Quantities, Units and Measurement
 

Similar to Optimization techniq

Similar to Optimization techniq (20)

Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
 
Linear regression
Linear regressionLinear regression
Linear regression
 
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
EMOD_Optimization_Presentation.pptx
EMOD_Optimization_Presentation.pptxEMOD_Optimization_Presentation.pptx
EMOD_Optimization_Presentation.pptx
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
 
15303589.ppt
15303589.ppt15303589.ppt
15303589.ppt
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
ARTIFICIAL-NEURAL-NETWORKMACHINELEARNING
ARTIFICIAL-NEURAL-NETWORKMACHINELEARNINGARTIFICIAL-NEURAL-NETWORKMACHINELEARNING
ARTIFICIAL-NEURAL-NETWORKMACHINELEARNING
 
Linear Regression.pptx
Linear Regression.pptxLinear Regression.pptx
Linear Regression.pptx
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
introduction to measurements.pptx
introduction to measurements.pptxintroduction to measurements.pptx
introduction to measurements.pptx
 
5954987.ppt
5954987.ppt5954987.ppt
5954987.ppt
 
Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9
 
10_support_vector_machines (1).pptx
10_support_vector_machines (1).pptx10_support_vector_machines (1).pptx
10_support_vector_machines (1).pptx
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
Machine learning mathematicals.pdf
Machine learning mathematicals.pdfMachine learning mathematicals.pdf
Machine learning mathematicals.pdf
 

Recently uploaded

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 

Recently uploaded (20)

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 

Optimization techniq

  • 1. Deep Learning | Machine Learning Optimization Technique Rakshith
  • 2. Table of Content • Basic mathematics • Introduction to simple Linear Regression • Why optimization • Calculation of gradient descent • Variations in gradient descent • Miscellaneous topics 1. Batch normalization 2. Memonization 3. Weight initialization
  • 3. Basic Mathematics Gradient: An inclined or a slope Tangent: A straight line or plane that touches a curve or curved surface at a point, but if extended does not cross it at that point.
  • 4. Basic of Gradient Descent …. Trigonometric value: Tan Angles Radians 0 0 30 -6.405 45 1.619 60 0.320 90 -1.995 Sin = opp/hyp Cos = adj/hyp Tan = opp/adj
  • 5. Basic of Gradient Descent …. • Tan becomes 0 when slop is zero • Tan = 0 is not always equal to global minima
  • 6. Introduction to simple linear regression: X Y -1 -1 1 2 2 3 4 3 6 5 7 8 19 20 Assumption of linear regression: Data is linearly distributed -2 0 2 4 6 8 10 -2 0 2 4 6 8 Y X Simple Linear Regression
  • 7. Introduction to simple linear regression: Equation of straight line is y = Ax + B X is given , A is slope B is intercept need to be found. A = (∑x)(∑y)- n(∑xy) / ((∑x)^2 -n ∑x^2) B = (∑x)(∑x^2)- n(∑y)(∑x^2) / ((∑x)^2 -n ∑x^2) X Y -1 -1 1 2 2 3 4 3 6 5 7 8 19 20 X sqr Y sqr XY 1 1 1 1 4 2 4 9 6 16 9 12 36 25 30 49 64 56 107 112 107 A = 0.932384 B = 0.380783 SSE = 3.4
  • 8. X y 0.93284* X + 0.380783 , SSE = 3.4 0.9226* X + 0.3567 , SSE = 2.8 0.9157* X + 0.2777 , SSE = 2.2 Way to find the best fit line which minimizes the error is by optimization technique , Optimization uses gradient descent to find minimum error Why optimization
  • 9. • Calculation of gradient descent Lets consider how to calculate gradient descent
  • 10. To fit a line Y pred = a + b X, start off with random values of a and b and calculate prediction error (SSE) Step 1:
  • 11. Step 2: Calculate the error gradient w.r.t the weights ∂SSE/∂a ∂SSE/∂b
  • 12. So, update rules: 1.New a = a – r * ∂SSE/∂a = 0.45-0.01*3.300 = 0.42 2.New b = b – r * ∂SSE/∂b= 0.75-0.01*1.545 = 0.73 here, r is the learning rate = 0.01, which is the pace of adjustment to the weights. Step 3: Adjust the weights with the gradients to reach the optimal values where SSE is minimized
  • 13. Step 4: Use new a and b for prediction and to calculate new Total SSE You can see with the new prediction, the total SSE has gone down (0.677 to 0.553). That means prediction accuracy has improved Step 5: Repeat step 3 and 4 till the time further adjustments to a, b doesn’t significantly reduces the error. At that time, we have arrived at the optimal a , b with the highest prediction accuracy.
  • 14. Extending the idea of Gradient Descent to Neural network Forward Propagation Initialize Weight (One Time) Feed data Compute Y Compute loss Backpropogation Compute Partial differentials Update weights
  • 15. Gradient Descent Mini Batch SGD Stochastic Gradient Descent Ad grad SGD with momentum Ad Delta Adam Flavors Of Gradients Learning RateNo of Samples
  • 16. • Gradient Descent • Stochastic Gradient Descent • Mini Batch SGD Stochastic Gradient Descent X1 X2 X3 Y 0 0.2 0.9 0 0.22 0.25 0.22 5 0.24 0.6 0.58 2.2 0.33 0.13 0.2 5.9 0.37 0.89 0.55 3.2 0.44 0.3 0.39 1.5 0.44 0.5 0.54 1.8 0.57 0.78 0.53 2.9 0.93 3 1 9.4 1 0.61 0.61 2.3 Feed one rows at a time • Random Weight initialization • No of coefficients is propositional to no of columns
  • 17. X1 X2 X3 Y 0 0.2 0.9 0 0.22 0.25 0.22 5 0.24 0.6 0.58 2.2 0.33 0.13 0.2 5.9 0.37 0.89 0.55 3.2 0.44 0.3 0.39 1.5 0.44 0.5 0.54 1.8 0.57 0.78 0.53 2.9 0.93 3 1 9.4 1 0.61 0.61 2.3 Mini Batch SGD B1 B2 B3 B4 B5 B1
  • 18. Batch Normalization Fully connected NW L1 L3 L4 L5 L6L2I/P O/P Before we feed data into network we normalize the data to bring the value under same scale this is nothing but Mean centering /variance scaling / normalization In Deep NW small change in input causes large change in the output because of lot of multiplication
  • 19. Normalization is always recommend in neural networks Because of fast convergence
  • 20. Why Batch normalization ? Internal co- variance shift • L1 To L2 no much changes or difference by the time it reaches L5 there is a huge shift • Where to introduce normalization ? Heuristic N1 N2 N3
  • 21. Memonization Instead of computing some repeated partial Derivate , what about compute once and reuse it ?
  • 22. Weight initializations Don’t Does 1. Never initialize your weights is equal to zero 2. Initialize same weight across all neuron this problem is a called problem of symmetry 5 3 3 3 3 3 3 2 3 3 34 15 + 15 + 15 + 6 + 6 + 6 + 12 12 12 = 33 = 33 = 33
  • 23. 3. Large negative values • Relu = Dead activation (refer activation function presentation) • Sigmoid = vanishing gradient Does Random initialization , on Random initialization each neuron learns different aspects Imagine each neuron is base model combine multiple base models ex: Random forest each model is built on different attributes so it sees lot of variation and learns very well.
  • 24. • Zeros: Initializer that generates tensors initialized to 0. • Ones: Initializer that generates tensors initialized to 1. • Constant: Initializer that generates tensors initialized to a constant value. • RandomNormal: Initializer that generates tensors with a normal distribution. • RandomUniform: Initializer that generates tensors with a uniform distribution. • TruncatedNormal: Initializer that generates a truncated normal distribution. • VarianceScaling: Initializer capable of adapting its scale to the shape of weights. • Orthogonal: Initializer that generates a random orthogonal matrix. • Identity: Initializer that generates the identity matrix. • lecun_uniform: LeCun uniform initializer. • glorot_normal: Glorot normal initializer, also called Xavier normal initializer. • glorot_uniform: Glorot uniform initializer, also called Xavier uniform initializer. • he_normal: He normal initializer. • lecun_normal: LeCun normal initializer. • he_uniform: He uniform variance scaling initializer. from keras import initializers
  • 25. • RandomNormal keras.initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None) • RandomUniform keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None) • TruncatedNormal keras.initializers.TruncatedNormal(mean=0.0, stddev=0.05, seed=None)
  • 26. • glorot_normal keras.initializers.glorot_normal(seed=None) It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / (fan_in + fan_out)) • glorot_uniform keras.initializers.glorot_uniform(seed=None) It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / (fan_in + fan_out)) • he_uniform keras.initializers.he_uniform(seed=None) It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / fan_in). • he_normal keras.initializers.he_normal(seed=None) It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / fan_in)where fan_in is the number of input units in the weight tensor.