SlideShare a Scribd company logo
1 of 40
Neural Networks
Ahmed
π‘π‘’π‘’π‘Ÿπ‘Žπ‘™ π‘π‘’π‘‘π‘€π‘œπ‘Ÿπ‘˜π‘  ∈ π‘†π‘’π‘π‘’π‘Ÿπ‘£π‘–π‘ π‘’π‘‘ π‘€π‘Žπ‘β„Žπ‘–π‘›π‘’ π‘™π‘’π‘Žπ‘Ÿπ‘›π‘–π‘›π‘”
History
Neural Networks Architectures
Standard Artificial
Neural Networks
ANN
Convolutional Neural
Networks
CNN
Recurrent Neural
Networks
RNN
Perceptron
z
𝑧 =
𝑖=0
𝑛
𝑀𝑖 π‘₯𝑖 = π‘Š βˆ— 𝑋
β€’ z = w0*1 + w1*x1+ w2*x2 + wn*xn
Perceptron
z π‘Ž = 𝑔(𝑧)g(z)
a
Activation functions and non linearity
Tanh(z) Relu(z)Οƒ(z)
. g(z)
. g(z)
x1
x2
x3
𝑀11
𝑀12
𝑀21
𝑀22
𝑀32
𝑀31
𝑧1
𝑧2
π‘Ž1
π‘Ž2
. Relu(z)
. Relu(z)
1
2
3
0.5
0.4
0.5
0.4
0.4
0.5
𝑧1
1
= 0.5 βˆ— 1 + 0.5 βˆ— 2 + 0.5 βˆ— 3 = 3
3
2.4 2.4
𝑧2
1
= 0.4 βˆ— 1 + 0.4 βˆ— 2 + 0.4 βˆ— 3 = 2.4
π‘Ž2
1
= Relu(2.4)=2.4
π‘Ž1
1
= Relu(3)=3
3
. Relu(z)
. Relu(z)
1
2
3
0.5
0.4
0.5
0.4
0.4
0.5
3
2.4
3
2.4
Relu(z).
2.5 2.5
𝑦
0.2
0.8
𝑧1
2
= 0.2 βˆ— 3 + 0.8 βˆ— 2.4 = 2.52
π‘Ž1
2
= Relu(2.52)=2.52
Forward propagation
𝑍 = π‘Š βˆ— 𝑋
𝐴 = 𝑔(𝑍)
𝐴 𝑖𝑠 π‘‘β„Žπ‘’ 𝑛𝑒π‘₯𝑑 π‘™π‘Žπ‘¦π‘’π‘Ÿβ€²
𝑠 input
𝑍 = π‘Š βˆ— 𝐴
Learning
z
g(z)
a
Learning problem
β€’ π‘‡β„Žπ‘’ π‘œπ‘π‘‘π‘Žπ‘–π‘›π‘’π‘‘ π‘œπ‘’π‘‘π‘π‘’π‘‘ 𝑖𝑠 β€²π‘Žβ€²
β€’ π‘‘β„Žπ‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘π‘‘ π‘œπ‘’π‘‘π‘π‘’π‘‘ 𝑖𝑠 ′𝑦′
β€’ π‘‡β„Žπ‘’ π‘’π‘Ÿπ‘Ÿπ‘œπ‘Ÿ β€˜π½β€™ 𝑖𝑠 π‘Ž π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› π‘œπ‘“ β€˜π‘Žβ€™ π‘Žπ‘›π‘‘ β€˜π‘¦β€™ π‘“π‘œπ‘Ÿ 𝑒π‘₯π‘Žπ‘šπ‘π‘™π‘’
J(a,y) = (π‘Ž βˆ’ 𝑦)2
β€’ How to minimise the error?
change w
β€’ How to find w?
Optimisation: Gradient Descent?
β€’ w = w- Ξ±
𝑑𝐽
𝑑𝑀
Problem statement
Error in output (J), how to change W?
How to find
πœ•π½
πœ•π‘€
? Back propagation?
πœ•π‘Ž
πœ•π‘§
= 𝑔′
𝑧
πœ•π‘§
πœ•π‘€
= π‘₯
πœ•π½
πœ•π‘€
=
πœ•π½
πœ•π‘Ž
πœ•π‘Ž
πœ•π‘§
πœ•π‘§
πœ•π‘€
w z a J
πœ•π½
πœ•π‘€
=
πœ•π½
πœ•π‘Ž
βˆ— 𝑔′
𝑧 βˆ— π‘₯
x
Chain Rule:
g'()
How to find
πœ•π½
πœ•π‘₯
? Back propagation?
πœ•π‘Ž
πœ•π‘§
= 𝑔′
𝑧
πœ•π‘§
πœ•π‘₯
= 𝑀
πœ•π½
πœ•π‘₯
=
πœ•π½
πœ•π‘Ž
πœ•π‘Ž
πœ•π‘§
πœ•π‘§
πœ•π‘₯
x z a J
πœ•π½
πœ•π‘₯
=
πœ•π½
πœ•π‘Ž
βˆ— 𝑔′
𝑧 βˆ— 𝑀
w
Chain Rule:
g'()
Update parameters
β€’ 𝑀 𝑛𝑒𝑀 = w- Ξ±
𝑑𝐽
𝑑𝑀
β€’ Assume Ξ±=0.01
β€’
πœ•π½
πœ•π‘€
=
πœ•π½
πœ•π‘Ž
πœ•π‘Ž
πœ•π‘§
π‘₯ = -88.24*π‘₯
β€’ w1_new = 0.1- 0.01*(-88.24)*1= 0.1+0.9=1
β€’ w2_new = 0.2- 0.01*(-88.24)*2.5= 2.4
β€’ w3_new = 0.4- 0.01*(-88.24)*2.9= 2.9
0.1
0.2 1.76
0.4
1.76
Obtained
Output
2.5
2.9
2.5
2.9
1.76
1
L2
Correct value is 90
Obtained value is 1.76
Error = - 88.24
Distribute the penalty to previous neurons
β€’
πœ•π½
πœ•π‘₯
=
πœ•π½
πœ•π‘Ž
πœ•π‘Ž
πœ•π‘§
𝑀 = -88.24*w
β€’
πœ•π½
πœ•π‘₯2
= -88.24*0.2= -17.6
β€’
πœ•π½
πœ•π‘₯3
= -88.24*0.4= -35.3
𝛛𝐉
π››π±πŸ
= -17.6
𝝏𝑱
ππ’™πŸ‘
= -35.3
Error = -88.
0.1
0.2 1.76
0.4
1.76
Obtained
Output
2.5
2.9
2.5
2.9
1.76
1
L2
𝛛𝐉
𝛛𝒂
= -88.24
Feed Forward
Z= W*X
a= g(z)
J= Cost function
Feed backward
dJ/da = (y-a) or calculated
da/dz = g'(z)
dJ/dz = (dJ/da)*(da/dz)
dJ/dx = (dJ/dz)*W
dJ/dw = (dJ/dz)*X
dJ/dx dz/dx da/dz dJ/da
wi g'(z) …
w11
w12 dJ/da1
w21
w22
w31 dJ/da2
w32
z2 a2
x2
x3
x1
z1 a1
dz/dw
x
Summary
How about other search and optimisation
methods?
Forward
propagation
calculate error
Back
propagation
Update
parameters
Learning the price of a flat in Al weibdeh
β€’ Description:
β€’ Ground Floor? Yes
β€’ 2 bathrooms
β€’ 3 bedrooms
β€’ The price is 90 K JoD.
[1,2,3]-> -> 90NN
ANN: 3 inputs, 1 output, 2 hidden Layers
0.1
0.0
0.5 0.0
0.4 0.2 0.3
0.5 0.8
0.4 0.8
-0.1
0.5 0.2
0.4
0.6
3
3.0
2.4
2.5
2.9
2.5
2.9
3.0
2.4
0.6
1 1
1
2
L2L1
0.1
0.0
0.5 0.0
0.4 0.2 0.2
0.5 0.8
0.4 0.8
0.4
0.5 0.2
0.4
1.76
3
3.0
2.4
2.5
2.9
2.5
2.9
3.0
2.4
1.76
1 1
1
2
Initialisation
Relu
Relu
Relu
Relu
Relu
. Relu(z)
. Relu(z)
1
2
3
0.5
0.4
0.5
0.4
0.4
0.5
𝑧1
1
= 0.5 βˆ— 1 + 0.5 βˆ— 2 + 0.5 βˆ— 3 = 3
3
2.4
3
2.4
𝑧2
1
= 0.4 βˆ— 1 + 0.4 βˆ— 2 + 0.4 βˆ— 3 = 2.4
π‘Ž2
1
= Relu(2.4)=2.4
π‘Ž1
1
= Relu(3)=3
Matrix multiplication
β€’ 𝑍 = π‘Š βˆ— 𝑋
β€’
𝑧1
𝑧2
=
𝑀11 𝑀21
𝑀12 𝑀22
𝑀31
𝑀32
π‘₯1
π‘₯2
π‘₯3
β€’
𝑧1
𝑧2
=
0.5 0.5
0.4 0.4
0.5
0.4
1
2
3
=
0.5 βˆ— 1 + 0.5 βˆ— 2 +
0.4 βˆ— 1 + 0.4 βˆ— 2 +
0.5 βˆ— 3
0.4 βˆ— 3
=
3
2.4
β€’ π‘Ž = 𝑅𝑒𝑙𝑒 𝑍 = π‘Ÿπ‘’π‘™π‘’
3
2.4
=
3
2.4
0.1
0.0
0.5 0.0
0.4 0.2 0.2
0.5 0.8
0.4 0.8
0.4
0.5 0.2
0.4
1.76
3
3.0
2.4
2.5
2.9
2.5
2.9
3.0
2.4
1.76
1 1
1
2
L2L1
Relu
Relu
Relu
Second Layer and Output Layer
β€’ Second Layer
β€’ 𝑍 = π‘Š βˆ— 𝑋
β€’
𝑧1
𝑧2
=
0 0.2
0 0.8
0.8
0.2
1
3
2.4
=
0 βˆ— 1 + 0.2 βˆ— 3 +
0 βˆ— 1 + 0.8 βˆ— 3 +
0.8 βˆ— 2.4
0.2 βˆ— 2.4
=
2.52
2.88
β€’ π‘Ž = 𝑅𝑒𝑙𝑒
2.52
2.88
=
2.52
2.88
β€’ Output Layer
β€’ 𝑍 = 0.1 0.2 0.4
1
2.52
2.88
= 0.1 βˆ— 1 + 0.2 βˆ— 2.52 0.4 βˆ— 2.88 = 1.76
β€’ 𝑦 = π‘Ž = 𝑅𝑒𝑙𝑒 𝑍 = 𝑅𝑒𝑙𝑒 1.76 =1.76
0.1
0.0
0.5 0.0
0.4 0.2 0.2 1.76
0.5 0.8
0.4 0.8
0.4
0.5 0.2
0.4
1.76
Obtained
Output
3
3.0
2.4
2.5
2.9
2.5
2.9
3.0
2.4
1.76
1 1
1
2
L2L1
β€’ π‘œπ‘π‘‘π‘Žπ‘–π‘›π‘’π‘‘ π‘£π‘Žπ‘™π‘’π‘’ 𝑖𝑠 1.76, correct value is 90!
β€’ Error = 1.76-90 = - 88.24
The other way around: BackProp
β€’ πΆπ‘œπ‘ π‘‘ π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› 𝐽 =
(aβˆ’ y) 2
2
β€’ Penalty:
β€’
πœ•π½
πœ•π‘Ž
= π‘Ž βˆ’ 𝑦 = 1.76βˆ’90 = βˆ’ 88.24
x z a J
w
Calculate new output parameters
β€’ 𝑀 𝑛𝑒𝑀 = w- Ξ±
𝑑𝐽
𝑑𝑀
β€’ Assume Ξ±=0.01
β€’
πœ•π½
πœ•π‘€
=
πœ•π½
πœ•π‘Ž
πœ•π‘Ž
πœ•π‘§
π‘₯ =
πœ•π½
πœ•π‘§
π‘₯ = -88.24*x
𝑀1_𝑛𝑒𝑀
𝑀2_𝑛𝑒𝑀
𝑀3_𝑛𝑒𝑀
= …slide 18 ……=
1
2.4
2.9
0.1
0.2 1.76
0.4
1.76
Obtained
Output
2.5
2.9
2.5
2.9
1.76
1
L2
Correct value is 90
Error = - 88.24
Distribute the penalty to L2 neurons
β€’
πœ•π½
πœ•π‘₯
=
πœ•π½
πœ•π‘Ž
πœ•π‘Ž
πœ•π‘§
𝑀 =
πœ•π½
πœ•π‘§
𝑀 = -88.24*w
β€’
πœ•π½
πœ•π‘₯2
= -88.24*0.2= -17.6
β€’
πœ•π½
πœ•π‘₯3
= -88.24*0.4= -35.3
𝛛𝐉
π››π±πŸ
= -17.6
𝝏𝑱
ππ’™πŸ‘
= -35.3
Error = -88.24
0.1
0.2 1.76
0.4
1.76
Obtained
Output
2.5
2.9
2.5
2.9
1.76
1
L2
Calculate L2 parameters
β€’ 𝑀 𝑛𝑒𝑀 = wβˆ’ Ξ±
𝑑𝐽
𝑑𝑧
X
β€’ Weights connected to upper neuron
β€’ Weights connected to lower neuron
0.0
0.0
0.2
0.8
0.8
0.2
3.0
2.4
2.5
2.9
2.5
2.9
3.0
2.4
1 1
L2L1
𝛛𝐉
π››πšπŸ
= -17.6
𝛛𝐉
π››πšπŸ‘
= -35.3
𝑀11_𝑛𝑒𝑀
𝑀21_𝑛𝑒𝑀
𝑀31_𝑛𝑒𝑀
=
𝑀11
𝑀21
𝑀31
- Ξ± *
𝑑𝐽
𝑑𝑧
*
π‘₯1
π‘₯2
π‘₯3
=
0
0.2
0.8
- 0.01*(βˆ’17.6) *
1
3
2.4
=
0.2
0.7
1.2
𝑀12_𝑛𝑒𝑀
𝑀22_𝑛𝑒𝑀
𝑀32_𝑛𝑒𝑀
=
𝑀12
𝑀22
𝑀32
- Ξ± *
𝑑𝐽
𝑑𝑧
*
π‘₯1
π‘₯2
π‘₯3
=
0
0.8
0.2
- 0.01*(35.3) *
1
3
2.4
=
0.4
1.9
1.0
Distribute the penalty to L1 neurons
β€’
πœ•π½
πœ•π‘₯
=
πœ•π½
πœ•π‘Ž
πœ•π‘Ž
πœ•π‘§
𝑀 =
πœ•π½
πœ•π‘§
𝑀
β€’ Which one should I take?!
𝛛𝐉
π››πšπŸ
= -17.6
𝝏𝑱
ππ’‚πŸ‘
= -35.3
0.0
0.0
0.2
0.8
0.8
0.2
3.0
2.4
2.5
2.9
2.5
2.9
3.0
2.4
1 1
L2L1
Distribute the penalty to L1 neurons
β€’
πœ•π½
πœ•π‘₯
=
πœ•π½
πœ•π‘Ž
πœ•π‘Ž
πœ•π‘§
𝑀 =
πœ•π½
πœ•π‘§
𝑀
β€’
πœ•π½
πœ•π‘₯2
= -17.6*0.2 + -35.3*0.8 =
= -31.8
β€’
πœ•π½
πœ•π‘₯3
= -17.6*0.8+ -35.3*0.2 = 8.9
= -21.2
𝛛𝐉
π››πšπŸ
= -17.6
𝝏𝑱
ππ’‚πŸ‘
= -35.3
0.0
0.0
0.2
0.8
0.8
0.2
3.0
2.4
2.5
2.9
2.5
2.9
3.0
2.4
1 1
L2L1
𝛛𝐉
π››π±πŸ
= -31.8
𝛛𝐉
π››π±πŸ
= -21.2
Calculate new input parameters
β€’ 𝑀 𝑛𝑒𝑀 = wβˆ’ Ξ±
𝑑𝐽
𝑑𝑧
X
β€’ Weights connected to upper neuron
β€’ Weights connected to lower neuron
𝛛𝐉
π››πšπŸ
= -31.8
𝛛𝐉
π››πšπŸ‘
= -21.2
𝑀11_𝑛𝑒𝑀
𝑀21_𝑛𝑒𝑀
𝑀31_𝑛𝑒𝑀
=
𝑀11
𝑀21
𝑀31
- Ξ± *
𝑑𝐽
𝑑𝑧
*
π‘₯1
π‘₯2
π‘₯3
=
0.5
0.5
0.5
- 0.01*(βˆ’31.8) *
1
2
3
=
0.8
1.1
1.4
𝑀12_𝑛𝑒𝑀
𝑀22_𝑛𝑒𝑀
𝑀32_𝑛𝑒𝑀
=
𝑀12
𝑀22
𝑀32
- Ξ± *
𝑑𝐽
𝑑𝑧
*
π‘₯1
π‘₯2
π‘₯3
=
0.4
0.4
0.4
- 0.01*(βˆ’21.2) *
1
2
3
=
0.6
0.8
1.0
0.5
0.4
0.5
0.4
0.5
0.4
3
3.0
2.4
3.0
2.4
1
1
2
L1
1.0
0.2
0.8 0.4
0.6 0.7 2.4 86.60
1.1 1.9
0.8 1.2
2.9
1.4 1.0
1.0
2
5.4 5.4 19.3 19.3
3
7.1 7.1 12.0 12.0 86.60 86.60
L1 L2
1 1
1 Obtained
Output
Update parameters
Output ? 86.6, Error -3.4
Examples
https://playground.tensorflow.org/
Neural networks and Deep Learning
Inception LeNet (GoogLe Network)*
The name actually comes from the
movie Inception
*Going deeper with convolutions
[Szegedy 2014]
Neural Networks can generate Music!
β€’ 30 seconds of Jazz generated by an RNN.
β€’ https://soundcloud.com/user-559668657/machine-generated-jazz
β€’ Do you like it?

More Related Content

What's hot

Pr083 Non-local Neural Networks
Pr083 Non-local Neural NetworksPr083 Non-local Neural Networks
Pr083 Non-local Neural NetworksTaeoh Kim
Β 
Fundamentals of Transport Phenomena ChE 715
Fundamentals of Transport Phenomena ChE 715Fundamentals of Transport Phenomena ChE 715
Fundamentals of Transport Phenomena ChE 715HelpWithAssignment.com
Β 
Numerical Methods: Solution of system of equations
Numerical Methods: Solution of system of equationsNumerical Methods: Solution of system of equations
Numerical Methods: Solution of system of equationsNikolai Priezjev
Β 
Digital Signal Processing
Digital Signal ProcessingDigital Signal Processing
Digital Signal Processingaj ahmed
Β 
MinFill_Presentation
MinFill_PresentationMinFill_Presentation
MinFill_PresentationAnna Lasota
Β 
Convergence Criteria
Convergence CriteriaConvergence Criteria
Convergence CriteriaTarun Gehlot
Β 
It 05104 digsig_1
It 05104 digsig_1It 05104 digsig_1
It 05104 digsig_1goutamkrsahoo
Β 
Linear Programming Problems : Dr. Purnima Pandit
Linear Programming Problems : Dr. Purnima PanditLinear Programming Problems : Dr. Purnima Pandit
Linear Programming Problems : Dr. Purnima PanditPurnima Pandit
Β 
Rosser's theorem
Rosser's theoremRosser's theorem
Rosser's theoremWathna
Β 
Measures of dispersion - united world school of business
Measures of dispersion -  united world school of businessMeasures of dispersion -  united world school of business
Measures of dispersion - united world school of businessUnitedworld School Of Business
Β 
Lec05 circle ellipse
Lec05 circle ellipseLec05 circle ellipse
Lec05 circle ellipseMaaz Rizwan
Β 
Computer Graphic - Lines, Circles and Ellipse
Computer Graphic - Lines, Circles and EllipseComputer Graphic - Lines, Circles and Ellipse
Computer Graphic - Lines, Circles and Ellipse2013901097
Β 
METHOD OF JACOBI
METHOD OF JACOBIMETHOD OF JACOBI
METHOD OF JACOBIjorgeduardooo
Β 

What's hot (18)

Pr083 Non-local Neural Networks
Pr083 Non-local Neural NetworksPr083 Non-local Neural Networks
Pr083 Non-local Neural Networks
Β 
Fundamentals of Transport Phenomena ChE 715
Fundamentals of Transport Phenomena ChE 715Fundamentals of Transport Phenomena ChE 715
Fundamentals of Transport Phenomena ChE 715
Β 
Numerical Methods: Solution of system of equations
Numerical Methods: Solution of system of equationsNumerical Methods: Solution of system of equations
Numerical Methods: Solution of system of equations
Β 
Digital Signal Processing
Digital Signal ProcessingDigital Signal Processing
Digital Signal Processing
Β 
MinFill_Presentation
MinFill_PresentationMinFill_Presentation
MinFill_Presentation
Β 
Convergence Criteria
Convergence CriteriaConvergence Criteria
Convergence Criteria
Β 
It 05104 digsig_1
It 05104 digsig_1It 05104 digsig_1
It 05104 digsig_1
Β 
Tokyo conference
Tokyo conferenceTokyo conference
Tokyo conference
Β 
Linear Programming Problems : Dr. Purnima Pandit
Linear Programming Problems : Dr. Purnima PanditLinear Programming Problems : Dr. Purnima Pandit
Linear Programming Problems : Dr. Purnima Pandit
Β 
Rosser's theorem
Rosser's theoremRosser's theorem
Rosser's theorem
Β 
One sided z transform
One sided z transformOne sided z transform
One sided z transform
Β 
tsoulkas_cumulants
tsoulkas_cumulantstsoulkas_cumulants
tsoulkas_cumulants
Β 
Measures of dispersion - united world school of business
Measures of dispersion -  united world school of businessMeasures of dispersion -  united world school of business
Measures of dispersion - united world school of business
Β 
Lec05 circle ellipse
Lec05 circle ellipseLec05 circle ellipse
Lec05 circle ellipse
Β 
Dispersion
DispersionDispersion
Dispersion
Β 
Computer Graphic - Lines, Circles and Ellipse
Computer Graphic - Lines, Circles and EllipseComputer Graphic - Lines, Circles and Ellipse
Computer Graphic - Lines, Circles and Ellipse
Β 
METHOD OF JACOBI
METHOD OF JACOBIMETHOD OF JACOBI
METHOD OF JACOBI
Β 
DC servo motor
DC servo motorDC servo motor
DC servo motor
Β 

Similar to Neural Network Guide: An Overview of Key Concepts in Neural Networks

Manual solucoes ex_extras
Manual solucoes ex_extrasManual solucoes ex_extras
Manual solucoes ex_extrasVandilberto Pinto
Β 
Manual solucoes ex_extras
Manual solucoes ex_extrasManual solucoes ex_extras
Manual solucoes ex_extrasVandilbertoPinto1
Β 
Solved exercises double integration
Solved exercises double integrationSolved exercises double integration
Solved exercises double integrationKamel Attar
Β 
Quantum factorization.pdf
Quantum factorization.pdfQuantum factorization.pdf
Quantum factorization.pdfssuser8b461f
Β 
Formulario Geometria Analitica.pdf
Formulario Geometria Analitica.pdfFormulario Geometria Analitica.pdf
Formulario Geometria Analitica.pdfAntonio Guasco
Β 
Integration techniques
Integration techniquesIntegration techniques
Integration techniquesKrishna Gali
Β 
Practical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentPractical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentRaphael Reitzig
Β 
Solids of revolution
Solids of revolutionSolids of revolution
Solids of revolutionAzad Kaymaz
Β 
Lecture 2: Artificial Neural Network
Lecture 2: Artificial Neural NetworkLecture 2: Artificial Neural Network
Lecture 2: Artificial Neural NetworkMohamed Loey
Β 
Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...
Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...
Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...Hareem Aslam
Β 
Gaussian quadratures
Gaussian quadraturesGaussian quadratures
Gaussian quadraturesTarun Gehlot
Β 
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions ManualCalculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions Manualnodyligomi
Β 
DSP LAB COMPLETE CODES.docx
DSP LAB COMPLETE CODES.docxDSP LAB COMPLETE CODES.docx
DSP LAB COMPLETE CODES.docxMUMAR57
Β 
Laplace equation
Laplace equationLaplace equation
Laplace equationalexkhan129
Β 
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by AntonSolutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by AntonPamelaew
Β 
UNIT I_5.pdf
UNIT I_5.pdfUNIT I_5.pdf
UNIT I_5.pdfMuthukumar P
Β 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Dr. C.V. Suresh Babu
Β 

Similar to Neural Network Guide: An Overview of Key Concepts in Neural Networks (20)

Unit 3
Unit 3Unit 3
Unit 3
Β 
Calculo integral - Larson
Calculo integral - LarsonCalculo integral - Larson
Calculo integral - Larson
Β 
Manual solucoes ex_extras
Manual solucoes ex_extrasManual solucoes ex_extras
Manual solucoes ex_extras
Β 
Manual solucoes ex_extras
Manual solucoes ex_extrasManual solucoes ex_extras
Manual solucoes ex_extras
Β 
Solved exercises double integration
Solved exercises double integrationSolved exercises double integration
Solved exercises double integration
Β 
Quantum factorization.pdf
Quantum factorization.pdfQuantum factorization.pdf
Quantum factorization.pdf
Β 
Formulario Geometria Analitica.pdf
Formulario Geometria Analitica.pdfFormulario Geometria Analitica.pdf
Formulario Geometria Analitica.pdf
Β 
Integration techniques
Integration techniquesIntegration techniques
Integration techniques
Β 
Practical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentPractical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient Apportionment
Β 
Solids of revolution
Solids of revolutionSolids of revolution
Solids of revolution
Β 
Lecture 2: Artificial Neural Network
Lecture 2: Artificial Neural NetworkLecture 2: Artificial Neural Network
Lecture 2: Artificial Neural Network
Β 
Section4 stochastic
Section4 stochasticSection4 stochastic
Section4 stochastic
Β 
Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...
Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...
Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...
Β 
Gaussian quadratures
Gaussian quadraturesGaussian quadratures
Gaussian quadratures
Β 
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions ManualCalculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
Β 
DSP LAB COMPLETE CODES.docx
DSP LAB COMPLETE CODES.docxDSP LAB COMPLETE CODES.docx
DSP LAB COMPLETE CODES.docx
Β 
Laplace equation
Laplace equationLaplace equation
Laplace equation
Β 
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by AntonSolutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Β 
UNIT I_5.pdf
UNIT I_5.pdfUNIT I_5.pdf
UNIT I_5.pdf
Β 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0
Β 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
Β 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
Β 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
Β 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
Β 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
Β 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
Β 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
Β 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
Β 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
Β 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
Β 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
Β 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
Β 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
Β 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
Β 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
Β 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
Β 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
Β 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
Β 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Β 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Β 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
Β 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Β 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
Β 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
Β 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
Β 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
Β 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Β 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Β 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Β 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Β 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Β 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Β 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Β 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Β 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Β 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
Β 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
Β 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Β 

Neural Network Guide: An Overview of Key Concepts in Neural Networks

  • 2. π‘π‘’π‘’π‘Ÿπ‘Žπ‘™ π‘π‘’π‘‘π‘€π‘œπ‘Ÿπ‘˜π‘  ∈ π‘†π‘’π‘π‘’π‘Ÿπ‘£π‘–π‘ π‘’π‘‘ π‘€π‘Žπ‘β„Žπ‘–π‘›π‘’ π‘™π‘’π‘Žπ‘Ÿπ‘›π‘–π‘›π‘”
  • 4. Neural Networks Architectures Standard Artificial Neural Networks ANN Convolutional Neural Networks CNN Recurrent Neural Networks RNN
  • 5. Perceptron z 𝑧 = 𝑖=0 𝑛 𝑀𝑖 π‘₯𝑖 = π‘Š βˆ— 𝑋 β€’ z = w0*1 + w1*x1+ w2*x2 + wn*xn
  • 6. Perceptron z π‘Ž = 𝑔(𝑧)g(z) a
  • 7. Activation functions and non linearity Tanh(z) Relu(z)Οƒ(z)
  • 9. . Relu(z) . Relu(z) 1 2 3 0.5 0.4 0.5 0.4 0.4 0.5 𝑧1 1 = 0.5 βˆ— 1 + 0.5 βˆ— 2 + 0.5 βˆ— 3 = 3 3 2.4 2.4 𝑧2 1 = 0.4 βˆ— 1 + 0.4 βˆ— 2 + 0.4 βˆ— 3 = 2.4 π‘Ž2 1 = Relu(2.4)=2.4 π‘Ž1 1 = Relu(3)=3 3
  • 10. . Relu(z) . Relu(z) 1 2 3 0.5 0.4 0.5 0.4 0.4 0.5 3 2.4 3 2.4 Relu(z). 2.5 2.5 𝑦 0.2 0.8 𝑧1 2 = 0.2 βˆ— 3 + 0.8 βˆ— 2.4 = 2.52 π‘Ž1 2 = Relu(2.52)=2.52
  • 11. Forward propagation 𝑍 = π‘Š βˆ— 𝑋 𝐴 = 𝑔(𝑍) 𝐴 𝑖𝑠 π‘‘β„Žπ‘’ 𝑛𝑒π‘₯𝑑 π‘™π‘Žπ‘¦π‘’π‘Ÿβ€² 𝑠 input 𝑍 = π‘Š βˆ— 𝐴
  • 13. Learning problem β€’ π‘‡β„Žπ‘’ π‘œπ‘π‘‘π‘Žπ‘–π‘›π‘’π‘‘ π‘œπ‘’π‘‘π‘π‘’π‘‘ 𝑖𝑠 β€²π‘Žβ€² β€’ π‘‘β„Žπ‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘π‘‘ π‘œπ‘’π‘‘π‘π‘’π‘‘ 𝑖𝑠 ′𝑦′ β€’ π‘‡β„Žπ‘’ π‘’π‘Ÿπ‘Ÿπ‘œπ‘Ÿ β€˜π½β€™ 𝑖𝑠 π‘Ž π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› π‘œπ‘“ β€˜π‘Žβ€™ π‘Žπ‘›π‘‘ β€˜π‘¦β€™ π‘“π‘œπ‘Ÿ 𝑒π‘₯π‘Žπ‘šπ‘π‘™π‘’ J(a,y) = (π‘Ž βˆ’ 𝑦)2 β€’ How to minimise the error? change w β€’ How to find w?
  • 14. Optimisation: Gradient Descent? β€’ w = w- Ξ± 𝑑𝐽 𝑑𝑀 Problem statement Error in output (J), how to change W?
  • 15. How to find πœ•π½ πœ•π‘€ ? Back propagation? πœ•π‘Ž πœ•π‘§ = 𝑔′ 𝑧 πœ•π‘§ πœ•π‘€ = π‘₯ πœ•π½ πœ•π‘€ = πœ•π½ πœ•π‘Ž πœ•π‘Ž πœ•π‘§ πœ•π‘§ πœ•π‘€ w z a J πœ•π½ πœ•π‘€ = πœ•π½ πœ•π‘Ž βˆ— 𝑔′ 𝑧 βˆ— π‘₯ x Chain Rule: g'()
  • 16. How to find πœ•π½ πœ•π‘₯ ? Back propagation? πœ•π‘Ž πœ•π‘§ = 𝑔′ 𝑧 πœ•π‘§ πœ•π‘₯ = 𝑀 πœ•π½ πœ•π‘₯ = πœ•π½ πœ•π‘Ž πœ•π‘Ž πœ•π‘§ πœ•π‘§ πœ•π‘₯ x z a J πœ•π½ πœ•π‘₯ = πœ•π½ πœ•π‘Ž βˆ— 𝑔′ 𝑧 βˆ— 𝑀 w Chain Rule: g'()
  • 17. Update parameters β€’ 𝑀 𝑛𝑒𝑀 = w- Ξ± 𝑑𝐽 𝑑𝑀 β€’ Assume Ξ±=0.01 β€’ πœ•π½ πœ•π‘€ = πœ•π½ πœ•π‘Ž πœ•π‘Ž πœ•π‘§ π‘₯ = -88.24*π‘₯ β€’ w1_new = 0.1- 0.01*(-88.24)*1= 0.1+0.9=1 β€’ w2_new = 0.2- 0.01*(-88.24)*2.5= 2.4 β€’ w3_new = 0.4- 0.01*(-88.24)*2.9= 2.9 0.1 0.2 1.76 0.4 1.76 Obtained Output 2.5 2.9 2.5 2.9 1.76 1 L2 Correct value is 90 Obtained value is 1.76 Error = - 88.24
  • 18. Distribute the penalty to previous neurons β€’ πœ•π½ πœ•π‘₯ = πœ•π½ πœ•π‘Ž πœ•π‘Ž πœ•π‘§ 𝑀 = -88.24*w β€’ πœ•π½ πœ•π‘₯2 = -88.24*0.2= -17.6 β€’ πœ•π½ πœ•π‘₯3 = -88.24*0.4= -35.3 𝛛𝐉 π››π±πŸ = -17.6 𝝏𝑱 ππ’™πŸ‘ = -35.3 Error = -88. 0.1 0.2 1.76 0.4 1.76 Obtained Output 2.5 2.9 2.5 2.9 1.76 1 L2 𝛛𝐉 𝛛𝒂 = -88.24
  • 19. Feed Forward Z= W*X a= g(z) J= Cost function Feed backward dJ/da = (y-a) or calculated da/dz = g'(z) dJ/dz = (dJ/da)*(da/dz) dJ/dx = (dJ/dz)*W dJ/dw = (dJ/dz)*X dJ/dx dz/dx da/dz dJ/da wi g'(z) … w11 w12 dJ/da1 w21 w22 w31 dJ/da2 w32 z2 a2 x2 x3 x1 z1 a1 dz/dw x Summary
  • 20. How about other search and optimisation methods? Forward propagation calculate error Back propagation Update parameters
  • 21. Learning the price of a flat in Al weibdeh β€’ Description: β€’ Ground Floor? Yes β€’ 2 bathrooms β€’ 3 bedrooms β€’ The price is 90 K JoD. [1,2,3]-> -> 90NN
  • 22. ANN: 3 inputs, 1 output, 2 hidden Layers 0.1 0.0 0.5 0.0 0.4 0.2 0.3 0.5 0.8 0.4 0.8 -0.1 0.5 0.2 0.4 0.6 3 3.0 2.4 2.5 2.9 2.5 2.9 3.0 2.4 0.6 1 1 1 2 L2L1
  • 23. 0.1 0.0 0.5 0.0 0.4 0.2 0.2 0.5 0.8 0.4 0.8 0.4 0.5 0.2 0.4 1.76 3 3.0 2.4 2.5 2.9 2.5 2.9 3.0 2.4 1.76 1 1 1 2 Initialisation Relu Relu Relu Relu Relu
  • 24. . Relu(z) . Relu(z) 1 2 3 0.5 0.4 0.5 0.4 0.4 0.5 𝑧1 1 = 0.5 βˆ— 1 + 0.5 βˆ— 2 + 0.5 βˆ— 3 = 3 3 2.4 3 2.4 𝑧2 1 = 0.4 βˆ— 1 + 0.4 βˆ— 2 + 0.4 βˆ— 3 = 2.4 π‘Ž2 1 = Relu(2.4)=2.4 π‘Ž1 1 = Relu(3)=3
  • 25. Matrix multiplication β€’ 𝑍 = π‘Š βˆ— 𝑋 β€’ 𝑧1 𝑧2 = 𝑀11 𝑀21 𝑀12 𝑀22 𝑀31 𝑀32 π‘₯1 π‘₯2 π‘₯3 β€’ 𝑧1 𝑧2 = 0.5 0.5 0.4 0.4 0.5 0.4 1 2 3 = 0.5 βˆ— 1 + 0.5 βˆ— 2 + 0.4 βˆ— 1 + 0.4 βˆ— 2 + 0.5 βˆ— 3 0.4 βˆ— 3 = 3 2.4 β€’ π‘Ž = 𝑅𝑒𝑙𝑒 𝑍 = π‘Ÿπ‘’π‘™π‘’ 3 2.4 = 3 2.4
  • 26. 0.1 0.0 0.5 0.0 0.4 0.2 0.2 0.5 0.8 0.4 0.8 0.4 0.5 0.2 0.4 1.76 3 3.0 2.4 2.5 2.9 2.5 2.9 3.0 2.4 1.76 1 1 1 2 L2L1 Relu Relu Relu
  • 27. Second Layer and Output Layer β€’ Second Layer β€’ 𝑍 = π‘Š βˆ— 𝑋 β€’ 𝑧1 𝑧2 = 0 0.2 0 0.8 0.8 0.2 1 3 2.4 = 0 βˆ— 1 + 0.2 βˆ— 3 + 0 βˆ— 1 + 0.8 βˆ— 3 + 0.8 βˆ— 2.4 0.2 βˆ— 2.4 = 2.52 2.88 β€’ π‘Ž = 𝑅𝑒𝑙𝑒 2.52 2.88 = 2.52 2.88 β€’ Output Layer β€’ 𝑍 = 0.1 0.2 0.4 1 2.52 2.88 = 0.1 βˆ— 1 + 0.2 βˆ— 2.52 0.4 βˆ— 2.88 = 1.76 β€’ 𝑦 = π‘Ž = 𝑅𝑒𝑙𝑒 𝑍 = 𝑅𝑒𝑙𝑒 1.76 =1.76
  • 28. 0.1 0.0 0.5 0.0 0.4 0.2 0.2 1.76 0.5 0.8 0.4 0.8 0.4 0.5 0.2 0.4 1.76 Obtained Output 3 3.0 2.4 2.5 2.9 2.5 2.9 3.0 2.4 1.76 1 1 1 2 L2L1 β€’ π‘œπ‘π‘‘π‘Žπ‘–π‘›π‘’π‘‘ π‘£π‘Žπ‘™π‘’π‘’ 𝑖𝑠 1.76, correct value is 90! β€’ Error = 1.76-90 = - 88.24
  • 29. The other way around: BackProp β€’ πΆπ‘œπ‘ π‘‘ π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› 𝐽 = (aβˆ’ y) 2 2 β€’ Penalty: β€’ πœ•π½ πœ•π‘Ž = π‘Ž βˆ’ 𝑦 = 1.76βˆ’90 = βˆ’ 88.24 x z a J w
  • 30. Calculate new output parameters β€’ 𝑀 𝑛𝑒𝑀 = w- Ξ± 𝑑𝐽 𝑑𝑀 β€’ Assume Ξ±=0.01 β€’ πœ•π½ πœ•π‘€ = πœ•π½ πœ•π‘Ž πœ•π‘Ž πœ•π‘§ π‘₯ = πœ•π½ πœ•π‘§ π‘₯ = -88.24*x 𝑀1_𝑛𝑒𝑀 𝑀2_𝑛𝑒𝑀 𝑀3_𝑛𝑒𝑀 = …slide 18 ……= 1 2.4 2.9 0.1 0.2 1.76 0.4 1.76 Obtained Output 2.5 2.9 2.5 2.9 1.76 1 L2 Correct value is 90 Error = - 88.24
  • 31. Distribute the penalty to L2 neurons β€’ πœ•π½ πœ•π‘₯ = πœ•π½ πœ•π‘Ž πœ•π‘Ž πœ•π‘§ 𝑀 = πœ•π½ πœ•π‘§ 𝑀 = -88.24*w β€’ πœ•π½ πœ•π‘₯2 = -88.24*0.2= -17.6 β€’ πœ•π½ πœ•π‘₯3 = -88.24*0.4= -35.3 𝛛𝐉 π››π±πŸ = -17.6 𝝏𝑱 ππ’™πŸ‘ = -35.3 Error = -88.24 0.1 0.2 1.76 0.4 1.76 Obtained Output 2.5 2.9 2.5 2.9 1.76 1 L2
  • 32. Calculate L2 parameters β€’ 𝑀 𝑛𝑒𝑀 = wβˆ’ Ξ± 𝑑𝐽 𝑑𝑧 X β€’ Weights connected to upper neuron β€’ Weights connected to lower neuron 0.0 0.0 0.2 0.8 0.8 0.2 3.0 2.4 2.5 2.9 2.5 2.9 3.0 2.4 1 1 L2L1 𝛛𝐉 π››πšπŸ = -17.6 𝛛𝐉 π››πšπŸ‘ = -35.3 𝑀11_𝑛𝑒𝑀 𝑀21_𝑛𝑒𝑀 𝑀31_𝑛𝑒𝑀 = 𝑀11 𝑀21 𝑀31 - Ξ± * 𝑑𝐽 𝑑𝑧 * π‘₯1 π‘₯2 π‘₯3 = 0 0.2 0.8 - 0.01*(βˆ’17.6) * 1 3 2.4 = 0.2 0.7 1.2 𝑀12_𝑛𝑒𝑀 𝑀22_𝑛𝑒𝑀 𝑀32_𝑛𝑒𝑀 = 𝑀12 𝑀22 𝑀32 - Ξ± * 𝑑𝐽 𝑑𝑧 * π‘₯1 π‘₯2 π‘₯3 = 0 0.8 0.2 - 0.01*(35.3) * 1 3 2.4 = 0.4 1.9 1.0
  • 33. Distribute the penalty to L1 neurons β€’ πœ•π½ πœ•π‘₯ = πœ•π½ πœ•π‘Ž πœ•π‘Ž πœ•π‘§ 𝑀 = πœ•π½ πœ•π‘§ 𝑀 β€’ Which one should I take?! 𝛛𝐉 π››πšπŸ = -17.6 𝝏𝑱 ππ’‚πŸ‘ = -35.3 0.0 0.0 0.2 0.8 0.8 0.2 3.0 2.4 2.5 2.9 2.5 2.9 3.0 2.4 1 1 L2L1
  • 34. Distribute the penalty to L1 neurons β€’ πœ•π½ πœ•π‘₯ = πœ•π½ πœ•π‘Ž πœ•π‘Ž πœ•π‘§ 𝑀 = πœ•π½ πœ•π‘§ 𝑀 β€’ πœ•π½ πœ•π‘₯2 = -17.6*0.2 + -35.3*0.8 = = -31.8 β€’ πœ•π½ πœ•π‘₯3 = -17.6*0.8+ -35.3*0.2 = 8.9 = -21.2 𝛛𝐉 π››πšπŸ = -17.6 𝝏𝑱 ππ’‚πŸ‘ = -35.3 0.0 0.0 0.2 0.8 0.8 0.2 3.0 2.4 2.5 2.9 2.5 2.9 3.0 2.4 1 1 L2L1 𝛛𝐉 π››π±πŸ = -31.8 𝛛𝐉 π››π±πŸ = -21.2
  • 35. Calculate new input parameters β€’ 𝑀 𝑛𝑒𝑀 = wβˆ’ Ξ± 𝑑𝐽 𝑑𝑧 X β€’ Weights connected to upper neuron β€’ Weights connected to lower neuron 𝛛𝐉 π››πšπŸ = -31.8 𝛛𝐉 π››πšπŸ‘ = -21.2 𝑀11_𝑛𝑒𝑀 𝑀21_𝑛𝑒𝑀 𝑀31_𝑛𝑒𝑀 = 𝑀11 𝑀21 𝑀31 - Ξ± * 𝑑𝐽 𝑑𝑧 * π‘₯1 π‘₯2 π‘₯3 = 0.5 0.5 0.5 - 0.01*(βˆ’31.8) * 1 2 3 = 0.8 1.1 1.4 𝑀12_𝑛𝑒𝑀 𝑀22_𝑛𝑒𝑀 𝑀32_𝑛𝑒𝑀 = 𝑀12 𝑀22 𝑀32 - Ξ± * 𝑑𝐽 𝑑𝑧 * π‘₯1 π‘₯2 π‘₯3 = 0.4 0.4 0.4 - 0.01*(βˆ’21.2) * 1 2 3 = 0.6 0.8 1.0 0.5 0.4 0.5 0.4 0.5 0.4 3 3.0 2.4 3.0 2.4 1 1 2 L1
  • 36. 1.0 0.2 0.8 0.4 0.6 0.7 2.4 86.60 1.1 1.9 0.8 1.2 2.9 1.4 1.0 1.0 2 5.4 5.4 19.3 19.3 3 7.1 7.1 12.0 12.0 86.60 86.60 L1 L2 1 1 1 Obtained Output Update parameters Output ? 86.6, Error -3.4
  • 38. Neural networks and Deep Learning
  • 39. Inception LeNet (GoogLe Network)* The name actually comes from the movie Inception *Going deeper with convolutions [Szegedy 2014]
  • 40. Neural Networks can generate Music! β€’ 30 seconds of Jazz generated by an RNN. β€’ https://soundcloud.com/user-559668657/machine-generated-jazz β€’ Do you like it?