Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.                                                                            Upcoming SlideShare
Loading in …5
×

# Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

This presentation explains how the backpropagation algorithm is useful in updating the artificial neural networks (ANNs) weights using two examples step by step. Readers should have a basic understanding of how ANNs work, partial derivatives, and multivariate chain rule.

This presentation won`t dive directly into the details of the algorithm but will start by training a very simple network. This is because the backpropagation algorithm is meant to be applied over a network after training. So, we should train the network before applying it to catch the benefits of backpropagation algorithm and how to use it.

• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here • Be the first to comment

### Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

1. 1. Backpropagation: Understanding How to Update ANNs Weights Step-by-Step Ahmed Fawzy Gad ahmed.fawzy@ci.menofia.edu.eg MENOUFIA UNIVERSITY FACULTY OF COMPUTERS AND INFORMATION INFORMATION TECHNOLOGY ‫المنوفية‬ ‫جامعة‬ ‫والمعلومات‬ ‫الحاسبات‬ ‫كلية‬ ‫المعلومات‬ ‫تكنولوجيا‬ ‫المنوفية‬ ‫جامعة‬
2. 2. Train then Update • The backpropagation algorithm is used to update the NN weights when they are not able to make the correct predictions. Hence, we should train the NN before applying backpropagation. Initial Weights PredictionTraining
3. 3. Train then Update • The backpropagation algorithm is used to update the NN weights when they are not able to make the correct predictions. Hence, we should train the NN before applying backpropagation. Initial Weights PredictionTraining BackpropagationUpdate
4. 4. Neural Network Training Example 𝐗 𝟏 𝐗 𝟐 𝐎𝐮𝐭𝐩𝐮𝐭 𝟎. 𝟏 𝟎. 𝟑 𝟎. 𝟎𝟑 𝐖𝟏 𝐖𝟐 𝐛 𝟎. 𝟓 𝟎. 𝟓 1. 𝟖𝟑 Training Data Initial Weights 𝟎. 𝟏 In Out 𝑾 𝟏 = 𝟎. 𝟓 𝑾 𝟐 = 𝟎. 𝟐 +𝟏 𝒃 = 𝟏. 𝟖𝟑 𝟎. 𝟑 𝑿 𝟏 In Out 𝑾 𝟏 𝑾 𝟐 +𝟏 𝒃 𝑿 𝟐
5. 5. Network Training • Steps to train our network: 1. Prepare activation function input (sum of products between inputs and weights). 2. Activation function output. 𝟎. 𝟏 In Out 𝑾 𝟏 = 𝟎. 𝟓 𝑾 𝟐 = 𝟎. 𝟐 +𝟏 𝒃 = 𝟏. 𝟖𝟑 𝟎. 𝟑
6. 6. Network Training: Sum of Products • After calculating the sop between inputs and weights, next is to use this sop as the input to the activation function. 𝟎. 𝟏 In Out 𝑾 𝟏 = 𝟎. 𝟓 𝑾 𝟐 = 𝟎. 𝟐 +𝟏 𝒃 = 𝟏. 𝟖𝟑 𝟎. 𝟑 𝒔 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃 𝒔 = 𝟎. 𝟏 ∗ 𝟎. 𝟓 + 𝟎. 𝟑 ∗ 𝟎. 𝟐 + 𝟏. 𝟖𝟑 𝒔 = 𝟏. 𝟗𝟒
7. 7. Network Training: Activation Function • In this example, the sigmoid activation function is used. • Based on the sop calculated previously, the output is as follows: 𝟎. 𝟏 In Out 𝑾 𝟏 = 𝟎. 𝟓 𝑾 𝟐 = 𝟎. 𝟐 +𝟏 𝒃 = 𝟏. 𝟖𝟑 𝟎. 𝟑 𝒇 𝒔 = 𝟏 𝟏 + 𝒆−𝒔 𝒇 𝒔 = 𝟏 𝟏 + 𝒆−𝟏.𝟗𝟒 = 𝟏 𝟏 + 𝟎. 𝟏𝟒𝟒 = 𝟏 𝟏. 𝟏𝟒𝟒 𝒇 𝒔 = 𝟎. 𝟖𝟕𝟒
8. 8. Network Training: Prediction Error • After getting the predicted outputs, next is to measure the prediction error of the network. • We can use the squared error function defined as follows: • Based on the predicted output, the prediction error is: 𝟎. 𝟏 In Out 𝑾 𝟏 = 𝟎. 𝟓 𝑾 𝟐 = 𝟎. 𝟐 +𝟏 𝒃 = 𝟏. 𝟖𝟑 𝟎. 𝟑 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝑬 = 𝟏 𝟐 𝟎. 𝟎𝟑 − 𝟎. 𝟖𝟕𝟒 𝟐 = 𝟏 𝟐 −𝟎. 𝟖𝟒𝟒 𝟐 = 𝟏 𝟐 𝟎. 𝟕𝟏𝟑 = 𝟎. 𝟑𝟓𝟕
9. 9. How to Minimize Prediction Error? • There is a prediction error and it should be minimized until reaching an acceptable error. What should we do in order to minimize the error? • There must be something to change in order to minimize the error. In our example, the only parameter to change is the weight. How to update the weights? • We can use the weights update equation: 𝑾 𝒏𝒆𝒘 = 𝑾 𝒐𝒍𝒅 + η 𝒅 − 𝒀 𝑿
10. 10. Weights Update Equation • We can use the weights update equation:  𝑾 𝒏𝒆𝒘: new updated weights.  𝑾 𝒐𝒍𝒅: current weights. [1.83, 0.5, 0.2]  η: network learning rate. 0.01  𝒅: desired output. 0.03  𝒀: predicted output. 0.874  𝑿: current input at which the network made false prediction. [+1, 0.1, 0.3] 𝑾 𝒏𝒆𝒘 = 𝑾 𝒐𝒍𝒅 + η 𝒅 − 𝒀 𝑿
11. 11. Weights Update Equation 𝑾 𝒏𝒆𝒘 = 𝑾 𝒐𝒍𝒅 + η 𝒅 − 𝒀 𝑿 = [𝟏. 𝟖𝟑, 𝟎. 𝟓, 𝟎. 𝟐 + 𝟎. 𝟎𝟏 𝟎. 𝟎𝟑 − 𝟎. 𝟖𝟕𝟒 [+𝟏, 𝟎. 𝟏, 𝟎. 𝟑 = [𝟏. 𝟖𝟑, 𝟎. 𝟓, 𝟎. 𝟐 + −𝟎. 𝟎𝟎𝟖𝟒[+𝟏, 𝟎. 𝟏, 𝟎. 𝟑 = [𝟏. 𝟖𝟑, 𝟎. 𝟓, 𝟎. 𝟐 + [−𝟎. 𝟎𝟎𝟖𝟒, −𝟎. 𝟎𝟎𝟎𝟖𝟒, −𝟎. 𝟎𝟎𝟐𝟓 = [𝟏. 𝟖𝟐𝟐, 𝟎. 𝟒𝟗𝟗, 𝟎. 𝟏𝟗𝟖
12. 12. Weights Update Equation • The new weights are: • Based on the new weights, the network will be re-trained. 𝑾 𝟏𝒏𝒆𝒘 𝑾 𝟐𝒏𝒆𝒘 𝒃 𝒏𝒆𝒘 𝟎. 𝟏𝟗𝟖 𝟎. 𝟒𝟗𝟗 𝟏. 𝟖𝟐𝟐 𝟎. 𝟏 In Out 𝑾 𝟏 = 𝟎. 𝟓 𝑾 𝟐 = 𝟎. 𝟐 +𝟏 𝒃 = 𝟏. 𝟖𝟑 𝟎. 𝟑
13. 13. Weights Update Equation • The new weights are: • Based on the new weights, the network will be re-trained. • Continue these operations until prediction error reaches an acceptable value. 1. Updating weights. 2. Retraining network. 3. Calculating prediction error. 𝑾 𝟏𝒏𝒆𝒘 𝑾 𝟐𝒏𝒆𝒘 𝒃 𝒏𝒆𝒘 𝟎. 𝟏𝟗𝟖 𝟎. 𝟒𝟗𝟗 𝟏. 𝟖𝟐𝟐 𝟎. 𝟏 In Out 𝑾 𝟏 = 𝟎. 𝟒𝟗𝟗 𝑾 𝟐 = 𝟎. 𝟏𝟗𝟖 +𝟏 𝒃 = 𝟏. 𝟖22 𝟎. 𝟑
14. 14. Why Backpropagation Algorithm is Important? • The backpropagation algorithm is used to answer these questions and understand effect of each weight over the prediction error. New Weights !Old Weights
15. 15. Forward Vs. Backward Passes • When training a neural network, there are two passes: forward and backward. • The goal of the backward pass is to know how each weight affects the total error. In other words, how changing the weights changes the prediction error? Forward Backward
16. 16. Backward Pass • Let us work with a simpler example: • How to answer this question: What is the effect on the output Y given a change in variable X? • This question is answered using derivatives. Derivative of Y wrt X ( 𝝏𝒀 𝝏𝑿 ) will tell us the effect of changing the variable X over the output Y. 𝒀 = 𝑿 𝟐 𝒁 + 𝑯
17. 17. Calculating Derivatives • The derivative 𝝏𝒀 𝝏𝑿 can be calculated as follows: • Based on these two derivative rules: • The result will be: 𝝏𝒀 𝛛𝑿 = 𝛛 𝛛𝑿 (𝑿 𝟐 𝒁 + 𝑯) 𝒀 = 𝑿 𝟐 𝒁 + 𝑯 𝛛 𝛛𝑿 𝑿 𝟐 = 𝟐𝑿Square 𝛛 𝛛𝑿 𝑪 = 𝟎Constant 𝝏𝒀 𝛛𝑿 = 𝟐𝑿𝒁 + 𝟎 = 𝟐𝑿𝒁
18. 18. Prediction Error – Weight Derivative E W? 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 Change in Y wrt X 𝝏𝒀 𝛛𝑿 Change in E wrt W 𝝏𝑬 𝛛𝑾
19. 19. Prediction Error – Weight Derivative 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐
20. 20. Prediction Error – Weight Derivative 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐
21. 21. Prediction Error – Weight Derivative 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕)
22. 22. Prediction Error – Weight Derivative 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 = 𝟏 𝟏 + 𝒆−𝒔
23. 23. Prediction Error – Weight Derivative 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝟏 𝟏 + 𝒆−𝒔 𝟐 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 = 𝟏 𝟏 + 𝒆−𝒔
24. 24. Prediction Error – Weight Derivative 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝟏 𝟏 + 𝒆−𝒔 𝟐 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 = 𝟏 𝟏 + 𝒆−𝒔
25. 25. Prediction Error – Weight Derivative 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝟏 𝟏 + 𝒆−𝒔 𝟐 𝒔 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 = 𝟏 𝟏 + 𝒆−𝒔
26. 26. Prediction Error – Weight Derivative 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝟏 𝟏 + 𝒆−𝒔 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 = 𝟏 𝟏 + 𝒆−𝒔 𝒔 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝟏 𝟏 + 𝒆−(𝑿1∗ 𝑾1+ 𝑿2∗𝑾2+𝒃) 𝟐
27. 27. Multivariate Chain Rule Predicted Output Prediction Error sop Weights 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝒇 𝒙 = 𝟏 𝟏 + 𝒆−𝒔 𝒔 = 𝑿 𝟏 ∗ 𝑾 𝟏 + 𝑿 𝟐 ∗ 𝑾 𝟐 + 𝒃 𝑾 𝟏, 𝑾 𝟐 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝟏 𝟏 + 𝒆−(𝑿1∗ 𝑾1+ 𝑿2∗𝑾2+𝒃) 𝟐 𝝏𝑬 𝝏𝑾 = 𝝏 𝝏𝑾 ( 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝟏 𝟏 + 𝒆−(𝑿 𝟏∗ 𝑾 𝟏+ 𝑿 𝟐∗𝑾 𝟐+𝒃) 𝟐 ) Chain Rule
28. 28. Multivariate Chain Rule Predicted Output Prediction Error sop Weights 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝒇 𝒙 = 𝟏 𝟏 + 𝒆−𝒔 𝒔 = 𝑿 𝟏 ∗ 𝑾 𝟏 + 𝑿 𝟐 ∗ 𝑾 𝟐 + 𝒃 𝑾 𝟏, 𝑾 𝟐 𝝏𝑬 𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝛛𝒔 𝝏𝒔 𝛛𝑾 𝟏 𝝏𝒔 𝛛𝑾 𝟐 𝝏𝑬 𝛛𝑾 𝟏 𝝏𝑬 𝛛𝑾 𝟐 Let’s calculate these individual partial derivatives. 𝝏𝑬 𝝏𝑾 𝟏 = 𝝏𝑬 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 ∗ 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝒔 ∗ 𝝏𝒔 𝝏𝑾 𝟏 𝝏𝑬 𝝏𝑾 𝟐 = 𝝏𝑬 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 ∗ 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝒔 ∗ 𝝏𝒔 𝝏𝑾 𝟐 𝝏𝑬 𝝏𝑾 𝟐 = 𝝏𝑬 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 ∗ 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝒔 ∗ 𝝏𝒔 𝝏𝑾 𝟐
29. 29. Error-Predicted ( 𝝏𝑬 𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 ) Partial Derivative Substitution 𝝏𝑬 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝝏 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 ( 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐) = 𝟐 ∗ 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐−𝟏 ∗ (𝟎 − 𝟏) )= (𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅) ∗ (−𝟏 = 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 − 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 𝝏𝑬 𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 − 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟖𝟕𝟒 − 𝟎. 𝟎𝟑 𝝏𝑬 𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝟎. 𝟖𝟒𝟒 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐
30. 30. Predicted-sop ( 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝒔 ) Partial Derivative 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝒔 = 𝝏 𝝏𝒔 ( 𝟏 𝟏 + 𝒆−𝒔 ) 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝒔 = 𝟏 𝟏 + 𝒆−𝒔 (𝟏 − 𝟏 𝟏 + 𝒆−𝒔 ) 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝒔 = 𝟏 𝟏 + 𝒆−𝒔 (𝟏 − 𝟏 𝟏 + 𝒆−𝒔 ) = 𝟏 𝟏 + 𝒆−𝟏.𝟗𝟒 (𝟏 − 𝟏 𝟏 + 𝒆−𝟏.𝟗𝟒 ) = 𝟏 𝟏 + 𝟎. 𝟏𝟒𝟒 (𝟏 − 𝟏 𝟏 + 𝟎. 𝟏𝟒𝟒 ) = 𝟏 𝟏. 𝟏𝟒𝟒 (𝟏 − 𝟏 𝟏. 𝟏𝟒𝟒 ) = 𝟎. 𝟖𝟕𝟒(𝟏 − 𝟎. 𝟖𝟕𝟒) = 𝟎. 𝟖𝟕𝟒(𝟎. 𝟏𝟐𝟔) 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝛛𝒔 = 𝟎. 𝟏𝟏 Substitution 𝐏𝐫𝐞𝐝𝐢𝐜𝐭𝐞𝐝 = 𝟏 𝟏 + 𝒆−𝒔
31. 31. Sop-𝑊1 ( 𝝏𝒔 𝛛𝑾 𝟏 ) Partial Derivative 𝝏𝒔 𝛛𝑾 𝟏 = 𝛛 𝛛𝑾 𝟏 (𝑿 𝟏 ∗ 𝑾 𝟏 + 𝑿 𝟐 ∗ 𝑾 𝟐 + 𝒃) = 𝟏 ∗ 𝑿 𝟏 ∗ 𝑾 𝟏 𝟏−𝟏 + 𝟎 + 𝟎 = 𝑿 𝟏 ∗ 𝑾 𝟏 𝟎 )= 𝑿 𝟏(𝟏 𝝏𝒔 𝛛𝑾 𝟏 = 𝑿 𝟏 𝝏𝒔 𝛛𝑾 𝟏 = 𝑿 𝟏 Substitution 𝝏𝒔 𝛛𝑾 𝟏 = 𝟎. 𝟏 𝐬 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃
32. 32. 𝝏𝒔 𝛛𝑾 𝟐 = 𝛛 𝛛𝑾 𝟐 (𝑿 𝟏 ∗ 𝑾 𝟏 + 𝑿 𝟐 ∗ 𝑾 𝟐 + 𝒃) = 𝟎 + 𝟏 ∗ 𝑿 𝟐 ∗ 𝑾 𝟐 𝟏−𝟏 + 𝟎 = 𝑿 𝟐 ∗ 𝑾 𝟐 𝟎 )= 𝑿 𝟐(𝟏 𝝏𝒔 𝛛𝑾 𝟐 = 𝑿 𝟐 𝝏𝒔 𝛛𝑾 𝟐 = 𝑿 𝟐 = 𝟎. 𝟑 Substitution 𝝏𝒔 𝛛𝑾 𝟐 = 𝟎. 𝟑 𝐬 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃 Sop-𝑊1 ( 𝝏𝒔 𝛛𝑾 𝟐 ) Partial Derivative
33. 33. Error-𝑊1 ( 𝛛𝑬 𝛛𝑾 𝟏 ) Partial Derivative • After calculating each individual derivative, we can multiply all of them to get the desired relationship between the prediction error and each weight. 𝝏𝑬 𝝏𝑾 𝟏 = 𝝏𝑬 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 ∗ 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝒔 ∗ 𝝏𝒔 𝝏𝑾 𝟏 𝝏𝑬 𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝟎. 𝟖𝟒𝟒 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝛛𝒔 = 𝟎. 𝟏𝟏 𝝏𝒔 𝛛𝑾 𝟏 = 𝟎. 𝟏 𝝏𝑬 𝛛𝑾 𝟏 = 𝟎. 𝟖𝟒𝟒 ∗ 𝟎. 𝟏𝟏 ∗ 𝟎. 𝟏 𝝏𝑬 𝛛𝑾 𝟏 = 𝟎. 𝟎𝟏 Calculated Derivatives
34. 34. Error-𝑊2 ( 𝛛𝑬 𝛛𝑾 𝟐 ) Partial Derivative 𝝏𝑬 𝝏𝑾 𝟐 = 𝝏𝑬 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 ∗ 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝝏𝒔 ∗ 𝝏𝒔 𝝏𝑾 𝟐 𝝏𝑬 𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝟎. 𝟖𝟒𝟒 𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝛛𝒔 = 𝟎. 𝟏𝟏 𝝏𝒔 𝛛𝑾 𝟐 = 𝟎. 𝟑 𝛛𝑬 𝛛𝑾 𝟐 = 𝟎. 𝟎𝟑 𝝏𝑬 𝛛𝑾 𝟐 = 𝟎. 𝟖𝟒𝟒 ∗ 𝟎. 𝟏𝟏 ∗ 𝟎. 𝟑 Calculated Derivatives
35. 35. Interpreting Derivatives • There are two useful pieces of information from the derivatives calculated previously. Increasing/decreasing weight increases/decreases error. Derivative MagnitudeDerivative Sign Positive Increasing/decreasing weight decreases/increases error. Negative Increasing/decreasing weight by P increases/decreases error by MAG*P. Increasing/decreasing weight by P decreases/increases error by MAG*P. Positive Sign Negative Sign In our example, because both 𝛛𝑬 𝛛𝑾 𝟏 and 𝛛𝑬 𝛛𝑾 𝟐 are positive, then we would like to decrease the weights in order to decrease the prediction error. 𝛛𝑬 𝛛𝑾 𝟐 = 𝟎. 𝟎𝟑 𝝏𝑬 𝛛𝑾 𝟏 = 𝟎. 𝟎𝟏
36. 36. Updating Weights • Each weight will be updated based on its derivative according to this equation: 𝑾𝒊𝒏𝒆𝒘 = 𝑾𝒊𝒐𝒍𝒅 − η ∗ 𝛛𝑬 𝛛𝑾𝒊 𝑾 𝟏𝒏𝒆𝒘 = 𝑾 𝟏 − η ∗ 𝛛𝑬 𝛛𝑾 𝟏 = 𝟎. 𝟓 − 0.01 ∗ 𝟎. 𝟎𝟏 𝑾 𝟏𝒏𝒆𝒘 = 𝟎. 𝟒𝟗𝟗𝟗𝟏 𝑾 𝟐𝒏𝒆𝒘 = 𝑾 𝟐 − η ∗ 𝛛𝑬 𝛛𝑾 𝟐 = 𝟎. 𝟐 − 0.01 ∗ 𝟎. 𝟎𝟐𝟖 𝑾 𝟐𝒏𝒆𝒘 = 𝟎. 𝟏𝟗𝟗𝟕 Updating 𝑾 𝟏 Updating 𝑾 𝟐 Continue updating weights according to derivatives and re-train the network until reaching an acceptable error.
37. 37. Second Example Backpropagation for NN with Hidden Layer
38. 38. ANN with Hidden Layer 𝑾 𝟏 𝑾 𝟐 𝑾 𝟑 𝑾 𝟒 𝑾 𝟓 𝑾 𝟔 𝒃 𝟏 𝒃 𝟐 𝒃 𝟑 𝟎. 𝟓 𝟎. 𝟏 𝟎. 𝟔𝟐 𝟎. 𝟐 −𝟎. 𝟐 𝟎. 𝟑 𝟎. 𝟒 −𝟎. 𝟏 𝟏. 𝟖𝟑 𝐗 𝟏 𝐗 𝟐 𝐎𝐮𝐭𝐩𝐮𝐭 𝟎. 𝟏 𝟎. 𝟑 𝟎. 𝟎𝟑 Training Data Initial Weights
39. 39. ANN with Hidden Layer Initial Weights PredictionTraining
40. 40. ANN with Hidden Layer Initial Weights PredictionTraining BackpropagationUpdate
41. 41. Forward Pass – Hidden Layer Neurons 𝒉 𝟏𝒊𝒏 = 𝑿 𝟏 ∗ 𝑾 𝟏 + 𝑿 𝟐 ∗ 𝑾 𝟐 + 𝒃 𝟏 = 𝟎. 𝟏 ∗ 𝟎. 𝟓 + 𝟎. 𝟑 ∗ 𝟎. 𝟏 + 𝟎. 𝟒 𝒉 𝟏𝒊𝒏 = 𝟎. 𝟒𝟖 𝒉 𝟏𝒐𝒖𝒕 = 𝟏 𝟏 + 𝒆−𝒉 𝟏𝒊𝒏 = 𝟏 𝟏 + 𝒆−𝟎.𝟒𝟖 𝒉 𝟏𝒐𝒖𝒕 = 𝟎. 𝟔𝟏𝟖 𝒉 𝟏 In Out
42. 42. Forward Pass – Hidden Layer Neurons 𝒉 𝟐𝒊𝒏 = 𝑿 𝟏 ∗ 𝑾 𝟑 + 𝑿 𝟐 ∗ 𝑾 𝟒 + 𝒃 𝟐 = 𝟎. 𝟏 ∗ 𝟎. 𝟔𝟐 + 𝟎. 𝟑 ∗ 𝟎. 𝟐 − 𝟎. 𝟏 𝒉 𝟐𝒊𝒏 = 𝟎. 𝟎𝟐𝟐 𝒉 𝟐𝒐𝒖𝒕 = 𝟏 𝟏 + 𝒆−𝒉 𝟐𝒊𝒏 = 𝟏 𝟏 + 𝒆−𝟎.𝟎𝟐𝟐 𝒉 𝟐𝒐𝒖𝒕 = 𝟎. 𝟓𝟎𝟔 𝒉 𝟐 In Out
43. 43. Forward Pass – Output Layer Neuron 𝒐𝒖𝒕𝒊𝒏 = 𝒉 𝟏𝒐𝒖𝒕 ∗ 𝑾 𝟓 + 𝒉 𝟐𝒐𝒖𝒕 ∗ 𝑾 𝟔 + 𝒃 𝟑 = 𝟎. 𝟔𝟏𝟖 ∗ −𝟎. 𝟐 + 𝟎. 𝟓𝟎𝟔 ∗ 𝟎. 𝟑 + 𝟏. 𝟖𝟑 𝒐𝒖𝒕𝒊𝒏 = 𝟏. 𝟖𝟓𝟖 𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟏 𝟏 + 𝒆−𝒐𝒖𝒕 𝒊𝒏 = 𝟏 𝟏 + 𝒆−𝟏.𝟖𝟓𝟖 𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟔𝟓 𝒐𝒖𝒕 In Out
44. 44. Forward Pass – Prediction Error 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 𝑬 = 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒐𝒖𝒕 𝒐𝒖𝒕 𝟐 = 𝟏 𝟐 𝟎. 𝟎𝟑 − 𝟎. 𝟖𝟔𝟓 𝟐 𝑬 = 𝟎. 𝟑𝟒𝟗 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟔𝟓 𝝏𝑬 𝝏𝑾 𝟏 , 𝝏𝑬 𝝏𝑾 𝟐 , 𝝏𝑬 𝝏𝑾 𝟑 , 𝝏𝑬 𝝏𝑾 𝟒 , 𝝏𝑬 𝝏𝑾 𝟓 , 𝝏𝑬 𝝏𝑾 𝟔
45. 45. Partial Derivatives Calculation
46. 46. E−𝑊5 ( 𝝏𝑬 𝝏𝑾 𝟓 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟓 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝑾 𝟓
47. 47. E−𝑊5 ( 𝝏𝑬 𝝏𝑾 𝟓 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟓 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝑾 𝟓 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝝏 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 ( 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒐𝒖𝒕 𝒐𝒖𝒕 𝟐 ) = 𝟐 ∗ 𝟏 𝟐 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒐𝒖𝒕 𝒐𝒖𝒕 𝟐−𝟏 ∗ (𝟎 − 𝟏) = 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ (−𝟏) 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝒐𝒖𝒕 𝒐𝒖𝒕 − 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝒐𝒖𝒕 𝒐𝒖𝒕 − 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟖𝟔𝟓 − 𝟎. 𝟎𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓 Partial Derivative Substitution
48. 48. E−𝑊5 ( 𝝏𝑬 𝝏𝑾 𝟓 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟓 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝑾 𝟓 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝝏 𝝏𝒐𝒖𝒕𝒊𝒏 ( 𝟏 𝟏 + 𝒆−𝒐𝒖𝒕 𝒊𝒏 ) 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = ( 𝟏 𝟏 + 𝒆−𝒐𝒖𝒕 𝒊𝒏 )(𝟏 − 𝟏 𝟏 + 𝒆−𝒐𝒖𝒕 𝒊𝒏 ) 𝜕𝒐𝒖𝒕 𝒐𝒖𝒕 𝜕𝒐𝒖𝒕𝒊𝒏 = ( 𝟏 𝟏 + 𝒆−𝟏.𝟖𝟓𝟖 )(𝟏 − 𝟏 𝟏 + 𝒆−𝟏.𝟖𝟓𝟖 ) = ( 𝟏 𝟏. 𝟓𝟔 )(𝟏 − 𝟏 𝟏. 𝟓𝟔 ) = 𝟎. 𝟔𝟒𝟏 𝟏 − 𝟎. 𝟔𝟒𝟏 = 𝟎. 𝟔𝟒𝟏 𝟎. 𝟑𝟓𝟗 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 Partial Derivative Substitution
49. 49. E−𝑊5 ( 𝝏𝑬 𝝏𝑾 𝟓 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟓 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝑾 𝟓 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟓 = 𝝏 𝝏𝑾 𝟓 (𝒉 𝟏𝒐𝒖𝒕 ∗ 𝑾 𝟓 + 𝒉 𝟐𝒐𝒖𝒕 ∗ 𝑾 𝟔 + 𝒃 𝟑) = 𝟏 ∗ 𝒉 𝟏𝒐𝒖𝒕 ∗ (𝑾 𝟓) 𝟏−𝟏 + 𝟎 + 𝟎 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟓 = 𝒉 𝟏𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟓 = 𝒉 𝟏𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟓 = 𝟎. 𝟔𝟏𝟖 Partial Derivative Substitution
50. 50. E−𝑊5 ( 𝝏𝑬 𝝏𝑾 𝟓 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟓 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝑾 𝟓 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟓 = 𝟎. 𝟔𝟏𝟖 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓 𝝏𝑬 𝝏𝑾 𝟓 = 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ 𝟎. 𝟔𝟏𝟖 𝝏𝑬 𝝏𝑾 𝟓 = 𝟎. 𝟏𝟏𝟗
51. 51. E−𝑊6 ( 𝝏𝑬 𝝏𝑾 𝟔 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟔 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝑾 𝟔
52. 52. E−𝑊6 ( 𝝏𝑬 𝝏𝑾 𝟔 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟔 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝑾 𝟔 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓
53. 53. E−𝑊6 ( 𝝏𝑬 𝝏𝑾 𝟔 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟓 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝑾 𝟔 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟔 = 𝝏 𝝏𝑾 𝟔 (𝒉 𝟏𝒐𝒖𝒕 ∗ 𝑾 𝟓 + 𝒉 𝟐𝒐𝒖𝒕 ∗ 𝑾 𝟔 + 𝒃 𝟑) = 𝟎 + 𝟏 ∗ 𝒉 𝟐𝒐𝒖𝒕 ∗ (𝑾 𝟔) 𝟏−𝟏 +𝟎 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟔 = 𝒉 𝟐𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟔 = 𝒉 𝟐𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟔 = 𝟎. 𝟓𝟎𝟔 Partial Derivative Substitution
54. 54. E−𝑊6 ( 𝝏𝑬 𝝏𝑾 𝟔 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟔 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝑾 𝟔 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝑾 𝟔 = 𝟎. 𝟓𝟎𝟔 𝝏𝑬 𝛛𝑾 𝟔 = 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ 𝟎. 𝟓𝟎𝟔 𝛛𝑬 𝛛𝑾 𝟔 = 𝟎. 𝟎𝟗𝟕
55. 55. E−𝑊1 ( 𝝏𝑬 𝝏𝑾 𝟏 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟏 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟏
56. 56. E−𝑊1 ( 𝝏𝑬 𝝏𝑾 𝟏 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟏 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟏 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓
57. 57. E−𝑊1 ( 𝝏𝑬 𝝏𝑾 𝟏 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟏 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟏 Partial Derivative Substitution 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟏 𝒐𝒖𝒕 = 𝝏 𝝏𝒉𝟏 𝒐𝒖𝒕 (𝒉 𝟏𝒐𝒖𝒕 ∗ 𝑾 𝟓 + 𝒉 𝟐𝒐𝒖𝒕 ∗ 𝑾 𝟔 + 𝒃 𝟑) = (𝒉 𝟏𝒐𝒖𝒕) 𝟏−𝟏 ∗ 𝑾 𝟓 + 𝟎 + 𝟎 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟏 𝒐𝒖𝒕 = 𝑾 𝟓 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟏 𝒐𝒖𝒕 = 𝑾 𝟓 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 = −𝟎. 𝟐
58. 58. E−𝑊1 ( 𝝏𝑬 𝝏𝑾 𝟏 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟏 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟏 Partial Derivative Substitution 𝝏𝒉𝟏 𝒐𝒖𝒕 𝝏𝒉𝟏𝒊𝒏 = 𝝏 𝝏𝒉 𝟏𝒊𝒏 ( 𝟏 𝟏 + 𝒆−𝒉 𝟏𝒊𝒏 ) 𝝏𝒉𝟏 𝒐𝒖𝒕 𝝏𝒉𝟏𝒊𝒏 = ( 𝟏 𝟏 + 𝒆−𝒉 𝟏𝒊𝒏 )(𝟏 − 𝟏 𝟏 + 𝒆−𝒉 𝟏𝒊𝒏 ) 𝝏𝒉𝟏 𝒐𝒖𝒕 𝝏𝒉𝟏𝒊𝒏 = ( 𝟏 𝟏 + 𝒆−𝒉 𝟏𝒊𝒏 )(𝟏 − 𝟏 𝟏 + 𝒆−𝒉 𝟏𝒊𝒏 ) = ( 𝟏 𝟏 + 𝒆−𝟎.𝟒𝟖 )(𝟏 − 𝟏 𝟏 + 𝒆−𝟎.𝟒𝟖 ) 𝝏𝒉 𝟐𝒐𝒖𝒕 𝝏𝒉 𝟐𝒊𝒏 = 𝟎. 𝟐𝟑𝟔
59. 59. E−𝑊1 ( 𝝏𝑬 𝝏𝑾 𝟏 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟏 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟏 Partial Derivative Substitution 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟏 = 𝝏 𝝏𝑾 𝟏 (𝑿 𝟏 ∗ 𝑾 𝟏 + 𝑿 𝟐 ∗ 𝑾 𝟐 + 𝒃 𝟏) = 𝑿 𝟏 ∗ (𝑾 𝟏) 𝟏−𝟏+ 𝟎 + 𝟎 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟏 = 𝑿 𝟏 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟏 = 𝑿 𝟏 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟏 = 𝟎. 𝟏
60. 60. E−𝑊1 ( 𝝏𝑬 𝝏𝑾 𝟏 ) Parial Derivative 𝝏𝑬 𝛛𝑾 𝟏 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟏 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟏 = 𝟎. 𝟏 𝝏𝒉 𝟐𝒐𝒖𝒕 𝝏𝒉 𝟐𝒊𝒏 = 𝟎. 𝟐𝟑𝟔 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 = −𝟎. 𝟐 𝝏𝑬 𝝏𝑾 𝟏 = 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ −𝟎. 𝟐 ∗ 𝟎. 𝟐𝟑𝟔 ∗ 𝟎. 𝟏 𝝏𝑬 𝝏𝑾 𝟏 = −𝟎. 𝟎𝟎𝟏
61. 61. E−𝑊2 ( 𝝏𝑬 𝝏𝑾 𝟐 ) Parial Derivative: 𝝏𝑬 𝛛𝑾 𝟐 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟐
62. 62. E−𝑊2 ( 𝝏𝑬 𝝏𝑾 𝟐 ) Parial Derivative: 𝝏𝑬 𝛛𝑾 𝟐 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟐 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓 𝝏𝒉 𝟐𝒐𝒖𝒕 𝝏𝒉 𝟐𝒊𝒏 = 𝟎. 𝟐𝟑𝟔 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 = −𝟎. 𝟐
63. 63. E−𝑊2 ( 𝝏𝑬 𝝏𝑾 𝟐 ) Parial Derivative: Partial Derivative Substitution 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟐 = 𝝏 𝝏𝑾 𝟐 (𝑿 𝟏 ∗ 𝑾 𝟏 + 𝑿 𝟐 ∗ 𝑾 𝟐 + 𝒃 𝟏) = 𝟎 + 𝑿 𝟐 ∗ (𝑾 𝟐) 𝟏−𝟏+𝟎 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟐 = 𝑿 𝟐 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟐 = 𝑿 𝟐 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟐 = 𝟎. 𝟑 𝝏𝑬 𝛛𝑾 𝟐 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟐
64. 64. E−𝑊2 ( 𝝏𝑬 𝝏𝑾 𝟐 ) Parial Derivative: 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓 𝝏𝒉 𝟐𝒐𝒖𝒕 𝝏𝒉 𝟐𝒊𝒏 = 𝟎. 𝟐𝟑𝟔 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 = −𝟎. 𝟐 𝝏𝒉𝟏𝒊𝒏 𝝏𝑾 𝟐 = 𝟎. 𝟑 𝝏𝑬 𝝏𝑾 𝟐 = 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ −𝟎. 𝟐 ∗ 𝟎. 𝟐𝟑𝟔 ∗ 𝟎. 𝟑 𝝏𝑬 𝝏𝑾 𝟐 = −. 𝟎𝟎𝟑 𝝏𝑬 𝛛𝑾 𝟐 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟏 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟏 𝒐𝒖𝒕 𝛛𝒉𝟏𝒊𝒏 ∗ 𝛛𝒉𝟏𝒊𝒏 𝛛𝑾 𝟐
65. 65. E−𝑊3 ( 𝝏𝑬 𝝏𝑾 𝟑 ) Parial Derivative: 𝝏𝑬 𝛛𝑾 𝟑 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟑
66. 66. E−𝑊3 ( 𝝏𝑬 𝝏𝑾 𝟑 ) Parial Derivative: 𝝏𝑬 𝛛𝑾 𝟑 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟑 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓
67. 67. E−𝑊3 ( 𝝏𝑬 𝝏𝑾 𝟑 ) Parial Derivative: 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟐 𝒐𝒖𝒕 = 𝝏 𝝏𝒉𝟐 𝒐𝒖𝒕 (𝒉 𝟏𝒐𝒖𝒕 ∗ 𝑾 𝟓 + 𝒉 𝟐𝒐𝒖𝒕 ∗ 𝑾 𝟔 + 𝒃 𝟑) = 𝟎 + (𝒉 𝟐𝒐𝒖𝒕) 𝟏−𝟏∗ 𝑾 𝟔 + 𝟎 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟐 𝒐𝒖𝒕 = 𝑾 𝟔 Partial Derivative Substitution 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟐 𝒐𝒖𝒕 = 𝑾 𝟔 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟐 𝒐𝒖𝒕 = 𝟎. 𝟑 𝝏𝑬 𝛛𝑾 𝟑 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟑
68. 68. E−𝑊3 ( 𝝏𝑬 𝝏𝑾 𝟑 ) Parial Derivative: 𝝏𝒉𝟐 𝒐𝒖𝒕 𝝏𝒉𝟐𝒊𝒏 = 𝝏 𝝏𝒉 𝟐𝒊𝒏 ( 𝟏 𝟏 + 𝒆−𝒉 𝟐𝒊𝒏 ) 𝝏𝒉𝟐 𝒐𝒖𝒕 𝝏𝒉𝟐𝒊𝒏 = ( 𝟏 𝟏 + 𝒆−𝒉 𝟐𝒊𝒏 )(𝟏 − 𝟏 𝟏 + 𝒆−𝒉 𝟐𝒊𝒏 ) Partial Derivative Substitution 𝝏𝒉𝟐 𝒐𝒖𝒕 𝝏𝒉𝟐𝒊𝒏 = ( 𝟏 𝟏 + 𝒆−𝒉 𝟐𝒊𝒏 )(𝟏 − 𝟏 𝟏 + 𝒆−𝒉 𝟐𝒊𝒏 ) = ( 𝟏 𝟏 + 𝒆−𝟎.𝟎𝟐𝟐 )(𝟏 − 𝟏 𝟏 + 𝒆−𝟎.𝟎𝟐𝟐 ) 𝝏𝒉 𝟐𝒐𝒖𝒕 𝝏𝒉 𝟐𝒊𝒏 = 𝟎. 𝟐𝟓 𝝏𝑬 𝛛𝑾 𝟑 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟑
69. 69. E−𝑊3 ( 𝝏𝑬 𝝏𝑾 𝟑 ) Parial Derivative: 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟑 = 𝝏 𝝏𝑾 𝟑 (𝑿 𝟏 ∗ 𝑾 𝟑 + 𝑿 𝟐 ∗ 𝑾 𝟒 + 𝒃 𝟐) = 𝑿 𝟏 ∗ 𝑾 𝟑 + 𝑿 𝟐 ∗ 𝑾 𝟒 + 𝒃 𝟐 = (𝑿 𝟏) 𝟏−𝟏∗ 𝑾 𝟑 + 𝟎 + 𝟎 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟑 = 𝑾 𝟑 Partial Derivative Substitution 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟑 = 𝑾 𝟑 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟑 = 𝟎. 𝟔𝟐 𝝏𝑬 𝛛𝑾 𝟑 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟑
70. 70. E−𝑊3 ( 𝝏𝑬 𝝏𝑾 𝟑 ) Parial Derivative: 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟐 𝒐𝒖𝒕 = 𝟎. 𝟑 𝝏𝒉 𝟐𝒐𝒖𝒕 𝝏𝒉 𝟐𝒊𝒏 = 𝟎. 𝟐𝟓 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟑 = 𝟎. 𝟔𝟐 𝝏𝑬 𝝏𝑾 𝟑 = 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ 𝟎. 𝟑 ∗ 𝟎. 𝟐𝟓 ∗ 𝟎. 𝟔𝟐 𝝏𝑬 𝝏𝑾 𝟑 = 𝟎. 𝟎𝟎𝟗 𝝏𝑬 𝛛𝑾 𝟑 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟑
71. 71. E−𝑊4 ( 𝝏𝑬 𝝏𝑾 𝟒 ) Parial Derivative: 𝝏𝑬 𝛛𝑾 𝟒 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟒
72. 72. E−𝑊4 ( 𝝏𝑬 𝝏𝑾 𝟒 ) Parial Derivative: 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟐 𝒐𝒖𝒕 = 𝟎. 𝟑 𝝏𝒉 𝟐𝒐𝒖𝒕 𝝏𝒉 𝟐𝒊𝒏 = 𝟎. 𝟐𝟓 𝝏𝑬 𝛛𝑾 𝟒 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟒
73. 73. E−𝑊4 ( 𝝏𝑬 𝝏𝑾 𝟒 ) Parial Derivative: 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟒 = 𝝏 𝝏𝑾 𝟒 (𝑿 𝟏 ∗ 𝑾 𝟑 + 𝑿 𝟐 ∗ 𝑾 𝟒 + 𝒃 𝟐) = 𝑿 𝟏 ∗ 𝑾 𝟑 + 𝑿 𝟐 ∗ 𝑾 𝟒 + 𝒃 𝟐 = 𝟎 + (𝑿 𝟐) 𝟏−𝟏∗ 𝑾 𝟒 + 𝟎 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟒 = 𝑾 𝟒 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟒 = 𝑾 𝟒 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟒 = 𝟎. 𝟐 Partial Derivative Substitution 𝝏𝑬 𝛛𝑾 𝟒 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟒
74. 74. E−𝑊4 ( 𝝏𝑬 𝝏𝑾 𝟒 ) Parial Derivative: 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 𝝏𝒐𝒖𝒕𝒊𝒏 = 𝟎. 𝟐𝟑 𝝏𝑬 𝝏𝒐𝒖𝒕 𝒐𝒖𝒕 = 𝟎. 𝟖𝟑𝟓 𝝏𝒐𝒖𝒕𝒊𝒏 𝝏𝒉𝟐 𝒐𝒖𝒕 = 𝟎. 𝟑 𝝏𝒉 𝟐𝒐𝒖𝒕 𝝏𝒉 𝟐𝒊𝒏 = 𝟎. 𝟐𝟓 𝝏𝒉𝟐𝒊𝒏 𝝏𝑾 𝟒 = 𝟎. 𝟐 𝝏𝑬 𝝏𝑾 𝟒 = 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ 𝟎. 𝟑 ∗ 𝟎. 𝟐𝟓 ∗ 𝟎. 𝟐 𝝏𝑬 𝝏𝑾 𝟒 = 𝟎. 𝟎𝟎𝟑 𝝏𝑬 𝛛𝑾 𝟒 = 𝛛𝑬 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 ∗ 𝛛𝒐𝒖𝒕 𝒐𝒖𝒕 𝛛𝒐𝒖𝒕𝒊𝒏 ∗ 𝛛𝒐𝒖𝒕𝒊𝒏 𝛛𝒉𝟐 𝒐𝒖𝒕 ∗ 𝛛𝒉𝟐 𝒐𝒖𝒕 𝛛𝒉𝟐𝒊𝒏 ∗ 𝛛𝒉𝟐𝒊𝒏 𝛛𝑾 𝟒
75. 75. All Error-Weights Partial Derivatives 𝝏𝑬 𝝏𝑾 𝟒 = 𝟎. 𝟎𝟎𝟑 𝝏𝑬 𝝏𝑾 𝟑 = 𝟎. 𝟎𝟎𝟗 𝝏𝑬 𝝏𝑾 𝟐 = −. 𝟎𝟎𝟑 𝝏𝑬 𝝏𝑾 𝟏 = −𝟎. 𝟎𝟎𝟏 𝛛𝑬 𝛛𝑾 𝟔 = 𝟎. 𝟎𝟗𝟕 𝝏𝑬 𝝏𝑾 𝟓 = 𝟎. 𝟏𝟏𝟗
76. 76. Updated Weights 𝑾 𝟏𝒏𝒆𝒘 = 𝑾 𝟏 − η ∗ 𝝏𝑬 𝝏𝑾 𝟏 = 𝟎. 𝟓 − 𝟎. 𝟎𝟏 ∗ −𝟎. 𝟎𝟎𝟏 = 𝟎. 𝟓𝟎𝟎𝟎𝟏 𝑾 𝟐𝒏𝒆𝒘 = 𝑾 𝟐 − η ∗ 𝝏𝑬 𝝏𝑾 𝟐 = 𝟎. 𝟏 − 𝟎. 𝟎𝟏 ∗ −𝟎. 𝟎𝟎𝟑 = 𝟎. 𝟏𝟎𝟎𝟎𝟑 𝑾 𝟑𝒏𝒆𝒘 = 𝑾 𝟑 − η ∗ 𝝏𝑬 𝝏𝑾 𝟑 = 𝟎. 𝟔𝟐 − 𝟎. 𝟎𝟏 ∗ 𝟎. 𝟎𝟎𝟗 = 𝟎. 𝟔𝟏𝟗𝟗𝟏 𝑾 𝟒𝒏𝒆𝒘 = 𝑾 𝟒 − η ∗ 𝝏𝑬 𝝏𝑾 𝟒 = 𝟎. 𝟐 − 𝟎. 𝟎𝟏 ∗ 𝟎. 𝟎𝟎𝟑 = 𝟎. 𝟏𝟗𝟗𝟕 𝑾 𝟓𝒏𝒆𝒘 = 𝑾 𝟓 − η ∗ 𝝏𝑬 𝝏𝑾 𝟓 = −𝟎. 𝟐 − 𝟎. 𝟎𝟏 ∗ 𝟎. 𝟔𝟏𝟖 = −𝟎. 𝟐𝟎𝟔𝟏𝟖 𝑾 𝟔𝒏𝒆𝒘 = 𝑾 𝟔 − η ∗ 𝝏𝑬 𝝏𝑾 𝟔 = 𝟎. 𝟑 − 𝟎. 𝟎𝟏 ∗ 𝟎. 𝟎𝟗𝟕 = 𝟎. 𝟐𝟗𝟗𝟎𝟑 Continue updating weights according to derivatives and re-train the network until reaching an acceptable error.