Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Quadratic Form and Functional        Optimization     9th June, 2011   Junpei Tsuji
Optimization of multivariate    quadratic function                           ๐‘ฅ1     1         3 1   ๐‘ฅ1๐ฝ ๐‘ฅ1 , ๐‘ฅ2 = 1.2 + 0....
Quadratic approximation                                                       1               ๐‘“ ๐’™ โ‰ˆ ๐‘“ ฬ… + ๐‘ฑฬ… โˆ™ ๐’™ โˆ’ ๏ฟฝ +    ...
Completing the square                          1 ๐‘“ ๐’™ = ๐‘“ ฬ… + ๐‘ฑฬ… โˆ™ ๐’™ โˆ’ ๏ฟฝ +                      ๐’™     ๐’™โˆ’๏ฟฝ                  ...
Completing the square                   1 ๐‘‡    ๐‘“ ๐’™ = ๐‘ + ๐’ƒ ๐’™ + ๐’™ ๐‘จ๐‘จ                  ๐‘‡                   2                ...
Quadratic form                ๐‘“ ๐’™๐’™ = ๐’™๐’™ ๐‘‡ ๐‘บ๐‘บ๐‘บโ€ข ๐‘บ is symmetric matrix.where
Symmetric matrixโ€ข Symmetric matrix ๐‘บ is defined as a matrix that satisfies the                           ๐‘บ๐‘‡ = ๐‘บ  following...
Diagonalization of symmetric matrixโ€ข We define an orthogonal matrix ๐‘ผ as follows:                           ๐‘ผ = ๐’–1 , ๐’–2 , ...
Transformation to principal axis                       ๐‘“ ๐’™โ€ฒ = ๐’™โ€ฒ ๐‘‡ ๐‘บ๐‘บโ€ฒโ€ข Then, we assume ๐’™๐’™ = ๐‘ผ ๐‘‡ ๐’›, where ๐’› =   ๐‘ง1 , ๐‘ง1 , ...
Contour surfaceโ€ข If we assume ๐‘“ ๐’› equals constant ๐‘,                          ๐‘                ๐‘“ ๐’› = ๏ฟฝ ๐œ† ๐‘– ๐‘ง ๐‘–2 = ๐‘       ...
Contour surface                           ๐‘ง2                2                                     ๐‘“ ๐’› = ๏ฟฝ ๐œ† ๐‘– ๐‘ง ๐‘– 2 = ๐‘๐‘๐‘๐‘...
Transformation to principal axis            ๐‘ฅ๐‘ฅ2                  ๐‘“ ๐’™๐’™ = ๐‘๐‘๐‘๐‘๐‘.                                      ๐‘ฅ๐‘ฅ1   ...
Parallel translation             ๐‘ฅ๐‘ฅ2๐‘ฅ2                   ๏ฟฝ                   ๐’™             ๐‘ฅ๐‘ฅ1                            ...
1Contour surface of quadratic function        ๐‘“ ๐’™ = ๐‘“ +               โˆ—                    ๐’™ โˆ’ ๐’™โˆ—        ๐‘‡                ...
Contour surface         ๐‘ง2                                  2                         ๐‘“ ๐’› = ๏ฟฝ ๐œ† ๐‘– ๐‘ง ๐‘– 2 = ๐‘๐‘๐‘๐‘๐‘.          ...
Stationary points๐‘“ ๐‘ฅ1 , ๐‘ฅ2 = ๐‘ฅ1 3 + ๐‘ฅ2 3 + 3๐‘ฅ1 ๐‘ฅ2 + 2  maximal point                                       saddle point
Stationary points                    1 3๐‘“ ๐‘ฅ1 , ๐‘ฅ2 = exp โˆ’     ๐‘ฅ1 + ๐‘ฅ1 โˆ’ ๐‘ฅ2 2                    3   saddle point          ...
Newton-Raphson method   ๐‘“๐‘“ ๐’™ = ๐ŸŽ where ๐‘“ ๐’™ is ๐‘-th polynomial byโ€ข Newtonโ€™s method is an approximate solver of  using a qua...
Algorithm of Newtonโ€™s methodProcedure Newton (๐‘ฑ ๐’™ , ๐‘ฏ ๐’™ ) 1. Initialize ๐’™. 2. Calculate ๐‘ฑ ๐’™ and ๐‘ฏ ๐’™ .    equation and givi...
Linear regression                                                                     ๐‘             ๐‘ฆ                     ...
Linear regression                             min RSS ๐œท                               ๐œท                                   ...
Linear regression     RSS ๐œท = ๐ฝ ๐œท = ๐’š โˆ’ ๐‘ฟ๐œท 2 = ๐’š โˆ’ ๐‘ฟ๐œท ๐‘‡ ๐’š โˆ’ ๐‘ฟ๐œท           = ๐’š ๐‘‡ ๐’š โˆ’ ๐œท ๐‘‡ ๐‘ฟ ๐‘‡ ๐’š โˆ’ ๐’š ๐‘‡ ๐‘ฟ๐œท + ๐œท ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œท        ...
Linear regressionGiven ๐œทโˆ— that satisfies ๐ฝโ€ฒ ๐œทโˆ— = ๐ŸŽ,                       ๐‘ฟ ๐‘‡ ๐’š = ๐‘ฟ ๐‘‡ ๐‘ฟ๐œทโˆ—                      ๐’š ๐‘‡ ๐‘ฟ = ๐œทโˆ— ...
Linear regression ๐ฝ ๐œท = ๐’š ๐‘‡ ๐’š โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œทโˆ— + ๐œท โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ ๐œท โˆ’ ๐œทโˆ— = ๐’š โˆ’ ๐‘ฟ๐œทโˆ— 2 + ๐œท โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ ๐œท โˆ’ ๐œทโˆ—           1 = ๐ฝ ๐œท...
Hessianโ€ข ๐‘ฏโ‰”                 = 2๐‘ฟ ๐‘‡ ๐‘ฟ          ๐œ•2 ๐ฝ         ๐œ•๐›ฝ ๐‘– ๐œ•๐›ฝ ๐‘—โ€ข ๐‘ฏ has the following two features:                   ...
Analysis of residuals                       ๐’šโˆ— = ๐‘ฟ๐œทโˆ—โ€ข Then, we substitute ๐œทโˆ— = ๐‘ฟ ๐‘‡ ๐‘ฟ   โˆ’1                                 ...
Analysis of residuals                โ„‹ = ๐‘ฟ ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ ๐‘‡The hat matrix โ„‹ is a projection matrix, which1. Projection: โ„‹ 2 = ...
Analysis of residuals                 ๐‘ฅ11    โ‹ฏ       ๐‘ฅ1๐‘ 1           ๐›ฝ1 โˆ—       ๐‘ฆ1 โˆ—                                      ...
Analysis of residuals                               ๐’š                                        ๐’šโˆ— = โ„‹๐’š (Projection)        ๐’™...
Analysis of residuals ๐’š = ๐‘ฟ๐œทโ€ข ๐œท = ๐‘ฟโˆ’1 ๐’š, where ๐‘ฟโˆ’1 is M-P generalized inverse.                              ๐‘= ๐‘          ...
Quadratic form and functional optimization
Upcoming SlideShare
Loading in โ€ฆ5
×

Quadratic form and functional optimization

1,250 views

Published on

This slideshow describes a mathematics topic on quadratic form in English.
I made the slideshow to understand this topic on a deeper way.ใ€€
I like linear algebra very much!!

Published in: Education
  • Be the first to comment

Quadratic form and functional optimization

  1. 1. Quadratic Form and Functional Optimization 9th June, 2011 Junpei Tsuji
  2. 2. Optimization of multivariate quadratic function ๐‘ฅ1 1 3 1 ๐‘ฅ1๐ฝ ๐‘ฅ1 , ๐‘ฅ2 = 1.2 + 0.2, 0.3 ๐‘ฅ2 + ๐‘ฅ1 , ๐‘ฅ2 ๐‘ฅ2 2 1 4 3 2 = 1.2 + 0.2๐‘ฅ1 + 0.3๐‘ฅ2 + ๐‘ฅ1 + ๐‘ฅ1 ๐‘ฅ2 + 2๐‘ฅ2 2 2 ๐‘ฅ1 , ๐‘ฅ2 , ๐ฝ = 0.045, 0.064, 1.1881
  3. 3. Quadratic approximation 1 ๐‘“ ๐’™ โ‰ˆ ๐‘“ ฬ… + ๐‘ฑฬ… โˆ™ ๐’™ โˆ’ ๏ฟฝ + ๐’™ ๐’™โˆ’๏ฟฝ ๐’™ ๐‘‡๏ฟฝ ๐‘ฏ ๐’™โˆ’๏ฟฝ ๐’™By Taylors expansion 2 constant linear form quadratic form ๐’™ โˆถ= ๐‘ฅ1 , ๐‘ฅ2 , โ‹ฏ ๐‘ฅ ๐‘ ๐‘‡where ๐‘“ ฬ… โˆถ= ๐‘“ ๏ฟฝ ๐’™โ€ข ๐‘ฑฬ…: = , ,โ‹ฏ, ๐œ•๐‘“ ๐œ•๐‘“ ๐œ•๐‘“โ€ข ๐œ•๐‘ฅ1 ๐œ•๐‘ฅ2 ๐œ•๐‘ฅ ๐‘ ๏ฟฝ ๐’™=๐’™โ€ข Jacobian (gradient) โ‹ฏ ๐œ•2 ๐‘“ ๐œ•2 ๐‘“ ๐œ•๐‘ฅ1 ๐œ•๐‘ฅ1 ๐œ•๐‘ฅ1 ๐œ•๐‘ฅ ๐‘โ€ข ๏ฟฝ โˆถ= ๐‘ฏ โ‹ฎ โ‹ฑ โ‹ฎ โ‹ฏ ๐œ•2 ๐‘“ ๐œ•2 ๐‘“ Hessian (constant) ๐œ•๐‘ฅ ๐‘ ๐œ•๐‘ฅ1 ๐œ•๐‘ฅ ๐‘ ๐œ•๐‘ฅ ๐‘ ๏ฟฝ ๐’™=๐’™
  4. 4. Completing the square 1 ๐‘“ ๐’™ = ๐‘“ ฬ… + ๐‘ฑฬ… โˆ™ ๐’™ โˆ’ ๏ฟฝ + ๐’™ ๐’™โˆ’๏ฟฝ ๐’™ ๐‘‡๏ฟฝ ๐‘ฏ ๐’™โˆ’๏ฟฝ ๐’™ 2โ€ข Let ๏ฟฝ = ๐’™โˆ— where ๐‘ฑ ๐’™โˆ— ๐‘‡ = ๐ŸŽ then ๐’™ 1 ๐‘“ ๐’™ = ๐‘“โˆ— + ๐’™ โˆ’ ๐’™โˆ— ๐‘‡ ๐‘ฏโˆ— ๐’™ โˆ’ ๐’™โˆ— 2 constant quadratic form
  5. 5. Completing the square 1 ๐‘‡ ๐‘“ ๐’™ = ๐‘ + ๐’ƒ ๐’™ + ๐’™ ๐‘จ๐‘จ ๐‘‡ 2 1 ๐‘“ ๐’™ = ๐‘‘+ ๐’™ โˆ’ ๐’™0 ๐‘‡ ๐‘จ ๐’™ โˆ’ ๐’™0 2 1 1 1 = ๐‘‘ + ๐’™0 ๐‘‡ ๐‘จ๐’™0 โˆ’ ๐’™0 ๐‘‡ ๐‘จ + ๐‘จ ๐‘‡ ๐’™ + ๐’™ ๐‘‡ ๐‘จ๐‘จ 2 2 2 ๐’ƒ๐‘‡ =โˆ’ ๐’™0 ๐‘‡ ๐‘จ + ๐‘จ ๐‘‡ 1 2 ๐’™0 ๐‘‡ = โˆ’2๐’ƒ ๐‘‡ ๐‘จ + ๐‘จ ๐‘‡ โˆ’1โ€ข ๐’™0 = โˆ’2 ๐‘จ + ๐‘จ ๐‘‡ โˆ’1 ๐’ƒ ๐‘= ๐‘‘+ ๐’™0 ๐‘‡ ๐‘จ๐’™0 1 1 ๐‘‡ 2 ๐‘‘= ๐‘โˆ’ ๐’™0 ๐‘จ๐’™0 = ๐‘ โˆ’ 2๐’ƒ ๐‘‡ ๐‘จ + ๐‘จ ๐‘‡ ๐‘จ ๐‘จ+ ๐‘จ๐‘‡ ๐’ƒโ€ข โˆ’1 โˆ’1 2 ๐‘“ ๐’™ = ๐‘ โˆ’ 2๐’ƒ ๐‘‡ ๐‘จ + ๐‘จ ๐‘‡ โˆ’1 ๐‘จ ๐‘จ+ ๐‘จ๐‘‡ โˆ’1 ๐’ƒTherefore, 1 + ๐’™ + 2 ๐‘จ + ๐‘จ ๐‘‡ โˆ’1 ๐’ƒ ๐‘‡ ๐‘จ ๐’™ + 2 ๐‘จ + ๐‘จ ๐‘‡ โˆ’1 ๐’ƒ 2 If ๐‘จ was symmetric matrix, 1 ๐‘‡ โˆ’1 1 ๐‘“ ๐’™ = ๐‘โˆ’ ๐’ƒ ๐‘จ ๐’ƒ+ ๐’™ + ๐‘จโˆ’1 ๐’ƒ ๐‘‡ ๐‘จ ๐’™ + ๐‘จโˆ’1 ๐’ƒโ€ข 2 2
  6. 6. Quadratic form ๐‘“ ๐’™๐’™ = ๐’™๐’™ ๐‘‡ ๐‘บ๐‘บ๐‘บโ€ข ๐‘บ is symmetric matrix.where
  7. 7. Symmetric matrixโ€ข Symmetric matrix ๐‘บ is defined as a matrix that satisfies the ๐‘บ๐‘‡ = ๐‘บ following formula:โ€ข Symmetric matrix ๐‘บ has real eigenvalues ๐œ† ๐‘– and eigenvectors ๐’– ๐‘– that consist of normal orthogonal base. ๐‘บ๐’– ๐‘– = ๐œ† ๐‘– ๐’– ๐‘–where ๐œ†1 โ‰ฅ ๐œ†2 โ‰ฅ โ‹ฏ โ‰ฅ ๐œ† ๐‘ ๐’– ๐‘– , ๐’– ๐‘— = ๐›ฟ ๐‘–๐‘– ๐›ฟ ๐‘–๐‘– is Kroneckers delta
  8. 8. Diagonalization of symmetric matrixโ€ข We define an orthogonal matrix ๐‘ผ as follows: ๐‘ผ = ๐’–1 , ๐’–2 , โ‹ฏ , ๐’– ๐‘โ€ข Then, ๐‘ผ satisfies the following formulas: ๐‘ผ๐‘‡ ๐‘ผ= ๐‘ฐ โˆด ๐‘ผโˆ’1 = ๐‘ผ ๐‘‡โ€ข where ๐‘ฐ is an identity matrix. ๐‘บ๐‘บ = ๐‘บ ๐’–1 , ๐’–2 , โ‹ฏ , ๐’– ๐‘ = ๐‘บ๐’–1 , ๐‘บ๐’–2 , โ‹ฏ , ๐‘บ๐’– ๐‘ ๐œ†1 = ๐œ†1 ๐’–1 , ๐œ†2 ๐’–2 , โ‹ฏ , ๐œ† ๐‘ ๐’– ๐‘ = ๐’–1 , โ‹ฏ , ๐’– ๐‘ โ‹ฑ ๐œ†๐‘ = ๐‘ผ ๐๐๐๐ ๐œ†1 , ๐œ†2 , โ‹ฏ , ๐œ† ๐‘ โˆด ๐‘บ = ๐‘ผ ๐๐๐๐ ๐œ†1 , ๐œ†2 , โ‹ฏ , ๐œ† ๐‘ ๐‘ผ๐‘‡
  9. 9. Transformation to principal axis ๐‘“ ๐’™โ€ฒ = ๐’™โ€ฒ ๐‘‡ ๐‘บ๐‘บโ€ฒโ€ข Then, we assume ๐’™๐’™ = ๐‘ผ ๐‘‡ ๐’›, where ๐’› = ๐‘ง1 , ๐‘ง1 , โ‹ฏ , ๐‘ง ๐‘ . ๐‘“ ๐‘ผ ๐‘‡ ๐’› = ๐‘ผ ๐‘‡ ๐’› ๐‘‡ ๐‘บ ๐‘ผ ๐‘‡ ๐’› = ๐’› ๐‘‡ ๐‘ผ๐‘บ๐‘ผ ๐‘‡ ๐’› = ๐’› ๐‘‡ ๐๐๐๐ ๐œ†1 , ๐œ†2 , โ‹ฏ , ๐œ† ๐‘ ๐’› ๐‘ โˆด ๐‘“ ๐’› = ๏ฟฝ ๐œ† ๐‘– ๐‘ง ๐‘–2 ๐‘–=1
  10. 10. Contour surfaceโ€ข If we assume ๐‘“ ๐’› equals constant ๐‘, ๐‘ ๐‘“ ๐’› = ๏ฟฝ ๐œ† ๐‘– ๐‘ง ๐‘–2 = ๐‘ ๐‘–=1โ€ข When ๐‘ = 2, โ€“ a locus of ๐’› illustrates an ellipse if ๐œ†1 ๐œ†2 > 0. โ€“ a locus of ๐’› illustrates a hyperbola if ๐œ†1 ๐œ†2 < 0.
  11. 11. Contour surface ๐‘ง2 2 ๐‘“ ๐’› = ๏ฟฝ ๐œ† ๐‘– ๐‘ง ๐‘– 2 = ๐‘๐‘๐‘๐‘๐‘. ๐‘–=1 ๐œ†1 ๐œ†2 > 0 ๐‘ง1maximal or minimal point ๐‘“ ๐‘ฅ1 , ๐‘ฅ2 = โˆ’๐‘ฅ1 2 โˆ’ 2๐‘ฅ2 2 + 20.0
  12. 12. Transformation to principal axis ๐‘ฅ๐‘ฅ2 ๐‘“ ๐’™๐’™ = ๐‘๐‘๐‘๐‘๐‘. ๐‘ฅ๐‘ฅ1 ๐’™๐’™ = ๐‘ผ ๐‘‡ ๐’› โˆด ๐’› = ๐‘ผ๐’™โ€ฒ Transformation to principal axis
  13. 13. Parallel translation ๐‘ฅ๐‘ฅ2๐‘ฅ2 ๏ฟฝ ๐’™ ๐‘ฅ๐‘ฅ1 ๐‘“ ๐’™ = ๐‘๐‘๐‘๐‘๐‘. ๐‘ฅ1 ๐’™๐’™ = ๐’™ โˆ’ ๏ฟฝ๐’™
  14. 14. 1Contour surface of quadratic function ๐‘“ ๐’™ = ๐‘“ + โˆ— ๐’™ โˆ’ ๐’™โˆ— ๐‘‡ ๐‘ฏโˆ— ๐’™ โˆ’ ๐’™โˆ— 2 ๐‘ฅ2 ๏ฟฝ ๐’™ ๐‘“ ๐’™ = ๐‘๐‘๐‘๐‘๐‘. ๐‘ฅ1
  15. 15. Contour surface ๐‘ง2 2 ๐‘“ ๐’› = ๏ฟฝ ๐œ† ๐‘– ๐‘ง ๐‘– 2 = ๐‘๐‘๐‘๐‘๐‘. ๐‘–=1 ๐œ†1 ๐œ†2 < 0 ๐‘ง1saddle point ๐‘“ ๐‘ฅ1 , ๐‘ฅ2 = ๐‘ฅ1 2 โˆ’ ๐‘ฅ2 2
  16. 16. Stationary points๐‘“ ๐‘ฅ1 , ๐‘ฅ2 = ๐‘ฅ1 3 + ๐‘ฅ2 3 + 3๐‘ฅ1 ๐‘ฅ2 + 2 maximal point saddle point
  17. 17. Stationary points 1 3๐‘“ ๐‘ฅ1 , ๐‘ฅ2 = exp โˆ’ ๐‘ฅ1 + ๐‘ฅ1 โˆ’ ๐‘ฅ2 2 3 saddle point maximal point
  18. 18. Newton-Raphson method ๐‘“๐‘“ ๐’™ = ๐ŸŽ where ๐‘“ ๐’™ is ๐‘-th polynomial byโ€ข Newtonโ€™s method is an approximate solver of using a quadratic approximation. ๐‘“ ๐’™ quadratic approximation of ๐‘“ ๐’™ in ๐’™ 1 ๐‘“ ๐’™ + ฮ”๐’™ โ‰ˆ ๐‘“ ๐’™ + ๐‘ฑ ๐’™ โˆ™ ฮ”๐’™ + ฮ”๐’™ ๐‘‡ ๐‘ฏ ๐’™ ฮ”๐’™ 2 ๐œ•๐‘“ ๐’™ + ฮ”๐’™ ๐‘“๐‘“ ๐’™โˆ— = ๐ŸŽ ๐œ• ฮ”๐’™ = ๐‘ฑ ๐’™ ๐‘‡ + ๐‘ฏ ๐’™ ฮ”๐’™ ๐’™โˆ— ๐’™ + ๐šซ๐’™ ๐’™ ๐’™
  19. 19. Algorithm of Newtonโ€™s methodProcedure Newton (๐‘ฑ ๐’™ , ๐‘ฏ ๐’™ ) 1. Initialize ๐’™. 2. Calculate ๐‘ฑ ๐’™ and ๐‘ฏ ๐’™ . equation and giving โˆ†๐’™ : ๐‘ฑ ๐’™ ๐‘‡ + ๐‘ฏ ๐’™ โˆ†๐’™ = ๐ŸŽ 3. Solve the following simultaneous 4. Update ๐’™ as follows: ๐’™ โ† ๐’™ + โˆ†๐’™ 5. If โˆ†๐’™ < ๐›ฟ then return ๐’™ else go back to 2.
  20. 20. Linear regression ๐‘ ๐‘ฆ ๐‘ฆ = ๐‘“ ๐’™ = ๐›ฝ0 + ๏ฟฝ ๐›ฝ ๐‘— ๐‘ฅ ๐‘— ๐‘ samples ๐’™ ๐‘–, ๐‘ฆ ๐‘– ๐‘—=1 ๐’™ ๐‘-th dimensional spaceWe would like to find ๐œทโˆ— that minimizes the residual sum of square (RSS).
  21. 21. Linear regression min RSS ๐œท ๐œท 2 ๐‘ ๐‘ ๐‘โ€ข where RSS ๐œท = ๏ฟฝ ๐‘ฆ ๐‘– โˆ’ ๐‘“ ๐’™ ๐‘– 2 = ๏ฟฝ ๐‘ฆ๐‘– โˆ’ ๐›ฝ0 + ๏ฟฝ ๐›ฝ ๐‘— ๐‘ฅ ๐‘–๐‘– ๐‘–=1 ๐‘–=1 ๐‘—=1โ€ข Given ๐‘ฟ, ๐’š, ๐œท as follows: ๐‘ฅ11 โ‹ฏ ๐‘ฅ1๐‘ 1 ๐‘ฆ1 ๐›ฝ1 ๐‘ฟ= โ‹ฎ โ‹ฑ โ‹ฎ โ‹ฎ , ๐’š= โ‹ฎ , ๐œท= โ‹ฎ ๐‘ฅ ๐‘๐‘ โ‹ฏ ๐‘ฅ ๐‘๐‘ 1 ๐‘ฆ๐‘ ๐›ฝ๐‘ โˆด RSS ๐œท = ๐’š โˆ’ ๐‘ฟ๐œท 2
  22. 22. Linear regression RSS ๐œท = ๐ฝ ๐œท = ๐’š โˆ’ ๐‘ฟ๐œท 2 = ๐’š โˆ’ ๐‘ฟ๐œท ๐‘‡ ๐’š โˆ’ ๐‘ฟ๐œท = ๐’š ๐‘‡ ๐’š โˆ’ ๐œท ๐‘‡ ๐‘ฟ ๐‘‡ ๐’š โˆ’ ๐’š ๐‘‡ ๐‘ฟ๐œท + ๐œท ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œท ๐’‚๐‘‡ ๐œท = ๐’‚ ๐œ• ๐œ•๐œท ๐œท๐‘‡ ๐’‚ = ๐’‚โ€ข ๐œ• ๐œ•๐œท ๐œท ๐‘‡ ๐‘จ๐œท = ๐‘จโ€ข ๐œ• ๐œ•๐œท ๐œ•๐ฝ ๐ฝโ€ฒ ๐œท = = โˆ’2๐‘ฟ ๐‘‡ ๐’š + 2๐‘ฟ ๐‘‡ ๐‘ฟ๐œทโ€ข ๐œ•๐œท
  23. 23. Linear regressionGiven ๐œทโˆ— that satisfies ๐ฝโ€ฒ ๐œทโˆ— = ๐ŸŽ, ๐‘ฟ ๐‘‡ ๐’š = ๐‘ฟ ๐‘‡ ๐‘ฟ๐œทโˆ— ๐’š ๐‘‡ ๐‘ฟ = ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ โˆด ๐œทโˆ— = ๐‘ฟ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ๐‘‡ ๐’š โˆด ๐ฝ ๐œท = ๐’š ๐’š โˆ’ ๐œท ๐‘ฟ ๐‘ฟ๐œท โˆ’ ๐œท ๐‘ฟ ๐‘‡ ๐‘ฟ๐œท + ๐œท ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œท ๐‘‡ ๐‘‡ ๐‘‡ โˆ— โˆ—๐‘‡ โˆด ๐ฝ ๐œท = ๐’š ๐‘‡ ๐’š โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œทโˆ— + ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œทโˆ— โˆ’ ๐œท ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œทโˆ— โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œท + ๐œท ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œท โˆด ๐ฝ ๐œท = ๐’š ๐‘‡ ๐’š โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œทโˆ— + ๐œท โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ ๐œท โˆ’ ๐œทโˆ— completing the square
  24. 24. Linear regression ๐ฝ ๐œท = ๐’š ๐‘‡ ๐’š โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ๐œทโˆ— + ๐œท โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ ๐œท โˆ’ ๐œทโˆ— = ๐’š โˆ’ ๐‘ฟ๐œทโˆ— 2 + ๐œท โˆ’ ๐œทโˆ— ๐‘‡ ๐‘ฟ ๐‘‡ ๐‘ฟ ๐œท โˆ’ ๐œทโˆ— 1 = ๐ฝ ๐œท + โˆ— ๐œท โˆ’ ๐œท โˆ— ๐‘‡ ๐‘ฏ ๐œท โˆ’ ๐œทโˆ— 2Residual sum of squares (RSS) quadratic form ๐›ฝ2 ๐ฝ ๐œท = ๐‘๐‘๐‘๐‘๐‘.by Linear Regression ๐œทโˆ— ๐œทโˆ— = ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ ๐‘‡ ๐’š ๐‘ฏ = 2๐‘ฟ ๐‘‡ ๐‘ฟ ๐›ฝ1
  25. 25. Hessianโ€ข ๐‘ฏโ‰” = 2๐‘ฟ ๐‘‡ ๐‘ฟ ๐œ•2 ๐ฝ ๐œ•๐›ฝ ๐‘– ๐œ•๐›ฝ ๐‘—โ€ข ๐‘ฏ has the following two features: ๐‘ฏ๐‘‡ = ๐‘ฏ โˆ€ ๐’™ โ‰  ๐ŸŽ, ๐’™ ๐‘‡ ๐‘ฏ๐‘ฏ > 0 โ€“ symmetric matrix: โ€“ positive-definite matrix:Therefore, ๐œทโˆ— = ๐‘ฟ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ ๐‘‡ ๐’š is the minimumof ๐ฝ ๐œท .
  26. 26. Analysis of residuals ๐’šโˆ— = ๐‘ฟ๐œทโˆ—โ€ข Then, we substitute ๐œทโˆ— = ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ ๐‘‡ ๐’š in the above, ๐’šโˆ— = ๐‘ฟ๐œทโˆ— = ๐‘ฟ ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ๐‘‡ ๐’š โˆด ๐’šโˆ— = โ„‹๐’š (Hat matrix)โ€ข the vector of residuals ๐’“ can be expressed by follows: ๐’“ = ๐’š โˆ’ ๐’šโˆ— = ๐’š โˆ’ โ„‹๐’š = ๐‘ฐ โˆ’ โ„‹ ๐’š ๐‘‰๐‘‰๐‘‰ ๐’“ = ๐‘‰๐‘‰๐‘‰ ๐‘ฐ โˆ’ โ„‹ ๐’š = ๐‘ฐ โˆ’ โ„‹ ๐‘‰๐‘‰๐‘‰ ๐’š ๐‘ฐ โˆ’ โ„‹ ๐‘‡
  27. 27. Analysis of residuals โ„‹ = ๐‘ฟ ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ ๐‘‡The hat matrix โ„‹ is a projection matrix, which1. Projection: โ„‹ 2 = โ„‹satisfies the following equations: โ„‹ 2 = โ„‹ โˆ™ โ„‹ = ๐‘ฟ ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ ๐‘‡ โˆ™ ๐‘ฟ ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ๐‘‡ = ๐‘ฟ ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ ๐‘‡ ๐‘ฟ ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ ๐‘‡ = ๐‘ฟ ๐‘ฟ ๐‘‡ ๐‘ฟ โˆ’1 ๐‘ฟ ๐‘‡ = โ„‹2. Orthogonal: โ„‹ ๐‘‡ = โ„‹
  28. 28. Analysis of residuals ๐‘ฅ11 โ‹ฏ ๐‘ฅ1๐‘ 1 ๐›ฝ1 โˆ— ๐‘ฆ1 โˆ— โ‹ฎ โ‹ฎ = โ‹ฎ โ‹ฑ โ‹ฎ โ‹ฎ ๐›ฝ๐‘ โˆ— ๐‘ฆ ๐‘โˆ— ๐‘ฅ ๐‘1 โ‹ฏ ๐‘ฅ ๐‘๐‘ 1 ๐›ฝ0 โˆ— ๐‘ฅ11 ๐‘ฅ1๐‘ 1= ๐›ฝ1 โˆ— โ‹ฎ + โ‹ฏ + ๐›ฝ ๐‘โˆ— โ‹ฎ + ๐›ฝ0 โ‹ฎ โˆ— ๐‘ฅ ๐‘1 ๐‘ฅ ๐‘๐‘ 1 ๐’™1 ๐’™๐‘ ๐’™ ๐‘+1 = ๐Ÿ linear combination in ๐‘ + 1 -th vector space
  29. 29. Analysis of residuals ๐’š ๐’šโˆ— = โ„‹๐’š (Projection) ๐’™๐‘ ๐’šโˆ— ๐’™๐‘— ๐‘ + 1 -th dimensional super surface๐‘-th dimensional space
  30. 30. Analysis of residuals ๐’š = ๐‘ฟ๐œทโ€ข ๐œท = ๐‘ฟโˆ’1 ๐’š, where ๐‘ฟโˆ’1 is M-P generalized inverse. ๐‘= ๐‘ ๐‘> ๐‘ 1. Unique solution: ๐‘< ๐‘ 2. Many solutions: ๐‘ฟโˆ’1 3. No solution: ๐‘ฟ โˆ’1 =๏ฟฝ ๐‘ฟ๐‘ฟ ๐‘ฟ๐‘ฟ๐‘ฟ โˆ’1 ๐œท = ๐‘ฟโˆ’1 ๐’š is min in ๐œท ๐‘ฟ๐‘ฟ๐‘ฟ โˆ’1 ๐‘ฟ๐‘ฟ ๐’š โˆ’ ๐‘ฟ๐œท 2 is minโ€ข

ร—