Total Least Squares
Total Least Squares is an extension to the Ordinary least Squares.

Data fitting problems are overdetermined.

Given an mxn matrix A, a mx1 vector b and we wish to determine x
   such that   D (b   Ax )   = min.
                             2

Where D is the diagonal weight matrix.
Singular Value Decomposition
We already know that the eigenvectors of a matrix A form a convenient
basis for working with A.

However, for rectangular matrices A (m x n), dim(Ax) ≠ dim(x) and the
concept of eigenvectors doesn’t exist.

Yet, ATA (n x n) is symmetric real matrix (A is real) and therefore, there is
an orthonormal basis of eigenvectors {uK}.
                                          Auk
Consider the vectors {vK}        vk
                                           λk
They are also orthonormal, since:
                   T    T             2
                 u j A Au k           k
                                          (k    j)
Singular Value Decomposition
Since ATA is positive semidefinite, its {λk≥0}.

Define the singular values of A as      k       k


and order them in a non-increasing order:           1    2
                                                             ...   n
                                                                       0

Motivation: One can see, that if A itself square and symmetric,
than {uk, σk} are the set of its own eigenvectors and eigenvalues.

For a general matrix A, assume {σ1 ≥ σ2 ≥… σR >0= σr+1 = σr+2 =…= σn }.

                   Au k     0 vk ,          k       r   1,..., n
                                  1 n           1 m
                             uk         ; vk
Singular Value Decomposition
Now we can write:
 |                  |            |                   |             |                |        |                |
Au1               Aur         Aur   1
                                                Aun            A u1           ur          ur    1
                                                                                                             un      AU
 |                  |            |                   |             |                |        |                |

                                                                                                                      1
                                                                                                                               0   0       0
                                                                                                                           
      |                  |               |                |            |                |        |                |
                                                                                                                      0        r
                                                                                                                                   0       0
      v
     1 1
                         v
                         r r
                                 0 vr        1
                                                        0 vn          v1          vr       vr       1
                                                                                                             vn                               VΣ
                                                                                                                      0        0   0       0
      |                  |               |                |            |                |        |                |
                                                                                                                                       
                                                                                                                      0        0   0       0

                                                 T                              T
                             AUU                              V U
                                     m n                         m m                        m n               n n T
                             A                            V                                               U
SVD: Example
                                                               1       1
Let us find SVD for the matrix                    A
                                                               2       2



In order to find V, we are calculating eigenvectors of
ATA:               T
                 A A
                                     1    2       1        1           5     3
                                 1        2       2        2           3     5
(5-λ)2-9=0;
                            10           100      64
                   1, 2
                                                           5       3       8,2
                                         2
                                 1                    1

                       v1        2                     2
                                             v2
                             1                        1
                                 2                    2
SVD: Example

The corresponding eigenvectors are found by:

                      5            i
                                           3
                                                   ui    0
                              3        5       i

                                                                 1
                  3       3                0                      2
                                  u1                u1
                 3        3                0                     1
                                                                 2

                                                             1
                  3   3                0                         2
                          u1                   u1
                  3   3                0                     1
                                                             2
SVD: Example
Now, we obtain the U and Σ :

                                 1
                         1   1            0                 0                          0
  A u1       v1                   2                   2 2                     v1                ,       1
                                                                                                             2 2;
         1
                         2   2   1    2 2                   1                          1
                                 2


                                      1
                             1   1        2            2                1                           1
   Au2            2
                    v2                                              2                  v2               ,     2
                                                                                                                    2;
                             2   2    1               0                 0                           0
                                      2



A=VΣUT:                                                                                     1           1
                                              1   1         0   1       2 2    0            2            2
                                              2   2         1   0        0         2        1           1
                                                                                                2       2
But when both the data A and b are contaminated with errors,
(A+E)x = b+r and for which D ( E : r )T F where D diag ( d 1 ...... d m )
And T diag ( t1 ...... t m )
                         .
Once a minimizing (E:r) is found subject to constraints there exists a
   vector ~ such that
          x
(A + E) ~ = b + r.
        x
1.   The line of total least squares passes through the centroids.
2.   Total least squares minimizes the norm of a linear map.
3.   The line of total least squares corresponds to the smallest
     singular value.
Least Squares
Consider again the set of data points, t , b                       m
                                        i    i                     i 1

and the problem of linear approximation of this set: b                                              ti
                                                       i


In the Least Squares (LS) approach, we defined a set of equations:

               b1        1    t1
                                                                                                    m           m

               b2        1    t2                                         1
                                                                                       m                  ti          bi
      b                                      Ax           x
                                                              *    T
                                                                  A A
                                                                              T
                                                                             A b    m
                                                                                                    i 1
                                                                                                     m
                                                                                                               i 1
                                                                                                               m
                                                                                        ti
                                                                                                           2
                                                                                                          ti         t ib i
                                                                                    i 1             i 1        i 1
               bm        1    tm



 If       ti        bi       ri       , then the LS solution x                              T
                                                                                                    minimizes the
 sum of squared errors:
                                                                                   m
                                             T                                2                 2
                                  Ax     b       Ax   b           Ax     b    2
                                                                                           rk
                                                                                   k 1
Least Squares
This approach assumes that in the set of points t , b     m
                                                 i    i   i 1
the values of bi are measured with errors while the values of ti
are exact, as demonstrated on the figure.
Total Least Squares
 Assume, that we rewrite the line equation b                                    t          in the form:

         1                          . Then the corresponding LS equation becomes:
 t           b         b

             t1   1    b1                                                                      m                 m

                                                                                     m               bi               ti
             t2   1    b2                             *        T       1    T                  i 1              i 1
                                                  x           C C          C t
     t                                  Cx                                          m           m
                                                                                                      2
                                                                                                               m

                                                                                       bi         bi              t ib i
                                                                                    i 1        i 1             i 1


             tm   1    bm

Corresponding to minimization of
                                                                   m
                               T                          2                 2
                  Cx       t       Cx   t    Cx   t                         k
                                                                                     bi                   ti               i
                                                          2
                                                                   k 1

Which means noise in ti instead of bi, and this generally leads to different
solution.
TLS
To solve the problem with the noise along both ti and bi, we rewrite the
line equation as:
                            bi   b                ti         t
where
                     m                                 m
                 1                            1
           b               bi    t                           ti
                m    i 1                      m        i 1

Now we can write:
                                 t1       t       b1         b        0
                                 t2       t       b2         b        0
                           Ax
                                                                1   
                                 tm       t       bm         b        0


 The exact solution of this system is possible if ti, bi lie on the same line, in
 this case rank(A)=1. This formulation is symmetric relatively to t and b.
TLS
The rank of A is 2 since the points are noisy and do not lie on the same
line. SVD factorization, and zeroing of the second singular value allow to
construct the matrix A1, closest to A with rank(A1)=1.


            t1           t       b1       b                                    d 11          0
                                                         u 11        u 11                                                   T
            t2           t       b2       b         T
                                                                                0           d 22       v11            v12
       A                                      U V                    
                                                                              0            0         v 21           v 22
                                                         u m1       u mm
            tm           t       bm       b                                                 

                                                                                                   0              0
                                                         d 11   0                                t1       at 1
                                      u 11       u 11                                  T          0              0
                             T
                                                          0     0   v 11     v 12                t1       at 2
       A1   U        1
                         V                        
                                                          0     0   v 21     v 22                            
                                      u m1       u mm                                             0          0
                                                                                               tm      at m
TLS
The geometric interpretation of the TLS method is finding a constant a
and a set of points           0
                             tk   , such that the points t 0 , at 0
                                                           k      k
                                                                                lie closest in the L2
to the data set t k , b k         :

                                                   2
                     |   |            |    |
                                                                     m
                                               0
 t
     0
         arg min     ˆ
                     t   ˆ
                         b        ˆ
                                  t
                                      0
                                           ˆ
                                          ab           arg min             tk   t
                                                                                    0 2
                                                                                          bk   at
                                                                                                    0 2
                0                                             0                     k               k
              a ,t                                          a ,t k
                                                                     k 1
                     |   |            |    |
                                                   F
Frobenius norm:

Thm: Assume that the matrix A             m n
                                              has rank r>k. The frobenius
  norm matrix approximation problem rank ( Z ) k A Z F has the solution
                                                min

     Z   Ak U k k V k
                         T
                               , where U k ( u 1 , u 2 ,... u k ) , V k ( v1 , v 2 ,... v k )
   and   k
            ( 1 , 2 ,... k ) . p
The minimum is A Ak F ( i k i i ) where p min( m , n ) .
                                   2 1/ 2
TLS formulation:

    Given a matrix with A               ,m > n and a vector b          , find
                                  m n                             m 1

    residuals E        and r                 that minimize the Frobenius norm
                   m n                   m 1



     (E | r)   F   subject to the condition, ( b     r)   Im( A    E)

Ppt tls

  • 1.
  • 2.
    Total Least Squaresis an extension to the Ordinary least Squares. Data fitting problems are overdetermined. Given an mxn matrix A, a mx1 vector b and we wish to determine x such that D (b Ax ) = min. 2 Where D is the diagonal weight matrix.
  • 3.
    Singular Value Decomposition Wealready know that the eigenvectors of a matrix A form a convenient basis for working with A. However, for rectangular matrices A (m x n), dim(Ax) ≠ dim(x) and the concept of eigenvectors doesn’t exist. Yet, ATA (n x n) is symmetric real matrix (A is real) and therefore, there is an orthonormal basis of eigenvectors {uK}. Auk Consider the vectors {vK} vk λk They are also orthonormal, since: T T 2 u j A Au k k (k j)
  • 4.
    Singular Value Decomposition SinceATA is positive semidefinite, its {λk≥0}. Define the singular values of A as k k and order them in a non-increasing order: 1 2 ... n 0 Motivation: One can see, that if A itself square and symmetric, than {uk, σk} are the set of its own eigenvectors and eigenvalues. For a general matrix A, assume {σ1 ≥ σ2 ≥… σR >0= σr+1 = σr+2 =…= σn }. Au k 0 vk , k r 1,..., n 1 n 1 m uk ; vk
  • 5.
    Singular Value Decomposition Nowwe can write: | | | | | | | | Au1  Aur Aur 1  Aun A u1  ur ur 1  un AU | | | | | | | | 1 0 0 0  | | | | | | | | 0 r 0 0 v 1 1  v r r 0 vr 1  0 vn v1  vr vr 1  vn VΣ 0 0 0 0 | | | | | | | |  0 0 0 0 T T AUU V U m n m m m n n n T A V U
  • 6.
    SVD: Example 1 1 Let us find SVD for the matrix A 2 2 In order to find V, we are calculating eigenvectors of ATA: T A A 1 2 1 1 5 3 1 2 2 2 3 5 (5-λ)2-9=0; 10 100 64 1, 2 5 3 8,2 2 1 1 v1 2 2 v2 1 1 2 2
  • 7.
    SVD: Example The correspondingeigenvectors are found by: 5 i 3 ui 0 3 5 i 1 3 3 0 2 u1 u1 3 3 0 1 2 1 3 3 0 2 u1 u1 3 3 0 1 2
  • 8.
    SVD: Example Now, weobtain the U and Σ : 1 1 1 0 0 0 A u1 v1 2 2 2 v1 , 1 2 2; 1 2 2 1 2 2 1 1 2 1 1 1 2 2 1 1 Au2 2 v2 2 v2 , 2 2; 2 2 1 0 0 0 2 A=VΣUT: 1 1 1 1 0 1 2 2 0 2 2 2 2 1 0 0 2 1 1 2 2
  • 9.
    But when boththe data A and b are contaminated with errors, (A+E)x = b+r and for which D ( E : r )T F where D diag ( d 1 ...... d m ) And T diag ( t1 ...... t m ) . Once a minimizing (E:r) is found subject to constraints there exists a vector ~ such that x (A + E) ~ = b + r. x
  • 10.
    1. The line of total least squares passes through the centroids. 2. Total least squares minimizes the norm of a linear map. 3. The line of total least squares corresponds to the smallest singular value.
  • 11.
    Least Squares Consider againthe set of data points, t , b m i i i 1 and the problem of linear approximation of this set: b ti i In the Least Squares (LS) approach, we defined a set of equations: b1 1 t1 m m b2 1 t2 1 m ti bi b Ax x * T A A T A b m i 1 m i 1 m    ti 2 ti t ib i i 1 i 1 i 1 bm 1 tm If ti bi ri , then the LS solution x T minimizes the sum of squared errors: m T 2 2 Ax b Ax b Ax b 2 rk k 1
  • 12.
    Least Squares This approachassumes that in the set of points t , b m i i i 1 the values of bi are measured with errors while the values of ti are exact, as demonstrated on the figure.
  • 13.
    Total Least Squares Assume, that we rewrite the line equation b t in the form: 1 . Then the corresponding LS equation becomes: t b b t1 1 b1 m m m bi ti t2 1 b2 * T 1 T i 1 i 1 x C C C t t Cx m m 2 m    bi bi t ib i i 1 i 1 i 1 tm 1 bm Corresponding to minimization of m T 2 2 Cx t Cx t Cx t k bi ti i 2 k 1 Which means noise in ti instead of bi, and this generally leads to different solution.
  • 14.
    TLS To solve theproblem with the noise along both ti and bi, we rewrite the line equation as: bi b ti t where m m 1 1 b bi t ti m i 1 m i 1 Now we can write: t1 t b1 b 0 t2 t b2 b 0 Ax   1  tm t bm b 0 The exact solution of this system is possible if ti, bi lie on the same line, in this case rank(A)=1. This formulation is symmetric relatively to t and b.
  • 15.
    TLS The rank ofA is 2 since the points are noisy and do not lie on the same line. SVD factorization, and zeroing of the second singular value allow to construct the matrix A1, closest to A with rank(A1)=1. t1 t b1 b d 11 0 u 11  u 11 T t2 t b2 b T 0 d 22 v11 v12 A U V      0 0 v 21 v 22 u m1  u mm tm t bm b   0 0 d 11 0 t1 at 1 u 11  u 11 T 0 0 T 0 0 v 11 v 12 t1 at 2 A1 U 1 V    0 0 v 21 v 22   u m1  u mm 0 0   tm at m
  • 16.
    TLS The geometric interpretationof the TLS method is finding a constant a and a set of points 0 tk , such that the points t 0 , at 0 k k lie closest in the L2 to the data set t k , b k : 2 | | | | m 0 t 0 arg min ˆ t ˆ b ˆ t 0 ˆ ab arg min tk t 0 2 bk at 0 2 0 0 k k a ,t a ,t k k 1 | | | | F
  • 17.
    Frobenius norm: Thm: Assumethat the matrix A m n has rank r>k. The frobenius norm matrix approximation problem rank ( Z ) k A Z F has the solution min Z Ak U k k V k T , where U k ( u 1 , u 2 ,... u k ) , V k ( v1 , v 2 ,... v k ) and k ( 1 , 2 ,... k ) . p The minimum is A Ak F ( i k i i ) where p min( m , n ) . 2 1/ 2
  • 18.
    TLS formulation: Given a matrix with A ,m > n and a vector b , find m n m 1  residuals E and r that minimize the Frobenius norm m n m 1 (E | r) F subject to the condition, ( b r) Im( A E)