10. 3.3 Recursive Least Squares Estimation
If we obtain measurements sequentially and want to update our estimate of z with
each new measurement, we need to augment the H matrix and completely
recompute the estimate x_hat. If the number of measurements becomes large,
then the computational effort could become prohibitive.
Weighted Least Square Estimator
11. 3.3 Recursive Least Squares Estimation
Linear Recursive Least Square
Estimator
gain matrix, correction term
12. 3.3 Recursive Least Squares Estimation
if measurement noise(v_k) is zero mean for all k,
and x_hat_0 = E(x),
E(x_hat) = E(x_k) for all k
13. 3.3 Recursive Least Squares Estimation
The optimality criterion :
Cost Function , Minimize the the sum of the variance of
estimation errors (J)
14. 3.3 Recursive Least Squares Estimation
Estimation-error covariance :
P_k is Positive definite.
17. 3.3.1 Alternate Estimator Forms
Sometimes it is useful to write the equations for P_k and K_k in alternate forms. Although these alternate
forms are mathematically identical, they can be beneficial from a computational point of view. They can
also lead to new results, which we will discover in later chapters.
19. 3.3.1 Alternate Estimator Forms
This is a simpler equation for P_{k}, but numerical computing problems (Le,
scaling issues) may cause this expression for P_{k} to not
be positive definite, even when P_{k-1} and R_{k} are positive definite.
22. 3.3.1 Alternate Estimator Forms
EX. 3.4 In this example, we illustrate the computational advantages of the first
form of the covariance update in Equation compared with the third form. Suppose
we have a scalar parameter z and a perfect measurement of it. That is, H1 = 1
and R1 = 0. Further suppose that our initial estimation covariance Po = 6, and our
computer provides precision of three digits to the right of the decimal point for
each quantity that it computes. The estimator gain K1 is
23. 3.3.2 Curve fitting
we measure data one sample at a time (y1, y2, ...) and want to find the best fit of a
curve to the data. The curve that we want to fit to the data could be constrained to be
linear, or quadratic, or sinusoid, or some other shape, depending on the underlying
problem.
EXAMPLE 3.7
Suppose that we know a priori that the underlying data is a quadratic function of
time. In this case, we have a quadratic data fitting problem. For example, suppose
we are measuring the altitude of a free-falling object. We know from our
understanding of physics that altitude r is a function of the acceleration due to
gravity, the initial altitude and velocity of the object r_{0} and v_{0}, and time t, as
given by the equation r=r_{0}+v_{0} t+(a / 2) t^{2} . So if we measure r at various
time instants and fit a quadratic to the resulting r versus t curve, then we have an
estimate of the parameters r_{0}, v_{0}, and a / 2 .
31. .
.
.
.
.
.
.
.
Introduction
A brief review of Wiener filtering.
Knowledge of Wiener filtering is not assummed.
Wiener filtering is historically important, still widely used in signal
processing and communication theory.
But WF is not used much for state estimation, this section is optional
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 4 / 44
32. .
.
.
.
.
.
.
.
History of Wiener Filtering
Design a linear, time-invariant filter to extract a signal from noise,
approaching the problem from the frequency domain perspective.
Invented in World War II by Norbert Wiener
First published in 1942, but knwon to public 1949 [Wie64].
Andrey Kolmogorov actually solved a more general problem earlier
(1941), and Mark Krein also worked on the same problem (1945).
Kolmogorov’s and Krein’s work was independent of Wiener’s work,
and Wiener acknowledges that Kolmogorov’s work predated his own
work [Wie56].
However, Kolmogorov’s and Krein’s work did not become well known
in the Western world until later, since it was published in Russian
[Kol41].
A nontechnical account of Wiener’s work is given in his autobiography
[Wie56]
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 5 / 44
33. .
.
.
.
.
.
.
.
Backgrounds (pp. 94-95) I
To set up the presentation of the Wiener filter, we first need to ask
the following question:
Question
How does the power spectrum of a stochastic process 𝑥(𝑡) change
when it goes through an LTI system with Impulse response 𝑔(𝑡) ?
Output 𝑦(𝑡)
The output 𝑦(𝑡) of the system is given by the convolution of the
impulse response with the input:
𝑦(𝑡) = 𝑔(𝑡) ∗ 𝑥(𝑡)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 6 / 44
34. .
.
.
.
.
.
.
.
Time Invariant
The system is time-invariant, a time shift in the input results in an
equal time shift in the output:
𝑦(𝑡 + 𝛼) = 𝑔(𝑡) ∗ 𝑥(𝑡 + 𝛼)
Convolution Integral
Multiplying the above two equations and writing out the convolutions
as integrals gives
𝑦(𝑡)𝑦(𝑡 + 𝛼) = ∫ 𝑔(𝜏)𝑥(𝑡 − 𝜏)𝑑𝜏 ∫ 𝑔(𝛾)𝑥(𝑡 + 𝛼 − 𝛾)𝑑𝛾
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 7 / 44
35. .
.
.
.
.
.
.
.
Autocorrelation of 𝑦(𝑡)
Taking the expected value of both sides of the above equation gives
the autocorrelation of 𝑦(𝑡) as a function of the autocorrelation of 𝑥(𝑡)
𝐸[𝑦(𝑡)𝑦(𝑡 + 𝛼)] = ∬ 𝑔(𝜏)𝑔(𝛾)𝐸[𝑥(𝑡 − 𝜏)𝑥(𝑡 + 𝛼 − 𝛾)]𝑑𝜏 𝑑𝛾
Shorthand Notation 𝑅 𝑦(𝛼)
… it will be written in shorthand notation as
𝑅 𝑦(𝛼) = ∬ 𝑔(𝜏)𝑔(𝛾)𝑅 𝑥(𝛼 + 𝜏 − 𝛾)𝑑𝜏 𝑑𝛾
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 8 / 44
36. .
.
.
.
.
.
.
.
Taking Fourier Transform
Take the Fourier transform of the above equation to obtain
∫ 𝑅 𝑦(𝛼)𝑒−𝑗𝜔𝛼
𝑑𝛼 = ∭ 𝑔(𝜏)𝑔(𝛾)𝑅 𝑧(𝛼 + 𝜏 − 𝛾)𝑒−𝑗𝜔𝛼
𝑑𝜏 𝑑𝛾𝑑𝛼
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 9 / 44
37. .
.
.
.
.
.
.
.
Power Spectrum of the Ouput 𝑦(𝑡)
Now we define a new variable of integration 𝛽 = 𝛼 + 𝜏 − 𝛾 and
replace 𝛼 in above equation to obtain
𝑆 𝑦(𝜔) = ∭ 𝑔(𝜏)𝑔(𝛾)𝑅 𝑧(𝛽)𝑒−𝑗𝜔𝛽
𝑒−𝑗𝜔𝜏
𝑒 𝑗𝜔𝜏
𝑑𝜏 𝑑𝛾𝑑𝛽
= 𝐺(−𝜔)𝐺(𝜔)𝑆 𝑥(𝜔)
(1)
In other words, the power spectrum of the output 𝑦(𝑡) is a function of
the Fourier transform of the impulse response of the system, 𝐺(𝜔),
and the power spectrum of the input 𝑥(𝑡)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 10 / 44
38. .
.
.
.
.
.
.
.
Problem statement
Design a stable LTI filter to extract a signal from noise.
Quantities of Interests
𝑥(𝑡) = noise free signal
𝑣(𝑡) = additive noise
𝑔(𝑡) = filter impulse response (to be designed)
̂𝑥(𝑡) = output of filter [estimate of 𝑥(𝑡)]
𝑒(𝑡) = estimation error = 𝑥(𝑡) − ̂𝑥(𝑡)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 11 / 44
39. .
.
.
.
.
.
.
.
Mathematical Expression
Fourier Transform
These quantities are represented in fig. 1 from which we see that
̂𝑥(𝑡) = 𝑔(𝑡) ∗ [𝑥(𝑡) + 𝑣(𝑡)]
̂𝑋(𝜔) = 𝐺(𝜔)[𝑋(𝜔) + 𝑉 (𝜔)]
𝐸(𝜔) = 𝑋(𝜔) − ̂𝑋(𝜔)
= 𝑋(𝜔) − 𝐺(𝜔)[𝑋(𝜔) + 𝑉 (𝜔)]
= [1 − 𝐺(𝜔)]𝑋(𝜔) − 𝐺(𝜔)𝑉 (𝜔)
the error signal 𝑒(𝑡) is the superposition of the system [1 − 𝐺(𝜔)]
acting on the signal 𝑥(𝑡),
the system 𝐺(𝜔) acting on the signal 𝑣(𝑡).
Therefore, from eq. 1 we obtain
𝑆 𝑒(𝜔) = [1 − 𝐺(𝜔)][1 − 𝐺(−𝜔)]𝑆 𝑥(𝜔) − 𝐺(𝜔)𝐺(−𝜔)𝑆 𝑣(𝜔) (2)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 12 / 44
40. .
.
.
.
.
.
.
.
Variance of Estimation Error
Recall, (Equation 2.92)
𝑆 𝑋(𝜔) = ∫
∞
−∞
𝑅 𝑋(𝜏)𝑒−𝑗𝜔𝜏
𝑑𝜏
𝑅 𝑋(𝜏) = 1
2𝜋 ∫
∞
−∞
𝑆 𝑋(𝜔)𝑒 𝑗𝜔𝜏
𝑑𝜔
(3)
The variance of the estimation error is obtained from eq. 3 (Equation
2.92) as
𝐸 [𝑒2
(𝑡)] =
1
2𝜋
∫ 𝑆 𝑒(𝜔)𝑑𝜔 (4)
To find the optimal filter 𝐺(𝜔) we need to minimize 𝐸 [𝑒2
(𝑡)],
which means that we need to know 𝑆 𝑥(𝜔) and 𝑆 𝑣(𝜔), the statistical
properties of the signal 𝑥(𝑡) and the noise 𝑣(𝑡)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 13 / 44
43. .
.
.
.
.
.
.
.
3.4.1 Parametric Filter Optimization I
To simplify the problem of the determination of the optimal filter
𝐺(𝜔)
Assume the optimal filter is a first-order, low-pass filter (stable and
causal) with a bandwidth 1/𝑇 to be determined by parametric
optimization.
𝐺(𝜔) =
1
1 + 𝑇 𝑗𝜔
This may not be a valid assumption, but it reduces the problem to a
parametric optimization problem.
To simplify the problem further
Suppose that 𝑆 𝑥(𝜔) and 𝑆 𝑣(𝜔) are in the following forms.
𝑆 𝑥(𝜔) =
2𝜎2
𝛽
𝜔2 + 𝛽2
𝑆 𝑣(𝜔) = 𝐴
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 16 / 44
44. .
.
.
.
.
.
.
.
3.4.1 Parametric Filter Optimization II
In other words, the noise 𝑣(𝑡) is white. From eq. 2 (Equation 3.78)
we obtain
𝑆 𝑒(𝜔) = (
𝑇 𝑗𝜔
1 + 𝑇 𝑗𝜔
) (
−𝑇 𝑗𝜔
1 − 𝑇 𝑗𝜔
) (
2𝜎2
𝛽
𝜔2 + 𝛽2
) −
(
1
1 + 𝑇 𝑗𝜔
) (
1
1 − 𝑇 𝑗𝜔
) 𝐴
Now we can substitute 𝑆 𝑒(𝜔) in eq. 4 (Equation 3.79) and
differentiate with respect to 𝑇 to find
𝑇opt =
√
𝐴
𝜎
√
2𝛽 − 𝛽
√
𝐴
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 17 / 44
45. .
.
.
.
.
.
.
.
Example 3.8
If 𝐴 = 𝜎 = 𝛽 = 1 then the optimal time constant of the filter is
computed as
𝑇 =
1
√
2 − 1
≈ 2.4
the optimal filter is given as
𝐺(𝜔) =
1
1 + 𝑗𝜔𝑇
=
1/𝑇
1/𝑇 + 𝑗𝜔
𝑔(𝑡) =
1
𝑇
𝑒−𝑡/𝑇
𝑡 ≥ 0
Converting this filter to the time domain results in
̇̂𝑥 =
1
𝑇
(− ̂𝑥 + 𝑦)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 18 / 44
51. .
.
.
.
.
.
.
.
Necessary Condition
Now recall from eq. 7 (Equation 2.87)
|𝑅 𝑋(𝜏)| ≤ 𝑅 𝑋(0) (7)
𝑅 𝑥(𝜏 − 𝑢) = 𝑅 𝑧(𝑢 − 𝜏) [i.e., 𝑅 𝑥(𝜏) is even] if 𝑥(𝑡) is stationary.
In this case, the above equation can be written as
0 = − 2 ∫ 𝜂(𝜏)𝑅 𝑥(𝜏)𝑑𝜏+
2 ∬ 𝜂(𝜏)𝑔(𝑢) [𝑅 𝑥(𝑢 − 𝜏) + 𝑅 𝑣(𝑢 − 𝜏)] 𝑑𝜏 𝑑𝑢
This gives the necessary condition for the optimality of the filter 𝑔(𝑡)
as follows:
∫ 𝜂(𝜏) [−𝑅 𝑧(𝜏) + ∫ 𝑔(𝑢) [𝑅 𝑥(𝑢 − 𝜏) + 𝑅 𝑣(𝑢 − 𝜏)] 𝑑𝑢] 𝑑𝜏 = 0 (8
We need to solve this for 𝑔(𝑡) to find the optimal filter.
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 24 / 44
54. .
.
.
.
.
.
.
.
3.4.3 Noncausal Filter Optimization
If no estrictions on causality of our filter, then
𝑔(𝑡) can be nonzero for 𝑡 < 0,
which means that our perturbation 𝜂(𝑡) can also be nonzero for 𝑡 < 0.
This means that the quantity inside the square brackets in eq. 8
(Equation 3.92) must be zero.
This results in
𝑅 𝑥(𝜏) = ∫ 𝑔(𝑢) [𝑅 𝑥(𝑢 − 𝜏) + 𝑅 𝑣(𝑢 − 𝜏)] 𝑑𝑢
= 𝑔(𝜏) ∗ [𝑅 𝑥(𝜏) + 𝑅 𝑣(𝜏)]
𝑆 𝑥(𝜔) = 𝐺(𝜔) [𝑆 𝑥(𝜔) + 𝑆 𝑣(𝜔)]
𝐺(𝜔) =
𝑆 𝑧(𝜔)
𝑆 𝑧(𝜔) + 𝑆 𝑣(𝜔)
(9)
The transfer function of the optimal filter is the ratio of the power
spectrum of the signal 𝑥(𝑡) to the sum of the power spectrums of
𝑥(𝑡) and the noise 𝑣(𝑡)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 27 / 44
55. .
.
.
.
.
.
.
.
Example 3.9
Consider the system discussed in Example 3.8 with 𝐴 = 𝛽 = 𝜎 = 1.
The signal and noise power spectra are given as
𝑆 𝑧(𝜔) =
2
𝜔2 + 1
𝑆 𝑣(𝜔) = 1
From this we obtain the optimal noncausal filter from eq. 9 (Equation
3.93) as
𝐺(𝜔) =
2
𝜔2 + 3
=
1
√
3
(
2
√
3
𝜔2 + 3
)
𝑔(𝑡) =
1
√
3
𝑒−
√
3|𝑡|
≈ 0.58𝑒−0.58|𝑡|
, 𝑡 ∈ [−∞, ∞]
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 28 / 44
56. .
.
.
.
.
.
.
.
Partial Fraction of 𝐺(𝜔)
In order to find a time domain representation of the filter, we perform
a partial fraction expansion of 𝐺(𝜔) to find the causal part and the
anticausal’ part of the filter:
𝐺(𝜔) =
1
√
3(𝑗𝜔 +
√
3)⏟⏟⏟⏟⏟⏟⏟
causal filter
+
1
√
3(−𝑗𝜔 +
√
3)⏟⏟⏟⏟⏟⏟⏟
anticausal filter
From this we see that
̂𝑋(𝜔) =
1
√
3(𝑗𝜔 +
√
3)
𝑌 (𝑠) −
1
√
3(𝑗𝜔 −
√
3)
𝑌 (𝑠)
= ̂𝑋 𝑐(𝜔) + ̂𝑋 𝑎(𝜔)
̂𝑋 𝑐(𝜔) are the causal part of ̃𝑋(𝜔)
̂𝑋 𝑜(𝜔) is anticausal part of ̃𝑋(𝜔)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 29 / 44
57. .
.
.
.
.
.
.
.
In the time domain, this can be written
̂𝑥(𝑡) = ̂𝑥 𝑐(𝑡) + ̂𝑥 𝑎(𝑡)
̇̂𝑥 𝑐 = −
√
3 ̂𝑥 𝑐 + 𝑦/
√
3
̂𝑥 𝑎 =
√
3 ̂𝑥 𝑎 − 𝑦/
√
3
The ̇̂𝑥 𝑐 equation runs forward in time and is therefore causal causal.
The ̂𝑥 𝑎 equation runs backward in time and is therefore anticausal and
stable. (If it ran forward in time, it would be unstable.)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 30 / 44
60. .
.
.
.
.
.
.
.
Theory I
If we require a causal filter for signal estimation, then 𝑔(𝑡) = 0 for
𝑡 < 0, and the perturbation 𝜂(𝑡) must be equal to 0 for 𝑡 < 0. In this
case, eq. 8 gives
𝑅 𝑥(𝜏) − ∫ 𝑔(𝑢) [𝑅 𝑥(𝑢 − 𝜏) + 𝑅 𝑣(𝑢 − 𝜏)] 𝑑𝑢 = 0, 𝑡 ≥ 0 (10)
The initial application of this equation was in the field of astrophysics
in 1894 [Sob63].
Explicit solutions were thought to be impossible, but Norbert Wiener
and Eberhard Hopf became instantly famous when they solved this
equation in 1931.
Their solution was so impressive that the equation became known as
the Wiener-Hopf equation.
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 33 / 44
61. .
.
.
.
.
.
.
.
Solution
To solve eq. 10 (Example 3.99) postulate some function 𝑎(𝑡) that is
arbitrary for 𝑡 < 0 but is equal to 0 for 𝑡 ≥ 0. Then we obtain
𝑅 𝑥(𝜏) − ∫ 𝑔(𝑢) [𝑅 𝑧(𝑢 − 𝜏) + 𝑅 𝑣(𝑢 − 𝜏)] 𝑑𝑢 = 𝑎(𝜏)
𝑆 𝑧(𝜔) − 𝐺(𝜔) [𝑆 𝑧(𝜔) + 𝑆 𝑣(𝜔)] = 𝐴(𝜔)
(11)
For ease of notation, make the following definition:
𝑆 𝑧𝑣(𝜔) = 𝑆 𝑥(𝜔) + 𝑆 𝑣(𝜔)
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 34 / 44
62. .
.
.
.
.
.
.
.
Then eq. 11 (Equation 3.100) becomes
𝑆 𝑥(𝜔) − 𝐺(𝜔)𝑆+
𝑧𝑣(𝜔)𝑆−
𝑥𝑣(𝜔) = 𝐴(𝜔) (12)
where,
𝑆+
𝑥𝑣(𝜔) is the part of 𝑆 𝑥𝑣(𝜔) that has all its poles and zeros in the
LHP (and hence corresponds to a causal time function)
𝑆−
𝑥𝑣(𝜔) is the part of 𝑆 𝑥𝑣(𝜔) that has all its poles and zeros in the
RHP (and hence corresponds to an anticausal time function).
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 35 / 44
63. .
.
.
.
.
.
.
.
eq. 12 (Equation 3.102) can be written as
𝐺(𝜔)𝑆+
𝑥𝑣(𝜔) =
𝑆 𝑧(𝜔)
𝑆−
𝑥𝑣(𝜔)
−
𝐴(𝜔)
𝑆 𝑧𝑣(𝜔)
𝐺(𝜔)𝑆+
𝑥𝑣(𝜔): a causal time function [assuming that 𝑔(𝑡) is stable].
𝐴(𝜔)
𝑆 𝑧𝑣(𝜔) : an anticausal time function.
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 36 / 44
64. .
.
.
.
.
.
.
.
Transfer Function of the Optimal Filter
Therefore,
𝐺(𝜔)𝑆+
𝑥𝑣(𝜔) = causal part of
𝑆 𝑥(𝜔)
𝑆−
𝑥𝑣(𝜔)
𝐺(𝜔) =
1
𝑆+
𝑥𝑣(𝜔)
[ causal part of
𝑆 𝑥(𝜔)
𝑆−
𝑥𝑣(𝜔)
]
(13)
This gives the TF of the optimal causal filter.
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 37 / 44
65. .
.
.
.
.
.
.
.
Example 3.10
Consider the system discussed in Section 3.4.1 with 𝐴 = 𝛽 = 𝜎 = 1.
This was also discussed in Example 3.9 For this example we have
𝑆 𝑧(𝜔) =
2
𝜔2 + 1
𝑆 𝑥𝑣(𝜔) =
𝜔2
+ 3
𝜔2 + 1
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 38 / 44
70. .
.
.
.
.
.
.
.
3.4.5 Comparison I
Comparing the three examples of optimal filter design presented in
this section Example 3.8, Example 3.9, Example 3.10 it can be shown
that the mean square errors of the filter are as follows [Bro96]:
Parameter optimization method: 𝐸 [𝑒2
(𝑡)] = 0.914
Causal Wiener filter: 𝐸 [𝑒2
(𝑡)] = 0.732
Noncausal Wiener filter: 𝐸 [𝑒2
(𝑡)] = 0.577
As expected, the estimation error decreases when we have fewer
constraints on the filter.
However, the removal of constraints makes the filter design problem
more difficult.
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 43 / 44
71. .
.
.
.
.
.
.
.
The Wiener filter is not very amenable to state estimation
because of
difficulty in extension to MIMO problems with state variable
descriptions, and
difficulty in application to signals with time-varying statistical
properties.
Sensor Fusion Study (AI Robotics KR) Chapter 3 Least Squares Estimation 44 / 44