SlideShare a Scribd company logo
1 of 29
EE-M110 2006/7, EF L5&6 1/29, v2.0
Lectures 5 & 6:
Least Squares Parameter Estimation
q1
q2
f(q)
Dr Martin Brown
Room: E1k, Control Systems Centre
Email: martin.brown@manchester.ac.uk
Telephone: 0161 306 4672
http://www.eee.manchester.ac.uk/intranet/pg/coursematerial/
EE-M110 2006/7, EF L5&6 2/29, v2.0
L5&6: Resources
Core texts
• Ljung, Chapters 4&7
• Norton, Chapter 4
• On-line, Chapters 4&5
In these two lectures, we’re looking at basic discrete time
representations of linear, time invariant plants and
models and seeing how their parameters can be
estimated using the normal equations.
The key example is the first order, linear, stable RC
electrical circuit which we met last week, and which has
an exponential response.
EE-M110 2006/7, EF L5&6 3/29, v2.0
L5&6: Learning Objectives
L5 Linear models and quadratic performance criterion
– ARX & ARMAX discrete-time, linear systems
– Predictive models, regression and exemplar data
– Residual signal
– Performance criterion
L6 Normal equations, interpretation and properties
– Quadratic cost functions
– Derive the normal equations for parameter estimation
– Examples
We’re not too concerned with system dynamics today, we’re
concentrating on the general form of least squares parameter
estimation
EE-M110 2006/7, EF L5&6 4/29, v2.0
Introduction to Parametric System Identification
In a full, physical, linear model, the model’s structure and
coefficients can be determined from first principles
In most cases, we have to estimate/tune the parameters because
of an incomplete understanding about the full system (unknown
drag, …)
We can use exemplar data (input/output examples), {x(t), y(t)}, to
estimate the unknown parameters
Initially assume that the structure is known (unrealistic, but …), and
all that remains to be estimated are the parameter values.
Plant
q
Model
q
u(t)
y(t)
y(t)
^
^
w(t)
v(t)







)
(
)
(
)
(
t
w
t
u
t
x
e(t)
EE-M110 2006/7, EF L5&6 5/29, v2.0
Recursive Parameter Estimation Framework
where:
q, q(t-1) are the real and estimated parameter vectors, respectively.
u(t) is the control input sequence
y(t), y(t), are the real and estimated outputs, respectively
e(t) is a white noise sequence (output/measurement noise)
w(t) is the disturbances from measurable sources
Controller
Plant
q
Model
q(t-1)
u(t) y(t)
y(t)
^
^
e(t)
w(t)
+
+
+
-
^
^
v(t)
EE-M110 2006/7, EF L5&6 6/29, v2.0
Basic Assumptions in System Identification
1) It is assumed that the unobservable disturbances can be
aggregated and represented by a single additive noise e(t).
There may also be input noise. Generally, it is assumed to be
zero-mean, Gaussian
2) The system is assumed to be linear with time-invariant
parameters, so q is not time-varying. This is only
approximately true within certain limits
3) The input signal u(t) is assumed exactly known. Often there
is noise associated with reading/measuring it
4) The system noise e(t) is assumed to be uncorrelated with
the input process u(t). This is unlikely to be true for instance
to due feedback of y(t)
5) The input signals need to be sufficiently exciting, they need
to excite all relevant modes in the model for identification and
testing
EE-M110 2006/7, EF L5&6 7/29, v2.0
Discrete-Time Transfer Function Models
On this course, we’re primarily concerned with discrete time signals and
systems.
Real-world physical, mechanical, electrical systems are continuous
Consider the CT resistor-capacitor circuit:
So let q-1 denote the backward shift operator q-1y(t)=y(t-1), then we have
NB we can use the c2d() Matlab function to go from the continuous time
(transfer function, state space) domain to the discrete time, z-domain.
( ) )
1
(
)
1
(
1
)
(
)
(
)
(
)
(
)
1
(
)
(
)
(
)
(














t
u
t
y
t
y
t
u
t
y
t
y
t
y
RC
t
u
t
y
dt
t
dy
RC
RC
RC
( ) 1
1
)
(
1
1
)
(
)
(
)
(
)
(
)
(









q
q
B
q
q
A
t
u
q
B
t
y
q
A
RC
RC
EE-M110 2006/7, EF L5&6 8/29, v2.0
Transfer Function/ARX DT LTI Model
The previous model is an example of an AutoRegressive with eXogenous
input), which can be expressed more generally as:
Some comments about the form of this model.
1. The degree of the polynomials determines the complexity of the system’s
response and the number of parameters that have to be estimated. The
roots of A(q) determine system stability
2. a0=1, without loss of generality, so the model can be written as a predictive
model y(t) = y(t-1) + … + u(t-1) + …
3. b0=0, as it is assumed that an input cannot instantly affect the output, and
so there must be at least a delay of one time instant between u & y
(assumes a fast enough sample time, relative to the system dynamics).
4. Typically e~N(0,s2) – independent and identically distributed
5. Close relationship between the q-shift and z-transform
6. When n=0, this produces a finite impulse response
m
m
n
n
q
b
q
b
q
B
q
a
q
a
q
A
t
e
t
u
q
B
t
y
q
A















1
1
1
1
)
(
1
)
(
)
(
)
(
)
(
)
(
)
(
EE-M110 2006/7, EF L5&6 9/29, v2.0
Linear Regression
The ARX system’s prediction model can be expressed as
• Here the model’s parameters can be written as:
• Treat the model as a deterministic system
• This is natural if the error term is considered to be insignificant or difficult
to guess
• This denotes the model structure M (linear, time invariant, for example),
and a particular model with a parameter value q, is M(q).
This can be written as a linear regression structure:
where
Parameter vector:
Input vector:
The term regression comes from the statistics literature and provides a
powerful set of techniques for determining the parameters and
interpretating the models. Need access to previous outputs y(t-1) …
)
(
))
(
1
(
)
(
)
(
)
|
(
ˆ t
y
q
A
t
u
q
B
t
y 


θ
θ
x
θ )
(
)
|
(
ˆ t
t
y T

T
m
n b
b
a
a ]
,
,
,
,
,
[ 1
1 
 


θ
T
m
t
u
t
u
n
t
y
t
y
t )]
(
,
),
1
(
),
(
,
),
1
(
[
)
( 



 

x
T
m
n b
b
a
a ]
,
,
,
,
,
[ 1
1 
 


θ
EE-M110 2006/7, EF L5&6 10/29, v2.0
LTI DT ARMAX Model
A more general discrete time, linear time invariant model also includes
Moving Average terms on the error/residual signal
Here, we describe the equation error term, e(t), as a moving average of
white noise (non-iid measurement errors)
Simple example
y(t) = 0.5y(t-1) + 0.3y(t-2) + 1.2u(t-1) - 0.3u(t-2) + 0.5e(t) + 0.5e(t-1)
This can be written as a pseudolinear regression
c
c
b
b
a
a
n
n
n
n
n
n
q
c
q
c
q
C
q
b
q
b
q
B
q
a
q
a
q
A
t
e
q
C
t
u
q
B
t
y
q
A






















1
1
1
1
1
1
1
)
(
)
(
1
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
1
)
(
)
(
)
(
)
|
(
ˆ t
y
q
C
q
A
t
u
q
C
q
B
t
y 








θ
EE-M110 2006/7, EF L5&6 11/29, v2.0
Exemplar Training Data
To estimate the unknown parameters q, we need to collect some
exemplar input-output data, and system identification is then a process
of estimating the parameter values that best fit the data.
The data is generated by a system of noisy ARX linear equations of the
form
where
y is a column vector of measured plant outputs (T,1)
X is a matrix of input regressors (T,n+m)
q is the “true” parameter vector (n+m,1)
e is the error vector (T,1)
Each row of X represents a single input/output sample. Each column of
X represents a time delayed output or input.
Note that there is a “burn-in” period to measure the time-delayed outputs
y(1), y(2), … which are necessary to form the inputs to the time-delayed
vector
[y(1), …, y(t-n)]
e
Xθ
y 

EE-M110 2006/7, EF L5&6 12/29, v2.0
Example: Data for 1st Order ARX Model
1st Order model representation
First order plant model (exponential decay) with no external disturbances
and the measurement noise is additive (Slide 7)
Input vector, output signal and parameters
At time t, the 1st order DT model is represented as
Output y(t)
Input x(t) = [y(t-1); u(t-1)]
Parameters q = [q1; q2]
Data
As there are two parameters, if the system is truly first order and there is
no measurement noise on any of the signals, we just need two (linearly
independent) samples to estimate q.
If there is measurement noise in y(t), we need to collect more data to
reduce the effect of the random noise.
Store X=[y(1) u(1); y(2) u(2); y(3) u(3); …], y=[y(2); y(3); y(4); …]
EE-M110 2006/7, EF L5&6 13/29, v2.0
Prediction Residual Signal
The residual signal (measured-predicted) is defined as:
and can be represented as:
A simple regression interpretation is
(each x represents an exemplar
sample from a single input,
single output system)
)
(t
x
Plant
q
Model
q
y(t)
y(t)
^
^
e(t)
+
+
+
-
x(t)
x(t) r(t)
output
measurement
“residual”
x
x
x
x
x
x
x
)
(
ˆ
)
(
)
( t
y
t
y
t
r 

)
(
ˆ
)
(
t
y
t
y
)
(
ˆ t
y
)
(t
y
)
(t
r
EE-M110 2006/7, EF L5&6 14/29, v2.0
The model’s response can be expressed as
y(t) = xT(t)q
where q is the model’s estimated parameter vector and x(t) is
the input vector
If y(t)=y(t), the model’s response is correct for that single
time sample. The residual r(t)=y(t)-y(t) is zero. The
residual’s magnitude gives us an idea of the “goodness” of
the parameter vector estimate for that data point.
For a set of measured outputs and predictions {y(t),y(t)}t, the
“size” of the residual vector r=y-y, is an estimate of the
parameter goodness
We can determine the size by looking at the norm of r.
Measures of Model Goodness
^ ^
^
^
^
^
^
EE-M110 2006/7, EF L5&6 15/29, v2.0
Residual Norm Measures
A vector p-norm (of a vector r) is defined by:
The most common p-norm is the 2-norm:
The vector p-norm has the properties that:
• ||r||  0
• ||r|| = 0 iff r = 0
• ||kr|| = k||r||
• ||r1+r2||  ||r1||+||r2||
For the residual vector, the norm is only zero if all the
residuals are zero. Otherwise, a small norm means that, on
average, the individual residuals are small in magnitude.
p
T
i
p
i
p
r
1
1






 

r



T
i
i
r
1
2
2
r
EE-M110 2006/7, EF L5&6 16/29, v2.0
Sum of Squared Residuals
The most common discrete time performance index is the
sum of squared residuals (2-norm squared):
For each data point, the model’s output is compared
against the plants and error is squared and summed
over the remaining points.
Any non-zero value for any of the residual values will mean
that the performance index is positive
The performance function f(q) is a function of the
parameter values, because some parameter values will
cause large residuals, others will cause small residuals.
We want the parameter values that minimize f(q) (0).











T
i
i
i y
y
f
1
2
^
2
2
)
( r
θ
EE-M110 2006/7, EF L5&6 17/29, v2.0
The aim of parameter estimation is to estimate the values of q that
minimize this performance index (sum squared residuals or errors
SSE).
When the model can predict the model exactly:
r(t) = e(t)
The residual signal is equal to the additive noise signal
Note that the SSE is often replaced by the mean squared error MSE
defined by
MSE = SSE/T  s2 (the variance of the additive noise signal)
This is the variance of the residual signal.
This is simply represents the average squared error and ensures that the
performance function does not depend on the amount of data
Example, when we have 1000 repeated trials (step responses) of 9 data
points for the DT electric circuit, with additive noise N(0,0.01)
MSE = ||r||2
2/T = 0.0103  s2
RMSE = 0.1015  s`
Relationship between Noise & Residual
EE-M110 2006/7, EF L5&6 18/29, v2.0
Example: DT RC Electrical Circuit
Consider the DT, first order, LTI representation of the RC circuit which is an
ARX model (Slide 7 & 12)
Assume that /RC=0.5, then:
y(t) = 0.5*y(t-1) + 0.5*u(t-1)
Here the system is initially at rest y(0)=0. Note that u here refers to a step
signal which is switched on at t=1 & u(0)=0, rather than the control
signal
Assume that 10 steps are taken, we collect 9 data points for system
identification:
>> X=[y(1:end-1)’ u(1:end-1)];
>> y1 = y(2:end)’;
Gaussian random noise of standard error 0.05 was also added to y1
>> y1e = y1+0.05*randn(size(y1));

















1
992
.
0
1
5
.
0
1
0
0
0


X

















996
.
0
75
.
0
5
.
0
0

y
EE-M110 2006/7, EF L5&6 19/29, v2.0
Example: Noisy Electric Circuit
Note here, we’re cheating a bit by assuming the exact
measurement y(t-1) is available to the model’s input but only
the noisy measurement ye(t) is available to the model’s output.
NB, in these notes, y() generally denotes the noisy output

















1
992
.
0
1
5
.
0
1
0
0
0


X

















996
.
0
75
.
0
5
.
0
0

y




















008
.
1
774
.
0
510
.
0
037
.
0
)
01
.
0
,
0
(

y
ye
 
 
090
.
0
ˆ
,
0081
.
0
ˆ
,
073
.
0
033
.
0
029
.
0
001
.
0
0367
.
0
9744
.
0
744
.
0
511
.
0
0
ˆ
]
5173
0
4756
0
[
ˆ
2
2
2







s
s
r
r
y
θ
T
T
T
.
.

 NB
randn(‘state’, 123456)
EE-M110 2006/7, EF L5&6 20/29, v2.0
Parameter Estimation
An important part of system identification is being able to
estimate the parameters of a linear model, when a
quadratic performance function is used to measure the
model’s goodness.
This produces the well-known normal equations for least
squares estimation
• This is a closed form solution
• Efficiently and robustly solved (in Matlab)
• Permits a statistical interpretation
• Can be solved recursively
Investigated over the next 3-4 lectures
( ) y
X
X
X
θ T
T 1
ˆ 

EE-M110 2006/7, EF L5&6 21/29, v2.0
Noise-free Parameter Determination
Parameter estimation works by assuming a plant/model
structure, which is taken to be exactly known.
If there are n+m parameters in the model, we can collect
n+m pieces of data (linearly independent – to ensure
that the input/data matrix, X, is invertible):
Xq = y
and invert the matrix to find the exact parameter values:
q =X-1y
In Matlab, both of the following forms are equivalent:
theta = inv(X)*y;
theta = Xy;
theta = [0.5 0.5] % Previous example
EE-M110 2006/7, EF L5&6 22/29, v2.0
Linear Model and Quadratic Performance
When the model is linear and the data is noisy (missing inputs,
unmeasurable disturbances), the Sum Squared Error (SSE)
performance index can be expressed as:
This expression is quadratic in q. Typically
size(X,1)>>size(X,2)
It is of the form (for 2 inputs/parameters):
The equivalent system of linear equations Xq=y+e is inconsistent
( )
( )























T
i
T
i
T
i
T
i
i
T
i
i
T
i
T
i
i
T
i
i
i
y
y
y
y
y
f
1
2
1
1
2
1
2
1
2
^
2
)
(
θ
x
θ
x
θ
x
θ
2
1
2
2
2
1
2
1 5
.
0
6
8
3
2
5
)
( q
q
q
q
q
q 





θ
f
EE-M110 2006/7, EF L5&6 23/29, v2.0
Quadratic Matrix Representation
This can also be expressed in matrix form
The general form for a quadratic is:
where
( ) ( )
Xθ
X
θ
y
X
θ
y
y
Xθ
X
θ
y
X
θ
Xθ
y
y
y
Xθ
y
Xθ
y
θ
T
T
T
T
T
T
T
T
T
T
T
T
f










2
)
(
c
f T
T


 θ
f
Hθ
θ
θ 2
1
)
(
)
(
.
.
j
i
k j
k
i
k
ij
T
x
x
E
x
x
H
D
X
X
H




)
(
.
y
x
E
y
x
f
i
k k
i
k
i
T
D
y
X
f







Hessian/covariance matrix Cross-correlation vector
EE-M110 2006/7, EF L5&6 24/29, v2.0
When the parameter vector is optimal:
For a quadratic MSE with a linear model:
At optimality:
In Matlab, the normal equations are:
thetaHat = inv(X’*X)*X’*y;
thetaHat = pinv(X)*y;
thetaHat = Xy;
Normal Equations for a Linear Model
0
θ


f
( )
y
X
Xθ
X
y
y
y
X
θ
Xθ
X
θ
θ
θ
T
T
T
T
T
T
T
f
2
2
2









f
^
q
q
( ) y
X
X
X
θ
0
y
X
θ
X
X
T
T
T
T
1
ˆ
ˆ




EE-M110 2006/7, EF L5&6 25/29, v2.0
Example 1: 2 Parameter Model
Data: 3 data and 2 unknowns































2
.
0
95
.
0
1
.
1
85
.
0
1
2
2
.
1
8
.
0
2
2
1
q
q
T
]
052
.
1
988
.
0
[
ˆ 
θ


























2
.
0
95
.
0
1
.
1
,
85
.
0
1
2
2
.
1
8
.
0
2
y
X
Find Least Squares solution to:
Form variance/covariance matrix and cross correlation vector
Invert variance/covariance matrix









36
.
5
85
.
4
85
.
4
44
.
6
X
XT







85
.
0
26
.
1
y
XT
( ) 







5848
.
0
4404
.
0
4404
.
0
4870
.
0
1
X
XT
Least squares solution
( )
1
ˆ T T


θ X X X y
EE-M110 2006/7, EF L5&6 26/29, v2.0
Example 2: Electrical Circuit ARX Model
9 exemplars and 2 parameters.
Additive measurement noise
T
]
511
.
0
467
.
0
[
ˆ 
θ
Inverse Hessian matrix







89
.
6
57
.
5
y
XT
( ) 











799
.
0
897
.
0
897
.
0
194
.
1
1
1
X
X
H T
Least squares solution

















1
992
.
0
1
5
.
0
1
0
0
0


X

















007
.
1
774
.
0
510
.
0
037
.
0

y
Hessian (variance/covariance)
matrix and correlation vector








8
0078
.
6
0078
.
6
3489
.
5
X
X
H T
NB
randn(‘state’, 123456)
See slides 7, 12, 18 & 19
EE-M110 2006/7, EF L5&6 27/29, v2.0
We can “plot” the performance index against different
parameter values in a model
As shown earlier, f() is a quadratic function in q
It is “centred” at q, I.e. f(q) = min f(q)
The shape (contours) depends on the Hessian matrix X, this
influences the ability to identify the plant. See next lectures
q1
q2
f
Investigation into the Performance Function
^ ^
EE-M110 2006/7, EF L5&6 28/29, v2.0
L5&6 Summary
ARX and ARMAX discrete time linear models are widely used
System identification is being considered simply as parameter estimation
The residual vector is used to assess the quality of the model (parameter
vector)
The sum, squared error/residual (2-norm) is commonly used to measure
the residual’s size because it can be interpreted as the noise variance
and because it is analytically convenient
For a linear model, the SSE is a quadratic function of the parameters,
which can be differentiated to estimate the optimal parameter via the
normal equations
EE-M110 2006/7, EF L5&6 29/29, v2.0
L5&6 Lab
Theory
• Make sure you can derive the normal equations S22-24
Matlab
1. Implement the DT RC circuit simulation, S18, so you can perform a least
squares parameter estimation given noisy data about the electrical circuit
2. Set the Gaussian random seed, as per S26 and check your estimates
are the same
3. Set different seed and note that the optimal parameter values are
different
4. Perform the step experiment 10, 100, 1000, … times and note that the
estimated optimal parameter values tend towards the true values of [0.5
0.5].
5. Load the data into the identification toolbox GUI and create a first order
parametric model with model orders [1 1 1]. NB you do not need to
remove the means from the data (why not?). Calculate the model and
view the value of the parameters and the model fit, as well as checking
the step response and validating the model.

More Related Content

Similar to LeastSquaresParameterEstimation.ppt

Ec8352 signals and systems 2 marks with answers
Ec8352 signals and systems   2 marks with answersEc8352 signals and systems   2 marks with answers
Ec8352 signals and systems 2 marks with answersGayathri Krishnamoorthy
 
Lecture6 Signal and Systems
Lecture6 Signal and SystemsLecture6 Signal and Systems
Lecture6 Signal and Systemsbabak danyal
 
Maneuvering target track prediction model
Maneuvering target track prediction modelManeuvering target track prediction model
Maneuvering target track prediction modelIJCI JOURNAL
 
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...Simen Li
 
Performance Comparison of Identification Methods Applied to Power Systems for...
Performance Comparison of Identification Methods Applied to Power Systems for...Performance Comparison of Identification Methods Applied to Power Systems for...
Performance Comparison of Identification Methods Applied to Power Systems for...Reza Pourramezan
 
Signals and systems( chapter 1)
Signals and systems( chapter 1)Signals and systems( chapter 1)
Signals and systems( chapter 1)Fariza Zahari
 
2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...
2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...
2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...NUI Galway
 
Basic System Properties.ppt
Basic System Properties.pptBasic System Properties.ppt
Basic System Properties.pptYashhKalal
 
On observer design methods for a
On observer design methods for aOn observer design methods for a
On observer design methods for acsandit
 
Adaptive pi based on direct synthesis nishant
Adaptive pi based on direct synthesis nishantAdaptive pi based on direct synthesis nishant
Adaptive pi based on direct synthesis nishantNishant Parikh
 
Convolution problems
Convolution problemsConvolution problems
Convolution problemsPatrickMumba7
 
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdfcab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdfTsegaTeklewold1
 
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdfcab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdfTsegaTeklewold1
 

Similar to LeastSquaresParameterEstimation.ppt (20)

Signals and system
Signals and systemSignals and system
Signals and system
 
Ec8352 signals and systems 2 marks with answers
Ec8352 signals and systems   2 marks with answersEc8352 signals and systems   2 marks with answers
Ec8352 signals and systems 2 marks with answers
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Lecture6 Signal and Systems
Lecture6 Signal and SystemsLecture6 Signal and Systems
Lecture6 Signal and Systems
 
Maneuvering target track prediction model
Maneuvering target track prediction modelManeuvering target track prediction model
Maneuvering target track prediction model
 
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
 
Performance Comparison of Identification Methods Applied to Power Systems for...
Performance Comparison of Identification Methods Applied to Power Systems for...Performance Comparison of Identification Methods Applied to Power Systems for...
Performance Comparison of Identification Methods Applied to Power Systems for...
 
Signals and systems( chapter 1)
Signals and systems( chapter 1)Signals and systems( chapter 1)
Signals and systems( chapter 1)
 
Lect2-SignalProcessing (1).pdf
Lect2-SignalProcessing (1).pdfLect2-SignalProcessing (1).pdf
Lect2-SignalProcessing (1).pdf
 
2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...
2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...
2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...
 
Lec1 01
Lec1 01Lec1 01
Lec1 01
 
Signals & systems
Signals & systems Signals & systems
Signals & systems
 
Basic System Properties.ppt
Basic System Properties.pptBasic System Properties.ppt
Basic System Properties.ppt
 
On observer design methods for a
On observer design methods for aOn observer design methods for a
On observer design methods for a
 
Adaptive pi based on direct synthesis nishant
Adaptive pi based on direct synthesis nishantAdaptive pi based on direct synthesis nishant
Adaptive pi based on direct synthesis nishant
 
Convolution problems
Convolution problemsConvolution problems
Convolution problems
 
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdfcab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
 
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdfcab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
 
Control assignment#1
Control assignment#1Control assignment#1
Control assignment#1
 

Recently uploaded

STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 

Recently uploaded (20)

STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 

LeastSquaresParameterEstimation.ppt

  • 1. EE-M110 2006/7, EF L5&6 1/29, v2.0 Lectures 5 & 6: Least Squares Parameter Estimation q1 q2 f(q) Dr Martin Brown Room: E1k, Control Systems Centre Email: martin.brown@manchester.ac.uk Telephone: 0161 306 4672 http://www.eee.manchester.ac.uk/intranet/pg/coursematerial/
  • 2. EE-M110 2006/7, EF L5&6 2/29, v2.0 L5&6: Resources Core texts • Ljung, Chapters 4&7 • Norton, Chapter 4 • On-line, Chapters 4&5 In these two lectures, we’re looking at basic discrete time representations of linear, time invariant plants and models and seeing how their parameters can be estimated using the normal equations. The key example is the first order, linear, stable RC electrical circuit which we met last week, and which has an exponential response.
  • 3. EE-M110 2006/7, EF L5&6 3/29, v2.0 L5&6: Learning Objectives L5 Linear models and quadratic performance criterion – ARX & ARMAX discrete-time, linear systems – Predictive models, regression and exemplar data – Residual signal – Performance criterion L6 Normal equations, interpretation and properties – Quadratic cost functions – Derive the normal equations for parameter estimation – Examples We’re not too concerned with system dynamics today, we’re concentrating on the general form of least squares parameter estimation
  • 4. EE-M110 2006/7, EF L5&6 4/29, v2.0 Introduction to Parametric System Identification In a full, physical, linear model, the model’s structure and coefficients can be determined from first principles In most cases, we have to estimate/tune the parameters because of an incomplete understanding about the full system (unknown drag, …) We can use exemplar data (input/output examples), {x(t), y(t)}, to estimate the unknown parameters Initially assume that the structure is known (unrealistic, but …), and all that remains to be estimated are the parameter values. Plant q Model q u(t) y(t) y(t) ^ ^ w(t) v(t)        ) ( ) ( ) ( t w t u t x e(t)
  • 5. EE-M110 2006/7, EF L5&6 5/29, v2.0 Recursive Parameter Estimation Framework where: q, q(t-1) are the real and estimated parameter vectors, respectively. u(t) is the control input sequence y(t), y(t), are the real and estimated outputs, respectively e(t) is a white noise sequence (output/measurement noise) w(t) is the disturbances from measurable sources Controller Plant q Model q(t-1) u(t) y(t) y(t) ^ ^ e(t) w(t) + + + - ^ ^ v(t)
  • 6. EE-M110 2006/7, EF L5&6 6/29, v2.0 Basic Assumptions in System Identification 1) It is assumed that the unobservable disturbances can be aggregated and represented by a single additive noise e(t). There may also be input noise. Generally, it is assumed to be zero-mean, Gaussian 2) The system is assumed to be linear with time-invariant parameters, so q is not time-varying. This is only approximately true within certain limits 3) The input signal u(t) is assumed exactly known. Often there is noise associated with reading/measuring it 4) The system noise e(t) is assumed to be uncorrelated with the input process u(t). This is unlikely to be true for instance to due feedback of y(t) 5) The input signals need to be sufficiently exciting, they need to excite all relevant modes in the model for identification and testing
  • 7. EE-M110 2006/7, EF L5&6 7/29, v2.0 Discrete-Time Transfer Function Models On this course, we’re primarily concerned with discrete time signals and systems. Real-world physical, mechanical, electrical systems are continuous Consider the CT resistor-capacitor circuit: So let q-1 denote the backward shift operator q-1y(t)=y(t-1), then we have NB we can use the c2d() Matlab function to go from the continuous time (transfer function, state space) domain to the discrete time, z-domain. ( ) ) 1 ( ) 1 ( 1 ) ( ) ( ) ( ) ( ) 1 ( ) ( ) ( ) (               t u t y t y t u t y t y t y RC t u t y dt t dy RC RC RC ( ) 1 1 ) ( 1 1 ) ( ) ( ) ( ) ( ) (          q q B q q A t u q B t y q A RC RC
  • 8. EE-M110 2006/7, EF L5&6 8/29, v2.0 Transfer Function/ARX DT LTI Model The previous model is an example of an AutoRegressive with eXogenous input), which can be expressed more generally as: Some comments about the form of this model. 1. The degree of the polynomials determines the complexity of the system’s response and the number of parameters that have to be estimated. The roots of A(q) determine system stability 2. a0=1, without loss of generality, so the model can be written as a predictive model y(t) = y(t-1) + … + u(t-1) + … 3. b0=0, as it is assumed that an input cannot instantly affect the output, and so there must be at least a delay of one time instant between u & y (assumes a fast enough sample time, relative to the system dynamics). 4. Typically e~N(0,s2) – independent and identically distributed 5. Close relationship between the q-shift and z-transform 6. When n=0, this produces a finite impulse response m m n n q b q b q B q a q a q A t e t u q B t y q A                1 1 1 1 ) ( 1 ) ( ) ( ) ( ) ( ) ( ) (
  • 9. EE-M110 2006/7, EF L5&6 9/29, v2.0 Linear Regression The ARX system’s prediction model can be expressed as • Here the model’s parameters can be written as: • Treat the model as a deterministic system • This is natural if the error term is considered to be insignificant or difficult to guess • This denotes the model structure M (linear, time invariant, for example), and a particular model with a parameter value q, is M(q). This can be written as a linear regression structure: where Parameter vector: Input vector: The term regression comes from the statistics literature and provides a powerful set of techniques for determining the parameters and interpretating the models. Need access to previous outputs y(t-1) … ) ( )) ( 1 ( ) ( ) ( ) | ( ˆ t y q A t u q B t y    θ θ x θ ) ( ) | ( ˆ t t y T  T m n b b a a ] , , , , , [ 1 1      θ T m t u t u n t y t y t )] ( , ), 1 ( ), ( , ), 1 ( [ ) (        x T m n b b a a ] , , , , , [ 1 1      θ
  • 10. EE-M110 2006/7, EF L5&6 10/29, v2.0 LTI DT ARMAX Model A more general discrete time, linear time invariant model also includes Moving Average terms on the error/residual signal Here, we describe the equation error term, e(t), as a moving average of white noise (non-iid measurement errors) Simple example y(t) = 0.5y(t-1) + 0.3y(t-2) + 1.2u(t-1) - 0.3u(t-2) + 0.5e(t) + 0.5e(t-1) This can be written as a pseudolinear regression c c b b a a n n n n n n q c q c q C q b q b q B q a q a q A t e q C t u q B t y q A                       1 1 1 1 1 1 1 ) ( ) ( 1 ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( 1 ) ( ) ( ) ( ) | ( ˆ t y q C q A t u q C q B t y          θ
  • 11. EE-M110 2006/7, EF L5&6 11/29, v2.0 Exemplar Training Data To estimate the unknown parameters q, we need to collect some exemplar input-output data, and system identification is then a process of estimating the parameter values that best fit the data. The data is generated by a system of noisy ARX linear equations of the form where y is a column vector of measured plant outputs (T,1) X is a matrix of input regressors (T,n+m) q is the “true” parameter vector (n+m,1) e is the error vector (T,1) Each row of X represents a single input/output sample. Each column of X represents a time delayed output or input. Note that there is a “burn-in” period to measure the time-delayed outputs y(1), y(2), … which are necessary to form the inputs to the time-delayed vector [y(1), …, y(t-n)] e Xθ y  
  • 12. EE-M110 2006/7, EF L5&6 12/29, v2.0 Example: Data for 1st Order ARX Model 1st Order model representation First order plant model (exponential decay) with no external disturbances and the measurement noise is additive (Slide 7) Input vector, output signal and parameters At time t, the 1st order DT model is represented as Output y(t) Input x(t) = [y(t-1); u(t-1)] Parameters q = [q1; q2] Data As there are two parameters, if the system is truly first order and there is no measurement noise on any of the signals, we just need two (linearly independent) samples to estimate q. If there is measurement noise in y(t), we need to collect more data to reduce the effect of the random noise. Store X=[y(1) u(1); y(2) u(2); y(3) u(3); …], y=[y(2); y(3); y(4); …]
  • 13. EE-M110 2006/7, EF L5&6 13/29, v2.0 Prediction Residual Signal The residual signal (measured-predicted) is defined as: and can be represented as: A simple regression interpretation is (each x represents an exemplar sample from a single input, single output system) ) (t x Plant q Model q y(t) y(t) ^ ^ e(t) + + + - x(t) x(t) r(t) output measurement “residual” x x x x x x x ) ( ˆ ) ( ) ( t y t y t r   ) ( ˆ ) ( t y t y ) ( ˆ t y ) (t y ) (t r
  • 14. EE-M110 2006/7, EF L5&6 14/29, v2.0 The model’s response can be expressed as y(t) = xT(t)q where q is the model’s estimated parameter vector and x(t) is the input vector If y(t)=y(t), the model’s response is correct for that single time sample. The residual r(t)=y(t)-y(t) is zero. The residual’s magnitude gives us an idea of the “goodness” of the parameter vector estimate for that data point. For a set of measured outputs and predictions {y(t),y(t)}t, the “size” of the residual vector r=y-y, is an estimate of the parameter goodness We can determine the size by looking at the norm of r. Measures of Model Goodness ^ ^ ^ ^ ^ ^ ^
  • 15. EE-M110 2006/7, EF L5&6 15/29, v2.0 Residual Norm Measures A vector p-norm (of a vector r) is defined by: The most common p-norm is the 2-norm: The vector p-norm has the properties that: • ||r||  0 • ||r|| = 0 iff r = 0 • ||kr|| = k||r|| • ||r1+r2||  ||r1||+||r2|| For the residual vector, the norm is only zero if all the residuals are zero. Otherwise, a small norm means that, on average, the individual residuals are small in magnitude. p T i p i p r 1 1          r    T i i r 1 2 2 r
  • 16. EE-M110 2006/7, EF L5&6 16/29, v2.0 Sum of Squared Residuals The most common discrete time performance index is the sum of squared residuals (2-norm squared): For each data point, the model’s output is compared against the plants and error is squared and summed over the remaining points. Any non-zero value for any of the residual values will mean that the performance index is positive The performance function f(q) is a function of the parameter values, because some parameter values will cause large residuals, others will cause small residuals. We want the parameter values that minimize f(q) (0).            T i i i y y f 1 2 ^ 2 2 ) ( r θ
  • 17. EE-M110 2006/7, EF L5&6 17/29, v2.0 The aim of parameter estimation is to estimate the values of q that minimize this performance index (sum squared residuals or errors SSE). When the model can predict the model exactly: r(t) = e(t) The residual signal is equal to the additive noise signal Note that the SSE is often replaced by the mean squared error MSE defined by MSE = SSE/T  s2 (the variance of the additive noise signal) This is the variance of the residual signal. This is simply represents the average squared error and ensures that the performance function does not depend on the amount of data Example, when we have 1000 repeated trials (step responses) of 9 data points for the DT electric circuit, with additive noise N(0,0.01) MSE = ||r||2 2/T = 0.0103  s2 RMSE = 0.1015  s` Relationship between Noise & Residual
  • 18. EE-M110 2006/7, EF L5&6 18/29, v2.0 Example: DT RC Electrical Circuit Consider the DT, first order, LTI representation of the RC circuit which is an ARX model (Slide 7 & 12) Assume that /RC=0.5, then: y(t) = 0.5*y(t-1) + 0.5*u(t-1) Here the system is initially at rest y(0)=0. Note that u here refers to a step signal which is switched on at t=1 & u(0)=0, rather than the control signal Assume that 10 steps are taken, we collect 9 data points for system identification: >> X=[y(1:end-1)’ u(1:end-1)]; >> y1 = y(2:end)’; Gaussian random noise of standard error 0.05 was also added to y1 >> y1e = y1+0.05*randn(size(y1));                  1 992 . 0 1 5 . 0 1 0 0 0   X                  996 . 0 75 . 0 5 . 0 0  y
  • 19. EE-M110 2006/7, EF L5&6 19/29, v2.0 Example: Noisy Electric Circuit Note here, we’re cheating a bit by assuming the exact measurement y(t-1) is available to the model’s input but only the noisy measurement ye(t) is available to the model’s output. NB, in these notes, y() generally denotes the noisy output                  1 992 . 0 1 5 . 0 1 0 0 0   X                  996 . 0 75 . 0 5 . 0 0  y                     008 . 1 774 . 0 510 . 0 037 . 0 ) 01 . 0 , 0 (  y ye     090 . 0 ˆ , 0081 . 0 ˆ , 073 . 0 033 . 0 029 . 0 001 . 0 0367 . 0 9744 . 0 744 . 0 511 . 0 0 ˆ ] 5173 0 4756 0 [ ˆ 2 2 2        s s r r y θ T T T . .   NB randn(‘state’, 123456)
  • 20. EE-M110 2006/7, EF L5&6 20/29, v2.0 Parameter Estimation An important part of system identification is being able to estimate the parameters of a linear model, when a quadratic performance function is used to measure the model’s goodness. This produces the well-known normal equations for least squares estimation • This is a closed form solution • Efficiently and robustly solved (in Matlab) • Permits a statistical interpretation • Can be solved recursively Investigated over the next 3-4 lectures ( ) y X X X θ T T 1 ˆ  
  • 21. EE-M110 2006/7, EF L5&6 21/29, v2.0 Noise-free Parameter Determination Parameter estimation works by assuming a plant/model structure, which is taken to be exactly known. If there are n+m parameters in the model, we can collect n+m pieces of data (linearly independent – to ensure that the input/data matrix, X, is invertible): Xq = y and invert the matrix to find the exact parameter values: q =X-1y In Matlab, both of the following forms are equivalent: theta = inv(X)*y; theta = Xy; theta = [0.5 0.5] % Previous example
  • 22. EE-M110 2006/7, EF L5&6 22/29, v2.0 Linear Model and Quadratic Performance When the model is linear and the data is noisy (missing inputs, unmeasurable disturbances), the Sum Squared Error (SSE) performance index can be expressed as: This expression is quadratic in q. Typically size(X,1)>>size(X,2) It is of the form (for 2 inputs/parameters): The equivalent system of linear equations Xq=y+e is inconsistent ( ) ( )                        T i T i T i T i i T i i T i T i i T i i i y y y y y f 1 2 1 1 2 1 2 1 2 ^ 2 ) ( θ x θ x θ x θ 2 1 2 2 2 1 2 1 5 . 0 6 8 3 2 5 ) ( q q q q q q       θ f
  • 23. EE-M110 2006/7, EF L5&6 23/29, v2.0 Quadratic Matrix Representation This can also be expressed in matrix form The general form for a quadratic is: where ( ) ( ) Xθ X θ y X θ y y Xθ X θ y X θ Xθ y y y Xθ y Xθ y θ T T T T T T T T T T T T f           2 ) ( c f T T    θ f Hθ θ θ 2 1 ) ( ) ( . . j i k j k i k ij T x x E x x H D X X H     ) ( . y x E y x f i k k i k i T D y X f        Hessian/covariance matrix Cross-correlation vector
  • 24. EE-M110 2006/7, EF L5&6 24/29, v2.0 When the parameter vector is optimal: For a quadratic MSE with a linear model: At optimality: In Matlab, the normal equations are: thetaHat = inv(X’*X)*X’*y; thetaHat = pinv(X)*y; thetaHat = Xy; Normal Equations for a Linear Model 0 θ   f ( ) y X Xθ X y y y X θ Xθ X θ θ θ T T T T T T T f 2 2 2          f ^ q q ( ) y X X X θ 0 y X θ X X T T T T 1 ˆ ˆ    
  • 25. EE-M110 2006/7, EF L5&6 25/29, v2.0 Example 1: 2 Parameter Model Data: 3 data and 2 unknowns                                2 . 0 95 . 0 1 . 1 85 . 0 1 2 2 . 1 8 . 0 2 2 1 q q T ] 052 . 1 988 . 0 [ ˆ  θ                           2 . 0 95 . 0 1 . 1 , 85 . 0 1 2 2 . 1 8 . 0 2 y X Find Least Squares solution to: Form variance/covariance matrix and cross correlation vector Invert variance/covariance matrix          36 . 5 85 . 4 85 . 4 44 . 6 X XT        85 . 0 26 . 1 y XT ( )         5848 . 0 4404 . 0 4404 . 0 4870 . 0 1 X XT Least squares solution ( ) 1 ˆ T T   θ X X X y
  • 26. EE-M110 2006/7, EF L5&6 26/29, v2.0 Example 2: Electrical Circuit ARX Model 9 exemplars and 2 parameters. Additive measurement noise T ] 511 . 0 467 . 0 [ ˆ  θ Inverse Hessian matrix        89 . 6 57 . 5 y XT ( )             799 . 0 897 . 0 897 . 0 194 . 1 1 1 X X H T Least squares solution                  1 992 . 0 1 5 . 0 1 0 0 0   X                  007 . 1 774 . 0 510 . 0 037 . 0  y Hessian (variance/covariance) matrix and correlation vector         8 0078 . 6 0078 . 6 3489 . 5 X X H T NB randn(‘state’, 123456) See slides 7, 12, 18 & 19
  • 27. EE-M110 2006/7, EF L5&6 27/29, v2.0 We can “plot” the performance index against different parameter values in a model As shown earlier, f() is a quadratic function in q It is “centred” at q, I.e. f(q) = min f(q) The shape (contours) depends on the Hessian matrix X, this influences the ability to identify the plant. See next lectures q1 q2 f Investigation into the Performance Function ^ ^
  • 28. EE-M110 2006/7, EF L5&6 28/29, v2.0 L5&6 Summary ARX and ARMAX discrete time linear models are widely used System identification is being considered simply as parameter estimation The residual vector is used to assess the quality of the model (parameter vector) The sum, squared error/residual (2-norm) is commonly used to measure the residual’s size because it can be interpreted as the noise variance and because it is analytically convenient For a linear model, the SSE is a quadratic function of the parameters, which can be differentiated to estimate the optimal parameter via the normal equations
  • 29. EE-M110 2006/7, EF L5&6 29/29, v2.0 L5&6 Lab Theory • Make sure you can derive the normal equations S22-24 Matlab 1. Implement the DT RC circuit simulation, S18, so you can perform a least squares parameter estimation given noisy data about the electrical circuit 2. Set the Gaussian random seed, as per S26 and check your estimates are the same 3. Set different seed and note that the optimal parameter values are different 4. Perform the step experiment 10, 100, 1000, … times and note that the estimated optimal parameter values tend towards the true values of [0.5 0.5]. 5. Load the data into the identification toolbox GUI and create a first order parametric model with model orders [1 1 1]. NB you do not need to remove the means from the data (why not?). Calculate the model and view the value of the parameters and the model fit, as well as checking the step response and validating the model.