Introduction to Hybrid Vehicle System Modeling and Control - 2013 - Liu - Appendix A System Identification State and.pdf

APPENDIX A
SYSTEM IDENTIFICATION:
STATE AND PARAMETER
ESTIMATION TECHNIQUES
Building mathematical models of subsystems and components is one of the most
important tasks in the analysis and design of hybrid vehicle systems. There are
two approaches to building a mathematical model: based on the principles and
mechanism as well as the relative physical and chemical laws describing the
characteristics of a given system or based on the observed system behaviors.
In engineering practice, the architecture of the mathematical model is usually
determined from the first approach, and detailed model parameters are deter-
mined from the second approach. In this chapter, we introduce basic theories and
methodologies used to build a mathematical model and estimate the parameters
of the model.
A.1 DYNAMIC SYSTEMS AND MATHEMATICAL MODELS
A.1.1 Types of Mathematical Models
The models mentioned in this book are one or a set of mathematical equations
that describe the relationship between inputs and outputs of a physical system.
These mathematical equations may have various forms such as algebra equations,
Introduction to Hybrid Vehicle System Modeling and Control, First Edition. Wei Liu.
© 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.
325

326 SYSTEM IDENTIFICATION: STATE AND PARAMETER ESTIMATION TECHNIQUES
differential equations, partial differential equations, or state space equations.
Mathematical models can be classified as:
• Static versus Dynamic Model A static model does not include time, that
is, the behavior of the described system does not vary over time, while a
dynamic model does. Dynamic models are typically described by a differ-
ential equation or difference equation. Laplace and Fouriers transforms can
be applied to the time-invariant dynamic model.
• Time-Varying versus Time-Invariant Model For a time-varying model, the
input–output characteristics vary over time, that is, the model parameters
differ over time, while a time-invariant model does not. Laplace and Fourier
transforms cannot be applied to time-variant systems.
• Deterministic versus Stochastic Model A deterministic model equation is
uniquely determined by parameters and previous states of the variables, but
different initial conditions may result in different solutions in such models.
In contrast, the variables of a stochastic model can only be described by a
stochastic process or probability distribution.
• Continuous versus Discrete Model A continuous model is one in which
the variables are all functions of the continuous-time variable t. A discrete
model differs from the continuous model in that one or more variables
of the model are in the form of either a pulse train or digital code. In
general, a discrete system receives data or information only intermittently at
specific instants of time. For HEV/EV analysis and design, discrete models
are mostly used.
• Linear versus Nonlinear Model If a mathematical model is linear, it means
that all operators in the model present linearity; otherwise the model is
nonlinear. A linear model satisfies the principles of superposition, which
take homogeneity and additivity rules together.
• Lumped-Parameter versus Distributed-Parameter Model A lumped-para-
meter model can be represented by an ordinary differential or difference
equation with time as the independent variable. A distributed-parameter
system model is described by a partial differential equation with space vari-
able and time variable dependence. Heat flow, diffusion processes, and long
transmission lines are typical distributed-parameter systems. Given certain
assumptions, a distributed-parameter system can be converted to a lumped-
parameter system through a finite-analysis method such as a finite-difference
method.
A.1.2 Linear Time-Continuous Systems
A.1.2.1 Input–Output Model of Linear Time-Invariant and Time-Continuous
System For a linear time-invariant and time-continuous system, shown as
Fig. A-1, the input–output relationship is normally described by the linear

DYNAMIC SYSTEMS AND MATHEMATICAL MODELS 327
System
u(t) y(t)
Input Output
Fig. A-1. Input–output system.
differential equation
dn
y
dtn
+ an−1
dn−1
y
dtn−1
+ · · · + a1
dy
dt
+ a0
y = bm
dm
u
dtm
+ bm−1
dm−1
u
dtm−1
+ · · · + b1
du
dt
+ b0u n ≥ m (A.1)
Initial conditions at t = t0 :
di
y
dti

t=t0
i = 0, 1, 2, . . . , n − 1
where u is the input variable and y is the output variable and coefficients ai, bi
are real constants and independent of u and y.
For a dynamic system, once input for t ≥ t0 and the initial conditions at
t = t0 are specified, the output response at t ≥ t0 is determined by solving
equation (A.1).
The transfer function is another way of describing a dynamic system. To
obtain the transfer function of the linear system represented by equation (A.1),
we simply take the Laplace transform on both sides of the equation and assume
zero initial conditions. The result is
(sn
+ an−1sn−1
+ · · · + a1s + a0)Y(s)
= (bmsm
+ bm−1sm−1
+ · · · + b1s + b0)U(s) n ≥ m (A.2)
The transfer function between u(t) and y(t) is defined as the ratio of Y(s) and
U(s); therefore, the transfer function of the system shown in Fig. A-1 is
G(s) =
Y(s)
U(s)
=
bmsm
+ bm−1sm−1
+ · · · + b1s + b0
sn + an−1sn−1 + · · · + a1s + a0
(A.3)
From equation (A.3), it can be seen that the transfer function is an algebraic
equation by which it will be much easier to analyze system performance. The
transfer function (A.3) has the following properties:
• The transfer function (A.3) is defined only for a linear time-invariant system.
• All initial conditions of the system are assumed to be zero.
• The transfer function is independent of the input and output.

Another way of modeling a linear time-invariant system is using the impulse
response (or weighting function). The impulse response of a linear system is
defined as the output response g(τ) of the system when the input is a unit
impulse function δ(t).
The output of the system shown in Fig. A-1 can be described by its impulse
response (or weighting function) g(τ) as
y(t) =
t
τ=0
g(τ)u(t − τ) dτ = g(t) ∗ u(t) = g(t) convolves into u(t) (A.4)
Knowing {g(τ)}|∞
τ=0 and u(t) for τ ≤ t, we can consequently compute the cor-
responding output y(t), τ ≤ t for any input. Thus, the impulse response is a
complete characterization of the system.
The impulse response also directly leads to the definition of transfer functions
of linear time-invariant systems. Taking the Laplace transform on both sides of
equation (A.4), we have the following equation from the real convolution theorem
of Laplace transformation:
L(y(t)) = Y(s) = L(g(t) ∗ u(t)) = L(g(t)) · L(u(t)) = G(s)U(s)
if u(t) = δ(t), then L(u(t)) = U(s) = 1
(A.5)
Equation (A.5) shows that the transfer function is the Laplace transform of the
impulse responseg(t), and the Laplace transform of the output Y(s) is equal to
the product of the transform function G(s) and the Laplace transform of the
input U(s).
A.1.2.2 State Space Model of Linear Time-Invariant and Time-Continuous
System In the state-space form, the relationship between input and output is
written as a first-order differential equation system using a state vector x(t).
This description of a linear dynamic system became a primary approach after
Kalman’s work on modern prediction control (Goodwin and Payne, 1977). It
is especially useful for hybrid vehicle system design in that insight into the
physical mechanisms of the system can more easily be incorporated into a state
space model than into an input–output model.
A linear stime-invariant and time-continuous system can be described by the
state space equation
State equations: ẋ =
dx
dt
= Ax + Bu(t)
Output equations: y(t) = Cx + Du(t) (A.6)
Initial conditions: x(0) = x0
where x is the n × 1 state vector; u is the p × 1 input vector; y is
the q × 1 output vector; A is an n × n coefficient matrix with constant

elements,
A =
⎡
⎢
⎢
⎢
⎣
a11 a12 · · · a1n
a21 a22 · · · a2n
.
.
.
.
.
.
.
.
.
an1 an2 · · · ann
⎤
⎥
⎥
⎥
⎦
(A.7)
B is an n × p coefficient matrix with constant elements,
B =
⎡
⎢
⎢
⎢
⎣
b11 b12 · · · b1p
b21 b22 · · · b2p
.
.
.
.
.
.
.
.
.
bn1 bn2 · · · bnp
⎤
⎥
⎥
⎥
⎦
(A.8)
C is an q × n coefficient matrix with constant elements,
C =
⎡
⎢
⎢
⎢
⎣
c11 c12 · · · c1n
c21 c22 · · · c2n
.
.
.
.
.
.
.
.
.
cq1 cq2 · · · cqn
⎤
⎥
⎥
⎥
⎦
(A.9)
and D is an q × p coefficient matrix with constant elements,
D =
⎡
⎢
⎢
⎢
⎣
d11 d12 · · · d1p
d21 d22 · · · d2p
.
.
.
.
.
.
.
.
.
dq1 dq2 · · · dqp
⎤
⎥
⎥
⎥
⎦
(A.10)
A. Relationship between State Space Equation and Differential Equation of
Dynamic System Let us consider a single-input, single-output, linear time-
invariant system described by an nth-order differential equation (A.1). The
problem is to represent the system (A.1) by a first-order differential equation.
Since the state variables are internal variables of a given system, they may
not be unique and may be dependent on how they are defined. Let us seek a
convenient way to assign the state variables and make them in the form of
equation (A.6) as follows.
For equation (A.6), a way of defining the state variables for equation (A.1) is
ẋ1 =
dx1
dt
= x2
ẋ2 =
dx2
dt
= x3 (A.11)
.
.
.

ẋn−1 =
dxn−1
dt
= xn
ẋn =
dxn
dt
= −a0x1 − a1x2 − a3x2 − · · · − an−2xn−1 − an−1xn + u
where the last state equation is obtained by the highest order derivative term to
the rest of equation (A.6). The output equation is the linear combination of state
variables and input as
y =

b0 − a0bm

x1 +

b1 − a1bm

x2 + · · · +

bm−1 − a0bm

xm + bmu (A.12)
In vector–matrix form, equations (A.11) and (A.12) are written as
State equations: ẋ = Ax + Bu(t)
Output equations: y(t) = Cx + Du(t)
(A.13)
where x is the n × 1 state vector, u is the scalar input and, y is the scalar output.
The coefficient matrices are
A =
⎡
⎢
⎢
⎢
⎣
0 1 0 · · · 0
0 0 1 · · · 0
· · · · · · · · ·
0 0 0 · · · 1
−a0 −a1 −a2 · · · −an−1
⎤
⎥
⎥
⎥
⎦
n×n
B =
⎡
⎢
⎢
⎢
⎢
⎢
⎣
0
0
.
.
.
0
1
⎤
⎥
⎥
⎥
⎥
⎥
⎦
n×1
C =

b0 − a0bm b1 − a1bm · · · bm−1 − am−1bm 0 · · · 0

1×n
D = [bm]1×1
(A.14)
Example A-1 Consider the input–output differential equation of a dynamic
system
...
y + 6ÿ + 41ẏ + 7y = 6u (A.15)
Define a state space equation and output equation of the system.
Solution: Define the state variables as
x1 = 1
6 y x2 = 1
6 ẏ x3 = 1
6 ÿ
Then,
ẋ1 = 1
6 ẏ = x2, ẋ2 = 1
6 ÿ = x3
ẋ3 = 1
6
...
y = −ÿ − 41
6 ẏ − 7
6 y + u = −7x1 − 41x2 − 6x3 = u
y = 6x1

that is, the defined state space and output equations of the system are
⎡
⎣
ẋ1
ẋ2
ẋ3
⎤
⎦ =
⎡
⎣
0 1 0
0 0 1
−7 −41 −6
⎤
⎦
⎡
⎣
x1
x2
x3
⎤
⎦ +
⎡
⎣
0
0
1
⎤
⎦ u
y = [6 0 0]
⎡
⎣
x1
x2
x3
⎤
⎦
B. Relationship between State Space Equation and Transfer Function of Dynamic
System Consider that a linear time-invariant system is described by the state
space model
State space equations: ẋ = Ax + Bu(t)
Output equations: y(t) = Cx + Du(t)
(A.16)
where x(t) is the n × 1 state vector, u(t) is the p × 1 input vector, y(t) is
the q × 1 output vector, and A, B, C, and D are the coefficient matrices with
appropriate dimensions.
Taking the Laplace transform on both sides of equation (A.16) and solving
for X(s), we have
sX(s) − x(0) = AX(s) + BU(s) ⇒ (sI − A)X(s) = x(0) + BU(s)
(A.17)
Furthermore, it can be written as
X(s) = (sI − A)−1
x(0) + (sI − A)−1
BU(s) (A.18)
Assume that the system has zero initial conditions, x(0) = 0; then equation (A.18)
becomes
X(s) = (sI − A)−1
BU(s) (A.19)
The Laplace transform of output equation (A.16) is
Y(s) = CX(s) + DU(s) (A.20)
Substituting equation (A.19) into equation (A.20), we have
Y(s) = C(sI − A)−1
BU(s) + DU(s) = [C(sI − A)−1
B + D]U(s) (A.21)
Thus, the transfer function is defined as
G(s) =
Y(s)
U(s)
= C(sI − A)−1
B + D (A.22)

which is a q × p matrix corresponding to the dimensions of the input and output
variables of the system.
Example A-2 Consider the state space equation of a dynamic system
⎡
⎣
ẋ1
ẋ2
ẋ3
⎤
⎦ =
⎡
⎣
0 1 0
−2 −3 0
−1 1 −3
⎤
⎦
⎡
⎣
x1
x2
x3
⎤
⎦ +
⎡
⎣
0
1
2
⎤
⎦ u
y = [0 0 1]
⎡
⎣
x1
x2
x3
⎤
⎦
(A.23)
Determine the transfer function of the system.
Solution: The corresponding system matrix input and output vectors are
A =
⎡
⎣
0 1 0
−2 −3 0
−1 1 −3
⎤
⎦ B =
⎡
⎣
0
1
2
⎤
⎦ C = [0 0 1]
Let
(sI − A)−1
=
⎡
⎣
s −1 0
2 s + 3 0
1 −1 s + 3
⎤
⎦
−1
=
1
s3 + 6s2 + 11s + 6
⎡
⎣
s2
+ 6s + 9 s + 3 0
−2(s + 3) s(s + 3) 0
s + 5 s − 1 s2
+ 3s + 2
⎤
⎦
From equation (A.22), the transfer function of the system is
G(s) =
Y(s)
U(s)
= C(sI − A)−1
B =
[0 0 1]
s3 + 6s2 + 11s + 6
×
⎡
⎣
s2
+ 6s + 9 s + 3 0
−2(s + 3) s(s + s3) 0
s + 5 s − 1 s2
+ 3s + 2
⎤
⎦
⎡
⎣
0
1
2
⎤
⎦
=
2s2 + 7s + 3
s3 + 6s2 + 11s + 6

C. Controllability and Observability of Dynamic System Since state variables are
internal variables of a dynamic system, it is necessary to ask if the state variables
are controllable by system inputs as well as if they are observable from system
outputs. The system controllability and observability will answer the question,
and they are defined as follows:
The states of a dynamic system are controllable if there exists a piecewise
continuous control u(t) which will drive the state to any arbitrary finite
state x(tf ) from an arbitrary initial state x(t0) in a finite time tf − t0.
The states of a dynamic system are completely observable if the measurement
(output) y(t) contains the information which can completely identify the
state variables x(t) in a finite time tf − t0.
The concepts of controllability and observability are very important in the the-
oretical and practical aspects of modern control theory. The following theorems
provide the criteria judging if the states of a system are controllable and observ-
able or not.
Theorem A-1 For the system described by the system state space
equation (A.13) to be completely state controllable, it is necessary and sufficient
that the following n × np matrix has a rank of n:
M = [B AB A2
B · · · An−1
B] (A.24)
This theorem shows that the condition of controllability depends on the coefficient
matrices A and B of the system described by equation (A.13). The theorem also
gives a way to test the controllability of a given system.
Theorem A-2 For the system described by equation (A.13) to be completely
observable, it is necessary and sufficient that the following qn × n matrix has a
rank of n:
N =
⎡
⎢
⎢
⎢
⎢
⎢
⎣
C
CA
CA2
.
.
.
CAn−1
⎤
⎥
⎥
⎥
⎥
⎥
⎦
(A.25)
This theorem also gives a way to test the state observability of a given system.
The concepts of controllability and observability of a dynamic system were first
introduced by R. E. Kalman in the 1960s (Kalman, 1960a and 1960b). However,
although the criteria of state controllability and observability given by the above
theorems are quite straightforward, it is not very easy to implement for a multiple-
input system (Ljung, 1987).

A.1.3 Linear Discrete System and Modeling
In contrast to the continuous system, the information of discrete-time systems
is acquired at the sampling moment. If the original signal is continuous, the
sampling of the signal at discrete times is a form of signal modulation. A discrete
system is usually described by a difference equation, impulse response, discrete
state space, or impulse transfer function.
For a linear time-invariant discrete system, its input–output relationship is
described by the linear difference equation
y(k) + a1y(k − 1) + · · · + any(k − n)
= b0u(k) + b1u(k − 1) + · · · + bmu(k − m) (n ≥ m) (A.26)
or y(k) +
n

j=1
aj y(k − j) =
m

j=0
bj u(k − j) (n ≥ m)
where u(k) is the input variable and y(k) is the output variable and coefficients
ai, bi are real constants and independent of u(k) and y(k).
If we introduce
A(q) = 1 + a1q−1
+ · · · + anq−n
B(q) = b0 + b1q−1
+ · · · + bmq−m
(A.27)
equation (A.26) can be written in the form
A(q−1
)y(k) = B(q−1
)u(k) (A.28)
Taking the z transform on both sides of the equation and assuming zero initial
conditions, we will obtain the z-transfer function
G(z) =
Y(z)
U(z)
=
B(z−1)
A(z−1)
=
b0 + b1z−1 + · · · + bmz−m
1 + a1z−1 + · · · + anz−n
(A.29)
The state space model of a linear time-invariant discrete system is as follows:
State equations: x(k + 1) = Ax(k) + Bu(k)
Output equations: y(k) = Cx(k) + Du(k)
(A.30)
where x(k) is a n × 1 state vector, u(k) is a p × 1 input vector, y(k) is a q × 1
output vector, and A is an n × n coefficient matrix with constant elements
A =
⎡
⎢
⎢
⎢
⎣
a11 a12 · · · a1n
a21 a22 · · · a2n
.
.
.
.
.
.
.
.
.
an1 an2 · · · ann
⎤
⎥
⎥
⎥
⎦
(A.31)

B is a n × p coefficient matrix with constant elements
B =
⎡
⎢
⎢
⎢
⎣
b11 b12 · · · b1p
b21 b22 · · · b2p
.
.
.
.
.
.
.
.
.
bn1 bn2 · · · bnp
⎤
⎥
⎥
⎥
⎦
(A.32)
C is a q × n coefficient matrix with constant elements
C =
⎡
⎢
⎢
⎢
⎣
c11 c12 · · · c1n
c21 c22 · · · c2n
.
.
.
.
.
.
.
.
.
cq1 cq2 · · · cqn
⎤
⎥
⎥
⎥
⎦
(A.33)
and D is a q × p coefficient matrix with constant elements
D =
⎡
⎢
⎢
⎢
⎣
d11 d12 · · · d1p
d21 d22 · · · d2p
.
.
.
.
.
.
.
.
.
dq1 dq2 · · · dqp
⎤
⎥
⎥
⎥
⎦
(A.34)
A.1.4 Linear Time-Invariant Discrete Stochastic Systems
Hybrid vehicle design and analysis engineers exclusively deal with observations
of inputs and outputs in discrete systems. In this section, we introduce the linear
discrete stochastic systems.
A. Sampling and Shannon’s Sampling Theorem Because of the discrete-time
nature of the hybrid vehicle controller, sampling is a fundamental problem affect-
ing control algorithm design. The Shannon sampling theorem presents the con-
ditions that the information in the original signal will not be lost during the
sampling. It states that the original continuous-time signal can be perfectly recon-
structed if the sampling frequency is equal to or greater than two times the
maximum frequency in the original continuous-time signal spectrum, that is,
ωs =
2π
Ts
≥ 2ωmax (A.35)
The following consequences can be drawn from the theorem:
• To assure a perfect reconstruction of the original signal, the lower bound
of the sampling angular frequency is 2ωmax for the original signal with the
highest frequency component ωmax.

• Or, if the sampling frequency ωsis determined, the highest frequency compo-
nent of the original signal should be less than ωs/2 for it to be reconstructed
perfectly.
• The frequency ωs/2 plays an important role in signal conversions. It is also
called the Nyquist frequency.
In the design of discrete-time systems, selecting an appropriate sampling time
(Ts) is an important design step. Shannon’s sampling theorem gives the condi-
tions assuring that the contained information in the original signal will not be
lost during the sampling process but does not say what happens when the con-
ditions and procedures are not exactly met; therefore, a system design engineer
who deals with the sampling and reconstruction process needs to understand
the original signal thoroughly, particularly in the frequency content. To deter-
mine the sampling frequency, the engineer also needs to comprehend how the
signal is reconstructed through an interpolation and requirement for the recon-
struction error, including the aliasing and interpolation error. Generally speaking,
the smaller Ts is, the closer the sampled signal is to the continuous signal. But if
the Ts is very small, the actual implementation may be more costly. If Ts is too
large, inaccuracies may occur and much information about the true nature will
be lost.
B. Disturbances on a System Based on equation (A.28), the output can be exactly
calculated once the input is known, but this is unrealistic in most cases. The
inputs, outputs, and parameters of a system may vary randomly with time. This
randomness is called disturbance, and it may be of a nature of noise. In most
cases, such random effects can be described by adding a lumped item at the
output of a regular system model [see Fig. A-2 and equation A.36]:
A(q−1
)y(k) = B(q−1
)u(k) + γ (k) (A.36)
A system involving such a disturbance is called a stochastic system, in which
measurement noise and uncontrollable inputs are the main sources and causes
for such a disturbance. The most distinctive feature of a disturbance is that its
value cannot be exactly predicted. However, information about past disturbance
can be important for making quantified guesses about coming values. Hence,
it is natural to employ a probability method to describe the statistical features
of a disturbance. A special case is that if the disturbance term follows a normal
System
g(t)
u(t) y(t)
Input Output
Fig. A-2. System with disturbance.

distribution, then the statistical features are uniquely described by the mean value
μ and the standard deviation σ of the disturbance. Some examples of noise signals
are shown in Fig. A-3. In stochastic system control design, control algorithm
design engineers must understand the characteristics of the noise signal, and
it is necessary to identify whether the behavior of system disturbance/noise is
stationary or not. For a stationary stochastic process, the probability distribution
is the same over time or position; therefore, some parameters obtained by enough
number of tests are valid to describe this type of stochastic process.
In practice, the mean value, the standard deviation or the variance, and the
peak-to-peak values are the simple features to characterize a stationary stochastic
process, although the spectral density function φ(ω), which characterizes the
frequency content of a signal, is the better representation of the time behavior of
a stationary signal. The value φ(ω) ω/(2
) is the average energy of a signal
in a narrow band of width ω centered around ω.
The average energy in the wide range is defined as
σ2
=
1
2π
∞
−∞
φ(ω) dω (A.37)
A signal where φ(ω) is constant is called white noise. Such a signal has its
energy equally distributed among all frequencies.
In HEV control algorithm design, engineers frequently work with signals
described as stochastic processes with deterministic components. This is because
0 20 40 60 80
–4
–2
0
2
4
Band-limited white noise example 2
0 50 100 150 200
–2
–1
0
1
2
Band-limited white noise example 1
0 50 100 150 200
–6
–4
–2
0
2
4
Random
0 50 100 150 200
–1
0
1
2
3
Uniform random
Fig. A-3. Examples of noise signals.

the input sequence of a system or component is deterministic, or at least partly
deterministic, but the disturbances on the system are conveniently described by
random variables, so the system output becomes a stochastic process with deter-
ministic components.
C. Zero-Order Hold and First-Order Hold In a hybrid vehicle system, most orig-
inal signals are continuous. These continuous signals need to be sampled and then
sent to the processor at discrete times. With a uniform sampling period, the con-
tinuous signal u(t), shown in Fig. A-4a, will be sampled at the instances of
time 0, 2T, 3T, . . ., and the sampled values, shown in Fig. A-4b, constitute the
basis of the system information. They are expressed as a discrete-time function
u(kT ) or simplified as u(k). The sample-and-hold system is shown in Fig. A-5a;
ideally, the sampler may be regarded as a switch which closes and opens in an
infinitely short time at which time u(t) is measured. In practice, this assumption
(a) Continuous time t (b) Discrete time kT
0 2T 3T 4T 5T 6T 7T 8T 9T 10T 11T
T
u(kT
)
u(t)
0 2T 3T 4T 5T 6T 7T 8T 9T 10T 11T
T
Fig. A-4. Continuous and sampled-data functions.
Hold
Ideal sampler
u(t) u∗(t) uh(t)
u
h
(t)
u
h
(t)
(a) Sampler-hold system
Discrete time kT
(b) Zero-order hold (c) First-order hold
Discrete time kT
0 2T 3T 4T 5T 6T 7T 8T 9T10T11T
T 0 2T 3T 4T 5T 6T 7T 8T 9T10T11T
T
Fig. A-5. Input and output of sampler and holder.

is justified when the switching duration is very short compared with the sampling
interval Ts of the system. Since the sampled signal u(kT ) is a set of spikes, a
device is needed to hold them to make the controller able to process them. If
the signal is held constant over a sampling interval, it is called the zero-order
hold. If the signal is linearly increasing and decreasing over a sampling interval,
it is called the first-order hold. The input–output relationships of the zero-order
hold and first-order hold are illustrated by Fig. A-5b, c. For a zero-order hold,
the output u∗
(t) holds a constant value during the sampling time period Ts, while
a first-order hold generates a ramp signal uh(t) during the sampling period of
time Ts. Although higher order holds are able to generate more complex and
more accurate wave shapes between the samples, they will complicate the whole
system and make it difficult to analyze and design; infact, they are seldom used
in practice.
D. Input–Output Model of Stochastic System A linear time-invariant stochastic
system can be described by the following input–output difference equation:
y(k) + a1y(k − 1) + · · · + ana
y(k − na)
= b0u(k) + b1u(k − 1) + · · · + bnb
u(k − na) + ξ(k) (A.38)
where {ξ(k)} is a white-noise sequence which directly indicates the error in the
difference equation.
If we introduce
θ = [a1 a2 · · · ana
b1 · · · bnb
]T
A(q−1
) = 1 + a1q−1
+ · · · + ana
q−na
(A.39)
B(q−1
) = b0 + b1q−1
+ · · · + bnb
q−nb
a transfer function form model of equation (A.38) can be obtained as
G(q−1
) =
B(q−1
)
A(q−1)
H(q−1
) =
1
A(q−1)
(A.40)
The model (A.38) or (A.40) is called an ARX model where AR refers to the
autoregressive part and X refer to extra input B(q−1)u(t) that is the exogenous
variable. If a certain degree of flexibility is added to describe the white-noise
error in equation (A.38) such as a moving average of white noise, the following
model is given:
y(k) + a1y(k − 1) + · · · + ana
y(k − n)
= b0u(k) + b1u(k − 1) + · · · + bnb
u(k − n)
+ w(k) + c1w(k − 1) + · · · + cnc
w(k − nc) (A.41)

It can also be written as the form
A

q−1

y(k) = B

q−1

u(k) + C

q−1

w(k) (A.42)
where u(k) is system input, y(k) is output, w(k) is independent white noise, and
A(q−1
), B(q−1
), C(q−1
) are
A

q−1

= 1 + a1q−1
+ · · · + ana
q−na
B

q−1

= b0 + b1q−1
+ · · · + bnb
q−nb
C

q−1

= 1 + c1q−1
+ · · · + cnc
q−nc
θ =

a1 a2 · · · ana
b0 · · · bnb
c1 · · · cnc

T
(A.43)
The model (A.41) is called the ARMAX model, which refers to the autoregressive
moving-average model with exogenous input model. The ARMAX models are
usually used to estimate system parameters online based on real-time measured
series data.
E. State Space Model of Stochastic System The state space model describing a
linear time-invariant stochastic system is
x(k + 1) = Ax(k) + Bu(k) + w(k)
y(k) = Cx(k) + Du(k) + v(k)
(A.44)
where A, B, C, and D are the coefficient matrices with appropriate dimensions and
{w(k)} and {v(k)} are two uncorrelated white-noise sequences with covariance
Q and R, respectively.
Based on the superposition principle, the model (A.44) can be expressed in
terms of two components as
y(k) = y(k) + η(k) (A.45)
where y(k) is the output of the deterministic model
x(k + 1) = Ax(k) + Bu(k) (A.46)
y(k) = Cx(k) + Du(k) (A.47)
where η(k) is a zero-mean stochastic process having the spectral density
η(z) = C (zI − A)−1
Q

z−1
I − AT
−1
CT
+ R (A.48)

PARAMETER ESTIMATION OF DYNAMIC SYSTEMS 341
A.2 PARAMETER ESTIMATION OF DYNAMIC SYSTEMS
In this section, we turn to the problem of parameter estimation of a dynamic
system. There are many different methods that can be used to determine the
parameters of a model, and also there are different criteria on which method
should be selected, but we only briefly introduce the basic principle of the least-
squares estimation method widely used in engineering.
A.2.1 Least Squares
Least squares is a classic method to deal with experimental data to predict the
orbits of planets and comets developed by Gauss in 1795. Unknown parameters
of a model should be chosen in such a way that the sum of the squares of the dif-
ference between actually observed and computed values is a minimum. Assuming
the computed output ŷ is given by the model, the least-squares principle can be
mathematically described as
ŷ(k) = θ1x1(k) + θ2x2(k) + · · · + θnxn(k) = ϕ(k)θ (A.49)
where x1, x2, . . . , xn are known inputs, y is output, and θ1, θ2, . . . , θn are
unknown parameters, and ϕ(k) = [xT
1 (k) xT
2 (k) · · · xT
n (k)]T
, θ = [θ1 θ2
· · · θn]T
.
The pairs of observations {(xi yi), i = 1, 2, . . . , N} are obtained from an
experiment or test. According to Gauss’s principle, the estimated parameters
should make the following cost function minimal:
J(θ) =
N

i=1
ε2
i =
N

i=1
(yi − ŷi)2
=
N

i=1
[yi − ϕ(i)θ]2
= min (A.50)
If we apply partial derivatives in equation (A.50) and make them equal to zero,
that is, ∂J(θ)/∂θ = 0, the solution to the least squares is
θ̂ = (T
)−1
T
Y (A.51)
where = [ϕ(1) · · · ϕ(N)]T
and Y = [y(1) · · · y(N)]T
.
The above least-squares method can be used to estimate the parameter in the
dynamic system described by equation (A.38) or (A.41) with C(q−1
) = 1. If we
assume that a sequence of inputs {u(1),u(2),. . . ,u(N)} has been applied to the
system and the corresponding sequence of output {y(1),y(2),. . . ,y(N)} has been
measured, the following vectors can be configured for the least squares described
by equation (A.51), and the unknown parameters are θ:
θ = [a1 · · · ana
b0 · · · bnb
]T
(A.52)

ϕ = [−y(k − 1) · · · −y(k − na − 1) u(k) · · · u(k − nb)]T
(A.53)
=
⎡
⎢
⎣
ϕT (1)
.
.
.
ϕT
(N)
⎤
⎥
⎦ (A.54)
Y =
⎡
⎢
⎣
y(1)
.
.
.
y(N)
⎤
⎥
⎦ N ≥ na + nb + 1 (A.55)
A.2.2 Statistical Property of Least-Squares Estimator
If we assume that the data are generated from the model
Y = θ0 + ε (A.56)
where θ0 ∈ Rn
is the vector of the theoretical true values of the model parameters,
ε ∈ Rn
is a vector of white noise with zero mean and variance σ2
, that is,
E{ε} = 0 and E{εεT } = σ2I, then the least-squares (LS) estimate of θ0 given
by equation (A.3) has the following properties:
(a) Bias (Expectations) The bias of an estimator is defined as the difference
between the true value of the estimated parameter and the expected value
of the estimate. If the difference is zero, the estimator is called unbiased;
otherwise it is said to be biased. The introduced LS is a unbiased estimator
if the noise is independent and with zero mean:
E{θ̂} = θ0 (A.57)
It can be proven from
E

θ̂

= E

T

−1
T
Y

= E

T

−1
T

θ0 + ε

=

T

−1
T
E

θ0 + ε

= θ0
(b) Variances The LS is the minimum-variance estimator
Var(θ̂) = E

θ̂ − θ0

θ̂ − θ0
T

= σ2

T

−1
(A.58)
This is derived from
E

(θ̂ − θ0)(θ̂ − θ0)T

= E

T

−1
T
Y − θ0

T

−1
T
Y − θ0
T

=

T

−1
T
E

Y − θ0

Y − θ0
T

T

−1
=

T

−1
T
σ2
I

T

−1
= σ2

T

−1
(A.59)
(c) Consistency The consistency property of an estimator means that if the
observation data size N is sufficiently large, the estimator is able to find
the value of θ0 with arbitrary precision. In a mathematical expression, this
means that as N goes to infinity, the estimate θ converges to θ0.
The following proof shows that the LS estimator is a consistent estimator;
that is, the LS estimate θ̂ converges to θ0 as the observation size N tends to
infinity. In mathematical terms, if we define limN→∞{[(1/N)T
]} = and
is nonsingular, then the estimate θ̂ converges to the true value θ0, that is,
limN→∞ E{(θ̂ − θ0)(θ̂ − θ0)T
} = 0.
Proof
lim
N→∞
E{(θ̂ − θ0)(θ̂ − θ0)T
}
= lim
N→∞
{σ2
(T
)−1
} = lim
N→∞

σ2
N

1
N
T

−1

= lim
N→∞
σ2
N
• lim
N→∞

1
N
T

−1

= lim
N→∞
σ2
N
· = 0 (A.60)
Example A-3 Determine the following model parameter a1, a2, b1, b2 by the
least-squares estimation method based on the observed input and output data
{u(1), u(2), · · · , u(N)} and {y(1), y(2), · · · , y(N)}:
y(k) + a1y(k − 1) + a2y(k − 2) = b1u(k − 1) + b2u(k − 2)
Solution: Compared with the least-squares formula given in (A.51), we have
=
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
−y(2) −y(1) u(2) u(1)
−y(3) −y(2) (3) u(2)
.
.
.
.
.
.
.
.
.
.
.
.
−y(N − 2) −y(N − 3) u(N − 2) u(N − 3)
−y(N − 1) −y(N − 2) u(N − 1) u(N − 2)
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦

Y =
⎡
⎢
⎢
⎢
⎢
⎢
⎣
y(3)
y(4)
.
.
.
y(N − 1)
y(N)
⎤
⎥
⎥
⎥
⎥
⎥
⎦
θ =
⎡
⎢
⎢
⎣
a1
a2
b1
b2
⎤
⎥
⎥
⎦
and
T
=
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
N−1

i=2
y2(i)
N−1

i=2
y(i)y(i − 1) −
N−1

i=2
y(i)u(i) −
N−1

i=2
y(i)u(i − 1)
N−1

i=2
y(i)y(i − 1)
N−1

i=2
y2(i − 1) −
N−1

i=2
y(i − 1)u(i) −
N−1

i=2
y(i − 1)u(i − 1)
−
N−1

i=2
y(i)u(i) −
N−1

i=2
y(i − 1)u(i)
N−1

i=2
u2
(i) −
N−1

i=2
u(i)u(i − 1)
−
N−1

i=2
y(i)u(i − 1) −
N−1

i=2
y(i − 1)u(i − 1)
N−1

i=2
u(i)u − 1)
N−1

i=2
u2
(i − 1)
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
4×4
T
Y =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
−
N−1

i=2
y(i)y(i + 1)
−
N−1

i=2
y(i − 1)y(i + 1)
N−1

i=2
u(i)y(i + 1)
N−1

i=2
u(i − 1)y(i + 1)
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
If the matrix T
is nonsingular, the estimated parameter θ = [a1, a2, b1, b2]T
is θ̂ = (T )−1T Y. If the persistent excitation is not imposed on the input
signals, the matrix T
will be singular, resulting in that no unique estimate
can be found by the least squares.
A.2.3 Recursive Least-Squares Estimator
In most practical applications, the observed data are obtained sequentially. If the
least-squares estimation has to be solved for N observations, the estimate not
only wastes computational resources but also overoccupies limited memory. It is
necessary to estimate the model parameters in such a way that the N + 1 estimate
is performed based on the results obtained for N observations. Parameter estima-
tion techniques that comply with this requirement are called recursive estimation
methods. As long as the measured input–output data are processed sequentially,
they become available. Recursive estimation methods are also referred to as
online or real-time estimation.
Based on N observations y1, y2, . . . , yN , the least-squares estimate of param-
eter θ̂ is given by equation (A.51) as
θ̂N = (T
N N )−1
T
N YN (A.61)

where
N =
⎡
⎢
⎣
ϕT
(1)
.
.
.
ϕT
(N)
⎤
⎥
⎦ YN =
⎡
⎢
⎣
y(1)
.
.
.
y(N)
⎤
⎥
⎦ (A.62)
To achieve a recursive least-squares algorithm, we first assume that the param-
eters θ̂N have been estimated based on the N known observations. The objective
here is to achieve θ̂N+1 based on θ̂N and just one extra observation yN+1.
First, we define
PN = (T
N N )−1
(A.63)
Then, we have
PN+1 = (T
N+1N+1)−1
=

N
ϕT (N + 1)
T
N
ϕT (N + 1)
!−1
= [T
N N + ϕ(N + 1)ϕT
(N + 1)]−1
(A.64)
Based on the matrix inversion lemma introduced below, we have
PN+1 =

T
N N + ϕ(N + 1)ϕT
(N + 1)

−1
=

T
N N
−1
−

T
N N
−1
ϕT
(N + 1)

I + ϕ(N + 1)

T
N N
−1
ϕT
(N + 1)
−1
ϕ(N + 1)

T
N N
−1
= PN − PN ϕT
(N + 1)

I + ϕ(N + 1)PN ϕT
(N + 1)

−1
ϕ(N + 1)PN
(A.65)
Let
KN = PN ϕT
(N + 1)[I + ϕ(N + 1)PN ϕT
(N + 1)]−1
(A.66)
then
PN+1 = PN − KN ϕ(N + 1)PN (A.67)
Refering to equation (A.61), the following results are achieved from the above
equations:
θ̂N+1 = (T
N+1N+1)−1
T
N+1YN+1
= [PN − KN ϕ(N + 1)PN ]

N
ϕT
(N + 1)
T
YN
yN+1

=

T
N YN + ϕT
(N + 1)yN+1

= PN T
N YN + PN ϕT
(N + 1)yN+1
− KN ϕ(N + 1)PN T
N YN − KN ϕ(N + 1)PN ϕT
(N + 1)yN+1
= θ̂N − KN ϕ(N + 1)θ̂N + KN yN+1
= θ̂N + KN

yN+1 − ϕ(N + 1)θ̂N

(A.68)
so the recursive least-squares estimation method is obtained and summarized as
θ̂(k + 1) = θ̂(k) + K(k)

y(k + 1) − ϕ(k + 1)θ̂(k)

(A.69)
K(k) = P(k)ϕT
(k + 1)

I + ϕ(k + 1)P(k)ϕT
(k + 1)

−1
(A.70)
P(k + 1) = P(k) − K(k)ϕ(k + 1)P(k) (A.71)
Remark 1 The estimate θ̂(k + 1) is achieved by adding a correction to the
previous estimate θ̂(k). The correction is proportional to the difference between
the measured output value of y(k + 1) and prediction ŷ(k + 1) of the y(k + 1)
based on the previous estimate. The components of the gain vector K(k) reflect
how to correct the previous estimate θ̂(k) based on the new observed data.
Remark 2 If 0 and Y0 can be obtained from an initial set of data, the
starting value P0 and θ̂0 may be obtained by evaluating (T
0 0)−1
and θ̂0 =
(T
0 0)−1
T
0 Y0, respectively. If there is no way to get enough initial observa-
tions, P0 may be set as Iρ2
, that is, P0 = Iρ2
, where ρ is a very large number
and θ̂0 may be arbitrary. For large N , the choice of initial value P0 and θ0 is
unimportant.
Matrix Inversion Lemma The matrix inversion lemma states that
(A + BCD)−1
= A−1
− A−1
B

C−1
+ DA−1
B
−1
DA−1
(A.72)
where A, C, and A + BCD are regular square matrices of appropriate size.
Proof: Multiply the left side equation (A.72) by the right side. If the result
equals the identity matrix, the lemma is proven. Thus, we have

A + BCD

A−1
− A−1

C−1
+ DA−1
B
−1
DA−1
#
= AA−1
− AA−1
B

C−1
+ DA−1
B
−1
DA−1
+ BCDA−1
− BCDA−1
B

C−1
+ DA−1
B
−1
DA−1

= I + BCDA−1
− B

C−1
+ DA−1
B
−1
DA−1
− BCDA−1
B

C−1
+ DA−1
B
−1
DA−1
= I + BCDA−1
− B

C−1
+ DA−1
B
−1
+ CDA−1
B

C−1
+ DA−1
B
−1
DA−1
= I + BCDA−1
− B

I + CDA−1
B

C−1
+ DA−1
B
−1
DA−1
= I + BCDA−1
− BC

I + DA−1
B

C−1
+ DA−1
B
−1
DA−1
= I + BCDA−1
− BCDA−1
= I
This completes the proof.
Remark 3 if D = BT
, then
(A + BCBT
)−1
= A−1
− A−1
B(C−1
+ BT
A−1
B)−1
BT
A−1
(A.73)
Remark 4 if C = I, then
(A + BD)−1
= A−1
− A−1
B(I + DA−1
B)−1
DA−1
(A.74)
A.2.4 Least-Squares Estimator for Slow Time-Varying Parameters
The recursive least-squares estimation method described in the previous section
is not directly applicable when the parameters vary over time as new data are
swamped by past data. There are two basic ways to modify the described recursive
method to handle time-varying parameters.
A. Exponential Window Approach (Exponentially Weighted Least Squares) It is
obvious that the cost function (A.50) equally makes use of all observed data.
However, if the system parameters are slowly time varying, the influence of
old data to the parameters is gradually eliminated. The idea of the exponential
window approach is to artificially emphasize the effect of current data by
exponentially weighting past data values, and this is done by using a cost
function with exponential weighting as
J(θ) =
N

i=1
λN−i
ε2
i =
N

i=1
λN−i

yi − ŷi
2
=
N

i=1
λN−i

yi − (i)θ
2
= min
(A.75)

The λ is called the forgetting factor, 0 λ 1, and it is a measure of how fast
the old data are forgotten. The recursive least-squares estimation algorithm by
using cost function (A.75) is given as
θ̂(k + 1) = θ̂(k) + K(k + 1)[y(k + 1) − ϕ(k + 1)θ̂(k)] (A.76)
K(k + 1) = P(k)ϕT
(k + 1)[λ + ϕ(k + 1)P(k)ϕT
(k + 1)]−1
(A.77)
P(k + 1) =
[I − K(k + 1)ϕ(k + 1)]P(k)
λ
(A.78)
Note that λ = 1 gives the standard least-squares estimation algorithm.
B. Rectangular Window Approach The idea of the rectangular window approach
is that the estimate at time k is only based on a finite number of past data, and all
old data are completely discarded. To implement this idea, a rectangular window
with fixed length N is set, and whenever a new set of data is added at each time
point, a set of old data is discarded simultaneously so the active number of the
data points is always kept to N . This approach requires that the last N estimate
θ̂i+N,i+1 and covariance Pi+N,i+1 be stored. For the more detailed algorithm
description, the interested reader is referred to Goodwin and Payne (1977).
A.2.5 Generalized Least-Squares Estimator
In previous sections, we discussed the statistical property of the least-squares
estimation method and stated that the estimate is unbiased if the noise {ξ(k)}
in the model (A.7) is a white noise or a sequence of uncorrelated zero mean
random variables with common variance σ2
. The white-noise assumption is not
a practical reality but is suitable for the low-frequency control system analysis of
a practical system. If the conditions of the uncorrelatedness and zero mean of the
noise sequence {ξ(k)} cannot be satisfied in a system, the statistical properties of
the least-squares estimation will not be guaranteed in general. In this case, the
generalized least-squares (GLS) method can be used and the estimate is unbiased,
which was developed and had been shown to work well in practice (Clarke, 1967;
Söderström, 1974).
The idea of the generalized least-squares estimation method is that the cor-
related sequence {ξ(k)} is considered as the output of a linear filter, which is
driven by a white-noise sequence, that is,
ξ(k) +
p

i=1
ciξ(k − i) = w(k) ⇒ C(q−1
)ξ(k) = w(k) (A.79)
where {w(k)} is a white-noise sequence C(q−1
) = 1 + c1q−1
+ · · · + cpq−p
.
For equation (A.79), we know its z-transfer function is
ξ(z)
w(z)
=
1
C(z−1)
(A.80)

STATE ESTIMATION OF DYNAMIC SYSTEMS 349
Then, the system model may be written as
A(q−1
)y(k) = B(q−1
)u(k) + C(q−1
)w(k) (A.81)
It can be further rewritten as
A(q−1
)y∗
(k) = B(q−1
)u∗
(k) + w(k) (A.82)
where
y∗
(k) =
y(k)
C(q−1)
u∗
(k) =
u(k)
C(q−1)
(A.83)
If {y∗
(k)} and {u∗
(k)} can be calculated, the parameters in A(q−1
) and B(q−1
)
may be estimated by least squares, and the estimate is unbiased. However, the
problem is that C(q−1
) is unknown. Thus, the parameters in C(q−1
) must be esti-
mated along with A(q−1
) and B(q−1
), which results in the following generalized
least-squares estimation method:
1. Set C(q−1
) = 1 and estimate parameter θ̂ in A(q−1
) and B(q−1
).
2. Generate {ξ̂(k)} from ξ(k) = Â(q−1
)y∗
(k) − B̂(q−1
)u∗
(k).
3. Estimate parameter ĉi in C(q−1
) of equation (A.79).
4. Generate {y∗(k)} and {u∗(k)} based on estimated ĉi and equation (A.83).
5. Estimate parameter θ̂ in A(q−1
) and B(q−1
) based on data points {y∗
(k)}
and {u∗
(k)}.
6. If converged, stop; otherwise go to 2.
A.3 STATE ESTIMATION OF DYNAMIC SYSTEMS
Some subsystems or components of a hybrid vehicle system are described by
the state space model equation (A.44). To control an HEV subsystem properly,
the system states sometimes need to be estimated based on observation data. In
1960, Kalman published his famous paper on the linear filtering problem, and his
results were named the Kalman filter, which is a set of mathematical equations
that provide a recursive computation method to estimate the states of a system
in a way minimizing the estimation error (Kalman, 1960). The Kalman filter
technique can be summarized as follows:
State space equations: x(k + 1) = Ax(k) + Bu(k) + w(k)
Output equations: y(k) = Cx(k) + v(k) (A.84)
Noise: E{w(k)} = 0 E{v(k)} = 0
E{x(0)} = μ E{w(j)w(k)} =
j =k
0,
E{w(j)w(k)} =
j=k
Q E{v(j)v(k)} =
j =k
0

Noise: E{w(k)} = 0, E{v(k)} = 0, E{x(0)} = μ,
E{w(j)w(k)} =
j =k
0, E{w(j)w(k)} =
j =k
Q,
E{v(j)v(k)} =
j =k
0, E{v(j)v(k)} =
j=k
R,
E{w(j)v(k)} = 0, E{x(0)w(k)} = 0,
E{x(0)v(k)} = 0,
E{[x(0) − μ][x(0) − μ]T
} = P0, (A.85)
Filter: x̂(k|k) = x̂(k|k − 1)
+ K(k)

y(k) − Cx̂(k|k − 1)

(A.86)
Prediction: x̂(k|k − 1) = Ax̂(k − 1|k − 1) (A.87)
Gain: K(k) = P(k, k − 1)CT
×

CP(k, k − 1)CT
+ R

−1
(A.88)
Filter error covariance: P(k, k) = [I − K(k)C]P(k, k − 1) (A.89)
Prediction error covariance: P(k, k − 1) = AP(k − 1, k − 1)AT
+ Q (A.90)
Initial conditions: x̂(0, 0) = x̂(0) = μ P(0, 0) = P0 (A.91)
For more details on the probabilistic origins and convergence properties of the
Kalman filter, the interested reader is refered to Kalman (1960), Kalman and
Bucy (1961), and Jazwinski (1970).
Example A-4 Consider the system
x(k + 1) = φx(k) + w(k) y(k) = x(k) + v(k) (A.92)
where {w(k)} and {v(k)}, k = 1, 2, . . ., are Gaussian noise sequences with zero
mean and Q and R covariance, respectively. Estimate the state by the Kalman
filter technique and list the several step computation values if we assume
φ = 1, P0 = 100, Q = 25, R = 15.
Solution: The filtering equation is
x̂(k|k) = x̂(k|k−1) + K(k)[y(k) − Cx̂(k|k − 1)] = x̂(k − 1|k − 1)
+ K(k)[y(k) − φx̂(k − 1|k−1)]
From equation (A.90), we have the following prediction error covariance, gain,
and filter error covariance:
P(k, k − 1) = φ2
P(k − 1, k − 1) + Q
K(k) = [φ2
P(k − 1, k − 1) + Q][φ2
P(k − 1, k − 1) + Q + R]−1

JOINT STATE AND PARAMETER ESTIMATION OF DYNAMIC SYSTEMS 351
TABLE A-1. The computation Results of Example A-4
k P (k, k−1) K(k) P (k, k)
0 . . . . . . 100
1 125 0.893 13.40
2 38.4 0.720 10.80
3 35.8 0.704 10.57
4 35.6 0.703 10.55
=
φ2P(k − 1, k − 1) + Q
φ2P(k − 1, k − 1) + Q + R
P(k, k) =
R[φ2
P(k − 1, k − 1) + Q]
φ2P(k − 1, k − 1) + Q + R
For the given φ = 1, P0 = 100, Q = 25, R = 15, the above equations are given
P(k, k − 1) = P(k − 1, k − 1) + 25
K(k) =
P(k − 1, k − 1) + 25
P(k − 1, k − 1) + 40
P(k, k) =
15[P(k − 1, k − 1) + 25]
[P(k − 1, k − 1) + 40]
= 15K(k)
The results of first several steps are listed in Table A-1.
A.4 JOINT STATE AND PARAMETER ESTIMATION OF
DYNAMIC SYSTEMS
In hybrid vehicle applications, it may also be necessary to simultaneously estimate
the states and parameters of a subsystem. We devote this section to describe two
approaches for joint state and parameter estimation of a dynamic system.
A.4.1 Extended Kalman Filter
While the Kalman filter provides a way to estimate the states of a linear dynamic
system, the extended Kalman filter gives a method to estimate the states of a
nonlinear dynamics system. To get the extended Kalman filter, we consider the
nonlinear dynamic system
State space equations: x(k + 1) = f(x(k), u(k)) + w(k)
Output equations: y(k) = g(x(k)) + v(k)
(A.93)

where x(k) is the state variable vector, u(k) is the input variable, and y(k) is the
output variable; {w(k)} and {v(k)} again represent the system and measurement
noises, which are assumed as a Gaussian distribution and independent zero mean
with covariances Q and R, respectively.
The extended Kalman filter algorithm is stated as
Filtering: x̂(k|k) = x̂(k|k − 1) + K(k)[y(k) − g(x̂(k|k − 1))]
P(k|k) =
$
I − K(k)
∂g(x(k))
∂x

ˆ
x(k|k−1)
%
P(k|k − 1)
(A.94)
Prediction: x̂(k|k − 1) = f(x̂(k − 1|k − 1), u(k − 1))
P(k|k − 1) =
∂f
∂x

x̂(k−1)
P(k − 1|k − 1)
∂fT
∂x

x̂(k−1)
+ Q
(A.95)
Gain: K(k) = P(k, k − 1)

∂g(x(k))
∂x

x̂(k|k−1)
!T
×
$
∂g(x(k))
∂x

x̂(k|k−1)
!
P(k, k − 1)C
×

∂g(x(k))
∂x

x̂(k|k−1)
!T
+ R
⎤
⎦
−1
(A.96)
Initial conditions: x̂(0, 0) = x̂(0) = μ P(0, 0) = P0 (A.97)
The iteration process of the extended Kalman filter algorithm is as follows:
1. Get the last state estimate x̂(k − 1|k − 1) and filter error covariance
P(k − 1|k − 1).
2. Compute x̂(k|k − 1) from x̂(k|k − 1) = f(x̂(k − 1|k − 1), u(k − 1)).
3. Compute P(k|k − 1) from
P(k|k − 1) =
∂f
∂x

x̂(k−1)
P(k − 1|k − 1)
∂fT
∂x

x̂(k−1)
+ Q
4. Compute K(k).
5. Compute x̂(k|k) from x̂(k|k) = x̂(k|k − 1) + K(k)[y(k) − g(x̂(k|k − 1))].

JOINT STATE AND PARAMETER ESTIMATION OF DYNAMIC SYSTEMS 353
6. Compute P(k|k) from
P(k|k) =
$
I − K(k)
∂g(x(k))
∂x

x̂(k|k−1)
%
P(k|k − 1)
7. Go to step 1.
The above extended Kalman filter algorithm can be applied to simultaneously
estimate states and parameters of a dynamic system. Let us consider the nonlinear
dynamic system described by a state space equation (A.93) and assume that there
are unknown parameter a in f(x(k), u(k)) and unknown parameter b in g(x(k)).
Then equation (A.93) is written as
State equations: x(k + 1) = f(x(k), u(k), a) + w(k)
Output equations: y(k) = g(x(k), b) + v(k)
(A.98)
In order to estimate parameters a and b, we define the a and b as new states,
described by the equations
a(k + 1) = a(k) b(k + 1) = b(k) (A.99)
Combining equations (A.98) and (A.99) and letting state variable
x∗
(k) = [x(k), a(k), b(k)]T
, an augmented state space equation is obtained as
State space equation: x∗
(k + 1) =
⎛
⎝
f(x(k), u(k), a(k))
a(k)
b(k)
⎞
⎠ +
⎛
⎝
w(k)
0
0
⎞
⎠
Output equation: y(k) = g(x(k), b(k)) + v(k) (A.100)
By applying the extended Kalman filter algorithm to equation (A.100), the aug-
mented state x∗
can be estimated, that is, the estimate of state x as well as
parameter a and b are given. There are many good articles presenting practical
examples of the extended Kalman filter. The interested reader is referred to the
book by Grewal and Andrews (2008).
A.4.2 Singular Pencil Model
The singular pencil (SP) model may be a new class of model for most control
engineers and was first proposed by G. Salut and others and then developed by
many researchers (Salut et al. 1979, 1980; Chen et al. 1986; Aplevich 1981,
1985, 1991). The SP model contains the input–output and state space model as
a subset. Models similar to this form have been called “generalized state space
models,” “descriptor systems,” “tableau equations,” “time-domain input–output
models,” and “implicit linear systems.” An advantage when a system is described

Introduction to Hybrid Vehicle System Modeling and Control - 2013 - Liu - Appendix A System Identification State and.pdf

Recommended

Recommended

More Related Content

Similar to Introduction to Hybrid Vehicle System Modeling and Control - 2013 - Liu - Appendix A System Identification State and.pdf

Similar to Introduction to Hybrid Vehicle System Modeling and Control - 2013 - Liu - Appendix A System Identification State and.pdf (20)

Recently uploaded

Recently uploaded (20)

Introduction to Hybrid Vehicle System Modeling and Control - 2013 - Liu - Appendix A System Identification State and.pdf