Optimal control systems

Unconstrained Optimal Control
Summary
Mr. Mohamed Mohamed El-Sayed Atyya
Aerospace Engineer

Contents
Page
1 Introduction 4
1.1 Classical and Modern Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Static Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Dynamic Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Plant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 Performance Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.3 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.4 Formal Statement of Optimal Control System . . . . . . . . . . . . . . . . . . . . . 6
2 Calculus of Variations and Open-Loop Optimal Control 8
2.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.1 Function and Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.2 Increment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 Differential and Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Optimum of a Function and a Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Optimum of a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Optimum of a Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Euler-Lagrange Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Different Cases for Euler-Lagrange Equation . . . . . . . . . . . . . . . . . . . . . 12
2.4 Procedure of Pontryagin Principle for Bolza Problem (Open-Loop Optimal Control) . . . 14
2.4.1 Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 Solution of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.3 Types of Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3 Linear Quadratic Optimal Control Systems I (Regulator Closed-Loop Optimal Con-
trol) 24
3.1 Procedure Summary of Finite-Time Linear Quadratic Regulator System: Time-Varying
Case (Closed-Loop Optimal Control with Fixed tf and Free x(tf )) . . . . . . . . . . . . . 24
3.2 Salient Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 LQR System for General Performance Index . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Procedure Summary of Infinite-Time Linear Quadratic Regulator System: Time-Varying
Case (Closed-Loop Optimal Control with tf = ∞ and Free x(∞)) . . . . . . . . . . . . . . 29
3.5 Procedure Summary of Infinite-Interval Linear Quadratic Regulator System: Time-Invariant
Case (Closed-Loop Optimal Control with tf = ∞ and Free x(∞)) . . . . . . . . . . . . . . 30
1

3.6 Stability Issues of Time-Invariant Regulator . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.7 Equivalence of Open-Loop and Closed-Loop Optimal Controls . . . . . . . . . . . . . . . . 34
4 Linear Quadratic Optimal Control Systems II (Tracking Closed-Loop Optimal Con-
trol) 38
4.1 Procedure Summary of Linear Quadratic Tracking System (Closed-Loop Optimal Control
with Fixed tf and Free x(tf )) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.1.3 Salient Features of Tracking System . . . . . . . . . . . . . . . . . . . . . . . . . . 40
with Infinite tf and Free x(∞)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
with Infinite Time-Invariant and Free x(∞)) . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.4 Procedure Summary of Fixed-End-Point Regulator System (Closed-Loop Optimal Control) 50
4.5 Procedure Summary of Regulator System with Prescribed Degree of Stability (Closed-
Loop Optimal Control of Infinite Time-Invariant Systems) . . . . . . . . . . . . . . . . . . 52
4.6 Closed-Loop Controller Design Using Frequency-Domain (Kalman Equation in Frequency
Domain) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.6.1 Relation Between Open-Loop and Closed-Loop . . . . . . . . . . . . . . . . . . . . 54
4.6.4 Gain Margin and Phase Margin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5 Variational Calculus and Open-Loop Optimal Control for Discrete-Time Systems 57
5.1 Discrete Euler-Lagrange Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Procedure Summary for Discrete-Time Optimal Control System (Open-Loop Optimal
Control) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2.3 Types of Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6 Linear Quadratic Optimal Control for Discrete-Time Systems I (Regulator Closed-
Loop Optimal Control) 61
6.1 Procedure Summary of Discrete-Time, Linear Quadratic Regulator System (Closed-Loop
Optimal Control with Fixed kf and Free x(kf )) . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2 Procedure Summary of Discrete-Time, Linear Quadratic Regulator System: Steady-State
Condition (Closed-Loop Optimal Control with kf = ∞) . . . . . . . . . . . . . . . . . . . 64
6.3 Analytical Solution to the Riccati Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Page 2 of 83

7 Linear Quadratic Optimal Control for Discrete-Time Systems II (Tracking Closed-
Loop Optimal Control) 70
7.1 Procedure Summary of Discrete-Time Linear Quadratic Tracking System (Closed-Loop
Optimal Control with Fixed Linear Time-Invariant and Free x(kf )) . . . . . . . . . . . . . 70
7.2 Closed-Loop Controller Design Using Frequency-Domain (Discrete Kalman Equation in
Frequency Domain) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.2.1 Relation Between Open-Loop and Closed-Loop . . . . . . . . . . . . . . . . . . . . 74
8 Pontryagin Minimum Principle 76
8.1 Procedure Summary of Pontryagin Minimum Principle . . . . . . . . . . . . . . . . . . . . 76
8.1.3 Types of Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.1.4 Important Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.1.5 Additional Necessary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.2 Optimal Control of Discrete-Time Systems Using the Principle of Optimality of Dynamic
Programming (Regulator Optimal Control with Fixed kf and Free x(kf )) . . . . . . . . . 79
8.3 Optimal Control of Continuous-Time Systems Using Hamilton-Jacobi-Bellman (HJB) Ap-
proach (Closed-Loop Optimal Control with Free x(tf )) . . . . . . . . . . . . . . . . . . . . 79
Page 3 of 83

Chapter 1
Introduction
1.1 Classical and Modern Control
The classical (conventional) control theory con-
cerned with single input and single output (SISO)
is mainly based on Laplace transforms theory and
its use in system representation in block diagram
form.
Y (s)
R(s)
=
G(s)
1 + G(s)H(s)
G(s) = Gc(s)Gp(s)
The modern control theory concerned with multi-
ple inputs and multiple outputs (MIMO) is based
on state variable representation in terms of a set
of first order differential (or difference) equations.
Here, the system (plant) is characterized by state
variables, say, in linear, time-invariant form as
˙x(t) = Ax(t) + Bu(t)
y(t) = Cx(t) + Du(t)
Figure 1.1: Classical Control Configuration
Figure 1.2: Modern Control Configuration
Figure 1.3: Components of a Modern Control System
4

1.2 Optimization
1.2.1 Static Optimization
Static optimization is concerned with controlling a
plant under steady state conditions, i.e., the sys-
tem variables are not changing with respect to time.
The plant is then described by algebraic equations.
Techniques used are ordinary calculus, Lagrange
multipliers, linear and nonlinear programming.
1.2.2 Dynamic Optimization
Dynamic optimization concerns with the optimal
control of plants under dynamic conditions, i.e., the
system variables are changing with respect to time
and thus the time is involved in system description.
Then the plant is described by differential (or differ-
ence) equations. Techniques used are search tech-
niques, dynamic programming, variational calculus
(or calculus of variations) and Pontryagin principle. Figure 1.4: Overview of Optimization
1.3 Optimal Control
The main objective of optimal control is to determine control signals that will cause a process (plant)
to satisfy some physical constraints and at the same time extremize (maximize or minimize) a chosen
performance criterion (performance index or cost function). The formulation of optimal control problem
requires
1. a mathematical description (or model) of the process to be controlled (generally in state variable
form),
2. a specification of the performance index, and
3. a statement of boundary conditions and the physical constraints on the states and/or controls.
1.3.1 Plant
For the purpose of optimization, we describe a physical plant by a set of linear or nonlinear differential
or difference equations.
1.3.2 Performance Index
In modern control theory, the optimal control problem is to find a control which causes the dynamical
system to reach a target or follow a state variable (or trajectory) and at the same time extremize a
performance index which may take several forms as described below.
1. Performance Index for Time-Optimal Control System:
J =
tf
t0
dt = tf − t0 = t∗
2. Performance Index for Time-Optimal Control System:
Assume that the magnitude |u(t)| of the thrust is proportional to the rate of fuel consumption.
J =
tf
t0
|u(t)|dt
Page 5 of 83

For several controls, we may write it as
J =
tf
t0
m
i=1
Ri |ui(t)| dt
where R is a weighting factor.
3. Performance Index for Minimum-Energy Control System:
J =
tf
t0
u (t)Ru(t)dt
where, R is a positive definite matrix.
4. Performance Index for Tracking-Optimal Control System:
J =
tf
t0
x (t)Qx(t)dt
where, Q is a positive semi-definite matrix.
5. Performance Index for Terminal Control System:
Minal target problem, we are interested in minimizing the error between the desired target position
xd(tf ) and the actual target position xa(tf ) at the end of the maneuver or at the final time tf .
The terminal (final) error is x(tf ) = xa(tf )xd(tf ). Taking care of positive and negative values of
error and weighting factors, we structure the cost function as
where, F is a positive semi-definite matrix.
6. Performance Index for General Optimal Control System:
J = x (tf )Fx(tf ) +
tf
t0
[x (t)Qx(t) + u (t)Ru(t)] dt
= S (x(tf ), tf ) +
tf
t0
V (x(t), u(t), t) dt
where, R is a positive definite matrix, and Q and F are positive semi-definite matrices, respectively.
Note that the matrices Q and R may be time varying.
1.3.3 Constraints
The control u(t) and state x(t) vectors are either unconstrained or constrained depending upon the
physical situation. The unconstrained problem is less involved and gives rise to some elegant results.
From the physical considerations, often we have the controls and states, such as currents and voltages
in an electrical circuit, speed of a motor, thrust of a rocket, constrained as
U ≤ u(t) ≤ U
X ≤ x(t) ≤ X
1.3.4 Formal Statement of Optimal Control System
The optimal control systems are studied in three stages.
1. In the first stage, we just consider the performance index of the form and use the well-known theory
of calculus of variations to obtain optimal functions.
2. In the second stage, we bring in the plant and try to address the problem of finding optimal control
u∗
(t) which will drive the plant and at the same time optimize the performance index.
3. Finally, the topic of constraints on the controls and states is considered along with the plant and
performance index to obtain optimal control.
Page 6 of 83

Figure 1.5: Optimal Control Problem
Page 7 of 83

Chapter 2
Calculus of Variations and
Open-Loop Optimal Control
2.1 Basic Concepts
2.1.1 Function and Functional
Function
A variable x is a function of a variable quantity t, (written as x(t) = f(t)), if to every value of t over
a certain range of t there corresponds a value x, i.e., we have a correspondence to a number t there
corresponds a number x. Note that here t need not be always time but any independent variable.
Functional
A variable quantity J is a functional dependent on a function f(x), written as J = J(f(x)), if to each
function f(x), there corresponds a value J, i.e., we have a correspondence: to the function f(x) there
corresponds a number J. Functional depends on several functions.
2.1.2 Increment
Increment of a Function
The increment of the function f, denoted by ∆f, is deﬁned as
∆f(t, ∆t) = f(t + ∆t) − f(t)
Example:
If
f(t) = (t1 + t2)
2
The increment ∆f
∆f = f(t + ∆t) − f(t)
= (t1 + ∆t1 + t2 + ∆t2)
2
− (t1 + t2)
2
= 2 (t1 + t2) ∆t1 + 2 (t1 + t2) ∆t2 + (∆t1)
2
+ (∆t2)
2
+ 2∆t1∆t2
Increment of a Functional
The increment of the functional J(f), denoted by ∆J, is deﬁned as
∆J(x(t), δx(t)) = J(x(t) + δx(t)) − J(x(t))
8

Example:
Find the increment of the functional
J =
tf
t0
2x2
(t) + 1 dt
The increment of J is given by
∆J = J(x(t) + δx(t)) − J(x(t))
=
tf
t0
2 (x(t) + δx(t))
2
+ 1 dt −
tf
t0
2x2
(t) + 1 dt
=
tf
t0
4x(t)δx(t) + 2(δx(t))2
dt
2.1.3 Differential and Variation
Differential of a Function
Let us define at a point t∗
the increment of the function f as
∆f = f(t∗
+ ∆t) − f(t∗
)
By expanding f(t∗
+ ∆t) in a Taylor series about t∗
, we get
∆f = f(t∗
) +
df
dt ∗
∆t +
1
2!
d2
f
dt2
∗
(∆t)2
+ ... − f(t∗
)
Neglecting the higher order terms in ∆t,
∆f =
df
dt ∗
∆t = ˙f(t∗
)∆t = df
Figure 2.1: Increment ∆f, Differential df, and Derivative ˙f of a Function f(t)
Example:
If
f(t) = t2
+ 2t
The increment ∆f is,
∆f = f(t∗
+ ∆t) − f(t∗
) = (t∗
+ ∆t)2
+ 2(t∗
+ ∆t) − (t2
+ 2t)
= 2t∆t + 2∆t + .. + higher order terms = 2(t + 1)∆t
= ˙f(t)∆t
Page 9 of 83

Variation of a Functional
Consider the increment of a functional
∆J = J(x(t) + δx(t)) − J(x(t))
Expanding J(x(t) + δx(t)) in a Taylor series, we get
∆J = J(x(t)) +
∂J
∂x
δx(t) +
1
2!
∂2
J
∂x2
(δx(t))2
+ ... − J(x(t))
=
∂J
∂x
δx(t) +
1
2!
∂2
J
∂x2
(δx(t))2
+ ...
= δJ + δ2
J + ...
Figure 2.2: Increment ∆J and the First Variation δJ of the Functional J
Example:
If
J(x(t)) =
tf
t0
2x2
(t) + 3x(t) + 4 dt
The increment δt is,
∆J = J(x(t) + δx(t)) − J(x(t))
=
tf
t0
2(x(t) + δx(t))2
+ 3(x(t) + δx(t)) + 4 − 2x2
(t) + 3x(t) + 4 dt
=
tf
t0
4x(t)δx(t) + 2(δx(t))2
+ 3δx(t) dt
Considering only the ﬁrst order terms, we get the (ﬁrst) variation as
δJ(x(t), δx(t)) =
tf
t0
[4x(t) + 3] δx(t)dt
Page 10 of 83

2.2 Optimum of a Function and a Functional
2.2.1 Optimum of a Function
The increment of the function ∆f, is used to eval-
uate the relative extrema points,
∆f = f(t)f(t∗
) ≥ 0,
f(t∗
) is a relative local minimum
∆f = f(t)f(t∗
) ≤ 0,
f(t∗
) is a relative local maximum
It is well known that the necessary condition for
optimum of a function is that the (first) differential
vanishes, i.e., df = 0. The sufficient condition
1. for minimum is that the second differential is
positive, i.e., d2
f > 0, and
2. for maximum is that the second differential is
negative, i.e., d2
f < 0.
Figure 2.3: (a) Minimum and (b) Maximum of a
Function f(t)
2.2.2 Optimum of a Functional
The increment of the functional ∆J, is used to evaluate the relative extrema points,
∆J = J(x)J(x∗
) ≥ 0,
J(x∗
) is a relative local minimum
∆J = J(x)J(x∗
) ≤ 0,
J(x∗
) is a relative local maximum
Theorem
For x∗
(t) to be a candidate for an optimum, the (first) variation of J must be zero on x∗
(t), i.e.,
δJ(x∗
(t), δx(t)) = 0 for all admissible values of δx(t). This is a necessary condition. As a sufficient
condition for minimum, the second variation δ2
J > 0, and for maximum δ2
J < 0.
2.3 Euler-Lagrange Equation
The Euler-Lagrange equation can be written as,
Vx −
d
dt
(V˙x) = 0
where,
Vx =
∂V
∂x
= Vx (x∗
(t), ˙x∗
(t), t)
V˙x =
∂V
∂ ˙x
= V˙x (x∗
(t), ˙x∗
(t), t)
Page 11 of 83

Since V is a function of three arguments ˙x∗
(t), x∗
(t), and t, and that x∗
(t) and ˙x∗
(t) are in turn functions
of t, we get
d
dt
(V˙x)∗ =
d
dt
Vx (x∗
(t), ˙x∗
(t), t)
∂ ˙x ∗
=
d
dt
∂2
V
∂x∂ ˙x
dx +
∂2
V
∂ ˙x∂ ˙x
d ˙x +
∂2
V
∂t∂ ˙x
dt
∗
=
∂2
V
∂x∂ ˙x ∗
dx
dt ∗
+
∂2
V
∂ ˙x∂ ˙x ∗
d2
x
dt2
∗
+
∂2
V
∂t∂ ˙x ∗
= Vx ˙x ˙x∗
(t) + V˙x ˙x ¨x∗
(t) + Vt ˙x
The alternate form for the EL equation is,
Vx − Vt ˙x − Vx ˙x ˙x∗
(t) − V˙x ˙x ¨x∗
(t) = 0
2.3.1 Diﬀerent Cases for Euler-Lagrange Equation
• Case 1: V is dependent of ˙x(t), and t. That is, V = V ( ˙x(t), t). Then Vx = 0. The Euler-Lagrange
equation becomes
d
dt
(V˙x) = 0
This leads us to
V˙x =
∂V ( ˙x(t), t)
∂x
= C
where, C is a constant of integration.
• Case 2: V is dependent of ˙x(t) only. That is, V = V ( ˙x(t)). Then Vx = 0. The Euler-Lagrange
equation becomes
d
dt
(V˙x) = 0 → V˙x = C
In general, the solution of either becomes
˙x∗
(t) = C1 → x∗
(t) = C1t + C2
This is simply an equation of a straight line.
• Case 3: V is dependent of x(t) and ˙x(t). That is, V = V (x(t), ˙x(t)). Then Vt ˙x = 0. Using the
other form of the Euler-Lagrange equation, we get
Vx − Vx ˙x ˙x∗
(t) − V˙x ˙x ¨x∗
(t) = 0
Multiplying the previous equation by x ∗ (t), we have
x∗
(t) [Vx − Vx ˙x ˙x∗
(t) − V˙x ˙x ¨x∗
(t)] = 0
This can be rewritten as
d
dt
(V − ˙x∗
(t)V˙x) = 0 → V − ˙x∗
(t)V˙x = C
The previous equation can be solved using any of the techniques such as, separation of variables.
• Case 4: V is dependent of x(t), and t, i.e., V = V (x(t), t). Then, V˙x = 0 and the Euler-Lagrange
equation becomes
∂V (x∗
(t), t)
∂x
= 0
The solution of this equation does not contain any arbitrary constants and therefore generally
speaking does not satisfy the boundary conditions x(t0) and x(tf ). Hence, in general, no solution
exists for this variational problem. Only in rare cases, when the function x(t) satisﬁes the given
boundary conditions x(t0) and x(tf ), it becomes an optimal function.
Page 12 of 83

Example 1:
Find the minimum length between any two points.
Solution:
It is well known that the solution to this problem is a straight line. However, we like to illustrate the
application of Euler-Lagrange equation for this simple case. Consider the arc between two points A and
B as shown in Fig.2.4. Let ds be the small arc length, and dx and dt are the small rectangular coordinate
values. Note that t is the independent variable representing distance and not time. Then,
(ds)2
= (dx)2
+ (dy)2
⇒ ds = 1 + ˙x(t)dt
The performance index J to be minimized,
J = ds =
tf
t0
1 + ˙x(t)dt =
tf
t0
V ( ˙x(t)) dt
where, V ( ˙x(t)) = 1 + ˙x(t). Note that V is a function of ˙x(t) only. Applying the Euler-Lagrange
equation to the performance index, we get
˙x∗
(t)
1 + ˙x(t)
= C
Solving this equation, we get the optimal solution as
x∗
(t) = C1t + C2
This is evidently an equation for a straight line and the constants C1 and C2 are evaluated from the
given boundary conditions. For example, if x(0) = 1 and x(2) = 5, C1 = 2 and C2 = 1 the straight line
is x∗
(t) = 2t + 1.
Figure 2.4: Arc Length
Example 2:
Find the optimum of
J =
2
0
˙x2
(t) − 2tx(t) dt
that satisfy the boundary (initial and ﬁnal) conditions
x(0) = 1 and x(2) = 5
Page 13 of 83

Solution:
V = ˙x2
(t) − 2tx(t)
∂V
∂x
−
d
dt
∂V
∂ ˙x
= 0 → −2t −
d
dt
(2 ˙x(t)) = 0 → ¨x(t) = t
Solving the previous simple differential equation, we have
x∗
(t) =
t3
6
+ C1t + C2
where, C1 and C2 are constants of integration. Using the given boundary conditions, we have
x(0) = 1 → C2 = 1
x(2) = 5 → C1 =
4
3
With these values for the constants, we finally have the optimal function as
x∗
(t) =
t3
6
+
4
3
t + 1
2.4 Procedure of Pontryagin Principle for Bolza Problem (Open-
Loop Optimal Control)
2.4.1 Statement of the Problem
1. Given the plant as
˙x(t) = f(x(t), u(t), t),
2. the performance index as
J = S(x(tf ), tf ) +
tf
t0
V (x(t), u(t), t)dt
3. and the boundary conditions as
x(t0) = x0 and final conditions depends on system type
4. find the optimal control.
2.4.2 Solution of the Problem
1. Form the Pontryagin H function
H(x(t), u(t), λ(t), t) = V (x(t), u(t), t) + λ (t)f(x(t), u(t), t)
2. Minimize H w.r.t. u(t)
∂H
∂u ∗
= 0 and obtain u∗
(t) = h(x∗
(t), λ∗
(t), t)
3. Using the results of Step 1 in Step 2, find the optimal H∗
H∗
(x∗
(t), h(x∗
(t), λ∗
(t), t), λ∗
(t), t) = H∗
(x∗
(t), λ∗
(t), t)
Page 14 of 83

4. Solve the set of 2n differential equations
˙x∗
= +
∂H
∂λ ∗
and ˙λ∗
= −
∂H
∂x ∗
with initial conditions x0 and the final conditions
H +
∂S
∂t ∗tf
δtf +
∂S
∂x
− λ(t)
∗tf
δxf = 0
which obtained from cases in Subsec.2.4.3.
5. Substitute the solutions of x∗
(t), λ∗
(t) from Step 4 into the expression for the optimal control u∗
(t)
of Step 2.
2.4.3 Types of Systems
Type Substitutions Boundary Conditions
Fixed-final time and fixed-final δtf = 0, x(t0) = x0,
state system, Fig.2.5(a) δxf = 0 x(tf ) = xf
Free-final time and fixed-final δtf = 0, x(t0) = x0, x(tf ) = xf ,
state system, Fig.2.5(b) δxf = 0 H∗
+ ∂S
∂t tf
= 0
Fixed-final time and free-final δtf = 0, x(t0) = x0,
state system, Fig.2.5(c) δxf = 0 λ∗
(tf ) = ∂S
∂x ∗tf
Free-final time and dependent free-final δxf = ˙θ(tf)δtf x(t0) = x0, x(tf ) = θ(tf ),
state system, Fig.2.5(d) H∗
+ ∂S
∂t + ∂S
∂x ∗
− λ∗
(t) ˙θ(t)
tf
= 0
Free-final time and independent free-final δtf = 0, δx(t0) = x0,
state system δxf = 0 H∗
+ ∂S
∂t tf
= 0, ∂S
∂x ∗
− λ∗
(t) tf
= 0
Figure 2.5: Different Types of Systems: (a) Fixed-Final Time and Fixed-Final State System, (b) FreeFi-
nal Time and Fixed-Final State System, (c) Fixed-Final Time and Free-Final State System, (d) FreeFinal
Time and Free-Final State System
Page 15 of 83

Example 1:
Statement of the Problem:
1. Plant
˙x1(t) = x2(t)
˙x2(t) = u(t)
2. Performance index
J =
1
2
tf
t0
u2
(t)dt
3. Boundary conditions
x(0) =

1
2

 ; x(2) =

1
0


Solution of the Problem:
V (x(t), u(t), t) = V (u(t)) =
1
2
u2
(t)
f(x(t), u(t), t) = f1 f2
= x2(t) u(t)
H = H(x1(t), x2(t), u(t), λ1(t), λ2(t))
= V (u(t)) + λ f(x(t), u(t), t)
=
1
2
u2
(t) + λ1(t)x2(t) + λ2(t)u(t)
2. Get u∗
(t)
∂H
∂u
= 0 ⇒ u∗
(t) + λ∗
2(t) = 0 ⇒ u∗
(t) = −λ∗
2(t)
3. Get H∗
H∗
(x∗
1(t), x∗
2(t), u∗
(t), λ∗
1(t), λ∗
2(t)) =
1
2
λ∗2
2 (t) + λ∗
1(t)x∗
2(t) − λ∗2
2 (t)
= λ∗
1(t)x∗
2(t) −
1
2
λ∗2
2 (t)
4. Obtain the state and costate equations
˙x∗
1(t) = +
∂H
∂λ1 ∗
= x2(t)
˙x∗
2(t) = +
∂H
∂λ2 ∗
= −λ∗
2(t)
˙λ∗
1(t) = −
∂H
∂x1 ∗
= 0
˙λ∗
2(t) = −
∂H
∂x2 ∗
= −λ∗
1(t)
Solving the previous equations, we have the optimal state and costate as
x∗
1(t) =
C3
6
t3
−
C4
2
t2
+ C2t + C1
x∗
2(t) =
C3
2
t2
− C4t + C2
λ∗
1(t) = C3
λ∗
2(t) = −C3t + C4
Page 16 of 83

5. Obtain the optimal control
u∗
(t) = −λ∗
2(t) = C3t − C4
From boundary conditions
x1(0) = 1, x2(0) = 2, x1(2) = 1, x2(2) = 0
We get
C1 = 1, C2 = 2, C3 = 3, and C4 = 4
Finally, we have the optimal states, costates and control as
x∗
1(t) = 0.5t3
− 2t2
+ 2t + 1
x∗
2(t) = 1.5t2
− 4t + 2
λ∗
1(t) = 3
λ∗
2(t) = −3t + 4
u∗
(t) = 3t − 4
The system with the optimal controller is shown in following Figure .
Figure 2.6: Optimal Controller
Figure 2.7: Optimal Control and States
Example 2:
1. Plant
˙x1(t) = x2(t)
˙x2(t) = u(t)
Page 17 of 83

J =
1
2
tf
t0
u2
(t)dt
x(0) =

1
2

 ; x(2) =

 0
free


V (x(t), u(t), t) = V (u(t)) =
1
2
u2
(t)
f(x(t), u(t), t) = f1 f2
= x2(t) u(t)
H = H(x1(t), x2(t), u(t), λ1(t), λ2(t))
= V (u(t)) + λ f(x(t), u(t), t)
=
1
2
u2
(t) + λ1(t)x2(t) + λ2(t)u(t)
2. Get u∗
(t)
∂H
∂u
= 0 ⇒ u∗
(t) + λ∗
2(t) = 0 ⇒ u∗
(t) = −λ∗
2(t)
3. Get H∗
H∗
(x∗
1(t), x∗
2(t), u∗
(t), λ∗
1(t), λ∗
2(t)) =
1
2
λ∗2
2 (t) + λ∗
1(t)x∗
2(t) − λ∗2
2 (t)
= λ∗
1(t)x∗
2(t) −
1
2
λ∗2
2 (t)
˙x∗
1(t) = +
∂H
∂λ1 ∗
= x2(t)
˙x∗
2(t) = +
∂H
∂λ2 ∗
= −λ∗
2(t)
˙λ∗
1(t) = −
∂H
∂x1 ∗
= 0
˙λ∗
2(t) = −
∂H
∂x2 ∗
= −λ∗
1(t)
x∗
1(t) =
C3
6
t3
−
C4
2
t2
+ C2t + C1
x∗
2(t) =
C3
2
t2
− C4t + C2
λ∗
1(t) = C3
λ∗
2(t) = −C3t + C4
u∗
(t) = −λ∗
2(t) = C3t − C4
x1(0) = 1, x2(0) = 2, x1(2) = 0, λ2(tf ) =
∂S
∂x2 ∗tf
= 0 ⇒ λ2(2) = 0
Page 18 of 83

We get
C1 = 1, C2 = 2, C3 =
15
8
, and C4 =
15
4
x∗
1(t) =
5
16
t3
−
15
8
t2
+ 2t + 1
x∗
2(t) =
15
16
t2
−
15
4
t + 2
λ∗
1(t) =
15
8
λ∗
2(t) = −
15
8
t +
15
4
u∗
(t) =
15
8
t −
15
4
Example 3:
1. Plant
˙x1(t) = x2(t)
˙x2(t) = u(t)
J =
1
2
tf
t0
u2
(t)dt
x(0) =

1
2

 ; x1(2) = 0, x(tf ) =

 3
free


Page 19 of 83

V (x(t), u(t), t) = V (u(t)) =
1
2
u2
(t)
f(x(t), u(t), t) = f1 f2
= x2(t) u(t)
H = H(x1(t), x2(t), u(t), λ1(t), λ2(t))
= V (u(t)) + λ f(x(t), u(t), t)
=
1
2
u2
(t) + λ1(t)x2(t) + λ2(t)u(t)
2. Get u∗
(t)
∂H
∂u
= 0 ⇒ u∗
(t) + λ∗
2(t) = 0 ⇒ u∗
(t) = −λ∗
2(t)
3. Get H∗
H∗
(x∗
1(t), x∗
2(t), u∗
(t), λ∗
1(t), λ∗
2(t)) =
1
2
λ∗2
2 (t) + λ∗
1(t)x∗
2(t) − λ∗2
2 (t)
= λ∗
1(t)x∗
2(t) −
1
2
λ∗2
2 (t)
˙x∗
1(t) = +
∂H
∂λ1 ∗
= x2(t)
˙x∗
2(t) = +
∂H
∂λ2 ∗
= −λ∗
2(t)
˙λ∗
1(t) = −
∂H
∂x1 ∗
= 0
˙λ∗
2(t) = −
∂H
∂x2 ∗
= −λ∗
1(t)
x∗
1(t) =
C3
6
t3
−
C4
2
t2
+ C2t + C1
x∗
2(t) =
C3
2
t2
− C4t + C2
λ∗
1(t) = C3
λ∗
2(t) = −C3t + C4
u∗
(t) = −λ∗
2(t) = C3t − C4
x1(0) = 1, x2(0) = 2, x1(2) = 0,
H +
∂S
∂t tf
= 0 ⇒ λ1(tf )x2(tf ) − 0.5λ2
2(tf ) = 0,
λ2(tf ) =
∂S
∂x2
= 0
We get
C1 = 1, C2 = 2, C3 =
4
9
, C4 =
4
3
and tf = 3
Page 20 of 83

x∗
1(t) =
4
54
t3
−
2
3
t2
+ 2t + 1
x∗
2(t) =
4
18
t2
−
4
3
t + 2
λ∗
1(t) =
4
9
λ∗
2(t) = −
4
9
t +
4
3
u∗
(t) =
4
9
t −
4
3
Example 4:
1. Plant
˙x1(t) = x2(t)
˙x2(t) = u(t)
J =
1
2
[x1(2) − 4]
2
+
1
2
[x2(2) − 2]
2
+
1
2
tf
t0
u2
(t)dt
x(0) =

1
2

 ; x(2) = free
Page 21 of 83

V (x(t), u(t), t) = V (u(t)) =
1
2
u2
(t)
f(x(t), u(t), t) = f1 f2
= x2(t) u(t)
H = H(x1(t), x2(t), u(t), λ1(t), λ2(t))
= V (u(t)) + λ f(x(t), u(t), t)
=
1
2
u2
(t) + λ1(t)x2(t) + λ2(t)u(t)
2. Get u∗
(t)
∂H
∂u
= 0 ⇒ u∗
(t) + λ∗
2(t) = 0 ⇒ u∗
(t) = −λ∗
2(t)
3. Get H∗
H∗
(x∗
1(t), x∗
2(t), u∗
(t), λ∗
1(t), λ∗
2(t)) =
1
2
λ∗2
2 (t) + λ∗
1(t)x∗
2(t) − λ∗2
2 (t)
= λ∗
1(t)x∗
2(t) −
1
2
λ∗2
2 (t)
˙x∗
1(t) = +
∂H
∂λ1 ∗
= x2(t)
˙x∗
2(t) = +
∂H
∂λ2 ∗
= −λ∗
2(t)
˙λ∗
1(t) = −
∂H
∂x1 ∗
= 0
˙λ∗
2(t) = −
∂H
∂x2 ∗
= −λ∗
1(t)
x∗
1(t) =
C3
6
t3
−
C4
2
t2
+ C2t + C1
x∗
2(t) =
C3
2
t2
− C4t + C2
λ∗
1(t) = C3
λ∗
2(t) = −C3t + C4
u∗
(t) = −λ∗
2(t) = C3t − C4
x1(0) = 1, x2(0) = 2,
λ1(tf ) =
∂S
∂x1 tf
⇒ λ∗
1(2) = x1(2) − 4
λ2(tf ) =
∂S
∂x2 tf
⇒ λ∗
2(2) = x2(2) − 2
We get
C1 = 1, C2 = 2, C3 =
3
7
, and C4 =
4
7
Page 22 of 83

x∗
1(t) =
1
14
t3
−
2
7
t2
+ 2t + 1
x∗
2(t) =
3
14
t2
−
4
7
t + 2
λ∗
1(t) =
3
7
λ∗
2(t) = −
3
7
t +
4
7
u∗
(t) =
3
7
t −
4
7
Page 23 of 83

Chapter 3
Linear Quadratic Optimal Control
Systems I (Regulator Closed-Loop
Optimal Control)
3.1 Procedure Summary of Finite-Time Linear Quadratic Reg-
ulator System: Time-Varying Case (Closed-Loop Optimal
Control with Fixed tf and Free x(tf ))
˙x(t) = A(t)x(t) + B(t)u(t)
J =
1
2
x (tf )F(tf )x(tf ) +
1
2
tf
t0
[x (t)Q(t)x(t) + u (t)R(t)u(t)] dt
x(t0) = x0, tf is fixed, and x(tf ) is free,
4. find the optimal control, state and performance index.
1. Solve the matrix differential Riccati equation
˙P(t) = −P(t)A(t) − A (t)P(t) − Q(t) + P(t)B(t)R−1
(t)B (t)P(t)
with final condition P(t = tf ) = F(tf ).
2. Solve the optimal state x∗
(t) from
˙x∗
(t) = A(t) − B(t)R−1
(t)B (t)P(t) x∗
(t)
with initial condition x(t0) = x0
3. Obtain the optimal control u∗
(t) as
u∗
(t) = −K(t)x∗
(t); where, K(t) = R−1
(t)B (t)P(t)
4. Obtain the optimal performance index from
J∗
=
1
2
x∗
(t)P(t)x∗
(t)
24

3.2 Salient Features
1. Riccati Coefficient: The Riccati coefficient matrix P(t) is a time-varying matrix which depends
upon the system matrices A(t) and B(t), the performance index (design) matrices Q(t), R(t) and
F(tf ), and the terminal time tf , but P(t) does not depend upon the initial state x(t0) of the
system.
2. P(t) is symmetric and hence it follows that the n × n order matrix DRE represents a system of
n(n + 1)/2 first order, nonlinear, time-varying, ordinary differential equations.
3. Optimal Control: The optimal control u∗
(t) is minimum (maximum) if the control weighted
matrix R(t) is positive definite (negative definite).
4. Optimal State: Using the optimal control u∗
(t) in the state equation, we have
˙x∗
(t) = A(t) − B(t)R−1
(t)B (t)P(t) x∗
(t) = G(t)x∗
(t)
The solution of this state differential equation along with the initial condition x(t0) gives the optimal
state x∗
(t). Let us note that there is no condition on the closed-loop matrix G(t) regarding stability
as long as we are considering the finite final time (tf ) system.
5. Optimal Cost: The minimum cost J∗
is given by
J =
1
2
x∗
(t)P(t)x∗
(t) for all t ∈ [t0, tf ]
where, P(t) is the solution of the matrix DRE, and x∗
(t) is the solution of the closed-loop optimal
system.
6. Definiteness of the Matrix P(t): Since F(tf ) is positive semidefinite, and P(tf ) = F(tf ), we
can easily say that P(tf ) is positive semidefinite. We can argue that P(t) is positive definite for
all t ∈ [t0, tf ). Suppose that P(t) is not positive definite for some t = ts < tf , then there exists
the corresponding state x∗
(ts) such that the cost function 1
2 x∗
(ts)P(ts)x∗
(ts) ≤ 0, which clearly
violates that fact that minimum cost has to be a positive quantity. Hence, P(t) is positive definite
for all t ∈ [t0, tf ). Since we already know that P(t) is symmetric, we now have that P(t) is positive
definite, symmetric matrix.
7. Computation of Matrix DRE: Under some conditions we can get analytical solution for the non-
linear matrix DRE. But in general, we may try to solve the matrix DRE by integrating backwards
from its known final condition.
8. Independence of the Riccati Coefficient Matrix P(t): The matrix P(t) is independent of
the optimal state x∗
(t), so that once the system and the cost are specified, that is, once we are
given the system/plant matrices A(t) and B(t), and the performance index matrices F(tf ), Q(t),
and R(t), we can independently compute the matrix P(t) before the optimal system operates in
the forward direction from its initial condition. Typically, we compute (offline) the matrix P(t)
backward in the interval t ∈ [tf , t0] and store them separately, and feed these stored values when
the system is operating in the forward direction in the interval t ∈ [tf , t0].
9. Implementation of the Optimal Control: The block diagram implementing the closed-loop
optimal controller (CLOC) is shown in Fig.3.1. The figure shows clearly that the (CLOC) gets its
values of P(t) externally, after solving the matrix DRE backward in time from t = tf to t = t0 and
hence there is no way that we can implement the closed-loop optimal control configuration on-line.
10. Linear Optimal Control: The optimal feedback control u∗
(t) given by
u∗
(t) = −K(t)x∗
(t)
where, the Kalman gain K(t) = R−1
(t)B (t)P(t). The previous optimal control is linear in state
x∗
(t). This is one of the nice features of the optimal control of linear systems with quadratic cost
Page 25 of 83

functionals. Also, note that the negative feedback in the optimal control relation emerged from the
theory of optimal control and was not introduced intentionally in our development.
Figure 3.1: Closed-Loop Optimal Control Implementation
11. Controllability: We don’t need the controllability condition on the system for implementing
the optimal feedback control, as long as we are dealing with a finite time (tf) system, because
the contribution of those uncontrollable states (which are also unstable) to the cost function is
still a finite quantity only. However, if we consider an infinite time interval, we certainly need the
controllability condition.
3.3 LQR System for General Performance Index
Consider a linear, time-varying plant described by
˙x(t) = A(t)x(t) + B(t)u(t)
with a cost functional
J =
1
2
1
2
tf
t0
[x (t)Q(t)x(t) + 2x (t)S(t)x(t) + u (t)R(t)u(t)] dt
=
1
2
1
2
tf
t0
x (t) u (t)

Q(t) S(t)
S(t) R(t)



x(t)
u(t)

 dt
where, S(t) is a positive definite matrix.
Hence, the matrix differential Riccati equation as
˙P(t) = −P(t)A(t) − A (t)P(t) − Q(t) + [P(t)B(t) + S(t)]R−1
(t)[B (t)P(t) − S (t)]
with the final condition on P(t) as P(tf ) = F(tf ) The optimal control is then given by
u(t) = −R−1
(t)B (t)[S (t) + P(t)]x(t)
When S(t) is made zero in the previous analysis, we get the previous results shown in Sec.3.1.
Page 26 of 83

Example:
1. Plant
˙x1(t) = x2(t)
˙x2(t) = −2x1(t) + x2(t) + u(t)
J =
1
2
x2
1(5) + x1(5)x2(5) + 2x2
2(5) +
1
2
5
0
2x2
1(t) + 6x1(t)x2(t) + 5x2(t) + 0.25u2
(t) dt
3. Boundary conditions 
x1(0)
x2(0)

 =

 2
−3


A =

 0 1
−2 1

 ; B =

0
1

 ; F(tf ) =

 1 0.5
0.5 2

 ; Q =

2 3
3 5

 ; R =
1
4
;

t0
tf

 =

0
5


Let P(t) be the 2 × 2 symmetric matrix
P(t) =

p11(t) p12(t)
p12(t) p22(t)


The solution of the matrix DRE
˙P(t) = −P(t)A(t) − A (t)P(t) − Q(t) + P(t)B(t)R−1
(t)B (t)P(t)

 ˙p11(t) ˙p12(t)
˙p12(t) ˙p22(t)

 = −

p11(t) p12(t)
p12(t) p22(t)



 0 1
−2 1

 −

0 −2
1 1



p11(t) p12(t)
p12(t) p22(t)


−

2 3
3 5

 +

p11(t) p12(t)
p12(t) p22(t)



0
1

 4 0 1

p11(t) p12(t)
p12(t) p22(t)


Simplifying the matrix DRE,
˙p11(t) = 4p2
12(t) + 4p12(t) − 2
˙p12(t) = −p11(t) − p12(t) + 2p22(t) + 4p12(t)p22(t) − 3
˙p22(t) = −2p12(t) − 2p22(t) + 4p2
22(t) − 5
with ﬁnal condition P(tf ) = F(tf )

p11(5) p12(5)
p12(5) p22(5)

 =

 1 0.5
0.5 2


Page 27 of 83

Figure 3.2: Riccati Coeﬃcients
(t) from
˙x∗
(t) = A(t) − B(t)R−1
(t)B (t)P(t) x∗
(t)
with initial condition x(0) = 2 −3
Figure 3.3: Optimal States
(t) as
u∗
(t) = −K(t)x∗
(t); where, K(t) = R−1
(t)B (t)P(t)
Figure 3.4: Optimal Control
Page 28 of 83

J∗
=
1
2
x∗
(t)P(t)x∗
(t)
Figure 3.5: Optimal Performance Index
Figure 3.6: Closed-Loop Optimal Control System
3.4 Procedure Summary of Inﬁnite-Time Linear Quadratic Reg-
ulator System: Time-Varying Case (Closed-Loop Optimal
Control with tf = ∞ and Free x(∞))
This problem cannot always be solved without some special conditions. For example, if anyone of the
states is uncontrollable and/or unstable, the corresponding performance measure J will become inﬁnite
and makes no physical sense. Thus, we need to impose the condition that the system should be completely
controllable.
˙x(t) = A(t)x(t) + B(t)u(t)
Page 29 of 83

J =
1
2
∞
t0
x(t0) = x0, x(∞) is free,
˙ˆP(t) = − ˆP(t)A(t) − A (t) ˆP(t) − Q(t) + ˆP(t)B(t)R−1
(t)B (t) ˆP(t)
with final condition ˆP(t = tf ) = 0. Where, ˆP = limtf →∞ P(t)
(t) from
˙x∗
(t) = A(t) − B(t)R−1
(t)B (t) ˆP(t) x∗
(t)
(t) as
u∗
(t) = − ˆK(t)x∗
(t); where, ˆK(t) = R−1
(t)B (t) ˆP(t)
J∗
=
1
2
x∗
(t) ˆP(t)x∗
(t)
3.5 Procedure Summary of Infinite-Interval Linear Quadratic
Regulator System: Time-Invariant Case (Closed-Loop Op-
timal Control with tf = ∞ and Free x(∞))
There are some of the implications of the time-invariance and the infinite final-time.
1. The infinite time interval case is considered for the following reasons:
(a) We wish to make sure that the state-regulator stays near zero state after the initial transient.
(b) We want to include any special case of large final time.
2. With infinite final-time interval, to include the final cost function does not make any practical
sense. Hence, the final cost term involving F(tf ) does not exist in the cost functional.
3. With infinite final-time interval, the system has to be completely controllable. Let us recall that
this controllability condition of the plant requires that the controllability matrix
B AB ... An−1
B
must be nonsingular or contain n linearly independent column vectors. The controllability require-
ment guarantees that the optimal cost is finite. On the other hand, if the system is not controllable
and some or all of those uncontrollable states are unstable, then the cost functional would be infi-
nite since the control interval is infinite. In such situations, we cannot distinguish optimal control
from the other controls. Alternatively, we can assume that the system is completely stabilizable.
Page 30 of 83

˙x(t) = Ax(t) + Bu(t)
J =
1
2
∞
t0
x(t0) = x0, x(∞) = 0
− ¯PA − A ¯P − Q + ¯PBR−1
B ¯P = 0
Where, ¯P = limtf →∞ P(t)
(t) from
˙x∗
(t) = A − BR−1
B ¯P x∗
(t)
Note: The original system [A] may be unstable, the optimal system A − BR−1
B ¯P must be
deﬁnitely stable.
(t) as
u∗
(t) = − ¯Kx∗
(t); where, ¯K = R−1
B ¯P
J∗
=
1
2
x∗
(t) ¯P(t)x∗
(t)
Figure 3.7: Implementation of the Closed-Loop Optimal Control: Inﬁnite Final Time
Page 31 of 83

Example:
1. Plant
˙x1(t) = x2(t)
˙x2(t) = −2x1(t) + x2(t) + u(t)
J =
1
2
∞
0
2x2
1(t) + 6x1(t)x2(t) + 5x2(t) + 0.25u2
(t) dt
x1(0)
x2(0)

 =

 2
−3


A =

 0 1
−2 1

 ; B =

0
1

 ; F(tf ) = 0; Q =

2 3
3 5

 ; R =
1
4
;

t0
tf

 =

 0
∞


Let ¯P be the 2 × 2 symmetric matrix
¯P =

¯p11 ¯p12
¯p12 ¯p22


0 = − ¯PA − A ¯P − Q + ¯PBR−1
B ¯P

0 0
0 0

 = −

¯p11 ¯p12
¯p12 ¯p22



 0 1
−2 1

 −

0 −2
1 1



¯p11 ¯p12
¯p12 ¯p22


−

2 3
3 5

 +

¯p11 ¯p12
¯p12 ¯p22



0
1

 4 0 1

¯p11 ¯p12
¯p12 ¯p22


0 = 4¯p2
12 + 4¯p12 − 2
0 = −¯p11 − ¯p12 + 2¯p22 + 4¯p12 ¯p22 − 3
0 = −2¯p12 − 2¯p22 + 4¯p2
22 − 5
∴ ¯P =

1.7363 0.3660
0.3660 1.4729


(t) from
˙x∗
(t) = A − BR−1
B ¯P x∗
(t)
with initial condition x(0) = 2 −3
Page 32 of 83

(t) as
u∗
(t) = − ¯Kx∗
(t); where, ¯K = R−1
B ¯P
J∗
=
1
2
x∗
(t) ¯Px∗
(t)
Page 33 of 83

The original plant is unstable (eigenvalues at 2 ± j1) whereas the optimal closed-loop system is
stable (eigenvalues at −4.0326, −0.8590).
Figure 3.11: Closed-Loop Optimal Control System
3.6 Stability Issues of Time-Invariant Regulator
Some stability remarks of the infinite-time regulator system
1. The closed-loop optimal system is not always stable especially when the original plant is unstable
and these unstable states are not weighted in the PI. In order to prevent such a situation, we need
the assumption that the pair [A, C] is detectable, where C is any matrix such that C C = Q, which
guarantees the stability of closed-loop optimal system. This assumption essentially ensures that
all the potentially unstable states will show up in the x (t)Qx(t) part of the performance measure.
2. The Riccati coefficient matrix ¯P is positive definite if and only if [A, C] is completely observable.
3. The detectability condition is necessary for stability of the closed-loop optimal system.
3. Thus both detectability and stabilizability conditions are necessary for the existence of a stable
closed-loop system.
3.7 Equivalence of Open-Loop and Closed-Loop Optimal Con-
trols
We present a simple example to show an interesting property that an optimal control system can be
solved and implemented as an open-loop optimal control (OLOC) configuration or a closed-loop optimal
control (CLOC) configuration. We will also demonstrate the simplicity of the CLOC.
Example:
1. Plant
˙x(t) = −3x(t) + u(t)
J =
∞
0
x2
(t) + u2
(t) dt
Page 34 of 83

 x(0)
x(∞)

 =

1
0


Open-Loop Optimal Control Solution:
V (x(t), u(t)) = x2
(t) + u2
(t)
f(x(t), u(t)) = −3x(t) + u(t)
H = V (x(t), u(t)) + λ(t)f(x(t), u(t)) = x2
(t) + u2
(t) + λ(t) [−3x(t) + u(t)]
2. Get u∗
(t)
∂H
∂u
= 0 ⇒ 2u∗
(t) + λ∗
(t) = 0 ⇒ u∗
(t) = −
1
2
λ∗
2(t)
3. Get H∗
H∗
= x∗2
(t) −
1
4
λ∗2
(t) − 3λ∗
(t)x∗
(t)
˙x∗
(t) = +
∂H
∂λ ∗
= −
1
2
λ∗
(t) − 3x∗
(t)
˙λ∗
(t) = +
∂H
∂x ∗
= −2x∗
(t) + 3λ∗
(t)
¨x∗
(t) − 10x∗
(t) = 0 ⇒ x∗
(t) = C1e
√
10t
+ C2e−
√
10t
λ∗
(t) = 2 [− ˙x∗
(t) − 3x∗
(t)]
= −2C1
√
10 + 3 e
√
10t
+ 2C2
√
10 − 3 e−
√
10t
x(∞) = 0 ⇒ C1 = 0
x(0) = 1 ⇒ C2 = 1
x∗
(t) = e−
√
10t
λ∗
(t) = 2
√
10 − 3 e−
√
10t
u∗
(t) = −
√
10 − 3 e−
√
10t
Figure 3.12: Optimal Control and State
Page 35 of 83

Closed-Loop Optimal Control Solution:
A = 3; B = 1; F(tf ) = 0; Q = 2; R = 2;

t0
tf

 =

 0
∞


¯P = ¯p
− ¯PA − A ¯P − Q + ¯PBR−1
B ¯P = 0
−¯p(−3) − (−3)¯p − 2 + ¯p(1)
1
2
(1)¯p = 0 ⇒ ¯p = −6 + 2
√
10
(t) from
˙x∗
(t) = A − BR−1
B ¯P x∗
(t) = −
√
10x∗
(t) ⇒ x∗
(t) = C1e−
√
10t
x(0) = 1 ⇒ C1 = 1 ⇒ x∗
(t) = e−
√
10t
(t) as
u∗
(t) = − ¯Kx∗
(t) = −R−1
B ¯Px∗
(t) = −
√
10 − 3 e−
√
10t
Figure 3.13: Optimal Control and State
J∗
(t0) =
1
2
x∗
(t0) ¯P(t)x∗
(t0) = −3 +
√
10
From the previous example, it is clear that
1. from the implementation point of view, the closed-loop optimal controller
√
10 − 3 is much sim-
pler than the open-loop optimal controller
√
10 − 3 e−
√
10t
which is an exponential time func-
tion and
2. with a closed-loop conﬁguration, all the advantages of conventional feedback are incorporated.
Page 36 of 83

Figure 3.14: (a) Open-Loop Optimal Controller (OLOC) and (b) Closed-Loop Optimal Controller
(CLOC)
Page 37 of 83

Chapter 4
Systems II (Tracking Closed-Loop
Optimal Control)
4.1 Procedure Summary of Linear Quadratic Tracking System
(Closed-Loop Optimal Control with Fixed tf and Free x(tf ))
˙x(t) = A(t)x(t) + B(t)u(t)
y(t) = C(t)x(t)
e(t) = z(t) − y(t); z(t) is the desired output
J =
1
2
e (tf )F(tf )e(tf ) +
1
2
tf
t0
[e (t)Q(t)e(t) + u (t)R(t)u(t)] dt
x(t0) = x0, x(tf ) is free,
˙P(t) = −P(t)A(t) − A (t)P(t) − Q(t) + P(t)E(t)P(t) − V (t)
with ﬁnal condition
P(tf ) = C (tf )F(tf )C(tf )
and the non-homogeneous vector diﬀerential equation
˙g(t) = − [A(t) − E(t)P(t)] g(t) − W(t)z(t)
g(tf ) = C (tf )F(tf )z(tf )
38

where,
E(t) = B(t)R−1
(t)B (t)
V (t) = C (t)Q(t)C(t)
W(t) = C (t)Q(t)
(t) from
˙x∗
(t) = [A(t) − E(t)P(t)] x∗
(t) + E(t)g(t)
(t) as
u∗
(t) = −K(t)x∗
(t) + R−1
(t)B (t)g(t); where, K(t) = R−1
(t)B (t)P(t)
J∗
=
1
2
x∗
(t)P(t)x∗
(t) − x∗
(t)g(t) + h(t)
where h(t) is the solution of
˙h(t) = −
1
2
g (t)E(t)g(t) −
1
2
z (t)Q(t)z(t)
h(tf ) = −z (tf )P(tf )z(tf )
Figure 4.1: Implementation of the Optimal Tracking System
Page 39 of 83

4.1.3 Salient Features of Tracking System
1. Riccati Coefficient Matrix P(t): We note that the desired output z(t) has no influence on
the matrix differential Riccati equation and its boundary condition. This means that once the
problem is specified in terms of the final time tf, the plant matrices A(t), B(t), and C(t), and the
cost functional matrices F(tf ), Q(t), and R(t), the matrix function P(t) is completely determined.
2. Closed Loop Eigenvalues: The closed-loop system matrix [A(t)−B(t)R−I(t)B (t)P(t)] is again
independent of the desired output z(t). This means the eigenvalues of the closed-loop, optimal
tracking system are independent of the desired output z(t).
3. Tracking and Regulator Systems: The main difference between the optimal output tracking
system and the optimal state regulator system is in the vector g(t). As shown in Fig.4.1, one can
think of the desired output z(t) as the forcing function of the closed-loop optimal system which
generates the signal g(t).
4. Also, note that if we make C(t) = I(t), then V (t) = Q(t). Thus, the matrix DRE in Subsec.4.1.2
becomes the same matrix DRE in Subsec.3.1.2.
Example 1:
1. Plant
˙x1(t) = x2(t)
˙x2(t) = −2x1(t) + x2(t) + u(t)
y(t) = x(t)
J = [1 − x1(tf )]
2
+
tf
t0
[1 − x1(t)]
2
+ 0.002u2
(t) dt
x(0) =

−0.5
0

 , tf = 20, x(tf ) is free, controls and states are unbounded
It is required to keep the state x1(t) close to 1.
The state x1(t) is to be kept close to the reference input z1(t) = 1 and since there is no condition
on state x2(t), one can choose arbitrarily as z2(t) = 0.
A =

 0 1
−2 −3

 ; B =

0
1

 ; C = I2; z(t) =

1
0

 ; Q =

2 0
0 0

 = F(tf ); R = 0.004
P(t) =

p11(t) p12(t)
p12(t) p22(t)


Page 40 of 83

E(t) = B(t)R−1
(t)B (t) =

0 0
0 250


V (t) = C (t)Q(t)C(t) =

2 0
0 0


W(t) = C (t)Q(t) =

2 0
0 0



 ˙p11(t) ˙p12(t)
˙p12(t) ˙p22(t)

 = −

p11(t) p12(t)
p12(t) p22(t)



 0 1
−2 −3

 −

0 −2
1 −3



p11(t) p12(t)
p12(t) p22(t)


+

p11(t) p12(t)
p12(t) p22(t)



0 0
0 250



p11(t) p12(t)
p12(t) p22(t)

 −

2 0
0 0


˙p11(t) = 250p2
12(t) + 4p12(t) − 2
˙p12(t) = 250p12(t)p22(t) − p11(t) + 3p12(t) + 2p22(t)
˙p22(t) = 250p2
22(t) − 2p12(t) + 6p22(t)
P(tf ) =

p11(tf ) p12(tf )
p12(tf ) p22(tf )

 = C (tf )F(tf )C(tf ) =

2 0
0 0


Let g(t) be the 2 × 1 vector as
g(t) =

g1(t)
g2(t)


The solution of vector equation
˙g(t) = − [A(t) − E(t)P(t)] g(t) − W(t)z(t)

g1(t)
g2(t)

 = −




 0 1
−2 −3

 −

0 0
0 250



p11(t) p12(t)
p12(t) p22(t)






g1(t)
g2(t)

 −

2 0
0 0



1
0


Page 41 of 83

Simplifying the equation
˙g1(t) = [250p12(t) + 2] g2(t) − 2
˙g2(t) = −g1(t) + [3 + 250p22(t)] g2(t)
g(tf ) = C (tf )F(tf )z(tf ) =

2
0


Figure 4.3: g(t) Coeﬃcients
(t) from
˙x∗
(t) = [A(t) − E(t)P(t)] x∗
(t) + E(t)g(t)
with initial condition x(t0) = −0.5 0
(t) as
u∗
(t) = −K(t)x∗
(t) + R−1
(t)B (t)P(t)
Page 42 of 83

J∗
(t0) =
1
2
x∗
(t0)P(t0)x∗
(t0) − x∗
(t0)g(t0) + h(t0) = 42.9092
˙h(t) = −
1
2
g (t)E(t)g(t) −
1
2
z (t)Q(t)z(t)
h(tf ) = −z (tf )P(tf )z(tf ) = 2
Figure 4.7: h(t) Solution
Page 43 of 83

Example 2:
1. Plant
˙x1(t) = x2(t)
˙x2(t) = −2x1(t) + x2(t) + u(t)
y(t) = x(t)
J =
tf
t0
[2t − x1(t)]
2
+ 0.02u2
(t) dt
x(0) =

−1
0

 , tf = 20, x(tf ) is free, controls and states are unbounded
the state x1(t) track a ramp function z1(t) = 2t.
As there is no condition on state x2(t), one can choose arbitrarily as z2(t) = 0.
A =

 0 1
−2 −3

 ; B =

0
1

 ; C = I2; z(t) =

2t
0

 ; Q =

2 0
0 0

 ; R = 0.04; F(tf ) = 0
P(t) =

p11(t) p12(t)
p12(t) p22(t)


E(t) = B(t)R−1
(t)B (t) =

0 0
0 25


V (t) = C (t)Q(t)C(t) =

2 0
0 0


W(t) = C (t)Q(t) =

2 0
0 0



 ˙p11(t) ˙p12(t)
˙p12(t) ˙p22(t)

 = −

p11(t) p12(t)
p12(t) p22(t)



 0 1
−2 −3

 −

0 −2
1 −3



p11(t) p12(t)
p12(t) p22(t)


+

p11(t) p12(t)
p12(t) p22(t)



0 0
0 25



p11(t) p12(t)
p12(t) p22(t)

 −

2 0
0 0


˙p11(t) = 25p2
12(t) + 4p12(t) − 2
˙p12(t) = 25p12(t)p22(t) − p11(t) + 3p12(t) + 2p22(t)
˙p22(t) = 25p2
22(t) − 2p12(t) + 6p22(t)
Page 44 of 83

P(tf ) =

p11(tf ) p12(tf )
p12(tf ) p22(tf )

 = C (tf )F(tf )C(tf ) =

0 0
0 0


Let g(t) be the 2 × 1 vector as
g(t) =

g1(t)
g2(t)


The solution of vector equation
˙g(t) = − [A(t) − E(t)P(t)] g(t) − W(t)z(t)

g1(t)
g2(t)

 = −




 0 1
−2 −3

 −

0 0
0 25



p11(t) p12(t)
p12(t) p22(t)






g1(t)
g2(t)

 −

2 0
0 0



2t
0


Simplifying the equation
˙g1(t) = [25p12(t) + 2] g2(t) − 4t
˙g2(t) = −g1(t) + [3 + 25p22(t)] g2(t)
g(tf ) = C (tf )F(tf )z(tf ) =

0
0


Page 45 of 83

(t) from
˙x∗
(t) = [A(t) − E(t)P(t)] x∗
(t) + E(t)g(t)
with initial condition x(t0) = −1 0
(t) as
u∗
(t) = −K(t)x∗
(t) + R−1
(t)B (t)P(t)
J∗
(t0) =
1
2
x∗
(t0)P(t0)x∗
(t0) − x∗
(t0)g(t0) + h(t0) = 2.0450 × 104
Page 46 of 83

˙h(t) = −
1
2
g (t)E(t)g(t) −
1
2
z (t)Q(t)z(t)
h(tf ) = −z (tf )P(tf )z(tf ) = 0
Figure 4.13: h(t) Solution
Page 47 of 83

(Closed-Loop Optimal Control with Inﬁnite tf and Free
x(∞))
˙x(t) = A(t)x(t) + B(t)u(t)
y(t) = C(t)x(t)
J =
1
2
tf
t0
[e (t)Q(t)e(t) + u (t)R(t)u(t)] dt
˙ˆP(t) = − ˆP(t)A(t) − A (t) ˆP(t) − Q(t) + ˆP(t)E(t) ˆP(t) − V (t)
ˆP(tf ) = 0; where, ˆP(t) = lim
tf →∞
P(t)
˙ˆg(t) = − A(t) − E(t) ˆP(t) ˆg(t) − W(t)z(t)
ˆg(tf ) = 0
where,
E(t) = B(t)R−1
(t)B (t)
V (t) = C (t)Q(t)C(t)
W(t) = C (t)Q(t)
(t) from
˙x∗
(t) = A(t) − E(t) ˆP(t) x∗
(t) + E(t)ˆg(t)
(t) as
u∗
(t) = − ˆK(t)x∗
(t) + R−1
(t)B (t)ˆg(t); where, ˆK(t) = R−1
(t)B (t) ˆP(t)
Page 48 of 83

J∗
=
1
2
x∗
(t) ˆP(t)x∗
(t) − x∗
(t)ˆg(t) + ˆh(t)
where ˆh(t) is the solution of
˙ˆh(t) = −
1
2
ˆg (t)E(t)ˆg(t) −
1
2
z (t)Q(t)z(t)
ˆh(tf ) = 0
(Closed-Loop Optimal Control with Inﬁnite Time-Invariant
and Free x(∞))
˙x(t) = Ax(t) + Bu(t)
y(t) = Cx(t)
J =
1
2
tf
t0
[e (t)Qe(t) + u (t)Ru(t)] dt
− ¯PA − A ¯P − Q + ¯PE ¯P − V = 0
¯P(tf ) = 0; where, ¯P = lim
tf →∞
P(t)
˙¯g(t) = − A − E ¯P ¯g(t) − Wz(t)
¯g(tf ) = 0
where,
E = BR−1
B
V = C QC
W = C Q
Page 49 of 83

(t) from
˙x∗
(t) = A − E ¯P x∗
(t) + E¯g(t)
(t) as
u∗
(t) = − ¯Kx∗
(t) + R−1
B ¯g(t); where, ¯K = R−1
B ¯P
J∗
=
1
2
x∗
(t) ¯Px∗
(t) − x∗
(t)¯g(t) + ¯h(t)
where ¯h(t) is the solution of
˙¯h(t) = −
1
2
¯g (t)E¯g(t) −
1
2
z (t)Qz(t)
¯h(tf ) = 0
4.4 Procedure Summary of Fixed-End-Point Regulator System
(Closed-Loop Optimal Control)
˙x(t) = A(t)x(t) + B(t)u(t)
J =
1
2
tf
t0
The solution of this problem depends on the boundary conditions.
1. If x(t0) = 0 and x(tf ) = 0
(a) Solve the inverse matrix diﬀerential Riccati equation
˙M(t) = A(t)M(t) + M(t)A (t) + M(t)Q(t)M(t) − B(t)R−1
(t)B (t)
M(tf ) = 0
(b) Solve the optimal state x∗
(t) from
˙x∗
(t) = A(t) − B(t)R−1
(t)B (t)M−1
(t) x∗
(t)
(c) Obtain the optimal control u∗
(t) as
u∗
(t) = −R−1
(t)B (t)M−1
(t)x∗
(t)
Page 50 of 83

(d) Obtain the optimal performance index J∗
(t) from
J∗
=
1
2
x∗
(t)M−1
(t)x∗
(t)
2. If x(t0) = 0 and x(tf ) = 0
(t)B (t)
with initial condition, M(t0) = 0, or ﬁnal condition, M(tf ) = 0
(t) from
˙x∗
(t) = A(t) − B(t)R−1
(t)B (t)M−1
(t) x∗
(t)
(t) as
u∗
(t) = −R−1
(t)B (t)M−1
(t)x∗
(t)
(t) from
J∗
=
1
2
x∗
(t)M−1
(t)x∗
(t)
3. If x(t0) = 0 and x(tf ) = 0
(t)B (t)
and the transformation equation
˙v(t) = M(t)Q(t)v(t) + A(t)v(t)
with initial conditions
M(t0) = 0; v(t0) = x(t0)
or with ﬁnal conditions
M(tf ) = 0; v(tf ) = x(tf )
(t) from
˙x∗
(t) = A(t) − B(t)R−1
(t)B (t)M−1
(t) x∗
(t) + B(t)R1
(t)B (t)M−1
(t)v(t)
(t) as
u∗
(t) = −R−1
(t)B (t)M−1
(t)[x∗
(t) − v(t)]
(t) from
J∗
=
1
2
x∗
(t)M−1
(t)x∗
(t)
Example:
1. Plant
˙x(t) = ax(t) + bu(t)
J =
1
2
tf
t0
qx2
(t) + ru2
(t) dt
Page 51 of 83

x(t0) = x0; x(tf ) = 0
1. Solve the inverse matrix diﬀerential Riccati equation
˙m(t) = 2am(t) + m2
(t)q −
b2
r
m(tf ) = 0
so, the solution is
m(t) =
b2
r
e−β(t−tf )
− eβ(t−tf )
(a + β)e−β(t−tf ) − (a − β)eβ(t−tf )
where,
β = a2 + q
b2
r
(t) from
˙x∗
(t) = A(t) − B(t)R−1
(t)B (t)M−1
(t) x∗
(t) = a −
(a + β)e−β(t−tf )
− (a − β)eβ(t−tf )
e−β(t−tf ) − eβ(t−tf )
x∗
(t)
(t) as
u∗
(t) = −R−1
(t)B (t)M−1
(t)x∗
(t) =
1
b
(a + β)e−β(t−tf )
− (a − β)eβ(t−tf )
e−β(t−tf ) − eβ(t−tf )
x∗
(t)
4. Obtain the optimal performance index J∗
from
J∗
=
1
2
x∗
(t)M−1
(t)x∗
(t)
4.5 Procedure Summary of Regulator System with Prescribed
Degree of Stability (Closed-Loop Optimal Control of Inﬁ-
nite Time-Invariant Systems)
˙x(t) = Ax(t) + Bu(t)
J =
1
2
∞
t0
e2αt
x(t0) = x0, x(∞) = 0
Page 52 of 83

¯P(A + αI) + (A + αI) ¯P + Q − ¯PBR−1
B ¯P = 0
(t) from
˙x∗
(t) = A − BR−1
B ¯P x∗
(t)
(t) as
u∗
(t) = −R−1
B ¯Px∗
(t)
J∗
=
1
2
e2αt0
x∗
(t) ¯Px∗
(t)
Note: The closed-loop optimal control system has eigenvalues with real parts less than −α. In other
words, the state x∗
(t) approaches zero at least as fast as e−αt
. Then, we say that the closed-loop optimal
system has a degree of stability of at least α.
Example:
1. Plant
˙x(t) = x(t) + u(t)
J =
1
2
∞
t0
e2αt
x(0) = 1; x(∞) = 0
A = −1; B = 1; Q = 1; R = 1
¯P(A + αI) + (A + αI) ¯P + Q − ¯PBR−1
B ¯P = 0
⇒ ¯p2
− 2¯p(α − 1) − 1 = 0 ⇒ ¯p = −1 + α + (α − 1)2 + 1
(t) from
˙x∗
(t) = A − BR−1
B ¯P x∗
(t) = −α − (α − 1)2 + 1 x∗
(t)
(t) as
u∗
(t) = −R−1
B ¯Px∗
(t) = −¯px∗
(t) = 1 − α − (α − 1)2 + 1 x∗
(t)
4. Obtain the optimal performance index J∗
from
J∗
=
1
2
e2αt0
x∗
(t) ¯Px∗
(t)
It is easy to see that the eigenvalue for the closed-loop system is related as
−α − (α − 1)2 + 1 < −α
This shows the desired result that the optimal system has the eigenvalue less than α
Page 53 of 83

4.6 Closed-Loop Controller Design Using Frequency-Domain
(Kalman Equation in Frequency Domain)
4.6.1 Relation Between Open-Loop and Closed-Loop
Consider a controllable, linear, time-invariant plant
˙x(t) = Ax(t) + Bu(t)
Then, the open-loop characteristic polynomial of the system is
∆o(s) = |sI − A|
and the optimal closed-loop characteristic polynomial is
∆c(s) = |sI − A + B ¯K| = |I + B ¯K[sI − A]−1
|.[sI − A] = |I + ¯K[sI − A]−1
B|∆o(s)
This is a relation between the open-loop ∆o(s) and closed-loop ∆c(s) characteristic polynomials. From
Fig.4.14, we note that
1. − ¯K[sI − A]−1
B is called the loop gain matrix, and
2. I + ¯K[sI − A]−1
B is termed return diﬀerence matrix.
Figure 4.14: Optimal Closed-Loop Control in Frequency Domain
˙x(t) = Ax(t) + Bu(t)
J =
1
2
∞
t0
x(t0) = x0, x(∞) = 0
4. ﬁnd the optimal control assuming that [A, B] is stabilizable and [A,
√
Q] is observable.
1. Solve the Kalman equation in frequency domain
B [−sI − A ]−1
Q[sI − A]−1
B + R = I + ¯K[−sI − A]−1
B R I + ¯K[sI − A]−1
B
2. Get the optimal feedback u∗
(t)
u∗
(t) = −R−1
B ¯Px∗
(t) = −Kx∗
(t)
Page 54 of 83

4.6.4 Gain Margin and Phase Margin
Rewrite the Kalman equation s = jw as
B [−jwI − A ]−1
Q[jwI − A]−1
B + R = I + ¯K[−jwI − A]−1
B R I + ¯K[jwI − A]−1
B
The previous result can be viewed as
M(jw) = W (−jw)W(jw)
where,
W(jw) = R1/2
I + ¯K[jwI − A]−1
B
M(jw) = R + B [−jwI − A ]−1
Q[jwI − A]−1
B
Note that M(jw) ≥ R > 0. Using Q = CC , R = DD = I and the notation
W (−jw)W(jw) = ||W(jw)||2
Then, Kalman equation can be written as
||I + ¯K[jwI − A]−1
B||2
= I + ||C[jwI − A]−1
B||2
||I + ¯K[sI − A]−1
B||2
= I + ||C[sI − A]−1
B||2
This result can be used to ﬁnd the optimal feedback matrix ¯K given the other quantities A, B, Q, R = I.
Example:
Find the optimal feedback coeﬃcients for the system
˙x1(t) = x2(t)
˙x2(t) = u(t)
and the performance measure
J =
1
2
∞
0
x2
1(t) + x2
2(t) + u2
(t) dt
First it is easy to identify the various matrices as
A =

0 1
0 0

 ; B =

0
1

 ; Q =

1 0
0 1

 ; R = 1
Also, note since Q = R = I, we have C = D = I and B = I. Thus, the Kalman equation becomes
B [−sI − A ]−1
Q[sI − A]−1
B + R = I + ¯K[−sI − A]−1
B R I + ¯K[sI − A]−1
B
1
s4
−
1
s2
+ 1 =
¯k2
11
s4
+
2¯k11 − ¯k2
12
s2
+ 1
∴ ¯K = ¯k11
¯k12
= 1
√
3
Then, the optimal feedback control as
u∗
(t) = − ¯Kx∗
(t) = − 1
√
3 x∗
(t)
Note: This example can be solved be procedure of Sec.3.1, we get
¯P =


√
3 1
1
√
3


Page 55 of 83

and the optimal control as
u∗
(t) = −R−1
B ¯Px∗
(t) = − 1
√
3 x∗
(t)
Figure 4.15: Closed-Loop Optimal Control System with Unity Feedback
Here, we can easily recognize that for a single-input, single-output case, the optimal feedback control
system is exactly like a classical feedback control system with unity negative feedback and transfer
function as Go(s) = k[sI − A]−1
b. Thus, the frequency domain interpretation in terms of gain margin,
phase margin can be easily done using Nyquist, Bode, or some other plot .of the transfer function Go(s).
Page 56 of 83

Chapter 5
Variational Calculus and Open-Loop
Optimal Control for Discrete-Time
Systems
5.1 Discrete Euler-Lagrange Equation
The discrete Euler-Lagrange equation can be written as,
∂V (x∗
(k), x∗
(k + 1), k)
∂x∗(k)
+
∂V (x∗
(k − 1), x∗
(k), k − 1)
∂x∗(k)
= 0
Example:
Consider the minimization of a functional
J(x(k0), k0) = J =
kf −1
k−k0
x(k)x(k + 1) + x2
(k)
subject to the boundary conditions x(0) = 2, and x(10) = 5.
Solution:
∂V (x∗
(k), x∗
(k + 1), k) = x(k)x(k + 1) + x2
(k)
∂V (x∗
(k − 1), x∗
(k), k − 1) = x(k − 1)x(k) + x2
(k − 1)
∂V (x∗
(k), x∗
(k + 1), k)
∂x∗(k)
= x(k + 1) + 2x(k)
∂V (x∗
(k − 1), x∗
(k), k − 1)
∂x∗(k)
= x(k − 1)
∂V (x∗
(k), x∗
(k + 1), k)
∂x∗(k)
+
∂V (x∗
(k − 1), x∗
(k), k − 1)
∂x∗(k)
= 0
∴ x(k + 1) + 2x(k) + x(k − 1) = 0 ⇒ x(k + 2) + 2x(k + 1) + x(k) = 0
If boundary conditions x(0) = 2 and x(10) = 5, so
x(k) = 2(−1)k
+ 0.3k(−1)k
5.2 Procedure Summary for Discrete-Time Optimal Control
System (Open-Loop Optimal Control)
x(k + 1) = A(k)x(x) + B(k)u(k),
57

J =
1
2
x (kf )F(kf )x(kf ) +
1
2
kf −1
k=k0
[x (k)Q(k)x(k) + u (k)R(k)u(k)] ,
x(k = k0) = x(k0) and final conditions depends on system type
H =
1
2
x (k)Q(k)x(k) +
1
2
u (k)R(k)u(k) + λ (k + 1) [A(k)x(x) + B(k)u(k)]
2. Minimize H w.r.t. u(k)
∂H
∂u(k) ∗
= 0, and obtain, u∗
(k) = −R−1
(k)B (k)λ∗
(k + 1)
3. Using the results of Step 1 in Step 2, find the optimal H∗
H∗
= H∗
(x∗
(k), λ∗
(k + 1))
x∗
(k + 1) =
∂H∗
∂λ∗(k + 1)
= A(k)x∗
(k) − B(k)R−1
(k)B (k)λ∗
(k + 1)
x∗
(k + 1) =
∂H∗
∂x∗(k)
= Q(k)x∗
(k) + A (k)λ∗
(k + 1)
with initial conditions x(k0) and the final conditions
H∗
+
∂S (x∗
(k), k)
∂k kf
δkf +
∂S (x∗
(k), k)
∂x∗
− λ∗
(k)
kf
δx(kf ) = 0
5. Substitute the solutions of λ∗
(k) from Step 4 into the expression for the optimal control u∗
(k) of
Step 2.
Fixed-final time and fixed-final δtkf = 0, x(k = k0) = x(k0),
state system, Fig.5.1(a) δx(kf ) = 0 x(k = kf ) = x(kf )
Fixed-final time and free-final δkf = 0, x(k = k0) = x(k0),
state system, Fig.5.1(c) δx(kf ) = 0 λ∗
(kf ) = ∂S(x∗
(k),k)
∂x∗(k)
kf
= F(kf )x(kf )
Page 58 of 83

Figure 5.1: Diﬀerent Types of Systems: (a) Fixed-Final Time and Fixed-Final State System, (b) Fixed-
Final Time and Free-Final State System
Example:
1. Plant
x(k + 1) = x(k) + u(k)
J(k0) =
1
2
kf −1
k=k0
u2
(k)
x(k0 = 0) = 1, x(kf = 10) = 0
H(x(k), u(k), λ(k + 1)) =
1
2
u2
(k) + λ(k + 1) [x(k) + u(k)]
2. Get u∗
(k)
∂H
∂u(k)
= 0 ⇒ u∗
(k) + λ∗
(k + 1) = 0 ⇒ u∗
(k) = −λ∗
(k + 1)
3. Get H∗
H∗
(x∗
(k), λ∗
(k + 1)) = x∗
(k)λ∗
(k + 1) −
1
2
λ∗2
(k + 1)
x∗
(k + 1) =
∂H∗
∂λ∗(k + 1)
= x∗
(k) − λ∗
(k + 1)
x∗
(k + 1) =
∂H∗
∂x∗(k)
= λ∗
(k + 1)
x∗
(k) = 1 − 0.1k
λ∗
(k + 1) = 0.1
Page 59 of 83

u∗
(k) = −0.1
Figure 5.2: State and Costate System
Page 60 of 83

Chapter 6
for Discrete-Time Systems I
(Regulator Closed-Loop Optimal
Control)
6.1 Procedure Summary of Discrete-Time, Linear Quadratic
Regulator System (Closed-Loop Optimal Control with Fixed
kf and Free x(kf ))
x(k + 1) = A(k)x(x) + B(k)u(k),
J =
1
2
x (kf )F(kf )x(kf ) +
1
2
kf −1
k=k0
[x (k)Q(k)x(k) + u (k)R(k)u(k)] ,
x(k = k0) = x(k0), x(kf ) is free, and kf is free,
4. ﬁnd the closed-loop optimal control, state and performance index.
P(k) = A (k)P(k + 1)[I + E(k)P(k + 1)]−1
A(k) + Q(k); E(k) = B(k)R−1
(k)B (k)
with ﬁnal condition P(k = kf ) = F(kf ).
(k) from
L(k) = R−1
(k)B (k)A
−1
(k)[P(k) − Q(k)]
x∗
(k + 1) = [A(k) − B(k)L(k)]x∗
(k)
with initial condition x(k0) = x0
61

(k) as
u∗
(k) = −L(k)x∗
(k); where, L(k) is the Kalman gain.
J∗
=
1
2
x∗
(k)P(k)x∗
(k)
Figure 6.1: Closed-Loop Optimal Controller for Linear Discrete-Time Regulator
Example:
1. Plant
x1(k + 1) = 0.8x1(k) + x2(k) + u(k)
x2(k + 1) = 0.6x2(k) + 0.5u(k)
L = x2
1(kf ) + 2x2
2(kf ) +
kf −1
k=k0
0.5x2
1(k) + 0.5x2
2(k) + 0.5u2
(k)
x1(k0 = 0) = 5, x2(k0) = 3, kf = 10, and x(kf ) is free.
A(k) =

0.8 1.0
0.0 0.6

 ; B(k) =

1.0
0.5

 ; F(kf ) =

2 0
0 4

 ; Q(k) =

1 0
0 1

 ; R(k) = 1
Let P(k) be the 2 × 2 symmetric matrix
P(k) =

p11(k) p12(k)
p12(k) p22(k)


P(k) = A (k)P(k + 1)[I + E(k)P(k + 1)]−1
A(k) + Q(k)

p11(k) p12(k)
p12(k) p22(k)

 =

0.8 1.0
0.0 0.6



p11(k + 1) p12(k + 1)
p12(k + 1) p22(k + 1)






1 0
0 1

 +

 1 0.5
0.5 0.25



p11(k + 1) p12(k + 1)
p12(k + 1) p22(k + 1)





−1
+

0.8 1.0
0.0 0.6



1 0
0 1


Page 62 of 83

with ﬁnal condition P(kf ) = F(kf )

p11(10) p12(10)
p12(10) p22(10)

 =

2 0
0 4


(k) from
L(k) = R−1
(k)B (k)A
−1
(k)[P(k) − Q(k)]
x∗
(k + 1) = [A(k) − B(k)L(k)]x∗
(k)
with initial condition x(k0) = 5 3
(k) as
u∗
(k) = −L(k)x∗
(k)
Page 63 of 83

J∗
=
1
2
x∗
(k)P(k)x∗
(k)
6.2 Procedure Summary of Discrete-Time, Linear Quadratic
Regulator System: Steady-State Condition (Closed-Loop
Optimal Control with kf = ∞)
x(k + 1) = Ax(x) + Bu(k),
J =
1
2
∞
k=k0
[x (k)Qx(k) + u (k)Ru(k)] ,
Page 64 of 83

x(k = k0) = x(k0), kf = ∞, x(∞) is free,
4. ﬁnd the closed-loop optimal control, state and performance index.
¯P = A ¯P I + BR−1
B ¯P
−1
A + Q, or
¯P = A ¯P − ¯PB B ¯PB + R
−1
B ¯P A + Q
with ﬁnal condition P(k = kf ) = F(kf ).
(k) from
¯L = R−1
B A
−1
[ ¯P − Q]
x∗
(k + 1) = [A − B ¯L]x∗
(k), or
¯La = B ¯PB + R
−1
B ¯PA
x∗
(k + 1) = [A − B ¯La]x∗
(k),
(k) as
u∗
(k) = −¯Lx∗
(k), or
u∗
(k) = −¯Lax∗
(k),
J∗
=
1
2
x∗
(k) ¯Px∗
(k)
Figure 6.6: Closed-Loop Optimal Control for Discrete-Time Steady-State Regulator System
Example:
1. Plant
x1(k + 1) = 0.8x1(k) + x2(k) + u(k)
x2(k + 1) = 0.6x2(k) + 0.5u(k)
L =
∞
k=k0
0.5x2
1(k) + 0.5x2
2(k) + 0.5u2
(k)
Page 65 of 83

x1(k0 = 0) = 5, x2(k0) = 3, kf = ∞, and x(∞) is free.
A(k) =

0.8 1.0
0.0 0.6

 ; B(k) =

1.0
0.5

 ; Q(k) =

1 0
0 1

 ; R(k) = 1; F(kf ) = 0
¯P =

¯p11 ¯p12
¯p12 ¯p22


¯P = A ¯P I + BR−1
B ¯P
−1
A + Q

¯p11 ¯p12
¯p12 ¯p22

 =

0.8 1.0
0.0 0.6



¯p11 ¯p12
¯p12 ¯p22






1 0
0 1

 +

 1 0.5
0.5 0.25



¯p11 ¯p12
¯p12 ¯p22





−1
+

0.8 1.0
0.0 0.6



1 0
0 1


∴ ¯P =

1.3944 0.3738
0.3738 1.7803


(k) from
¯L = R−1
B A
−1
[ ¯P − Q]
x∗
(k + 1) = [A − B ¯L]x∗
(k)
Page 66 of 83

(k) as
u∗
(k) = −¯Lx∗
(k)
J∗
=
1
2
x∗
(k) ¯Px∗
(k)
6.3 Analytical Solution to the Riccati Equation
From the solution of state and costate,

x(k + 1)
λ(k)

 =

A(k) −E(k)
Q(k) A (k)



 x(k)
λ(k + 1)


Page 67 of 83

we can get,

x(k)
λ(k)

 = H

x(k + 1)
λ(k + 1)


H =

 A−1
A−1
E
QA−1
A + QA−1
E


then contract the matrix D,
D = W−1
HW =

M 0
0 M−1


where, W columns are the eigenvectors of H. Note that, D is a diagonal matrix of eigenvalues of H.
now, we can get the solution of the Riccati Equation as,
P(k) = [W21 + W22T(k)] [W11 + W12T(k)]
−1
T(k) = M−(kf −k)
T(kf )M−(kf −k)
T(kf ) = − [W22 − F(kf )W12]
−1
[W21 − F(kf )W11]
The stead-state solution of the Riccati Equation,
¯P = W21W−1
11
to satisfy the condition of symmetry of P(k),
P(k) =
1
2
[P(k) + P (k)]
Example:
Consider the following data of a system
A(k) =

0.8 1.0
0.0 0.6

 ; B(k) =

1.0
0.5

 ; F(kf ) =

2 0
0 4

 ; Q(k) =

1 0
0 1

 ; R(k) = 1
Solution
Page 68 of 83

H =







1.2500 −2.0833 0.2083 0.1042
0 1.6667 0.8333 0.4167
1.2500 −2.0833 1.0083 0.1042
0 1.6667 1.8333 1.0167







D =







2.1497 + 1.439i 0 0 0
0 2.1497 − 1.4398i 0 0
0 0 0.3211 + 0.2151i 0
0 0 0 0.3211 − 0.2151i







W11 =

 0.0554 − 0.4243i 0.0554 + 0.4243i
−0.3531 + 0.0891i −0.3531 − 0.0891i


W12 =

−0.1100 − 0.4434i −0.1100 + 0.4434i
−0.0780 − 0.1622i −0.0780 + 0.1622i


W22 =

−0.2339 + 0.2417i −0.2339 − 0.2417i
0.8036 + 0.0000i 0.8036 + 0.0000i


T(kf ) =

−0.5260 + 0.1644i −0.2745 − 0.3013i
−0.2745 + 0.3013i −0.5260 − 0.1644i


At steady-state conditions F(kf ) = 0 and kf = ∞
¯P =

1.3944 0.3738
0.3738 1.7803


Page 69 of 83

Chapter 7
for Discrete-Time Systems II
(Tracking Closed-Loop Optimal
Control)
7.1 Procedure Summary of Discrete-Time Linear Quadratic Track-
ing System (Closed-Loop Optimal Control with Fixed Lin-
ear Time-Invariant and Free x(kf ))
x(k + 1) = Ax(k) + Bu(k)
y(k) = Cx(k)
J(k0) =
1
2
[Cx(kf ) − z(kf )] F [Cx(kf ) − z(kf )]+
1
2
kf −1
k=k0
[Cx(k) − z(k)] Q [Cx(k) − z(k)] + u (k)Ru(k)
x(k0) = x0, x(kf ) is free, and k is ﬁxed,
4. ﬁnd the optimal control and state.
P(k) = A P(k + 1) [I + EP(k + 1)]
−1
A + V
V = C QC
E = BR−1
B
P(kf ) = C FC
70

2. Solve the vector diﬀerence equation
g(k) = A I − P−1
(k + 1) + E
−1
E g(k + 1) + Wz(k)
W = C Q
g(kf ) = C Fz(kf )
(k) from
x∗
(k + 1) = [A − BL(k)] x∗
(k) + BLg(k)g(k + 1)
L(k) = [R + B P(k + 1)B]
−1
B P(k + 1)A
Lg(k) = [R + B P(k + 1)B]
−1
B
(k) as
u∗
(k) = −L(k)x∗
(k) + Lg(k)g(k + 1)
Figure 7.1: Implementation of Discrete-Time Optimal Tracker
Example:
1. Plant
x1(k + 1) = 0.8x1(k) + x2(k) + u(k)
x2(k + 1) = 0.6x2(k) + 0.5u(k)
Page 71 of 83

L = x2
1(kf ) + 2x2
2(kf ) +
kf −1
k=k0
0.5x2
1(k) + 0.5x2
2(k) + 0.5u2
(k)
x1(k0 = 0) = 5, x2(k0) = 3, kf = 10, and x(kf ) is free.
It is required to keep the state x1(k) close to 2.
The state x1(k) is to be kept close to the reference input z1(k) = 2 and since there is no condition
on state x2(k), one can choose arbitrarily as z2(k) = 0.
A(k) =

0.8 1.0
0.0 0.6

 ; B(k) =

1.0
0.5

 ; F(kf ) =

1 0
0 0

 = Q(k); z(k) =

2
0

 ; R(k) = 0.01
P(k) = A P(k + 1) [I + EP(k + 1)]
−1
A + V
V = C QC
E = BR−1
B
P(kf ) = C FC =

1 0
0 0


2. Solve the vector diﬀerence equation
g(k) = A I − P−1
(k + 1) + E
−1
E g(k + 1) + Wz(k)
W = C Q
Page 72 of 83

g(kf ) = C Fz(kf ) =

2
0


(k) from
x∗
(k + 1) = [A − BL(k)] x∗
(k) + BLg(k)g(k + 1)
L(k) = [R + B P(k + 1)B]
−1
B P(k + 1)A
Lg(k) = [R + B P(k + 1)B]
−1
B
(k) as
u∗
(k) = −L(k)x∗
(k) + Lg(k)g(k + 1)
Page 73 of 83

7.2 Closed-Loop Controller Design Using Frequency-Domain
(Discrete Kalman Equation in Frequency Domain)
7.2.1 Relation Between Open-Loop and Closed-Loop
Consider a discrete controllable, linear, time-invariant plant
x(k + 1) = Ax(k) + Bu(k)
Then, the open-loop characteristic polynomial of the system is
∆o(z) = |zI − A|
and the optimal closed-loop characteristic polynomial is
∆c(z) = |zI − A + B ¯L| = |I + B ¯L[zI − A]−1
|.[zI − A] = |I + ¯L[zI − A]−1
B|∆o(z)
This is a relation between the open-loop ∆o(z) and closed-loop ∆c(z) characteristic polynomials. From
Fig.7.6, we note that
1. −¯L[zI − A]−1
B is called the loop gain matrix, and
2. I + ¯L[zI − A]−1
B is termed return diﬀerence matrix.
Figure 7.6: Closed-Loop Discrete-Time Optimal Control System
Page 74 of 83

x(k + 1) = Ax(k) + Bu(k)
J =
1
2
kf −1
k=k0
[x (k)Qx(k) + u (k)Ru(k)]
x(k0) = x0, x(kf ) = 0
4. ﬁnd the optimal control assuming that [A, B] is stabilizable and [A,
√
Q] is observable.
1. Solve the discrete Kalman equation in frequency domain.
B z−1
I − A
−1
Q [zI − A]
−1
B+R = I + ¯L z−1
I − A
−1
B B ¯PB + R I + ¯L [zI − A]
−1
B
2. Get the optimal feedback u∗
(t)
u∗
(t) = −¯Lx∗
(t)
Page 75 of 83

Chapter 8
Pontryagin Minimum Principle
8.1 Procedure Summary of Pontryagin Minimum Principle
˙x(t) = f(x(t), u(t), t),
J = S (x(tf ), tf ) +
tf
t0
V (x(t), u(t), t)dt,
x(t0) = x0 and tf and x(tf ) = xf are free,
H(x(t), u(t), λ(t), t) = V (x(t), u(t), t) + λ (t)f(x(t), u(t), t)
2. Minimize H w.r.t. u(t)(≤ U)
H(x∗
(t), u∗
(t), λ∗
(t), t) ≤ H(x∗
(t), u(t), λ∗
(t), t)
˙x∗
= +
∂H
∂λ ∗
and ˙λ∗
= −
∂H
∂x ∗
with initial conditions x0 and the ﬁnal conditions
H +
∂S
∂t ∗tf
δtf +
∂S
∂x
− λ(t)
∗tf
δxf = 0
4. Substitute the solutions of x∗
(t), λ∗
(t) from Step 3 into the expression for the optimal control u∗
(t)
of Step 2.
76

Fixed-final time and fixed-final δtf = 0, x(t0) = x0,
state system, Fig.8.1(a) δxf = 0 x(tf ) = xf
Free-final time and fixed-final δtf = 0, x(t0) = x0, x(tf ) = xf ,
state system, Fig.8.1(b) δxf = 0 H∗
+ ∂S
∂t tf
= 0
Fixed-final time and free-final δtf = 0, x(t0) = x0,
state system, Fig.8.1(c) δxf = 0 λ∗
(tf ) = ∂S
∂x ∗tf
Free-final time and dependent free-final δxf = ˙θ(tf)δtf x(t0) = x0, x(tf ) = θ(tf ),
state system, Fig.8.1(d) H∗
+ ∂S
∂t + ∂S
∂x ∗
− λ∗
(t) ˙θ(t)
tf
= 0
Free-final time and independent free-final δtf = 0, δx(t0) = x0,
state system δxf = 0 H∗
+ ∂S
∂t tf
= 0, ∂S
∂x ∗
− λ∗
(t) tf
= 0
Figure 8.1: Different Types of Systems: (a) Fixed-Final Time and Fixed-Final State System, (b) FreeFi-
nal Time and Fixed-Final State System, (c) Fixed-Final Time and Free-Final State System, (d) FreeFinal
Time and Free-Final State System
Page 77 of 83

8.1.4 Important Notes
1. In the previous chapters, there is no constraints on control signal which not happen in physical
systems. Fig.8.2, shows that δu(t) maybe positive or negative in the region of free constraints,
otherwise δu(t) has only positive value.
Figure 8.2: (a) An Optimal Control Function Constrained by a Boundary (b) A Control Variation for
Which −δu(t) Is Not Admissible
2. The condition in step 2 is the necessary condition for minimum
H(x∗
(t), u(t), λ∗
(t), t) − H(x∗
(t), u∗
(t), λ∗
(t), t) ≥ 0
3. The sufficient condition for unconstrained control systems is that the second derivative of the
Hamiltonian
∂2
H
∂u2
(x∗
(t), u∗
(t), λ∗
(t), t) =
∂2
H
∂u2
∗
must be positive definite.
8.1.5 Additional Necessary Conditions
1. If the final time tf is fixed and the Hamiltonian H does not depend on time t explicitly, then the
Hamiltonian H must be constant when evaluated along the optimal trajectory; that is
H(x∗
(t), u∗
(t), λ∗
(t)) = constant = C1 ∀ t ∈ [t0, tf ]
2. If the final time tf is free or not specified priori and the Hamiltonian does not depend explicitly on
time t, then the Hamiltonian must be identically zero when evaluated along the optimal trajectory;
that is,
H(x∗
(t), u∗
(t), λ∗
(t)) = 0 ∀ t ∈ [t0, tf ]
Example:
Minimize the scalar function
H = u2
− 6u + 7
subject to the constraint relation
|u| ≤ 2, → −2 ≤ u ≤ 2
Solution:
Get the unconstrained control as
∂H
∂u
= 0 → 2u∗
− 6 = 0 → u∗
= 3
and the corresponding optimal H∗
H(u∗
) = H∗
= H(3) = 32
− 6 × 3 + 7 = −2
This value of u∗
= 3 is certainly outside the constraint (admissible) region, But, the constrained control,
H(u) = H(2) = 22
− 6 × 2 + 7 = −1
So, the necessary condition satisfied
H(u∗
) ≤ H(u)
Page 78 of 83

8.2 Optimal Control of Discrete-Time Systems Using the Prin-
ciple of Optimality of Dynamic Programming (Regulator
Optimal Control with Fixed kf and Free x(kf ))
x(k + 1) = Ax(k) + Bu(k)
Jk =
1
2
x (kf )Fx(kf ) +
1
2
kf −1
i
[x (k)Qx(k) + u (k)Ru(k)] ; i ≤ k ≤ kf
x(k0) = x0, x(kf ) is free, and there are no constraints on the state or control
1. Solve the matrix diﬀerential Riccati equation backward,
L(k) = [R + B P(k + 1)B]
−1
B P(k + 1)A
P(k) = [A − BL(k)] P(k + 1)[A − BL(k)] + L (k)RL(k) + Q
with the ﬁnal condition P(kf ) = F.
(t) from
x(k + 1) = [A − BL(k)]x∗
(k)
(t) as
u∗
(k) = −L(k)x∗
(k); where, L(k) is the Kalman gain
J∗
k =
1
2
x∗
(k)P(k)x∗
(k)
8.3 Optimal Control of Continuous-Time Systems Using Hamilton-
Jacobi-Bellman (HJB) Approach (Closed-Loop Optimal Con-
trol with Free x(tf ))
˙x(t) = f(x(t), u(t), t)
J = S (x(tf ), tf ) +
tf
t0
V (x(t), u(t), t)dt
x(t0) = x0; x(tf ) is free
Page 79 of 83

H (x(t), u(t), J∗
x, t) = V (x(t), u(t), t) + J∗
x f(x(t), u(t), t)
2. Minimize H w.r.t. u(t) as
∂H
∂u ∗
= 0 and obtain u∗
(t) = h (x∗
(t), J∗
x, t)
3. Using the result of Step 2, ﬁnd the optimal H∗
function
H∗
(x∗
(t), h (x∗
(t), J∗
x, t) , J∗
x, t) = H∗
(x∗
(t), J∗
x, t)
and obtain the HJB equation
4. Solve the HJB equation
J∗
t + H (x∗
(t), J∗
x, t) = 0
with boundary condition J∗
(x∗
(tf ), tf ) = S(x(tf ), tf )
Note that:
J∗
t =
∂J∗
(x∗
(t), t)
∂t
; J∗
x =
∂J∗
(x∗
(t), t)
∂x∗
5. Use the solution J∗
, from Step 4 to evaluate J∗
x and substitute into the expression for u∗
(t) of Step
2, to obtain the optimal control
Example 1:
1. Plant
˙x(t) = −2x(t) + u(t)
J =
1
2
x2
(tf ) +
1
2
tf
0
x2
(t) + u2
(t) dt
1. Hamiltonian equation H
V (x(t), u(t), t) =
1
2
x2
(t) + u2
(t)
S (x(tf ), tf ) =
1
2
x2
(tf )
f(x(t), u(t), t) = −2x(t) + u(t)
H (x∗
(t), Jx, u∗
(t), t) = V (x(t), u(t), t) + Jxf(x(t), u(t), t)
=
1
2
x2
(t) +
1
2
u2
(t) + Jx(−2x(t) + u(t))
2. Get u∗
(t)
∂H
∂u
= 0 ⇒ u(t) + Jx = 0 ⇒ u(t) = −Jx
Page 80 of 83

3. Get H∗
H∗
=
1
2
x2
(t) +
1
2
(−Jx)
2
+ Jx(−2x(t) − Jx) = −
1
2
J2
x +
1
2
x2
(t) − 2x(t)Jx
J∗
t + H∗
= 0 ⇒ Jt −
1
2
J2
x +
1
2
x2
(t) − 2x(t)Jx = 0
with boundary condition
J∗
(x∗
(tf ), tf ) = S(x(tf ), tf ) =
1
2
x2
(tf )
As, the performance index is a quadratic function of states and controls, we can guess the solution
as
J(x(t)) =
1
2
p(t)x2
(t)
where, p(t), the unknown function to be determined, has the boundary condition as
J(x(tf )) =
1
2
x2
(tf ) =
1
2
p(tf )x2
(tf ) ⇒ p(tf ) = 1
∴ Jx = p(t)x(t); Jt =
1
2
˙p(t)x2
(t)
1
2
˙p(t)x∗2
(t) −
1
2
p2
(t)x∗2
(t) +
1
2
x∗2
(t) − 2p(t)x∗2
(t) = 0
⇒
1
2
˙p(t) −
1
2
p2
(t) +
1
2
− 2p(t) x∗2
(t) = 0
⇒
1
2
˙p(t) −
1
2
p2
(t) +
1
2
− 2p(t) = 0
⇒ p(t) =
(
√
5 − 2) + (
√
5 + 2) 3−
√
5
3+
√
5
e2
√
5(t−tf )
1 − 3−
√
5
3+
√
5
e2
√
5(t−tf )
The closed-loop optimal control
u∗
(t) = −p(t)x(t)
Note: Let us note that as
tf → ∞ ⇒ p(t) = p(∞) = ¯p =
√
5 − 2
and the optimal control is
u(t) = −(
√
5 − 2)x(t)
Example 2:
1. Plant
˙x(t) = −2x(t) + u(t)
J =
∞
0
x2
(t) + u2
(t) dt
1. Hamiltonian equation H
V (x(t), u(t), t) = x2
(t) + u2
(t)
f(x(t), u(t), t) = −2x(t) + u(t)
H (x∗
(t), Jx, u∗
(t), t) = V (x(t), u(t), t) + Jxf(x(t), u(t), t) = x2
(t) + u2
(t) + Jx(−2x(t) + u(t))
Page 81 of 83

2. Get u∗
(t)
∂H
∂u
= 0 ⇒ 2u(t) + Jx = 0 ⇒ u(t) = −
1
2
Jx
3. Get H∗
H∗
= x2
(t) + −
1
2
Jx
2
+ Jx −2x(t) −
1
2
Jx = −
1
4
J2
x + x2
(t) − 2x(t)Jx
J∗
t + H∗
= 0 ⇒ Jt −
1
4
J2
x + x2
(t) − 2x(t)Jx = 0
with boundary condition
J∗
(x∗
(tf ), tf ) = S(x(tf ), tf ) = 0
As, the performance index is a quadratic function of states and controls, we can guess the solution
as
J(x(t)) =
1
2
p(t)x2
(t)
where, p(t), the unknown function to be determined, has the boundary condition as
J(x(tf )) =
1
2
x2
(tf ) =
1
2
p(tf )x2
(tf ) ⇒ p(tf ) = 0
∴ Jx = p(t)x(t); Jt =
1
2
˙p(t)x2
(t)
1
2
˙p(t)x∗2
(t) −
1
4
p2
(t)x∗2
(t) + x∗2
(t) − 2p(t)x∗2
(t) = 0
⇒
1
2
˙p(t) −
1
4
p2
(t) + 1 − 2p(t) x∗2
(t) = 0
⇒
1
2
˙p(t) −
1
4
p2
(t) + 1 − 2p(t) = 0
⇒ p(t) =
2(
√
5 − 2) + 2(
√
5 + 2) 3−
√
5
3+
√
5
e2
√
5(t−tf )
1 − 3−
√
5
3+
√
5
e2
√
5(t−tf )
The closed-loop optimal control
u∗
(t) = −
1
2
p(t)x(t)
Note: Let us note that as
tf → ∞ ⇒ p(t) = p(∞) = ¯p = 2
√
5 − 2
and the optimal control is
u(t) = −(
√
5 − 2)x(t)
Page 82 of 83

References
[1] Desineni Subbaram Naidu,”Optimal Control Systems,” Idaho State Universitv, Pocatello, Idaho,
USA.
Contacts
mohamed.atyya94@eng-st.cu.edu.eg
83

Optimal control systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Optimal control systems

Similar to Optimal control systems (20)

More from Mohamed Mohamed El-Sayed

More from Mohamed Mohamed El-Sayed (20)

Recently uploaded

Recently uploaded (20)

Optimal control systems