Lyapunov functions, value functions,
               and performance bounds
Sean Meyn

Department of Electrical and Computer Engineering
University of Illinois
and the Coordinated Science Laboratory




   Joint work with R. Tweedie, I. Kontoyiannis, and P. Mehta
   Supported in part by NSF (ECS 05 23620, and prior funding), and AFOSR
Objectives


Nonlinear state space model ≡ (controlled) Markov process,
                                                  state process X


Typical form:

     dX(t) = f (X(t), U (t)) dt + σ(X(t), U (t)) dW (t)
                                                          noise
                              control
Objectives


Nonlinear state space model ≡ (controlled) Markov process,
                                                  state process X
Typical form:
     dX(t) = f (X(t), U (t)) dt + σ(X(t), U (t)) dW (t)
                                                          noise
                                  control


Questions: For a given feedback law,
• Is the state process stable?
• Is the average cost finite?    E[c(X(t), U (t))]
• Can we solve the DP equations?       min c(x, u) + Du h∗ (x) = η ∗
                                            u
• Can we approximate the average cost η∗? The value function h∗ ?
Outline


Markov Models                 P t (x, · ) − π   f   →0




                                                         sup Ex [SτC (f )] < ∞
                                                         C
                  π(f ) < ∞
Representations


Lyapunov Theory           DV (x) ≤ −f (x) + bIC (x)




Conclusions
I
Markov Models
Notation


Markov chain: X = X(t) : t ≥ 0
Countable state space, X
Transition semigroup,
     P t (x, y) = P X(s + t) = y X(s) = x ,   x, y ∈ X
Notation: Generators & Resolvents


Markov chain: X = X(t) : t ≥ 0
Countable state space, X
Transition semigroup,
     P t (x, y) = P X(s + t) = y X(s) = x ,   x, y ∈ X

Generator: For some domain of functions h,
                     1
      Dh (x) = lim E[h(X(s + t)) − h(X(s)) X(s) = x]
                 t→0 t

                    1 t
              = lim (P h (x) − h(x))
                t→0 t
Notation: Generators & Resolvents


Generator: For some domain of functions h,
                     1
      Dh (x) = lim E[h(X(s + t)) − h(X(s)) X(s) = x]
                 t→0 t

                     1 t
               = lim (P h (x) − h(x))
                 t→0 t

Rate matrix:
       Dh (x) =       Q(x, y)h(y)   P t = eQt
                  y
α             µ
Example: MM1 Queue

                          
                          x + 1 Prob εα
                          
Sample paths:   X(t + ε) ≈ x − 1 Prob εµ
                          
                          
                            x    Prob 1 − ε(α + µ)


Rate matrix:
                                                           
     −α      α   0      0      0      0                  ···
    µ −α − µ
                α      0      0      0                  · · ·
                                                              
    0
            µ −α − µ   α      0      0                  · · ·
                                                              
   
Q= 0        0   µ    −α − µ   α      0                  · · ·
                                                              
    0
            0   0      µ    −α − µ   α                  · · ·
                                                              
    0
            0   0      0      µ    −α − µ               · · ·
                                                              
      .
      .      .
             .   .
                 .      .
                        .      .
                               .      .
                                      .
      .      .   .      .      .      .
2        2
                                               σW = 0   σW = 1




Example: O-U Model



Sample paths:    dX(t) = AX(t) dt + B dW (t)
                                A n × n, Bn × 1, W standard BM

                                        2
Generator: Dh (x) = (Ax)T h (x) + B T       h (x)B
2        2
                                                       σW = 0   σW = 1




Example: O-U Model



Sample paths:         dX(t) = AX(t) dt + B dW (t)
                                      A n × n, Bn × 1, W standard BM

                                                2
Generator: Dh (x) = (Ax)T h (x) + B T               h (x)B


h quadratic,      h(x) = 1 xT P x
                         2               h (x) = P x
                                         2
                                             h (x) = P


               Dh (x) = 1 xT (P A + AT P )x + B T P B
                        2
Notation: Generators & Resolvents


Generator: For some domain of functions h,
                     1
      Dh (x) = lim E[h(X(s + t)) − h(X(s)) X(s) = x]
                 t→0 t

                           1 t
                     = lim (P h (x) − h(x))
                       t→0 t

Rate matrix:
       Dh (x) =              Q(x, y)h(y)   P t = eQt
                         y

Resolvent:
                 ∞
   Rα =              e−αt P t
             0
Notation: Generators & Resolvents


Generator: For some domain of functions h,
                     1
      Dh (x) = lim E[h(X(s + t)) − h(X(s)) X(s) = x]
                 t→0 t

                           1 t
                     = lim (P h (x) − h(x))
                       t→0 t

Rate matrix:
       Dh (x) =              Q(x, y)h(y)        P t = eQt
                         y

Resolvent:                                 Resolvent equations:
                 ∞
   Rα =              e−αt P t              Rα = [ Iα − Q]−1
             0
                                            QRα = Rα Q = αRα − I
Notation: Generators & Resolvents


Motivation: Dynamic programming. For a cost function c,

hα (x) = Rα c (x) =            Rα (x, y)c(y)
                         y∈X
               ∞
       =           eαt E[c(X(t)) X(0) = x] dt
           0
                                               Discounted-cost value function
Notation: Generators & Resolvents


Motivation: Dynamic programming. For a cost function c,

hα (x) = Rα c (x) =            Rα (x, y)c(y)
                         y∈X
               ∞
       =           eαt E[c(X(t)) X(0) = x] dt
           0
                                               Discounted-cost value function


Resolvent equation = dynamic programming equation,

                          c + Dhα      = αhα
Notation: Steady State Distribution


Invariant (probability) measure π: X is stationary. In particular,

                                         X(t) ∼ π,       t≥0
Notation: Steady State Distribution


Invariant (probability) measure π: X is stationary. In particular,

                                              X(t) ∼ π,     t≥0


Characterizations:

                     π(x)P t (x, y)   =   π(y)
               x∈X

           α         π(x)Rα (x, y)    =   π(y), α > 0
               x∈X

                      π(x)Q(x, y)     =   0
                x∈X                                  y ∈X
Notation: Relative Value Function


Invariant measure π, cost function c , steady-state mean η

Relative value function:
                        ∞
          h(x) =            E[c(X(t)) − η X(0) = x] dt
                    0
Notation: Relative Value Function


Invariant measure π, cost function c , steady-state mean η

Relative value function:
                        ∞
          h(x) =            E[c(X(t)) − η X(0) = x] dt
                    0


Solution to Poisson’s equation (average-cost DP equation):

                             c + Dh = η
II
    Representations
π    ∝ ν[I − (R − s ⊗ ν)]−1

h    = [I − (R − s ⊗ ν)]−1 c
                           ˜
Irreducibility


ψ-Irreducibility:
        ψ(y) > 0 =⇒ P X(t) reaches y X(0) = x > 0 all x

        ψ(y) > 0 =⇒ R(x, y) > 0 all x
Small Functions and Small Measures


ψ-Irreducibility:
        ψ(y) > 0 =⇒ P X(t) reaches y X(0) = x > 0 all x

        ψ(y) > 0 =⇒ R(x, y) > 0 all x

Small functions and measures: For a function s and probability ν,
                R(x, y) ≥ s(x)ν(y),     x, y ∈ X             ∞
                                                    R=           e−t P t dt
                                                         0
Small Functions and Small Measures


ψ-Irreducibility:
        ψ(y) > 0 =⇒ P X(t) reaches y X(0) = x > 0 all x

        ψ(y) > 0 =⇒ R(x, y) > 0 all x

Small functions and measures: For a function s and probability ν,
                R(x, y) ≥ s(x)ν(y),     x, y ∈ X
Resolvent dominates rank-one matrix,                  R=
                                                               ∞
                                                                   e−t P t dt
                                                           0
                     R ≥s⊗ν
Small Functions and Small Measures


ψ-Irreducibility:
        ψ(y) > 0 =⇒ P X(t) reaches y X(0) = x > 0 all x

        ψ(y) > 0 =⇒ R(x, y) > 0 all x

Small functions and measures: For a function s and probability ν,
                R(x, y) ≥ s(x)ν(y),     x, y ∈ X
Resolvent dominates rank-one matrix,                       R=
                                                                    ∞
                                                                        e−t P t dt
                                                                0
                     R ≥s⊗ν
ψ-Irreducibility justi es assumption: s(x) > 0 for all x
                         and WLOG,     ν = δx ∗ ,   where ψ(x∗ ) > 0
α           µ
Example: MM1 Queue


R(x, y) > 0 for all x and y     (irreducible in usual sense)



Conclusion:
                   R(x, y ) ≥ s(x)ν(y )

                where s(x) := R(x, 0)
                              ν := δ0
2              2
                                                         σW = 0         σW = 1




Example: O-U Model
                                                       dX(t) = AX(t) dt + B dW (t)




R(0, . ) Gaussian
         Full rank if and only if (A, B) is controllable.

Conclusion: Under controllability, for any m, there is ε s.t.,


               R(x, A) ≥ s(x)ν(A)        all x and A


           where      s(x) = ε I    x ≤m
                     ν(A) uniform on         x ≤m
Potential Matrix

                              ∞
Potential matrix: G(x, y) =         (R − s ⊗ ν)n (x, y)
                              n=0



                   G = [I − (R − s ⊗ ν)]−1
Representation of π

                              ∞
Potential matrix: G(x, y) =         (R − s ⊗ ν)n (x, y)
                              n=0




                     π ∝ νG
                                            νG (y) =         ν(x)G(x, y)
                                                       x∈X
Representation of h

                              ∞
Potential matrix: G(x, y) =         (R − s ⊗ ν)n (x, y)
                              n=0




               h = RG c
                      ˜                 + constant

               c(x) = c(x) − η
               ˜
                                            G˜ (y) =
                                             c               G(x, y)˜(y)
                                                                    c
                 η=         π(x)c(x)                   y∈X
                      y∈X
Representation of h

                               ∞
Potential matrix: G(x, y) =         (R − s ⊗ ν)n (x, y)
                              n=0




                h = RG c
                       ˜                + constant

                c(x) = c(x) − η
                ˜
                                            G˜ (y) =
                                             c               G(x, y)˜(y)
                                                                    c
                  η=         π(x)c(x)                  y∈X
                       y∈X


If sum converges, then Poisson’s equation is solved:
                       c(x) + Dh (x) = η
III
Lyapunov Theory
              P n (x, · ) − π   f   →0




                                         sup Ex [SτC (f )] < ∞
                                         C
  π(f ) < ∞




          ∆V (x) ≤ −f (x) + bIC (x)
Lyapunov Functions



            DV ≤ −g + bs
Lyapunov Functions



               DV ≤ −g + bs

General assumptions: V : X → (0,∞)
                      g : X → [1, ∞)
                      b < ∞, s small
                             e.g., s (x) = IC (x), C nite
Lyapunov Bounds on G                   DV ≤ −g + bs


Resolvent equation gives RV − V ≤ −Rg + bRs
Lyapunov Bounds on G                   DV ≤ −g + bs


Resolvent equation gives RV − V ≤ −Rg + bRs
Since s⊗ν is non-negative,
  −[I − (R − s ⊗ ν)]V ≤ RV − V ≤ −Rg + bRs

          G−1
Lyapunov Bounds on G                       DV ≤ −g + bs


Resolvent equation gives RV − V ≤ −Rg + bRs
Since s⊗ν is non-negative,
  −[I − (R − s ⊗ ν)]V ≤ RV − V ≤ −Rg + bRs

           G−1

More positivity,    V ≥ GRg − bGRs

Some algebra,      GR = G(R − s ⊗ ν) + (Gs) ⊗ ν ≥ G − I
                   Gs ≤ 1
Lyapunov Bounds on G                       DV ≤ −g + bs


Resolvent equation gives RV − V ≤ −Rg + bRs
Since s⊗ν is non-negative,
  −[I − (R − s ⊗ ν)]V ≤ RV − V ≤ −Rg + bRs

           G−1

More positivity,    V ≥ GRg − bGRs

Some algebra,      GR = G(R − s ⊗ ν) + (Gs) ⊗ ν ≥ G − I
                   Gs ≤ 1

General bound:       GRg     ≤ V + 2b
                      Gg     ≤ V + g + 2b
Existence of π                              DV ≤ −g + bs


Condition (V2)    DV ≤ −1 + bs


Representation: π ∝ νG

Bound:   GRg ≤ V + 2b =⇒ G(x, X) ≤ V (x) + 2b

Conclusion: π exists as a probability measure on X
Existence of moments            DV ≤ −g + bs


Condition (V3)   DV ≤ −g + bs


Representation: π ∝ νG

Bound:   Gg ≤ V + g + 2b
Existence of moments                            DV ≤ −g + bs


Condition (V3)    DV ≤ −g + bs


Representation: π ∝ νG

Bound:   Gg ≤ V + g + 2b

Conclusion: π exists as a probability measure on X
                 and the steady-state mean is nite,

                       π(g) :=         π(x)g(x) ≤ b
                                 x∈X
α                      α   µ
Example: MM1 Queue ρ =
                       µ


Linear Lyapunov function, V (x) = x
                ∞
     DV (x) =         Q(x, y)y
                y=0

             = α(x + 1) + µ(x − 1) − (α + µ)x

             = −(µ − α)          x>0



Conclusion: (V2) holds if and only if ρ < 1
α                   α           µ
Example: MM1 Queue ρ =
                       µ


QuadraticLyapunov function, V (x) = x 2
               ∞
    DV (x) =         Q(x, y)y 2
               y=0

            = α(x + 1)2 + µ(x − 1)2 − (α + µ)x2
            = α(x2 + 2x + 1) + µ(x2 − 2x + 1)2 − (α + µ)x2
            = −2(µ − α)x + α + µ


Conclusion: (V3) holds, g(x) = 1 + x
                 if and only if ρ < 1
2           2
                                                         σW = 0      σW = 1




Example: O-U Model
                                                    dX(t) = AX(t) dt + B dW (t)



h quadratic,      h(x) = 1 xT P x
                         2               h (x) = P x
                                         2
                                             h (x) = P


               Dh (x) = 1 xT (P A + AT P )x + B T P B
                        2


Suppose that P > 0 solves the Lyapunov equation,

                         P A + AT P = -I
2           2
                                                         σW = 0      σW = 1




Example: O-U Model
                                                    dX(t) = AX(t) dt + B dW (t)



h quadratic,      h(x) = 1 xT P x
                         2               h (x) = P x
                                         2
                                             h (x) = P


               Dh (x) = 1 xT (P A + AT P )x + B T P B
                        2


Suppose that P > 0 solves the Lyapunov equation,

                         P A + AT P = -I

Then (V3) follows from the identity,
                         2      2       2
      Dh (x) = − 1 x
                 2           + σX ,    σX = B T P B
2              2
                                                            σW = 0         σW = 1




Example: O-U Model
                                                          dX(t) = AX(t) dt + B dW (t)



The function h(x) = 1 xT P x solves Poisson’s equation,
                    2


                                              1       2
            Dh = −g + η              g(x) =   2   x
                                            2
                                       η = σX


Suppose that P > 0 solves the Lyapunov equation,

                       P A + AT P = -I

Then (V3) follows from the identity,
                        2      2         2
      Dh (x) = − 1 x
                 2          + σX ,      σX = B T P B
Poisson’s Equation                        DV ≤ −g + bs


Condition (V3)    DV ≤ −g + bs


Representation:   h = RG c
                         ˜   + constant     c(x) = c(x) − η
                                            ˜

Bound: RGg ≤ V + 2b =⇒ RGg (x) ≤ V (x) + 2b
Poisson’s Equation                           DV ≤ −g + bs


Condition (V3)      DV ≤ −g + bs


Representation:    h = RG c
                          ˜   + constant         c(x) = c(x) − η
                                                 ˜

Bound: RGg ≤ V + 2b =⇒ RGg (x) ≤ V (x) + 2b


Conclusion: If c is bounded by g, then h is bounded,

                  h(x) ≤ V (x) + 2b
α                         α     µ
Example: MM1 Queue ρ =
                       µ


Poisson’s equation with g (x) = x

                       Dh = −g + η


We have (V3) with V a quadratic function of x:

Recall, with h (x) = x 2

           Dh (x) = −2(µ − α)x + α + µ           x>0
α                α   µ
Example: MM1 Queue ρ =
                       µ


Poisson’s equation with g (x) = x

                      Dh = −g + η
Solved with

                   x2 + x          ρ
          h(x) = 1
                 2 µ−α         η=
                                  1−ρ
IV
Conclusions
P t (x, · ) − π   f   →0




                                                                                            sup Ex [SτC (f )] < ∞
                                                                                            C
Final words




                                                     π(f ) < ∞
                                                             DV (x) ≤ −f (x) + bIC (x)




Just as in linear systems theory, Lyapunov functions
provide a characterization of system properties, as well
as a practical verification tool
P t (x, · ) − π   f   →0




                                                                                             sup Ex [SτC (f )] < ∞
                                                                                             C
 Final words




                                                      π(f ) < ∞
                                                              DV (x) ≤ −f (x) + bIC (x)




 Just as in linear systems theory, Lyapunov functions
 provide a characterization of system properties, as well
 as a practical verification tool

 Much is left out of this survey - in particular,

• Converse theory
• Limit theory
• Approximation techniques to construct Lyapunov functions
          or approximations to value functions
• Application to controlled Markov processes, and
          approximate dynamic programming
References
 [1,4] ψ-Irreducible foundations
 [2,11,12,13] Mean- eld models, ODE models, and Lyapunov functions
 [1,4,5,9,10] Operator-theoretic methods.                             See also appendix of [2]
 [3,6,7,10] Generators and continuous time models

[1] S. P. Meyn and R. L. Tweedie. Markov chains and stochastic    [9] I. Kontoyiannis and S. P. Meyn. Spectral theory and limit
    stability. Cambridge University Press, Cambridge, second           theorems for geometrically ergodic Markov processes. Ann.
    edition, 2009. Published in the Cambridge Mathematical             Appl. Probab., 13:304–362, 2003. Presented at the INFORMS
    Library.                                                           Applied Probability Conference, NYC, July, 2001.
[2] S. P. Meyn. Control Techniques for Complex Networks. Cam-     [10] I. Kontoyiannis and S. P. Meyn. Large deviations asymptotics
    bridge University Press, Cambridge, 2007. Pre-publication          and the spectral theory of multiplicatively regular Markov
    edition online: http://black.csl.uiuc.edu/˜meyn.                   processes. Electron. J. Probab., 10(3):61–123 (electronic),
[3] S. N. Ethier and T. G. Kurtz. Markov Processes : Charac-           2005.
    terization and Convergence. John Wiley & Sons, New York,      [11] W. Chen, D. Huang, A. Kulkarni, J. Unnikrishnan, Q. Zhu,
    1986.                                                              P. Mehta, S. Meyn, and A. Wierman. Approximate dynamic
[4] E. Nummelin. General Irreducible Markov Chains and Non-            programming using fluid and diffusion approximations with
    negative Operators. Cambridge University Press, Cambridge,         applications to power management. Accepted for inclusion in
    1984.                                                              the 48th IEEE Conference on Decision and Control, December
[5] S. P. Meyn and R. L. Tweedie. Generalized resolvents               16-18 2009.
    and Harris recurrence of Markov processes. Contemporary       [12] P. Mehta and S. Meyn. Q-learning and Pontryagin’s Minimum
    Mathematics, 149:227–250, 1993.                                    Principle. Accepted for inclusion in the 48th IEEE Conference
[6] S. P. Meyn and R. L. Tweedie. Stability of Markovian               on Decision and Control, December 16-18 2009.
    processes III: Foster-Lyapunov criteria for continuous time   [13] G. Fort, S. Meyn, E. Moulines, and P. Priouret. ODE
    processes. Adv. Appl. Probab., 25:518–548, 1993.                   methods for skip-free Markov chain stability with applications
[7] D. Down, S. P. Meyn, and R. L. Tweedie. Exponential                to MCMC. Ann. Appl. Probab., 18(2):664–707, 2008.
    and uniform ergodicity of Markov processes. Ann. Probab.,
    23(4):1671–1691, 1995.
[8] P. W. Glynn and S. P. Meyn. A Liapounov bound for solutions
    of the Poisson equation. Ann. Probab., 24(2):916–931, 1996.

                                                                   See also earlier seminal work by Hordijk, Tweedie, ... full references in [1].

Markov Tutorial CDC Shanghai 2009

  • 1.
    Lyapunov functions, valuefunctions, and performance bounds Sean Meyn Department of Electrical and Computer Engineering University of Illinois and the Coordinated Science Laboratory Joint work with R. Tweedie, I. Kontoyiannis, and P. Mehta Supported in part by NSF (ECS 05 23620, and prior funding), and AFOSR
  • 2.
    Objectives Nonlinear state spacemodel ≡ (controlled) Markov process, state process X Typical form: dX(t) = f (X(t), U (t)) dt + σ(X(t), U (t)) dW (t) noise control
  • 3.
    Objectives Nonlinear state spacemodel ≡ (controlled) Markov process, state process X Typical form: dX(t) = f (X(t), U (t)) dt + σ(X(t), U (t)) dW (t) noise control Questions: For a given feedback law, • Is the state process stable? • Is the average cost finite? E[c(X(t), U (t))] • Can we solve the DP equations? min c(x, u) + Du h∗ (x) = η ∗ u • Can we approximate the average cost η∗? The value function h∗ ?
  • 4.
    Outline Markov Models P t (x, · ) − π f →0 sup Ex [SτC (f )] < ∞ C π(f ) < ∞ Representations Lyapunov Theory DV (x) ≤ −f (x) + bIC (x) Conclusions
  • 5.
  • 6.
    Notation Markov chain: X= X(t) : t ≥ 0 Countable state space, X Transition semigroup, P t (x, y) = P X(s + t) = y X(s) = x , x, y ∈ X
  • 7.
    Notation: Generators &Resolvents Markov chain: X = X(t) : t ≥ 0 Countable state space, X Transition semigroup, P t (x, y) = P X(s + t) = y X(s) = x , x, y ∈ X Generator: For some domain of functions h, 1 Dh (x) = lim E[h(X(s + t)) − h(X(s)) X(s) = x] t→0 t 1 t = lim (P h (x) − h(x)) t→0 t
  • 8.
    Notation: Generators &Resolvents Generator: For some domain of functions h, 1 Dh (x) = lim E[h(X(s + t)) − h(X(s)) X(s) = x] t→0 t 1 t = lim (P h (x) − h(x)) t→0 t Rate matrix: Dh (x) = Q(x, y)h(y) P t = eQt y
  • 9.
    α µ Example: MM1 Queue  x + 1 Prob εα  Sample paths: X(t + ε) ≈ x − 1 Prob εµ   x Prob 1 − ε(α + µ) Rate matrix:   −α α 0 0 0 0 ···  µ −α − µ  α 0 0 0 · · ·   0  µ −α − µ α 0 0 · · ·   Q= 0 0 µ −α − µ α 0 · · ·   0  0 0 µ −α − µ α · · ·   0  0 0 0 µ −α − µ · · ·  . . . . . . . . . . . . . . . . . .
  • 10.
    2 2 σW = 0 σW = 1 Example: O-U Model Sample paths: dX(t) = AX(t) dt + B dW (t) A n × n, Bn × 1, W standard BM 2 Generator: Dh (x) = (Ax)T h (x) + B T h (x)B
  • 11.
    2 2 σW = 0 σW = 1 Example: O-U Model Sample paths: dX(t) = AX(t) dt + B dW (t) A n × n, Bn × 1, W standard BM 2 Generator: Dh (x) = (Ax)T h (x) + B T h (x)B h quadratic, h(x) = 1 xT P x 2 h (x) = P x 2 h (x) = P Dh (x) = 1 xT (P A + AT P )x + B T P B 2
  • 12.
    Notation: Generators &Resolvents Generator: For some domain of functions h, 1 Dh (x) = lim E[h(X(s + t)) − h(X(s)) X(s) = x] t→0 t 1 t = lim (P h (x) − h(x)) t→0 t Rate matrix: Dh (x) = Q(x, y)h(y) P t = eQt y Resolvent: ∞ Rα = e−αt P t 0
  • 13.
    Notation: Generators &Resolvents Generator: For some domain of functions h, 1 Dh (x) = lim E[h(X(s + t)) − h(X(s)) X(s) = x] t→0 t 1 t = lim (P h (x) − h(x)) t→0 t Rate matrix: Dh (x) = Q(x, y)h(y) P t = eQt y Resolvent: Resolvent equations: ∞ Rα = e−αt P t Rα = [ Iα − Q]−1 0 QRα = Rα Q = αRα − I
  • 14.
    Notation: Generators &Resolvents Motivation: Dynamic programming. For a cost function c, hα (x) = Rα c (x) = Rα (x, y)c(y) y∈X ∞ = eαt E[c(X(t)) X(0) = x] dt 0 Discounted-cost value function
  • 15.
    Notation: Generators &Resolvents Motivation: Dynamic programming. For a cost function c, hα (x) = Rα c (x) = Rα (x, y)c(y) y∈X ∞ = eαt E[c(X(t)) X(0) = x] dt 0 Discounted-cost value function Resolvent equation = dynamic programming equation, c + Dhα = αhα
  • 16.
    Notation: Steady StateDistribution Invariant (probability) measure π: X is stationary. In particular, X(t) ∼ π, t≥0
  • 17.
    Notation: Steady StateDistribution Invariant (probability) measure π: X is stationary. In particular, X(t) ∼ π, t≥0 Characterizations: π(x)P t (x, y) = π(y) x∈X α π(x)Rα (x, y) = π(y), α > 0 x∈X π(x)Q(x, y) = 0 x∈X y ∈X
  • 18.
    Notation: Relative ValueFunction Invariant measure π, cost function c , steady-state mean η Relative value function: ∞ h(x) = E[c(X(t)) − η X(0) = x] dt 0
  • 19.
    Notation: Relative ValueFunction Invariant measure π, cost function c , steady-state mean η Relative value function: ∞ h(x) = E[c(X(t)) − η X(0) = x] dt 0 Solution to Poisson’s equation (average-cost DP equation): c + Dh = η
  • 20.
    II Representations π ∝ ν[I − (R − s ⊗ ν)]−1 h = [I − (R − s ⊗ ν)]−1 c ˜
  • 21.
    Irreducibility ψ-Irreducibility: ψ(y) > 0 =⇒ P X(t) reaches y X(0) = x > 0 all x ψ(y) > 0 =⇒ R(x, y) > 0 all x
  • 22.
    Small Functions andSmall Measures ψ-Irreducibility: ψ(y) > 0 =⇒ P X(t) reaches y X(0) = x > 0 all x ψ(y) > 0 =⇒ R(x, y) > 0 all x Small functions and measures: For a function s and probability ν, R(x, y) ≥ s(x)ν(y), x, y ∈ X ∞ R= e−t P t dt 0
  • 23.
    Small Functions andSmall Measures ψ-Irreducibility: ψ(y) > 0 =⇒ P X(t) reaches y X(0) = x > 0 all x ψ(y) > 0 =⇒ R(x, y) > 0 all x Small functions and measures: For a function s and probability ν, R(x, y) ≥ s(x)ν(y), x, y ∈ X Resolvent dominates rank-one matrix, R= ∞ e−t P t dt 0 R ≥s⊗ν
  • 24.
    Small Functions andSmall Measures ψ-Irreducibility: ψ(y) > 0 =⇒ P X(t) reaches y X(0) = x > 0 all x ψ(y) > 0 =⇒ R(x, y) > 0 all x Small functions and measures: For a function s and probability ν, R(x, y) ≥ s(x)ν(y), x, y ∈ X Resolvent dominates rank-one matrix, R= ∞ e−t P t dt 0 R ≥s⊗ν ψ-Irreducibility justi es assumption: s(x) > 0 for all x and WLOG, ν = δx ∗ , where ψ(x∗ ) > 0
  • 25.
    α µ Example: MM1 Queue R(x, y) > 0 for all x and y (irreducible in usual sense) Conclusion: R(x, y ) ≥ s(x)ν(y ) where s(x) := R(x, 0) ν := δ0
  • 26.
    2 2 σW = 0 σW = 1 Example: O-U Model dX(t) = AX(t) dt + B dW (t) R(0, . ) Gaussian Full rank if and only if (A, B) is controllable. Conclusion: Under controllability, for any m, there is ε s.t., R(x, A) ≥ s(x)ν(A) all x and A where s(x) = ε I x ≤m ν(A) uniform on x ≤m
  • 27.
    Potential Matrix ∞ Potential matrix: G(x, y) = (R − s ⊗ ν)n (x, y) n=0 G = [I − (R − s ⊗ ν)]−1
  • 28.
    Representation of π ∞ Potential matrix: G(x, y) = (R − s ⊗ ν)n (x, y) n=0 π ∝ νG νG (y) = ν(x)G(x, y) x∈X
  • 29.
    Representation of h ∞ Potential matrix: G(x, y) = (R − s ⊗ ν)n (x, y) n=0 h = RG c ˜ + constant c(x) = c(x) − η ˜ G˜ (y) = c G(x, y)˜(y) c η= π(x)c(x) y∈X y∈X
  • 30.
    Representation of h ∞ Potential matrix: G(x, y) = (R − s ⊗ ν)n (x, y) n=0 h = RG c ˜ + constant c(x) = c(x) − η ˜ G˜ (y) = c G(x, y)˜(y) c η= π(x)c(x) y∈X y∈X If sum converges, then Poisson’s equation is solved: c(x) + Dh (x) = η
  • 31.
    III Lyapunov Theory P n (x, · ) − π f →0 sup Ex [SτC (f )] < ∞ C π(f ) < ∞ ∆V (x) ≤ −f (x) + bIC (x)
  • 32.
    Lyapunov Functions DV ≤ −g + bs
  • 33.
    Lyapunov Functions DV ≤ −g + bs General assumptions: V : X → (0,∞) g : X → [1, ∞) b < ∞, s small e.g., s (x) = IC (x), C nite
  • 34.
    Lyapunov Bounds onG DV ≤ −g + bs Resolvent equation gives RV − V ≤ −Rg + bRs
  • 35.
    Lyapunov Bounds onG DV ≤ −g + bs Resolvent equation gives RV − V ≤ −Rg + bRs Since s⊗ν is non-negative, −[I − (R − s ⊗ ν)]V ≤ RV − V ≤ −Rg + bRs G−1
  • 36.
    Lyapunov Bounds onG DV ≤ −g + bs Resolvent equation gives RV − V ≤ −Rg + bRs Since s⊗ν is non-negative, −[I − (R − s ⊗ ν)]V ≤ RV − V ≤ −Rg + bRs G−1 More positivity, V ≥ GRg − bGRs Some algebra, GR = G(R − s ⊗ ν) + (Gs) ⊗ ν ≥ G − I Gs ≤ 1
  • 37.
    Lyapunov Bounds onG DV ≤ −g + bs Resolvent equation gives RV − V ≤ −Rg + bRs Since s⊗ν is non-negative, −[I − (R − s ⊗ ν)]V ≤ RV − V ≤ −Rg + bRs G−1 More positivity, V ≥ GRg − bGRs Some algebra, GR = G(R − s ⊗ ν) + (Gs) ⊗ ν ≥ G − I Gs ≤ 1 General bound: GRg ≤ V + 2b Gg ≤ V + g + 2b
  • 38.
    Existence of π DV ≤ −g + bs Condition (V2) DV ≤ −1 + bs Representation: π ∝ νG Bound: GRg ≤ V + 2b =⇒ G(x, X) ≤ V (x) + 2b Conclusion: π exists as a probability measure on X
  • 39.
    Existence of moments DV ≤ −g + bs Condition (V3) DV ≤ −g + bs Representation: π ∝ νG Bound: Gg ≤ V + g + 2b
  • 40.
    Existence of moments DV ≤ −g + bs Condition (V3) DV ≤ −g + bs Representation: π ∝ νG Bound: Gg ≤ V + g + 2b Conclusion: π exists as a probability measure on X and the steady-state mean is nite, π(g) := π(x)g(x) ≤ b x∈X
  • 41.
    α α µ Example: MM1 Queue ρ = µ Linear Lyapunov function, V (x) = x ∞ DV (x) = Q(x, y)y y=0 = α(x + 1) + µ(x − 1) − (α + µ)x = −(µ − α) x>0 Conclusion: (V2) holds if and only if ρ < 1
  • 42.
    α α µ Example: MM1 Queue ρ = µ QuadraticLyapunov function, V (x) = x 2 ∞ DV (x) = Q(x, y)y 2 y=0 = α(x + 1)2 + µ(x − 1)2 − (α + µ)x2 = α(x2 + 2x + 1) + µ(x2 − 2x + 1)2 − (α + µ)x2 = −2(µ − α)x + α + µ Conclusion: (V3) holds, g(x) = 1 + x if and only if ρ < 1
  • 43.
    2 2 σW = 0 σW = 1 Example: O-U Model dX(t) = AX(t) dt + B dW (t) h quadratic, h(x) = 1 xT P x 2 h (x) = P x 2 h (x) = P Dh (x) = 1 xT (P A + AT P )x + B T P B 2 Suppose that P > 0 solves the Lyapunov equation, P A + AT P = -I
  • 44.
    2 2 σW = 0 σW = 1 Example: O-U Model dX(t) = AX(t) dt + B dW (t) h quadratic, h(x) = 1 xT P x 2 h (x) = P x 2 h (x) = P Dh (x) = 1 xT (P A + AT P )x + B T P B 2 Suppose that P > 0 solves the Lyapunov equation, P A + AT P = -I Then (V3) follows from the identity, 2 2 2 Dh (x) = − 1 x 2 + σX , σX = B T P B
  • 45.
    2 2 σW = 0 σW = 1 Example: O-U Model dX(t) = AX(t) dt + B dW (t) The function h(x) = 1 xT P x solves Poisson’s equation, 2 1 2 Dh = −g + η g(x) = 2 x 2 η = σX Suppose that P > 0 solves the Lyapunov equation, P A + AT P = -I Then (V3) follows from the identity, 2 2 2 Dh (x) = − 1 x 2 + σX , σX = B T P B
  • 46.
    Poisson’s Equation DV ≤ −g + bs Condition (V3) DV ≤ −g + bs Representation: h = RG c ˜ + constant c(x) = c(x) − η ˜ Bound: RGg ≤ V + 2b =⇒ RGg (x) ≤ V (x) + 2b
  • 47.
    Poisson’s Equation DV ≤ −g + bs Condition (V3) DV ≤ −g + bs Representation: h = RG c ˜ + constant c(x) = c(x) − η ˜ Bound: RGg ≤ V + 2b =⇒ RGg (x) ≤ V (x) + 2b Conclusion: If c is bounded by g, then h is bounded, h(x) ≤ V (x) + 2b
  • 48.
    α α µ Example: MM1 Queue ρ = µ Poisson’s equation with g (x) = x Dh = −g + η We have (V3) with V a quadratic function of x: Recall, with h (x) = x 2 Dh (x) = −2(µ − α)x + α + µ x>0
  • 49.
    α α µ Example: MM1 Queue ρ = µ Poisson’s equation with g (x) = x Dh = −g + η Solved with x2 + x ρ h(x) = 1 2 µ−α η= 1−ρ
  • 50.
  • 51.
    P t (x,· ) − π f →0 sup Ex [SτC (f )] < ∞ C Final words π(f ) < ∞ DV (x) ≤ −f (x) + bIC (x) Just as in linear systems theory, Lyapunov functions provide a characterization of system properties, as well as a practical verification tool
  • 52.
    P t (x,· ) − π f →0 sup Ex [SτC (f )] < ∞ C Final words π(f ) < ∞ DV (x) ≤ −f (x) + bIC (x) Just as in linear systems theory, Lyapunov functions provide a characterization of system properties, as well as a practical verification tool Much is left out of this survey - in particular, • Converse theory • Limit theory • Approximation techniques to construct Lyapunov functions or approximations to value functions • Application to controlled Markov processes, and approximate dynamic programming
  • 53.
    References [1,4] ψ-Irreduciblefoundations [2,11,12,13] Mean- eld models, ODE models, and Lyapunov functions [1,4,5,9,10] Operator-theoretic methods. See also appendix of [2] [3,6,7,10] Generators and continuous time models [1] S. P. Meyn and R. L. Tweedie. Markov chains and stochastic [9] I. Kontoyiannis and S. P. Meyn. Spectral theory and limit stability. Cambridge University Press, Cambridge, second theorems for geometrically ergodic Markov processes. Ann. edition, 2009. Published in the Cambridge Mathematical Appl. Probab., 13:304–362, 2003. Presented at the INFORMS Library. Applied Probability Conference, NYC, July, 2001. [2] S. P. Meyn. Control Techniques for Complex Networks. Cam- [10] I. Kontoyiannis and S. P. Meyn. Large deviations asymptotics bridge University Press, Cambridge, 2007. Pre-publication and the spectral theory of multiplicatively regular Markov edition online: http://black.csl.uiuc.edu/˜meyn. processes. Electron. J. Probab., 10(3):61–123 (electronic), [3] S. N. Ethier and T. G. Kurtz. Markov Processes : Charac- 2005. terization and Convergence. John Wiley & Sons, New York, [11] W. Chen, D. Huang, A. Kulkarni, J. Unnikrishnan, Q. Zhu, 1986. P. Mehta, S. Meyn, and A. Wierman. Approximate dynamic [4] E. Nummelin. General Irreducible Markov Chains and Non- programming using fluid and diffusion approximations with negative Operators. Cambridge University Press, Cambridge, applications to power management. Accepted for inclusion in 1984. the 48th IEEE Conference on Decision and Control, December [5] S. P. Meyn and R. L. Tweedie. Generalized resolvents 16-18 2009. and Harris recurrence of Markov processes. Contemporary [12] P. Mehta and S. Meyn. Q-learning and Pontryagin’s Minimum Mathematics, 149:227–250, 1993. Principle. Accepted for inclusion in the 48th IEEE Conference [6] S. P. Meyn and R. L. Tweedie. Stability of Markovian on Decision and Control, December 16-18 2009. processes III: Foster-Lyapunov criteria for continuous time [13] G. Fort, S. Meyn, E. Moulines, and P. Priouret. ODE processes. Adv. Appl. Probab., 25:518–548, 1993. methods for skip-free Markov chain stability with applications [7] D. Down, S. P. Meyn, and R. L. Tweedie. Exponential to MCMC. Ann. Appl. Probab., 18(2):664–707, 2008. and uniform ergodicity of Markov processes. Ann. Probab., 23(4):1671–1691, 1995. [8] P. W. Glynn and S. P. Meyn. A Liapounov bound for solutions of the Poisson equation. Ann. Probab., 24(2):916–931, 1996. See also earlier seminal work by Hordijk, Tweedie, ... full references in [1].