SlideShare a Scribd company logo
CSLCOORDINATED SCIENCE LABORATORY
Mean-field Methods in Estimation & Control
Prashant G. Mehta
Coordinated Science Laboratory
Department of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign
EE Seminar, Harvard
Apr. 1, 2011
Introduction Kuramoto model
Background: Synchronization of coupled oscillators
dθi(t) = ωi +
κ
N
N
∑
j=1
sin(θj(t)−θi(t)) dt +σ dξi(t), i = 1,...,N
ωi taken from distribution g(ω) over [1−γ,1+γ]
γ — measures the heterogeneity of the population
κ — measures the strength of coupling
[10] Y. Kuramoto, 1975; [15] Strogatz et al., J. Stat. Phy., 1991
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 2 / 38
Introduction Kuramoto model
Background: Synchronization of coupled oscillators
dθi(t) = ωi +
κ
N
N
∑
j=1
sin(θj(t)−θi(t)) dt +σ dξi(t), i = 1,...,N
ωi taken from distribution g(ω) over [1−γ,1+γ]
γ — measures the heterogeneity of the population
κ — measures the strength of coupling 1- 1+1
[10] Y. Kuramoto, 1975; [15] Strogatz et al., J. Stat. Phy., 1991
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 2 / 38
Introduction Kuramoto model
Background: Synchronization of coupled oscillators
dθi(t) = ωi +
κ
N
N
∑
j=1
sin(θj(t)−θi(t)) dt +σ dξi(t), i = 1,...,N
ωi taken from distribution g(ω) over [1−γ,1+γ]
γ — measures the heterogeneity of the population
κ — measures the strength of coupling
[10] Y. Kuramoto, 1975; [15] Strogatz et al., J. Stat. Phy., 1991
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 2 / 38
Introduction Kuramoto model
Background: Synchronization of coupled oscillators
dθi(t) = ωi +
κ
N
N
∑
j=1
sin(θj(t)−θi(t)) dt +σ dξi(t), i = 1,...,N
ωi taken from distribution g(ω) over [1−γ,1+γ]
γ — measures the heterogeneity of the population
κ — measures the strength of coupling
0 0.1 0.2
0.1
0.15
0.2
0.25
0.3 Locking
Incoherence
κ
κ < κc(γ)
R
γ
Synchrony
Incoherence
[10] Y. Kuramoto, 1975; [15] Strogatz et al., J. Stat. Phy., 1991
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 2 / 38
Introduction Kuramoto model
Movies of incoherence and synchrony solution
−1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
−1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
Incoherence Synchrony
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 3 / 38
Introduction Oscillator examples
Motivation: Oscillators in biology
Oscillators in biology
Coupled oscillators
Images taken from google search.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 4 / 38
Introduction Oscillator examples
Hodgkin-Huxley type Neuron model
C
dV
dt
= −gT ·m2
∞(V)·h·(V −ET )
−gh ·r ·(V −Eh)−......
dh
dt
=
h∞(V)−h
τh(V)
dr
dt
=
r∞(V)−r
τr(V)
2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000
−150
−100
−50
0
50
100
Voltage
time
Neural spike train
[6] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 5 / 38
Introduction Oscillator examples
Hodgkin-Huxley type Neuron model
C
dV
dt
= −gT ·m2
∞(V)·h·(V −ET )
−gh ·r ·(V −Eh)−......
dh
dt
=
h∞(V)−h
τh(V)
dr
dt
=
r∞(V)−r
τr(V)
2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000
−150
−100
−50
0
50
100
Voltage
time
Neural spike train
[6] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 5 / 38
Introduction Oscillator examples
Hodgkin-Huxley type Neuron model
C
dV
dt
= −gT ·m2
∞(V)·h·(V −ET )
−gh ·r ·(V −Eh)−......
dh
dt
=
h∞(V)−h
τh(V)
dr
dt
=
r∞(V)−r
τr(V)
2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000
−150
−100
−50
0
50
100
Voltage
time
Neural spike train
−100
−50
0
50
100
0
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
Vh
r
Limit cyle
r
h v
[6] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 5 / 38
Introduction Oscillator examples
Hodgkin-Huxley type Neuron model
C
dV
dt
= −gT ·m2
∞(V)·h·(V −ET )
−gh ·r ·(V −Eh)−......
dh
dt
=
h∞(V)−h
τh(V)
dr
dt
=
r∞(V)−r
τr(V)
2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000
−150
−100
−50
0
50
100
Voltage
time
Neural spike train
−100
−50
0
50
100
0
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
Vh
r
Limit cyle
r
h v
Normal form reduction
−−−−−−−−−−−−−→
˙θi = ωi +ui ·Φ(θi)
[6] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 5 / 38
Introduction Role of Neural Rhythms
Question (Fundamental question in Neuroscience)
Why is synchrony (neural rhythms) useful?
Does it have a functional role?
1 Synchronization
Phase transition in controlled system (motivated by coupled
oscillators)
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of Coupled Oscillators is a Game,” TAC
2 Neuronal computations
Bayesian inference
Neural circuits as particle filters (Lee & Mumford)
T. Yang, P. G. Mehta and S. P. Meyn, “ A Control-oriented Approach for Particle Filtering,” ACC 2011, CDC 2011
3 Learning
Synaptic plasticity via long term potentiation (Hebbian learning)
“Neurons that fire together wire together”
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “ Learning in Mean-Field Oscillator Games,” CDC 2010
Destexhe & Marder, Nature, 2004; Kopell et al., Neuroscience, 2009; Lee & Mumford, J. Opt. Soc, 2003
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 6 / 38
Introduction Role of Neural Rhythms
Question (Fundamental question in Neuroscience)
Why is synchrony (neural rhythms) useful?
Does it have a functional role?
1 Synchronization
Phase transition in controlled system (motivated by coupled
oscillators)
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of Coupled Oscillators is a Game,” TAC
2 Neuronal computations
Bayesian inference
Neural circuits as particle filters (Lee & Mumford)
T. Yang, P. G. Mehta and S. P. Meyn, “ A Control-oriented Approach for Particle Filtering,” ACC 2011, CDC 2011
3 Learning
Synaptic plasticity via long term potentiation (Hebbian learning)
“Neurons that fire together wire together”
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “ Learning in Mean-Field Oscillator Games,” CDC 2010
Destexhe & Marder, Nature, 2004; Kopell et al., Neuroscience, 2009; Lee & Mumford, J. Opt. Soc, 2003
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 6 / 38
Introduction Role of Neural Rhythms
Question (Fundamental question in Neuroscience)
Why is synchrony (neural rhythms) useful?
Does it have a functional role?
1 Synchronization
Phase transition in controlled system (motivated by coupled
oscillators)
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of Coupled Oscillators is a Game,” TAC
2 Neuronal computations
Bayesian inference
Neural circuits as particle filters (Lee & Mumford)
T. Yang, P. G. Mehta and S. P. Meyn, “ A Control-oriented Approach for Particle Filtering,” ACC 2011, CDC 2011
3 Learning
Synaptic plasticity via long term potentiation (Hebbian learning)
“Neurons that fire together wire together”
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “ Learning in Mean-Field Oscillator Games,” CDC 2010
Destexhe & Marder, Nature, 2004; Kopell et al., Neuroscience, 2009; Lee & Mumford, J. Opt. Soc, 2003
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 6 / 38
Introduction Role of Neural Rhythms
Question (Fundamental question in Neuroscience)
Why is synchrony (neural rhythms) useful?
Does it have a functional role?
1 Synchronization
Phase transition in controlled system (motivated by coupled
oscillators)
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of Coupled Oscillators is a Game,” TAC
2 Neuronal computations
Bayesian inference
Neural circuits as particle filters (Lee & Mumford)
T. Yang, P. G. Mehta and S. P. Meyn, “ A Control-oriented Approach for Particle Filtering,” ACC 2011, CDC 2011
3 Learning
Synaptic plasticity via long term potentiation (Hebbian learning)
“Neurons that fire together wire together”
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “ Learning in Mean-Field Oscillator Games,” CDC 2010
Destexhe & Marder, Nature, 2004; Kopell et al., Neuroscience, 2009; Lee & Mumford, J. Opt. Soc, 2003
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 6 / 38
Part I
Mean-field Control
Collaborators
Huibing Yin Sean P. Meyn Uday V. Shanbhag
“Synchronization of coupled oscillators is a game,” IEEE TAC
“Learning in Mean-Field Oscillator Games,” CDC 2010
“On the Efficiency of Equilibria in Mean-field Oscillator Games,” ACC 2011
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 8 / 38
Derivation of model Problem statement
Oscillators dynamic game
Dynamics of ith
oscillator
dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0
ui(t) — control 1- 1+1
ith
oscillator seeks to minimize
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[ c(θi;θ−i)
cost of anarchy
+ 1
2 Ru2
i
cost of control
]ds
θ−i = (θj)j=i
R — control penalty
c(·) — cost function
c(θi;θ−i) =
1
N ∑
j=i
c•
(θi,θj), c•
≥ 0
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 9 / 38
Derivation of model Problem statement
Oscillators dynamic game
Dynamics of ith
oscillator
dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0
ui(t) — control 1- 1+1
ith
oscillator seeks to minimize
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[ c(θi;θ−i)
cost of anarchy
+ 1
2 Ru2
i
cost of control
]ds
θ−i = (θj)j=i
R — control penalty
c(·) — cost function
c(θi;θ−i) =
1
N ∑
j=i
c•
(θi,θj), c•
≥ 0
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 9 / 38
Derivation of model Problem statement
Oscillators dynamic game
Dynamics of ith
oscillator
dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0
ui(t) — control
ith
oscillator seeks to minimize
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[ c(θi;θ−i)
cost of anarchy
+ 1
2 Ru2
i
cost of control
]ds
θ−i = (θj)j=i
R — control penalty
c(·) — cost function
c(θi;θ−i) =
1
N ∑
j=i
c•
(θi,θj), c•
≥ 0
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 9 / 38
Derivation of model Mean-field model
Mean-field model derivation
dθi = (ωi +ui(t))dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
Influence
Influence
Mass
1 Mean-field approximation
c(θi;θ−i(t)) =
1
N ∑
j=i
c•
(θi,θj)
N→∞
−−−−−−→ ¯c(θi,t)
2 Optimal control of single oscillator
Decentralized control structure
[8] M. Huang, P. Caines, and R. Malhame, IEEE TAC, 2007 [HCM]; [13] Lasry & Lions, Japan. J. Math, 2007;
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 10 / 38
Derivation of model Derivation steps
Single oscillator with given cost
Dynamics of the oscillator
dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0
The cost function is assumed known
ηi(ui; ¯c) = lim
T→∞
1
T
T
0
E c(θi;θ−i) + 1
2 Ru2
i (s) ds
⇑
¯c(θi(s),s)
HJB equation:
∂thi +ωi∂θ hi =
1
2R
(∂θ hi)2
− ¯c(θ,t)+η∗
i −
σ2
2
∂2
θθ hi
Optimal control law: u∗
i (t) = ϕi(θ,t) = −
1
R
∂θ hi(θ,t)
[1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
Derivation of model Derivation steps
Single oscillator with given cost
Dynamics of the oscillator
dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0
The cost function is assumed known
ηi(ui; ¯c) = lim
T→∞
1
T
T
0
E c(θi;θ−i) + 1
2 Ru2
i (s) ds
⇑
¯c(θi(s),s)
HJB equation:
∂thi +ωi∂θ hi =
1
2R
(∂θ hi)2
− ¯c(θ,t)+η∗
i −
σ2
2
∂2
θθ hi
Optimal control law: u∗
i (t) = ϕi(θ,t) = −
1
R
∂θ hi(θ,t)
[1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
Derivation of model Derivation steps
Single oscillator with given cost
Dynamics of the oscillator
dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0
The cost function is assumed known
ηi(ui; ¯c) = lim
T→∞
1
T
T
0
E c(θi;θ−i) + 1
2 Ru2
i (s) ds
⇑
¯c(θi(s),s)
HJB equation:
∂thi +ωi∂θ hi =
1
2R
(∂θ hi)2
− ¯c(θ,t)+η∗
i −
σ2
2
∂2
θθ hi
Optimal control law: u∗
i (t) = ϕi(θ,t) = −
1
R
∂θ hi(θ,t)
[1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
Derivation of model Derivation steps
Single oscillator with given cost
Dynamics of the oscillator
dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0
The cost function is assumed known
ηi(ui; ¯c) = lim
T→∞
1
T
T
0
E c(θi;θ−i) + 1
2 Ru2
i (s) ds
⇑
¯c(θi(s),s)
HJB equation:
∂thi +ωi∂θ hi =
1
2R
(∂θ hi)2
− ¯c(θ,t)+η∗
i −
σ2
2
∂2
θθ hi
Optimal control law: u∗
i (t) = ϕi(θ,t) = −
1
R
∂θ hi(θ,t)
[1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
Derivation of model Derivation steps
Single oscillator with given cost
Dynamics of the oscillator
dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0
The cost function is assumed known
ηi(ui; ¯c) = lim
T→∞
1
T
T
0
E c(θi;θ−i) + 1
2 Ru2
i (s) ds
⇑
¯c(θi(s),s)
HJB equation:
∂thi +ωi∂θ hi =
1
2R
(∂θ hi)2
− ¯c(θ,t)+η∗
i −
σ2
2
∂2
θθ hi
Optimal control law: u∗
i (t) = ϕi(θ,t) = −
1
R
∂θ hi(θ,t)
[1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
Derivation of model Derivation steps
Single oscillator with optimal control
Dynamics of the oscillator
dθi(t) = ωi −
1
R
∂θ hi(θi,t) dt +σ dξi(t)
Kolmogorov forward equation for pdf p(θ,t,ωi)
FPK: ∂t p+ωi∂θ p =
1
R
∂θ [p(∂θ h)]+
σ2
2
∂2
θθ p
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 12 / 38
Derivation of model Derivation steps
Mean-field Approximation
HJB equation for population
∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t)+η(ω)−
σ2
2
∂2
θθ h h(θ,t,ω)
Population density
∂t p+ω∂θ p =
1
R
∂θ [p(∂θ h)]+
σ2
2
∂2
θθ p p(θ,t,ω)
Enforce cost consistency
¯c(θ,t) =
Ω
2π
0
c•
(θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω
≈
1
N ∑
j=i
c•
(θ,ϑ)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 13 / 38
Derivation of model Derivation steps
Mean-field model
dθi = (ωi +ui(t))dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
c(θi;θ−i) =
1
N ∑
j=i
c•
(θi,θj(t))
N→∞
−−−→ ¯c(θi,t)
Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t)
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂t p+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
¯c(ϑ,t) =
Ω
2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
[8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006;
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
Derivation of model Derivation steps
Mean-field model
dθi = (ωi +ui(t))dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
Influence
Influence
Mass
Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t)
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂t p+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
¯c(ϑ,t) =
Ω
2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
[8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006;
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
Derivation of model Derivation steps
Mean-field model
dθi = (ωi +ui(t))dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
Influence
Influence
Mass
Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t)
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂t p+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
¯c(ϑ,t) =
Ω
2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
[8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006;
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
Derivation of model Derivation steps
Mean-field model
dθi = (ωi +ui(t))dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
Influence
Influence
Mass
Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t)
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂t p+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
¯c(ϑ,t) =
Ω
2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
[8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006;
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
Derivation of model Derivation steps
Mean-field model
dθi = (ωi +ui(t))dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
Influence
Influence
Mass
Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t)
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂t p+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
¯c(ϑ,t) =
Ω
2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
[8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006;
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
Solutions of PDE model ε-Nash equilibrium
Solution of PDEs gives ε-Nash equilibrium
Optimal control law
uo
i = −
1
R
∂θ h(θ(t),t,ω) ω=ωi
Theorem
For the control law uo
i = −
1
R
∂θ h(θ(t),t,ω) ω=ωi
, we have the ε-Nash
equilibrium property
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N,
for any adapted control ui.
So, we look for solutions of PDEs.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 15 / 38
Solutions of PDE model ε-Nash equilibrium
Solution of PDEs gives ε-Nash equilibrium
Optimal control law
uo
i = −
1
R
∂θ h(θ(t),t,ω) ω=ωi
Theorem
For the control law uo
i = −
1
R
∂θ h(θ(t),t,ω) ω=ωi
, we have the ε-Nash
equilibrium property
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N,
for any adapted control ui.
So, we look for solutions of PDEs.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 15 / 38
Solutions of PDE model ε-Nash equilibrium
Solution of PDEs gives ε-Nash equilibrium
Optimal control law
uo
i = −
1
R
∂θ h(θ(t),t,ω) ω=ωi
Theorem
For the control law uo
i = −
1
R
∂θ h(θ(t),t,ω) ω=ωi
, we have the ε-Nash
equilibrium property
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N,
for any adapted control ui.
So, we look for solutions of PDEs.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 15 / 38
Solutions of PDE model Synchronization
Main result: Synchronization as a solution of the game
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γγ
Incoherence
R > Rc(γ)
Synchrony
Incoherence
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
1- 1+1
Yin et al., ACC 2010 Strogatz et al., J. Stat. Phy., 1991
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 16 / 38
Solutions of PDE model Synchronization
Main result: Synchronization as a solution of the game
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γγ
Incoherence
R > Rc(γ)
Synchrony
Incoherence
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
0 0.1 0.2
0.1
0.15
0.2
0.25
0.3 Locking
Incoherence
κ
κ < κc(γ)
R
γ
Synchrony
Incoherence
dθi = ωi +
κ
N
N
∑
j=1
sin(θj −θi) dt +σ dξi
Yin et al., ACC 2010 Strogatz et al., J. Stat. Phy., 1991
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 16 / 38
Solutions of PDE model Synchronization
Incoherence solution (PDE)
Incoherence solution
h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) :=
1
2π
h(θ,t,ω) = 0 ⇒ ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t)+η∗
−
σ2
2
∂2
θθ h
∂t p+ω∂θ p =
1
R
∂θ [p(∂θ h)]+
σ2
2
∂2
θθ p
¯c(θ,t) =
Ω
2π
0
c•
(θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 17 / 38
Solutions of PDE model Synchronization
Incoherence solution (PDE)
Incoherence solution
h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) :=
1
2π
h(θ,t,ω) = 0 ⇒ ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t)+η∗
−
σ2
2
∂2
θθ h
p(θ,t,ω) = 1
2π ⇒ ∂t p+ω∂θ p =
1
R
∂θ [p(∂θ h)]+
σ2
2
∂2
θθ p
¯c(θ,t) =
Ω
2π
0
c•
(θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 17 / 38
Solutions of PDE model Synchronization
Incoherence solution (Finite population)
Closed-loop dynamics dθi = (ωi + ui
=0
)dt +σ dξi(t)
Average cost
ηi = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i
=0
]dt
= lim
T→∞
1
N ∑
j=i
1
T
T
0
E[1
2 sin2 θi(t)−θj(t)
2
]dt
=
1
N ∑
j=i
2π
0
E[1
2 sin2 θi(t)−ϑ
2
]
1
2π
dϑ =
N −1
N
η0
−1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
ε-Nash property
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 18 / 38
Solutions of PDE model Synchronization
Incoherence solution (Finite population)
Closed-loop dynamics dθi = (ωi + ui
=0
)dt +σ dξi(t)
Average cost
ηi = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i
=0
]dt
= lim
T→∞
1
N ∑
j=i
1
T
T
0
E[1
2 sin2 θi(t)−θj(t)
2
]dt
=
1
N ∑
j=i
2π
0
E[1
2 sin2 θi(t)−ϑ
2
]
1
2π
dϑ =
N −1
N
η0
−1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
ε-Nash property
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 18 / 38
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γγ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
1 Convergence theory (as N → ∞)
2 Bifurcation analysis
3 Analytical and numerical computations
Yin et al., IEEE TAC, To appear; [17] Yin et al., ACC2010
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 19 / 38
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γγ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
incoherence soln.
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
1 Convergence theory (as N → ∞)
2 Bifurcation analysis
3 Analytical and numerical computations
Yin et al., IEEE TAC, To appear; [17] Yin et al., ACC2010
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 19 / 38
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γγ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
synchrony soln.
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
1 Convergence theory (as N → ∞)
2 Bifurcation analysis
3 Analytical and numerical computations
Yin et al., IEEE TAC, To appear; [17] Yin et al., ACC2010
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 19 / 38
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γγ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
1 Convergence theory (as N → ∞)
2 Bifurcation analysis
3 Analytical and numerical computations
Yin et al., IEEE TAC, To appear; [17] Yin et al., ACC2010
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 19 / 38
Part II
Mean-field Estimation
Collaborators
Tao Yang Sean P. Meyn
“A Control-oriented Approach for Particle Filtering,” ACC 2011
“Feedback Particle Filter with Mean-field Coupling,” submitted to CDC 2011
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 21 / 38
Problem Definition
Filtering problem
Signal, observation processes:
dθ(t) = ω dt +σB dB(t) mod 2π (1)
dZ(t) = h(θ(t))dt +σW dW(t) (2) -
Particle filter:
dθi(t) = ω dt +σB dBi(t)+Ui(t) mod 2π, i = 1,...,N. (3)
Objective: To choose control Ui(t) for the purpose of filtering.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 22 / 38
Problem Definition
Filtering problem
Signal, observation processes:
dθ(t) = ω dt +σB dB(t) mod 2π (1)
dZ(t) = h(θ(t))dt +σW dW(t) (2) -
Particle filter:
dθi(t) = ω dt +σB dBi(t)+Ui(t) mod 2π, i = 1,...,N. (3)
Objective: To choose control Ui(t) for the purpose of filtering.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 22 / 38
Problem Definition
Filtering problem
Signal, observation processes:
dθ(t) = ω dt +σB dB(t) mod 2π (1)
dZ(t) = h(θ(t))dt +σW dW(t) (2) -
Particle filter:
dθi(t) = ω dt +σB dBi(t)+Ui(t) mod 2π, i = 1,...,N. (3)
Objective: To choose control Ui(t) for the purpose of filtering.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 22 / 38
Problem Definition
Filtering problem
Signal, observation processes:
dX(t) = a(X(t))dt +σB dB(t) (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
X(t) ∈ R – state, Z(t) ∈ R – observation
a(·),h(·) – C1
functions
B(t),W(t) –ind. standard Wiener processes, σB ≥ 0,σW > 0
X(0) ∼ p0(·)
Define Z t
:= σ{Z(s),0 ≤ s ≤ t}
Objective: estimate the posterior distribution p∗
of X(t) given Z t
.
Solution approaches:
Linear dynamics: Kalman filter (R. E. Kalman, 1960)
Nonlinear dynamics: Kushner’s equation (H. J. Kushner, 1964)
Numerical Methods: Particle filtering (N. J. Gordon et. al, 1993)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 23 / 38
Problem Definition
Filtering problem
Signal, observation processes:
dX(t) = a(X(t))dt +σB dB(t) (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
X(t) ∈ R – state, Z(t) ∈ R – observation
a(·),h(·) – C1
functions
B(t),W(t) –ind. standard Wiener processes, σB ≥ 0,σW > 0
X(0) ∼ p0(·)
Define Z t
:= σ{Z(s),0 ≤ s ≤ t}
Objective: estimate the posterior distribution p∗
of X(t) given Z t
.
Solution approaches:
Linear dynamics: Kalman filter (R. E. Kalman, 1960)
Nonlinear dynamics: Kushner’s equation (H. J. Kushner, 1964)
Numerical Methods: Particle filtering (N. J. Gordon et. al, 1993)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 23 / 38
Problem Definition
Filtering problem
Signal, observation processes:
dX(t) = a(X(t))dt +σB dB(t) (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
X(t) ∈ R – state, Z(t) ∈ R – observation
a(·),h(·) – C1
functions
B(t),W(t) –ind. standard Wiener processes, σB ≥ 0,σW > 0
X(0) ∼ p0(·)
Define Z t
:= σ{Z(s),0 ≤ s ≤ t}
Objective: estimate the posterior distribution p∗
of X(t) given Z t
.
Solution approaches:
Linear dynamics: Kalman filter (R. E. Kalman, 1960)
Nonlinear dynamics: Kushner’s equation (H. J. Kushner, 1964)
Numerical Methods: Particle filtering (N. J. Gordon et. al, 1993)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 23 / 38
Existing Approach Kalman Filter
Linear case: Kalman Filter
Linear Gaussian system:
dX(t) = aX(t)dt +σB dB(t) (1)
dZ(t) = hX(t)dt +σW dW(t) (2)
Kalman filter: p∗
= N(µ(t),Σ(t))
dµ(t) = aµ(t)dt +
hΣ(t)
σ2
W
(dZ(t)−hµ(t)dt)
innovation error
dΣ(t)
dt
= 2aΣ(t)+σ2
B −
h2Σ2(t)
σ2
W
[9] R. E. Kalman, Trans. ASME, Ser. D: J. Basic Eng.,1961
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 24 / 38
Existing Approach Kalman Filter
Linear case: Kalman Filter
Linear Gaussian system:
dX(t) = aX(t)dt +σB dB(t) (1)
dZ(t) = hX(t)dt +σW dW(t) (2)
Kalman filter: p∗
= N(µ(t),Σ(t))
dµ(t) = aµ(t)dt +
hΣ(t)
σ2
W
(dZ(t)−hµ(t)dt)
innovation error
dΣ(t)
dt
= 2aΣ(t)+σ2
B −
h2Σ2(t)
σ2
W
[9] R. E. Kalman, Trans. ASME, Ser. D: J. Basic Eng.,1961
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 24 / 38
Existing Approach Nonlinear case: Kushner’s equation
Nonlinear case: K-S equation
Signal, observation processes:
dX(t) = a(X(t))dt +σB dB(t), (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
B(t),W(t) are independent standard Wiener processes.
Kushner’s equation: the posterior distribution p∗
is a solution of a
stochastic PDE:
dp∗
= L †
(p∗
)dt +(h− ˆh)(σ2
W )−1
(dZ(t)− ˆhdt)p∗
where
ˆh = E[h(X(t))|Z t
] = h(x)p∗
(x,t)dx
L †
(p∗
) = −
∂(p∗ ·a(x))
∂x
+
1
2
σ2
B
∂2 p∗
∂x2
No closed-form solution in general. Closure problem.
[11] H. J. Kushner, SIAM J. Control, 1964
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 25 / 38
Existing Approach Nonlinear case: Kushner’s equation
Nonlinear case: K-S equation
Signal, observation processes:
dX(t) = a(X(t))dt +σB dB(t), (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
B(t),W(t) are independent standard Wiener processes.
Kushner’s equation: the posterior distribution p∗
is a solution of a
stochastic PDE:
dp∗
= L †
(p∗
)dt +(h− ˆh)(σ2
W )−1
(dZ(t)− ˆhdt)p∗
where
ˆh = E[h(X(t))|Z t
] = h(x)p∗
(x,t)dx
L †
(p∗
) = −
∂(p∗ ·a(x))
∂x
+
1
2
σ2
B
∂2 p∗
∂x2
No closed-form solution in general. Closure problem.
[11] H. J. Kushner, SIAM J. Control, 1964
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 25 / 38
Existing Approach Nonlinear case: Kushner’s equation
Nonlinear case: K-S equation
Signal, observation processes:
dX(t) = a(X(t))dt +σB dB(t), (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
B(t),W(t) are independent standard Wiener processes.
Kushner’s equation: the posterior distribution p∗
is a solution of a
stochastic PDE:
dp∗
= L †
(p∗
)dt +(h− ˆh)(σ2
W )−1
(dZ(t)− ˆhdt)p∗
where
ˆh = E[h(X(t))|Z t
] = h(x)p∗
(x,t)dx
L †
(p∗
) = −
∂(p∗ ·a(x))
∂x
+
1
2
σ2
B
∂2 p∗
∂x2
No closed-form solution in general. Closure problem.
[11] H. J. Kushner, SIAM J. Control, 1964
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 25 / 38
Existing Approach Nonlinear case: Kushner’s equation
Numerical methods: Particle filter
Idea:
Approximate posterior p∗
in terms of particles {Xi(t)}N
i=1 so
p∗
≈
1
N
N
∑
i=1
δXi(t)(·)
Algorithm outline
Initialization Xi(0) ∼ p∗
0(·).
At each time step:
Importance sampling (for weight update)
Resampling (for variance reduction)
[5], N. J. Gordon, D. J. Salmond, and A. F. M. Smith, IEEE Proceedings on Radar and Signal Processing, 1993
[4], A. Doucet, N. Freitas, N. Gordon, Springer-Verlag, 2001
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 26 / 38
Existing Approach Nonlinear case: Kushner’s equation
Numerical methods: Particle filter
Idea:
Approximate posterior p∗
in terms of particles {Xi(t)}N
i=1 so
p∗
≈
1
N
N
∑
i=1
δXi(t)(·)
Algorithm outline
Initialization Xi(0) ∼ p∗
0(·).
At each time step:
Importance sampling (for weight update)
Resampling (for variance reduction)
[5], N. J. Gordon, D. J. Salmond, and A. F. M. Smith, IEEE Proceedings on Radar and Signal Processing, 1993
[4], A. Doucet, N. Freitas, N. Gordon, Springer-Verlag, 2001
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 26 / 38
A Control-Oriented Approach to Particle Filtering Overview
Mean-field filtering
Signal, observation processes:
dX(t) = a(X(t))dt +σB dB(t) (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
Controlled system (N particles):
dXi(t) = a(Xi(t))dt +σB dBi(t)+ Ui(t)
mean field control
, i = 1,...,N (3)
{Bi(t)}N
i=1 are ind. standard Wiener Processes.
Objective: Choose control Ui(t) such that the empirical distribution
of the population approximates the posterior distribution p∗
as
N → ∞.
[7] M. Huang, P. E. Caines, and R. P. Malhame, IEEE TAC, 2007
[18] H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, IEEE TAC, To be appeared
[12] J. Lasry and P. Lions, Japanese Journal of Mathematics, 2007
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 27 / 38
A Control-Oriented Approach to Particle Filtering Overview
Mean-field filtering
Signal, observation processes:
dX(t) = a(X(t))dt +σB dB(t) (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
Controlled system (N particles):
dXi(t) = a(Xi(t))dt +σB dBi(t)+ Ui(t)
mean field control
, i = 1,...,N (3)
{Bi(t)}N
i=1 are ind. standard Wiener Processes.
Objective: Choose control Ui(t) such that the empirical distribution
of the population approximates the posterior distribution p∗
as
N → ∞.
[7] M. Huang, P. E. Caines, and R. P. Malhame, IEEE TAC, 2007
[18] H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, IEEE TAC, To be appeared
[12] J. Lasry and P. Lions, Japanese Journal of Mathematics, 2007
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 27 / 38
A Control-Oriented Approach to Particle Filtering Overview
Mean-field filtering
Signal, observation processes:
dX(t) = a(X(t))dt +σB dB(t) (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
Controlled system (N particles):
dXi(t) = a(Xi(t))dt +σB dBi(t)+ Ui(t)
mean field control
, i = 1,...,N (3)
{Bi(t)}N
i=1 are ind. standard Wiener Processes.
Objective: Choose control Ui(t) such that the empirical distribution
of the population approximates the posterior distribution p∗
as
N → ∞.
[7] M. Huang, P. E. Caines, and R. P. Malhame, IEEE TAC, 2007
[18] H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, IEEE TAC, To be appeared
[12] J. Lasry and P. Lions, Japanese Journal of Mathematics, 2007
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 27 / 38
A Control-Oriented Approach to Particle Filtering Example: Linear case
Example: Linear case
Controlled system: for i = 1,...N:
dXi(t) = aXi(t)dt +σB dBi(t)+
hΣ(t)
σ2
W
[dZ(t)−h
Xi(t)+ µ(t)
2
dt]
Ui(t)
(3)
where
µ(t) ≈ ¯µ(N)
(t) =
1
N
N
∑
i=1
Xi(t) ←− sample mean
Σ(t) ≈ ¯Σ(N)
(t) =
1
N −1
N
∑
i=1
(Xi(t)− µ(t))2
←− sample variance
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 28 / 38
A Control-Oriented Approach to Particle Filtering Example: Linear case
Example: Linear case
Feedback particle filter:
dXi(t) = aXi(t)dt +σB dBi(t)+
h¯Σ(N)(t)
σ2
W
[dZ(t)−h
Xi(t)+ ¯µ(N)(t)
2
dt]
(3)
Mean-field model: Let p denote cond. dist. of Xi(t) given Z t
with
p(x,0) = N(µ(0),Σ(0)). Then p = N(µ(t),Σ(t)) where
dµ(t) = aµ(t)dt +
hΣ(t)
σ2
W
(dZ(t)−hµ(t))
dΣ(t)
dt
= 2aΣ(t)+σ2
B −
h2Σ2(t)
σ2
W
Same as Kalman filter!
As N → ∞, the empirical distribution approximates the posterior p∗
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 29 / 38
A Control-Oriented Approach to Particle Filtering Example: Linear case
Comparison plots
mse =
1
T
T
0
Σ
(N)
t −Σt
Σt
2
dt
10
1
10
2
10
3
10
−3
10
−2
10
−1
10
0
N
Error FPF −0.5
0
0.5
BPF −0.5
0
Bootstrap (BPF)
Feedback (FPF)
a ∈ [−1,1],h = 3,σB = 1,σW = 0.5, dt = 0.01,T = 50 and
N ∈ [20,50,100,200,500,1000].
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 30 / 38
Methodology
Methodology: Variational problem
Continuous-discrete time problem. .......
Signal, observ processes:
dX(t) = a(X(t))dt +σB dB(t)
Z(tn) = h(X(tn))+W(tn)
Particle filter
Filter: dXi(t) = a(Xi(t)dt +σB dBi(t)
Control: Xi(tn) = Xi(t−
n )+u(Xi(t−
n ))
control
Belief map.
p∗
n(x): cond. pdf of X(t)|Z t
Bayes rule
pn(x): cond. pdf of Xi(t)|Z t
Variational problem:
K-L Metric: KL(pn p∗
n) = Epn [ln
pn
p∗
n
]
Control design: u∗
(x) = arg minuKL(pn p∗
n).
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 31 / 38
Methodology
Methodology: Variational problem
Continuous-discrete time problem. .......
Signal, observ processes:
dX(t) = a(X(t))dt +σB dB(t)
Z(tn) = h(X(tn))+W(tn)
Particle filter
Filter: dXi(t) = a(Xi(t)dt +σB dBi(t)
Control: Xi(tn) = Xi(t−
n )+u(Xi(t−
n ))
control
Belief map.
p∗
n(x): cond. pdf of X(t)|Z t
Bayes rule
pn(x): cond. pdf of Xi(t)|Z t
Variational problem:
K-L Metric: KL(pn p∗
n) = Epn [ln
pn
p∗
n
]
Control design: u∗
(x) = arg minuKL(pn p∗
n).
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 31 / 38
Methodology
Methodology: Variational problem
Continuous-discrete time problem. .......
Signal, observ processes:
dX(t) = a(X(t))dt +σB dB(t)
Z(tn) = h(X(tn))+W(tn)
Particle filter
Filter: dXi(t) = a(Xi(t)dt +σB dBi(t)
Control: Xi(tn) = Xi(t−
n )+u(Xi(t−
n ))
control
Belief map.
p∗
n(x): cond. pdf of X(t)|Z t
Bayes rule
pn(x): cond. pdf of Xi(t)|Z t
Variational problem:
K-L Metric: KL(pn p∗
n) = Epn [ln
pn
p∗
n
]
Control design: u∗
(x) = arg minuKL(pn p∗
n).
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 31 / 38
Feedback particle filter
Feedback particle filter(FPF)
Problem:
dX(t) = a(X(t))dt +σB dB(t), (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
FPF:
dXi(t) = a(Xi(t))dt +σB dBi(t)
+v(Xi(t))dIi(t)+
1
2
σ2
W v(Xi(t))v (Xi(t))dt (3)
Innovation error dIi(t)=:dZ(t)−
1
2
(h(Xi(t))+ ˆh)dt
Population prediction ˆh =
1
N
N
∑
i=1
h(Xi(t)).
Euler-Lagrange equation:
−
∂
∂x
1
p(x,t)
∂
∂x
{p(x,t)v(x,t)} =
1
σ2
W
h (x), (4)
with boundary condition lim
x→±∞
v(x,t)p(x,t) = 0.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 32 / 38
Feedback particle filter
Feedback particle filter(FPF)
Problem:
dX(t) = a(X(t))dt +σB dB(t), (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
FPF:
dXi(t) = a(Xi(t))dt +σB dBi(t)
+v(Xi(t))dIi(t)+
1
2
σ2
W v(Xi(t))v (Xi(t))dt (3)
Innovation error dIi(t)=:dZ(t)−
1
2
(h(Xi(t))+ ˆh)dt
Population prediction ˆh =
1
N
N
∑
i=1
h(Xi(t)).
Euler-Lagrange equation:
−
∂
∂x
1
p(x,t)
∂
∂x
{p(x,t)v(x,t)} =
1
σ2
W
h (x), (4)
with boundary condition lim
x→±∞
v(x,t)p(x,t) = 0.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 32 / 38
Feedback particle filter
Feedback particle filter(FPF)
Problem:
dX(t) = a(X(t))dt +σB dB(t), (1)
dZ(t) = h(X(t))dt +σW dW(t) (2)
FPF:
dXi(t) = a(Xi(t))dt +σB dBi(t)
+v(Xi(t))dIi(t)+
1
2
σ2
W v(Xi(t))v (Xi(t))dt (3)
Innovation error dIi(t)=:dZ(t)−
1
2
(h(Xi(t))+ ˆh)dt
Population prediction ˆh =
1
N
N
∑
i=1
h(Xi(t)).
Euler-Lagrange equation:
−
∂
∂x
1
p(x,t)
∂
∂x
{p(x,t)v(x,t)} =
1
σ2
W
h (x), (4)
with boundary condition lim
x→±∞
v(x,t)p(x,t) = 0.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 32 / 38
Consistency result
Consistency result
p∗
denotes cond. pdf of X(t) given Z t
.
dp∗
= L †
(p∗
)dt +(h− ˆh)(σ2
W )−1
(dZ(t)− ˆhdt)p∗
p denotes cond. pdf of Xi(t) given Z t
dp = L †
pdt −
∂
∂x
(vp) dZ(t)−
∂
∂x
(up) dt +
σ2
W
2
∂2
∂x2
pv2
dt
Theorem
Consider the two evolution equations for p and p∗
, defined according to
the solution of the Kolmogorov forward equation and the K-S equation,
respectively. Suppose that the control function v(x,t) is obtained
according to (4). Then, provided p(x,0) = p∗
(x,0), we have for all t ≥ 0,
p(x,t) = p∗
(x,t)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 33 / 38
Linear Gaussian case
Linear Gaussian case
Linear system:
dX(t) = aX(t)dt +σB dB(t) (1)
dZ(t) = hX(t)dt +σW dW(t) (2)
p(x,t) =
1
2πΣ(t)
exp −
(x− µ(t))2
2Σ(t)
E-L equation:
−
∂
∂x
1
p
∂
∂x
{pv} =
h
σ2
W
⇒ v(x,t) =
hΣ(t)
σ2
W
FPF:
dXi(t) = aXi(t)dt +σB dBi(t)+
hΣ(t)
σ2
W
[dZ(t)−h
Xi(t)+ µ(t)
2
dt] (3)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 34 / 38
Linear Gaussian case
Linear Gaussian case
Linear system:
dX(t) = aX(t)dt +σB dB(t) (1)
dZ(t) = hX(t)dt +σW dW(t) (2)
p(x,t) =
1
2πΣ(t)
exp −
(x− µ(t))2
2Σ(t)
E-L equation:
−
∂
∂x
1
p
∂
∂x
{pv} =
h
σ2
W
⇒ v(x,t) =
hΣ(t)
σ2
W
FPF:
dXi(t) = aXi(t)dt +σB dBi(t)+
hΣ(t)
σ2
W
[dZ(t)−h
Xi(t)+ µ(t)
2
dt] (3)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 34 / 38
Linear Gaussian case
Linear Gaussian case
Linear system:
dX(t) = aX(t)dt +σB dB(t) (1)
dZ(t) = hX(t)dt +σW dW(t) (2)
p(x,t) =
1
2πΣ(t)
exp −
(x− µ(t))2
2Σ(t)
E-L equation:
−
∂
∂x
1
p
∂
∂x
{pv} =
h
σ2
W
⇒ v(x,t) =
hΣ(t)
σ2
W
FPF:
dXi(t) = aXi(t)dt +σB dBi(t)+
hΣ(t)
σ2
W
[dZ(t)−h
Xi(t)+ µ(t)
2
dt] (3)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 34 / 38
Linear Gaussian case
Linear Gaussian case
Linear system:
dX(t) = aX(t)dt +σB dB(t) (1)
dZ(t) = hX(t)dt +σW dW(t) (2)
p(x,t) =
1
2πΣ(t)
exp −
(x− µ(t))2
2Σ(t)
E-L equation:
−
∂
∂x
1
p
∂
∂x
{pv} =
h
σ2
W
⇒ v(x,t) =
hΣ(t)
σ2
W
FPF:
dXi(t) = aXi(t)dt +σB dBi(t)+
hΣ(t)
σ2
W
[dZ(t)−h
Xi(t)+ µ(t)
2
dt] (3)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 34 / 38
Example
Example: Filtering of nonlinear oscillator
Nonlinear oscillator:
dθ(t) = ω dt +σB dB(t) mod 2π (1)
dZ(t) =
1+cosθ(t)
2
h(θ(t))
dt +σW dW(t) (2)
-
Controlled system (N particles):
dθi(t) = ω dt +σB dBi(t)+v(θi(t))[dZ(t)−
1
2
(h(θi(t))+ ˆh)dt]
+
1
2
σ2
W vv dt mod 2π, i = 1,...,N. (3)
where v(θi(t)) is obtained via the solution of the E-L equation:
−
∂
∂θ
1
p(θ,t)
∂
∂θ
{p(θ,t)v(θ,t)} = −
sinθ
σ2
W
(4)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 35 / 38
Example
Example: Filtering of nonlinear oscillator
Nonlinear oscillator:
dθ(t) = ω dt +σB dB(t) mod 2π (1)
dZ(t) =
1+cosθ(t)
2
h(θ(t))
dt +σW dW(t) (2)
-
Controlled system (N particles):
dθi(t) = ω dt +σB dBi(t)+v(θi(t))[dZ(t)−
1
2
(h(θi(t))+ ˆh)dt]
+
1
2
σ2
W vv dt mod 2π, i = 1,...,N. (3)
where v(θi(t)) is obtained via the solution of the E-L equation:
−
∂
∂θ
1
p(θ,t)
∂
∂θ
{p(θ,t)v(θ,t)} = −
sinθ
σ2
W
(4)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 35 / 38
Example
Example: Filtering of nonlinear oscillator
Fourier form of p(θ,t):
p(θ,t) =
1
2π
+Ps(t)sinθ +Pc(t)cosθ.
Approx. solution of E-L equation (4) (using a perturbation
method):
v(θ,t) =
1
2σ2
W
−sinθ +
π
2
[Pc(t)sin2θ −Ps(t)cos2θ] ,
v (θ,t) =
1
2σ2
W
{−cosθ +π[Pc(t)cos2θ +Ps(t)sin2θ]}
where
Pc(t) ≈ ¯P
(N)
c (t) =
1
πN
N
∑
j=1
cosθj(t), Ps(t) ≈ ¯P
(N)
s (t) =
1
πN
N
∑
j=1
sinθj(t).
Reminiscent of coupled oscillators (Winfree)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 36 / 38
Example
Example: Filtering of nonlinear oscillator
Fourier form of p(θ,t):
p(θ,t) =
1
2π
+Ps(t)sinθ +Pc(t)cosθ.
Approx. solution of E-L equation (4) (using a perturbation
method):
v(θ,t) =
1
2σ2
W
−sinθ +
π
2
[Pc(t)sin2θ −Ps(t)cos2θ] ,
v (θ,t) =
1
2σ2
W
{−cosθ +π[Pc(t)cos2θ +Ps(t)sin2θ]}
where
Pc(t) ≈ ¯P
(N)
c (t) =
1
πN
N
∑
j=1
cosθj(t), Ps(t) ≈ ¯P
(N)
s (t) =
1
πN
N
∑
j=1
sinθj(t).
Reminiscent of coupled oscillators (Winfree)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 36 / 38
Example
Example: Filtering of nonlinear oscillator
Fourier form of p(θ,t):
p(θ,t) =
1
2π
+Ps(t)sinθ +Pc(t)cosθ.
Approx. solution of E-L equation (4) (using a perturbation
method):
v(θ,t) =
1
2σ2
W
−sinθ +
π
2
[Pc(t)sin2θ −Ps(t)cos2θ] ,
v (θ,t) =
1
2σ2
W
{−cosθ +π[Pc(t)cos2θ +Ps(t)sin2θ]}
where
Pc(t) ≈ ¯P
(N)
c (t) =
1
πN
N
∑
j=1
cosθj(t), Ps(t) ≈ ¯P
(N)
s (t) =
1
πN
N
∑
j=1
sinθj(t).
Reminiscent of coupled oscillators (Winfree)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 36 / 38
Example
Simulation Results
Signal, observation processes:
dθ(t) = 1dt +0.5dB(t) (1)
dZ(t) =
1+cosθ(t)
2
dt +0.4dW(t) (2)
Particle system (N = 100):
dθi(t) = 1dt +σB dBi(t)+U(θi(t); ¯P
(N)
c (t), ¯P
(N)
s (t)) mod 2π (3)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 37 / 38
Example
Simulation Results
0 2 4 0 2 4
0.01
0.02
0.03
0.04
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
t t x
π 2π0
π
2π
(a) (b) (c)First harmonic PDF estimate
Second harmonic
Bounds
State
Conditional mean est.
θ(t)
¯θN
(t)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 38 / 38
Thank you!
Website: http://www.mechse.illinois.edu/research/mehtapg
Collaborators
Huibing Yin Tao Yang Sean Meyn Uday Shanbhag
“Synchronization of coupled oscillators is a game,” IEEE TAC, ACC 2010
“Learning in Mean-Field Oscillator Games,” CDC 2010
“On the Efficiency of Equilibria in Mean-field Oscillator Games,” ACC 2011
“A Control-oriented Approach for Particle Filtering,” ACC2011, CDC2011
Part III
Learning in mean-field game
Learning Q-function approximation
Approx. DP
Optimality equation min
ui
{c(θ;θ−i(t))+ 1
2 Ru2
i +Dui hi(θ,t)
=: Hi(θ,ui;θ−i(t))
} = η∗
i
Optimal control law Kuramoto law
u∗
i = −
1
R
∂θ hi(θ,t) u
(Kur)
i = −
κ
N ∑
j=i
sin(θi −θj(t))
Parameterization:
H
(Ai,φi)
i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1
2 Ru2
i +(ωi −1+ui)AiS(φi)
+
σ2
2
AiC(φi)
where
S(φ)
(θ,θ−i) =
1
N ∑
j=i
sin(θ −θj −φ), C(φ)
(θ,θ−i) =
1
N ∑
j=i
cos(θ −θj −φ)
Approx. optimal control:
u
(Ai,φi)
i = arg min
ui
{H
(Ai,φi)
i (θ,ui;θ−i(t))} = −
Ai
R
S(φi)
(θ,θ−i)
Watkins & Dayan, Q-learning, 1992; Bertsekas & Tsitsiklis, NDP, 1996; Mehta & Meyn, CDC 2009
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 41 / 38
Learning Q-function approximation
Approx. DP
Optimality equation min
ui
{c(θ;θ−i(t))+ 1
2 Ru2
i +Dui hi(θ,t)
=: Hi(θ,ui;θ−i(t))
} = η∗
i
Optimal control law Kuramoto law
u∗
i = −
1
R
∂θ hi(θ,t) u
(Kur)
i = −
κ
N ∑
j=i
sin(θi −θj(t))
Parameterization:
H
(Ai,φi)
i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1
2 Ru2
i +(ωi −1+ui)AiS(φi)
+
σ2
2
AiC(φi)
where
S(φ)
(θ,θ−i) =
1
N ∑
j=i
sin(θ −θj −φ), C(φ)
(θ,θ−i) =
1
N ∑
j=i
cos(θ −θj −φ)
Approx. optimal control:
u
(Ai,φi)
i = arg min
ui
{H
(Ai,φi)
i (θ,ui;θ−i(t))} = −
Ai
R
S(φi)
(θ,θ−i)
Watkins & Dayan, Q-learning, 1992; Bertsekas & Tsitsiklis, NDP, 1996; Mehta & Meyn, CDC 2009
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 41 / 38
Learning Q-function approximation
Approx. DP
Optimality equation min
ui
{c(θ;θ−i(t))+ 1
2 Ru2
i +Dui hi(θ,t)
=: Hi(θ,ui;θ−i(t))
} = η∗
i
Optimal control law Kuramoto law
u∗
i = −
1
R
∂θ hi(θ,t) u
(Kur)
i = −
κ
N ∑
j=i
sin(θi −θj(t))
Parameterization:
H
(Ai,φi)
i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1
2 Ru2
i +(ωi −1+ui)AiS(φi)
+
σ2
2
AiC(φi)
where
S(φ)
(θ,θ−i) =
1
N ∑
j=i
sin(θ −θj −φ), C(φ)
(θ,θ−i) =
1
N ∑
j=i
cos(θ −θj −φ)
Approx. optimal control:
u
(Ai,φi)
i = arg min
ui
{H
(Ai,φi)
i (θ,ui;θ−i(t))} = −
Ai
R
S(φi)
(θ,θ−i)
Watkins & Dayan, Q-learning, 1992; Bertsekas & Tsitsiklis, NDP, 1996; Mehta & Meyn, CDC 2009
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 41 / 38
Learning Q-function approximation
Approx. DP
Optimality equation min
ui
{c(θ;θ−i(t))+ 1
2 Ru2
i +Dui hi(θ,t)
=: Hi(θ,ui;θ−i(t))
} = η∗
i
Optimal control law Kuramoto law
u∗
i = −
1
R
∂θ hi(θ,t) u
(Kur)
i = −
κ
N ∑
j=i
sin(θi −θj(t))
Parameterization:
H
(Ai,φi)
i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1
2 Ru2
i +(ωi −1+ui)AiS(φi)
+
σ2
2
AiC(φi)
where
S(φ)
(θ,θ−i) =
1
N ∑
j=i
sin(θ −θj −φ), C(φ)
(θ,θ−i) =
1
N ∑
j=i
cos(θ −θj −φ)
Approx. optimal control:
u
(Ai,φi)
i = arg min
ui
{H
(Ai,φi)
i (θ,ui;θ−i(t))} = −
Ai
R
S(φi)
(θ,θ−i)
Watkins & Dayan, Q-learning, 1992; Bertsekas & Tsitsiklis, NDP, 1996; Mehta & Meyn, CDC 2009
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 41 / 38
Learning Stochastic approx.
Learning algorithm
Bellman error:
Pointwise: L (Ai,φi)
(θ,t) = min
ui
{H
(Ai,φi)
i }−η
(A∗
i ,φ∗
i )
i
Stochastic approx. algorithm
˜e(Ai,φi) =
2
∑
k=1
| L (Ai,φi)
, ˜ϕk(θ) |2
dAi
dt
= −ε
d˜e(Ai,φi)
dAi
,
dφi
dt
= −ε
d˜e(Ai,φi)
dφi
Convergence analysis: Ai(t) → A∗
, φi(t) → φ∗
100 200 300 400 500
−1
0
1
2
3
10 15 20 25
−1
0
1
2
3
Synchrony Incoherence
Yin et al., CDC 2010
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 42 / 38
Learning Stochastic approx.
Learning algorithm
Bellman error:
Pointwise: L (Ai,φi)
(θ,t) = min
ui
{H
(Ai,φi)
i }−η
(A∗
i ,φ∗
i )
i
Stochastic approx. algorithm
˜e(Ai,φi) =
2
∑
k=1
| L (Ai,φi)
, ˜ϕk(θ) |2
dAi
dt
= −ε
d˜e(Ai,φi)
dAi
,
dφi
dt
= −ε
d˜e(Ai,φi)
dφi
Convergence analysis: Ai(t) → A∗
, φi(t) → φ∗
100 200 300 400 500
−1
0
1
2
3
10 15 20 25
−1
0
1
2
3
Synchrony Incoherence
Yin et al., CDC 2010
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 42 / 38
Learning Bifurcation diagram
Comparison of average cost
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 43 / 38
Learning Bifurcation diagram
Comparison of average cost
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
Learning
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 43 / 38
Learning Bifurcation diagram
Comparison of average cost
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
Learning
Mean-field
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 43 / 38
Part IV
Apendix
Analysis of phase transition Incoherence solution
Overview of the steps
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂t p+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
¯c(ϑ,t) =
Ω
2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
Assume c•
(ϑ,θ) = c•
(ϑ −θ) = 1
2 sin2 ϑ −θ
2
Incoherence solution
h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) :=
1
2π
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 45 / 38
Analysis of phase transition Bifurcation analysis
Linearization and spectra
Linearized PDE (about incoherence solution)
∂
∂t
˜z(θ,t,ω) =
−ω∂θ
˜h− ¯c− σ2
2 ∂2
θθ
˜h
−ω∂θ ˜p+ 1
2πR∂2
θθ
˜h+ σ2
2 ∂2
θθ ˜p
=: LR ˜z(θ,t,ω)
Spectrum of the linear operator
1 Continuous spectrum {S(k)
}+∞
k=−∞
S(k)
:=
λ ∈ C λ = ±
σ2
2
k2
−kωi for all ω ∈ Ω
2 Discrete spectrum
Characteristic eqn:
1
8R Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 46 / 38
Analysis of phase transition Bifurcation analysis
Linearization and spectra
Linearized PDE (about incoherence solution)
∂
∂t
˜z(θ,t,ω) =
−ω∂θ
˜h− ¯c− σ2
2 ∂2
θθ
˜h
−ω∂θ ˜p+ 1
2πR∂2
θθ
˜h+ σ2
2 ∂2
θθ ˜p
=: LR ˜z(θ,t,ω)
Spectrum of the linear operator
1 Continuous spectrum {S(k)
}+∞
k=−∞
S(k)
:=
λ ∈ C λ = ±
σ2
2
k2
−kωi for all ω ∈ Ω
2 Discrete spectrum
Characteristic eqn:
1
8R Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 46 / 38
Analysis of phase transition Bifurcation analysis
Linearization and spectra
Linearized PDE (about incoherence solution)
∂
∂t
˜z(θ,t,ω) =
−ω∂θ
˜h− ¯c− σ2
2 ∂2
θθ
˜h
−ω∂θ ˜p+ 1
2πR∂2
θθ
˜h+ σ2
2 ∂2
θθ ˜p
=: LR ˜z(θ,t,ω)
Spectrum of the linear operator
1 Continuous spectrum {S(k)
}+∞
k=−∞
S(k)
:=
λ ∈ C λ = ±
σ2
2
k2
−kωi for all ω ∈ Ω −0.2 −0.1 0 0.1 0.2 0.3
−3
−2
−1
0
1
2
3
real
imag
γ = 0.1
R decreases
k=2 k=2
k=1 k=1
2 Discrete spectrum
Characteristic eqn:
1
8R Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 46 / 38
Analysis of phase transition Bifurcation analysis
Linearization and spectra
Linearized PDE (about incoherence solution)
∂
∂t
˜z(θ,t,ω) =
−ω∂θ
˜h− ¯c− σ2
2 ∂2
θθ
˜h
−ω∂θ ˜p+ 1
2πR∂2
θθ
˜h+ σ2
2 ∂2
θθ ˜p
=: LR ˜z(θ,t,ω)
Spectrum of the linear operator
1 Continuous spectrum {S(k)
}+∞
k=−∞
S(k)
:=
λ ∈ C λ = ±
σ2
2
k2
−kωi for all ω ∈ Ω −0.2 −0.1 0 0.1 0.2 0.3
−3
−2
−1
0
1
2
3
real
imag
γ = 0.1
R decreases
k=2 k=2
k=1 k=1
2 Discrete spectrum
Characteristic eqn:
1
8R Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 46 / 38
Analysis of phase transition Bifurcation analysis
Bifurcation diagram (Hamiltonian Hopf)
Characteristic eqn:
1
8R Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
Stability proof
[3] Dellnitz et al., Int. Series Num. Math., 1992
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 47 / 38
Analysis of phase transition Bifurcation analysis
Bifurcation diagram (Hamiltonian Hopf)
Characteristic eqn:
1
8R Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
Stability proof
−0.2 −0.1 0 0.1 0.2
-0.6
-0.8
-1
-1.2
-1.4
real
imag
(a)
Cont.spectrum;ind.ofR
Disc.spectrum;fn.ofR
Bifurcation point
[3] Dellnitz et al., Int. Series Num. Math., 1992
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 47 / 38
Analysis of phase transition Bifurcation analysis
Bifurcation diagram (Hamiltonian Hopf)
Characteristic eqn:
1
8R Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
Stability proof
−0.2 −0.1 0 0.1 0.2
-0.6
-0.8
-1
-1.2
-1.4
real
imag
(a)
Cont.spectrum;ind.ofR
Disc.spectrum;fn.ofR
Bifurcation point
0 0.05 0.1 0.15 0.2
15
20
25
30
35
40
45
50
Incoherence
R > R
R
c(γ
γ
) (c))
Synchrony
0.05
[3] Dellnitz et al., Int. Series Num. Math., 1992
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 47 / 38
Analysis of phase transition Numerics
Numerical solution of PDEs
Incoherence; R = 60
incoherence
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 48 / 38
Analysis of phase transition Numerics
Numerical solution of PDEs
Incoherence; R = 60
incoherence
Synchrony; R = 10
synchrony
synchrony
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 48 / 38
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γγ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 49 / 38
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γγ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
incoherence soln.
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 49 / 38
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γγ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
synchrony soln.
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 49 / 38
Analysis of phase transition Numerics
Bifurcation diagram with another cost
c•
(θ,ϑ) = 0.25(1−cos(θ −ϑ))
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
0.26
0.28
R−1/2
0 1 2 3 4 5 6 7
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
θ
p
R = 23.8359
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 50 / 38
Analysis of phase transition Numerics
Bifurcation diagram with another cost
c•
(θ,ϑ) = 0.25(1−cos(θ −ϑ))
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
0.26
0.28
R−1/2
0 1 2 3 4 5 6 7
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
θ
p
R = 23.8359
0 1 2 3 4 5 6 7
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
θ
p
R = 23.8359
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 50 / 38
Analysis of phase transition Numerics
Bifurcation diagram with another cost
c•
(θ,ϑ) = 0.25(1−cos(θ −ϑ))
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
0.26
0.28
R−1/2
0 1 2 3 4 5 6 7
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
θ
p
R = 23.8359
c•
(θ,ϑ) = 1−cos(θ −ϑ)−cos3(θ −ϑ)
0 0.5 1 1.5
0
0.2
0.4
0.6
0.8
1
R−1/2
0 1 2 3 4 5 6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 1 2 3 4 5 6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 51 / 38
Analysis of phase transition Numerics
Bifurcation diagram with another cost
c•
(θ,ϑ) = 0.25(1−cos(θ −ϑ))
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
0.26
0.28
R−1/2
0 1 2 3 4 5 6 7
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
θ
p
R = 23.8359
c•
(θ,ϑ) = 1−cos(θ −ϑ)−cos3(θ −ϑ)
0 1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
θ
p
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 51 / 38
Feedback particle filter
Numerical method: particle filter
Basic algorithm (discretized time):
Step 0: Initialization.
For i = 1,...,N, sample x
(i)
0 ∼ p0|0(x) and set n = 0.
Step 1: Importance Sampling step.
For i = 1,...,N, sample ˜x
(i)
n ∼ pN
n−1|n−1K(xn) = (1/N)
N
∑
k=1
K(xn|x
(k)
n−1).
For i = 1,...,N, evaluate the normalized importance weight w
(i)
n
w
(i)
n ∝ g zn|˜x
(i)
n ;
N
∑
i=1
w
(i)
n = 1
pN
n|n(x) =
N
∑
i=1
w
(i)
n δ
(i)
˜xn
(x)
Step 2: Resampling
For i = 1,...,N, sample x
(i)
n ∼ pN
n|n(x).
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 52 / 38
Feedback particle filter
Calculation of KL divergence
Recall the definition of K-L divergence for densities,
KL(pn ˆp∗
n) =
R
pn(s)ln
pn(s)
ˆp∗
n(s)
ds.
We make a co-ordinate transformation s = x+u(x) and express the K-L
divergence as:
KL(pn ˆp∗
n)
=
R
p−
n (x)
|1+u (x)|
ln(
p−
n (x)
|1+u (x)| ˆp∗
n(x+u(x)|zn)
)|1+u (x)|dx
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 53 / 38
Feedback particle filter
Derivation of Euler-Lagrange equation
Denote:
L (x,u,u ) = −p−
n (x) ln|1+u (x)|
+ln(p−
n (x+u)pZ|X (zn|x+u)) .
(4)
The optimization problem thus is a calculus of variation problem:
min
u
L (x,u,u )dx.
The minimizer is obtained via the analysis of first variation given by the
well-known Euler-Lagrange equation:
∂L
∂u
=
d
dx
∂L
∂u
,
Explicitly substituting the expression (4) for L , we obtain the expecting
result.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 54 / 38
Feedback particle filter
Simulation Results
When particle filter does not work well:
Bounds
0
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
t t
First harmonic PDF estimate
Second harmonic
x
π 2π0
π
2π
0 2 4 0 2 4
0
0.01
0.02
0.03
0.04
(a) (b) (c)
State
Conditional mean est.
θ(t)
¯θN
(t)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 55 / 38
Feedback particle filter
Derivation of FPK for FPF
Controlled particle system: Controlled system (N particles):
dXi(t) = a(Xi(t))dt + Ui(t)
mean field control
+σB dBi(t), i = 1,...,N (3)
Using ansatz: Ui(t) = u(Xi(t))dt +v(Xi(t))dZ(t). Particle system
becomes:
dXi(t) = (a(Xi(t))+u(Xi(t))) dt +σB dBi(t)+v(Xi(t))dZ(t) i = 1,...,N
(3)
Then the cond. distribution of Xi(t) given the filteration Z t
, p(x,t),
satisfies the forward equation:
dp = L †
pdt −
∂
∂x
(vp) dZt −
∂
∂x
(up) dt +σ2
W
1
2
∂2
∂x2
pv2
dt,
Note after going through some short calculation, we have
u(x,t) = v(x,t) −
1
2
(h(x)+ ˆh)+
1
2
σ2
W v (x,t) , (4)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 56 / 38
Feedback particle filter
Derivation of FPF for linear case
The density p is a function of the stochastic process µt. Using Itô’s
formula,
dp(x,t) =
∂ p
∂µ
dµt +
∂ p
∂Σ
dΣt +
1
2
∂2 p
∂µ2
dµ2
t ,
where
∂ p
∂µ
=
x− µt
Σt
p,
∂ p
∂Σ
=
1
2Σt
(x− µt)2
Σt
−1 p, and
∂2 p
∂µ2
=
1
Σt
(x− µt)2
Σt
−1 p. Substituting these into the forward
equation, we obtain a quadratic equation Ax2
+Bx = 0, where
A = dΣt − 2aΣt +σ2
B −
h2Σ2
t
σ2
W
dt,
B = dµt − aµt dt +
hΣt
σ2
W
(dZt −hµt dt) .
This leads to the Kalman filter.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 57 / 38
Feedback particle filter
Derivation of E-L equation
In the continuous-time case, the control is of the form:
Ui
t = u(Xi
t ,t)dt +v(Xi
t ,t)dZt. (5)
Substituting this in the E-L BVP for the continuous-discrete time case,
we arrive at the formal equation:
∂
∂x
p(x,t)
1+u dt +v dZt
=p(x,t)
∂
∂ ˜u
ln p(x+udt +vdZt,t)
+ln pdZ|X (dZt | x+udt +vdZt) , (6)
where pdZ|X (dZt|·) =
1
2πσ2
W
exp −
(dZt −h(·))2
2σ2
W
and ˜u = udt +vdZt.
For notational ease, we use primes to denote partial derivatives with
respect to x: p is used to denote p(x,t), p :=
∂ p
∂x
(x,t), p :=
∂2 p
∂x2
(x,t),
u :=
∂u
∂x
(x,t), v :=
∂v
∂x
(x,t) etc. Note that the time t is fixed.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 58 / 38
Feedback particle filter
Derivation of E-L equation cont.1
Step 1: The three terms in (6) are simplified as:
∂
∂x
p
1+u dt +v dZt
= p − f1 dt −(p v + pv )dZt
p
∂
∂ ˜u
ln p(x+udt +vdZt) = p + f2 dt +(p v−
p 2v
p
)dZt
p
∂
∂ ˜u
ln pdZ|X (dZt|x+udt +vdZt) =
p
σ2
W
[h dZt +(h v−hh )dt]
where we have used Itô’s rules dZ2
t = σ2
W dt, dZt dt = 0 etc.
f1 = (p u + pu )−σ2
W (p v 2
+2pv v ),
f2 = (p u−
p 2u
p
)+σ2
W v2 1
2
p −
3p p
2p
+
p 3
p2
.
Collecting terms in O(dZt) and O(dt), after some simplification, leads
to the following ODEs:
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 59 / 38
Feedback particle filter
Derivation of E-L equation cont.2
E (v) =
1
σ2
W
h (x) (7)
E (u) = −
1
σ2
W
h(x)h (x)+h (x)v+σ2
W G(x,t) (8)
where E (v) = −
∂
∂x
1
p(x,t)
∂
∂x
{p(x,t)v(x,t)} , and
G = −2v v −(v )2
(ln p) +
1
2
v2
(ln p) .
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 60 / 38
Feedback particle filter
Derivation of E-L equation cont.3
Step 2. Suppose (u,v) are admissible solutions of the E-L BVP (7)-(8).
Then it is claimed that
−(pv) =
h− ˆh
σ2
W
p (9)
−(pu) = −
(h− ˆh)ˆh
σ2
W
p−
1
2
σ2
W (pv2
) . (10)
Recall that admissible here means
lim
x→±∞
p(x,t)u(x,t) = 0, lim
x→±∞
p(x,t)v(x,t) = 0. (11)
To show (9), integrate (7) once to obtain
−(pv) =
1
σ2
W
hp+Cp,
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 61 / 38
Feedback particle filter
Derivation of E-L equation cont.4
where the constant of integration C = −
ˆh
σ2
W
is obtained by integrating
once again between −∞ to ∞ and using the boundary conditions for
v (11). This gives (9). To show (10), we denote its right hand side as R
and claim
R
p
= −
hh
σ2
W
+h v+σ2
W G. (12)
The equation (10) then follows by using the ODE (8) together with the
boundary conditions for u (11). The verification of the claim involves a
straightforward calculation, where we use (7) to obtain expressions for
h and v . The details of this calculation are omitted on account of
space.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 62 / 38
Feedback particle filter
Derivation of E-L equation cont.5
Step 3. The E-L equation for v is given by (7). After going through
some short calculation, we have
u(x,t) = v(x,t) −
1
2
(h(x)+ ˆh)+
1
2
σ2
W v (x,t) , (13)
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
Bibliography
Dimitri P. Bertsekas.
Dynamic Programming and Optimal Control, volume 1.
Athena Scientific, Belmont, Massachusetts, 1995.
Eric Brown, Jeff Moehlis, and Philip Holmes.
On the phase reduction and response dynamics of neural
oscillator populations.
Neural Computation, 16(4):673–715, 2004.
M. Dellnitz, J.E. Marsden, I. Melbourne, and J. Scheurle.
Generic bifurcations of pendula.
Int. Series Num. Math., 104:111–122, 1992.
A. Doucet, N. de Freitas, and N. Gordon.
Sequential Monte-Carlo Methods in Practice.
Springer-Verlag, April 2001.
N. J. Gordon, D. J. Salmond, and A. F. M. Smith.
Novel approach to nonlinear/non-Gaussian Bayesian state
estimation.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
Bibliography
IEE Proceedings F Radar and Signal Processing, 140(2):107–113,
1993.
J. Guckenheimer.
Isochrons and phaseless sets.
J. Math. Biol., 1:259–273, 1975.
M. Huang, P. E. Caines, and R. P. Malhame.
Large-population cost-coupled LQG problems with nonuniform
agents: Individual-mass behavior and decentralized ε-Nash
equilibria.
52(9):1560–1571, 2007.
Minyi Huang, Peter E. Caines, and Roland P. Malhame.
Large-population cost-coupled LQG problems with nonuniform
agents: Individual-mass behavior and decentralized ε-nash
equilibria.
IEEE transactions on automatic control, 52(9):1560–1571, 2007.
R. E. Kalman.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
Bibliography
A new approach to linear filtering and prediction problems.
Journal of Basic Engineering, 82(1):35–45, 1960.
Y. Kuramoto.
International Symposium on Mathematical Problems in Theoretical
Physics, volume 39 of Lecture Notes in Physics.
Springer-Verlag, 1975.
H. J. Kushner.
On the differential equations satisfied by conditional probability
densities of markov process.
SIAM J. Control, 2:106–119, 1964.
J. Lasry and P. Lions.
Mean field games.
Japanese Journal of Mathematics, 2(2):229–260, 2007.
Jean-Michel Lasry and Pierre-Louis Lions.
Mean field games.
Japan. J. Math., 2:229–260, 2007.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
Bibliography
Sean P. Meyn.
The policy iteration algorithm for average reward markov decision
processes with general state space.
IEEE Transactions on Automatic Control, 42(12):1663–1680,
December 1997.
S. H. Strogatz and R. E. Mirollo.
Stability of incoherence in a population of coupled oscillators.
Journal of Statistical Physics, 63:613–635, May 1991.
G. Y. Weintraub, L. Benkard, and B. Van Roy.
Oblivious equilibrium: A mean field approximation for large-scale
dynamic games.
In Advances in Neural Information Processing Systems,
volume 18. MIT Press, 2006.
Huibing Yin, Prashant G. Mehta, Sean P. Meyn, and Uday V.
Shanbhag.
Synchronization of coupled oscillators is a game.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
Bibliography
In Proc. of 2010 American Control Conference, pages 1783–1790,
Baltimore, MD, 2010.
Mehta P. G. Meyn S. P. Yin, H. and U. V. Shanbhag.
Synchronization of coupled oscillators is a game.
IEEE Trans. Automat. Control.
P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38

More Related Content

Similar to Mean-field Methods in Estimation & Control

Neural Power Amplifier
Neural Power AmplifierNeural Power Amplifier
Neural Power Amplifier
Andrew Doyle
 
PaperNo11-GHEZELBASHHABIBISAFARI-IJMMS
PaperNo11-GHEZELBASHHABIBISAFARI-IJMMSPaperNo11-GHEZELBASHHABIBISAFARI-IJMMS
PaperNo11-GHEZELBASHHABIBISAFARI-IJMMS
Mezban Habibi
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Umberto Picchini
 
Somatic Niche Construction
Somatic Niche ConstructionSomatic Niche Construction
Somatic Niche Construction
joshbrsn
 
A comparative study of living cell micromechanical properties by oscillatory ...
A comparative study of living cell micromechanical properties by oscillatory ...A comparative study of living cell micromechanical properties by oscillatory ...
A comparative study of living cell micromechanical properties by oscillatory ...
Angela Zaorski
 
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Infrastructure Facility
 
J.Jochim_Poster_CNS_2016
J.Jochim_Poster_CNS_2016J.Jochim_Poster_CNS_2016
J.Jochim_Poster_CNS_2016
Janina Jochim
 
Kate Toland - Poster Neuro Conference Sept 2015 MK
Kate Toland - Poster Neuro Conference Sept 2015 MKKate Toland - Poster Neuro Conference Sept 2015 MK
Kate Toland - Poster Neuro Conference Sept 2015 MK
Kate Toland
 
Top Trending Article - International Journal of Chaos, Control, Modeling and ...
Top Trending Article - International Journal of Chaos, Control, Modeling and ...Top Trending Article - International Journal of Chaos, Control, Modeling and ...
Top Trending Article - International Journal of Chaos, Control, Modeling and ...
ijccmsjournal
 
Survival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsSurvival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive Models
Christos Argyropoulos
 
The biological nature of free will
The biological nature of free willThe biological nature of free will
The biological nature of free will
Björn Brembs
 
Mc intosh 2003
Mc intosh 2003Mc intosh 2003
Mc intosh 2003
Jacob Sheu
 
New Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative GamesNew Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative Games
SSA KPI
 
isi
isiisi
isi
ahfdv
 
Finite Element Analysis - UNIT-5
Finite Element Analysis - UNIT-5Finite Element Analysis - UNIT-5
Finite Element Analysis - UNIT-5
propaul
 
Theoretical and Applied Phase-Field: Glimpses of the activities in India
Theoretical and Applied Phase-Field: Glimpses of the activities in IndiaTheoretical and Applied Phase-Field: Glimpses of the activities in India
Theoretical and Applied Phase-Field: Glimpses of the activities in India
PFHub PFHub
 
Theoretical and Applied Phase-Field: Glimpses of the activities in India
Theoretical and Applied Phase-Field: Glimpses of the activities in IndiaTheoretical and Applied Phase-Field: Glimpses of the activities in India
Theoretical and Applied Phase-Field: Glimpses of the activities in India
Daniel Wheeler
 
pso2015.pdf
pso2015.pdfpso2015.pdf
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
Neural Networks with Anticipation: Problems and Prospects
Neural Networks with Anticipation: Problems and ProspectsNeural Networks with Anticipation: Problems and Prospects
Neural Networks with Anticipation: Problems and Prospects
SSA KPI
 

Similar to Mean-field Methods in Estimation & Control (20)

Neural Power Amplifier
Neural Power AmplifierNeural Power Amplifier
Neural Power Amplifier
 
PaperNo11-GHEZELBASHHABIBISAFARI-IJMMS
PaperNo11-GHEZELBASHHABIBISAFARI-IJMMSPaperNo11-GHEZELBASHHABIBISAFARI-IJMMS
PaperNo11-GHEZELBASHHABIBISAFARI-IJMMS
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
 
Somatic Niche Construction
Somatic Niche ConstructionSomatic Niche Construction
Somatic Niche Construction
 
A comparative study of living cell micromechanical properties by oscillatory ...
A comparative study of living cell micromechanical properties by oscillatory ...A comparative study of living cell micromechanical properties by oscillatory ...
A comparative study of living cell micromechanical properties by oscillatory ...
 
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
 
J.Jochim_Poster_CNS_2016
J.Jochim_Poster_CNS_2016J.Jochim_Poster_CNS_2016
J.Jochim_Poster_CNS_2016
 
Kate Toland - Poster Neuro Conference Sept 2015 MK
Kate Toland - Poster Neuro Conference Sept 2015 MKKate Toland - Poster Neuro Conference Sept 2015 MK
Kate Toland - Poster Neuro Conference Sept 2015 MK
 
Top Trending Article - International Journal of Chaos, Control, Modeling and ...
Top Trending Article - International Journal of Chaos, Control, Modeling and ...Top Trending Article - International Journal of Chaos, Control, Modeling and ...
Top Trending Article - International Journal of Chaos, Control, Modeling and ...
 
Survival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsSurvival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive Models
 
The biological nature of free will
The biological nature of free willThe biological nature of free will
The biological nature of free will
 
Mc intosh 2003
Mc intosh 2003Mc intosh 2003
Mc intosh 2003
 
New Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative GamesNew Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative Games
 
isi
isiisi
isi
 
Finite Element Analysis - UNIT-5
Finite Element Analysis - UNIT-5Finite Element Analysis - UNIT-5
Finite Element Analysis - UNIT-5
 
Theoretical and Applied Phase-Field: Glimpses of the activities in India
Theoretical and Applied Phase-Field: Glimpses of the activities in IndiaTheoretical and Applied Phase-Field: Glimpses of the activities in India
Theoretical and Applied Phase-Field: Glimpses of the activities in India
 
Theoretical and Applied Phase-Field: Glimpses of the activities in India
Theoretical and Applied Phase-Field: Glimpses of the activities in IndiaTheoretical and Applied Phase-Field: Glimpses of the activities in India
Theoretical and Applied Phase-Field: Glimpses of the activities in India
 
pso2015.pdf
pso2015.pdfpso2015.pdf
pso2015.pdf
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Neural Networks with Anticipation: Problems and Prospects
Neural Networks with Anticipation: Problems and ProspectsNeural Networks with Anticipation: Problems and Prospects
Neural Networks with Anticipation: Problems and Prospects
 

Recently uploaded

Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 

Recently uploaded (20)

Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 

Mean-field Methods in Estimation & Control

  • 1. CSLCOORDINATED SCIENCE LABORATORY Mean-field Methods in Estimation & Control Prashant G. Mehta Coordinated Science Laboratory Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign EE Seminar, Harvard Apr. 1, 2011
  • 2. Introduction Kuramoto model Background: Synchronization of coupled oscillators dθi(t) = ωi + κ N N ∑ j=1 sin(θj(t)−θi(t)) dt +σ dξi(t), i = 1,...,N ωi taken from distribution g(ω) over [1−γ,1+γ] γ — measures the heterogeneity of the population κ — measures the strength of coupling [10] Y. Kuramoto, 1975; [15] Strogatz et al., J. Stat. Phy., 1991 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 2 / 38
  • 3. Introduction Kuramoto model Background: Synchronization of coupled oscillators dθi(t) = ωi + κ N N ∑ j=1 sin(θj(t)−θi(t)) dt +σ dξi(t), i = 1,...,N ωi taken from distribution g(ω) over [1−γ,1+γ] γ — measures the heterogeneity of the population κ — measures the strength of coupling 1- 1+1 [10] Y. Kuramoto, 1975; [15] Strogatz et al., J. Stat. Phy., 1991 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 2 / 38
  • 4. Introduction Kuramoto model Background: Synchronization of coupled oscillators dθi(t) = ωi + κ N N ∑ j=1 sin(θj(t)−θi(t)) dt +σ dξi(t), i = 1,...,N ωi taken from distribution g(ω) over [1−γ,1+γ] γ — measures the heterogeneity of the population κ — measures the strength of coupling [10] Y. Kuramoto, 1975; [15] Strogatz et al., J. Stat. Phy., 1991 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 2 / 38
  • 5. Introduction Kuramoto model Background: Synchronization of coupled oscillators dθi(t) = ωi + κ N N ∑ j=1 sin(θj(t)−θi(t)) dt +σ dξi(t), i = 1,...,N ωi taken from distribution g(ω) over [1−γ,1+γ] γ — measures the heterogeneity of the population κ — measures the strength of coupling 0 0.1 0.2 0.1 0.15 0.2 0.25 0.3 Locking Incoherence κ κ < κc(γ) R γ Synchrony Incoherence [10] Y. Kuramoto, 1975; [15] Strogatz et al., J. Stat. Phy., 1991 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 2 / 38
  • 6. Introduction Kuramoto model Movies of incoherence and synchrony solution −1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 −1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 Incoherence Synchrony P. G. Mehta (Illinois) Harvard Apr. 1, 2011 3 / 38
  • 7. Introduction Oscillator examples Motivation: Oscillators in biology Oscillators in biology Coupled oscillators Images taken from google search. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 4 / 38
  • 8. Introduction Oscillator examples Hodgkin-Huxley type Neuron model C dV dt = −gT ·m2 ∞(V)·h·(V −ET ) −gh ·r ·(V −Eh)−...... dh dt = h∞(V)−h τh(V) dr dt = r∞(V)−r τr(V) 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 −150 −100 −50 0 50 100 Voltage time Neural spike train [6] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 5 / 38
  • 9. Introduction Oscillator examples Hodgkin-Huxley type Neuron model C dV dt = −gT ·m2 ∞(V)·h·(V −ET ) −gh ·r ·(V −Eh)−...... dh dt = h∞(V)−h τh(V) dr dt = r∞(V)−r τr(V) 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 −150 −100 −50 0 50 100 Voltage time Neural spike train [6] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 5 / 38
  • 10. Introduction Oscillator examples Hodgkin-Huxley type Neuron model C dV dt = −gT ·m2 ∞(V)·h·(V −ET ) −gh ·r ·(V −Eh)−...... dh dt = h∞(V)−h τh(V) dr dt = r∞(V)−r τr(V) 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 −150 −100 −50 0 50 100 Voltage time Neural spike train −100 −50 0 50 100 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 Vh r Limit cyle r h v [6] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 5 / 38
  • 11. Introduction Oscillator examples Hodgkin-Huxley type Neuron model C dV dt = −gT ·m2 ∞(V)·h·(V −ET ) −gh ·r ·(V −Eh)−...... dh dt = h∞(V)−h τh(V) dr dt = r∞(V)−r τr(V) 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 −150 −100 −50 0 50 100 Voltage time Neural spike train −100 −50 0 50 100 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 Vh r Limit cyle r h v Normal form reduction −−−−−−−−−−−−−→ ˙θi = ωi +ui ·Φ(θi) [6] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 5 / 38
  • 12. Introduction Role of Neural Rhythms Question (Fundamental question in Neuroscience) Why is synchrony (neural rhythms) useful? Does it have a functional role? 1 Synchronization Phase transition in controlled system (motivated by coupled oscillators) H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of Coupled Oscillators is a Game,” TAC 2 Neuronal computations Bayesian inference Neural circuits as particle filters (Lee & Mumford) T. Yang, P. G. Mehta and S. P. Meyn, “ A Control-oriented Approach for Particle Filtering,” ACC 2011, CDC 2011 3 Learning Synaptic plasticity via long term potentiation (Hebbian learning) “Neurons that fire together wire together” H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “ Learning in Mean-Field Oscillator Games,” CDC 2010 Destexhe & Marder, Nature, 2004; Kopell et al., Neuroscience, 2009; Lee & Mumford, J. Opt. Soc, 2003 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 6 / 38
  • 13. Introduction Role of Neural Rhythms Question (Fundamental question in Neuroscience) Why is synchrony (neural rhythms) useful? Does it have a functional role? 1 Synchronization Phase transition in controlled system (motivated by coupled oscillators) H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of Coupled Oscillators is a Game,” TAC 2 Neuronal computations Bayesian inference Neural circuits as particle filters (Lee & Mumford) T. Yang, P. G. Mehta and S. P. Meyn, “ A Control-oriented Approach for Particle Filtering,” ACC 2011, CDC 2011 3 Learning Synaptic plasticity via long term potentiation (Hebbian learning) “Neurons that fire together wire together” H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “ Learning in Mean-Field Oscillator Games,” CDC 2010 Destexhe & Marder, Nature, 2004; Kopell et al., Neuroscience, 2009; Lee & Mumford, J. Opt. Soc, 2003 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 6 / 38
  • 14. Introduction Role of Neural Rhythms Question (Fundamental question in Neuroscience) Why is synchrony (neural rhythms) useful? Does it have a functional role? 1 Synchronization Phase transition in controlled system (motivated by coupled oscillators) H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of Coupled Oscillators is a Game,” TAC 2 Neuronal computations Bayesian inference Neural circuits as particle filters (Lee & Mumford) T. Yang, P. G. Mehta and S. P. Meyn, “ A Control-oriented Approach for Particle Filtering,” ACC 2011, CDC 2011 3 Learning Synaptic plasticity via long term potentiation (Hebbian learning) “Neurons that fire together wire together” H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “ Learning in Mean-Field Oscillator Games,” CDC 2010 Destexhe & Marder, Nature, 2004; Kopell et al., Neuroscience, 2009; Lee & Mumford, J. Opt. Soc, 2003 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 6 / 38
  • 15. Introduction Role of Neural Rhythms Question (Fundamental question in Neuroscience) Why is synchrony (neural rhythms) useful? Does it have a functional role? 1 Synchronization Phase transition in controlled system (motivated by coupled oscillators) H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of Coupled Oscillators is a Game,” TAC 2 Neuronal computations Bayesian inference Neural circuits as particle filters (Lee & Mumford) T. Yang, P. G. Mehta and S. P. Meyn, “ A Control-oriented Approach for Particle Filtering,” ACC 2011, CDC 2011 3 Learning Synaptic plasticity via long term potentiation (Hebbian learning) “Neurons that fire together wire together” H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “ Learning in Mean-Field Oscillator Games,” CDC 2010 Destexhe & Marder, Nature, 2004; Kopell et al., Neuroscience, 2009; Lee & Mumford, J. Opt. Soc, 2003 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 6 / 38
  • 17. Collaborators Huibing Yin Sean P. Meyn Uday V. Shanbhag “Synchronization of coupled oscillators is a game,” IEEE TAC “Learning in Mean-Field Oscillator Games,” CDC 2010 “On the Efficiency of Equilibria in Mean-field Oscillator Games,” ACC 2011 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 8 / 38
  • 18. Derivation of model Problem statement Oscillators dynamic game Dynamics of ith oscillator dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0 ui(t) — control 1- 1+1 ith oscillator seeks to minimize ηi(ui;u−i) = lim T→∞ 1 T T 0 E[ c(θi;θ−i) cost of anarchy + 1 2 Ru2 i cost of control ]ds θ−i = (θj)j=i R — control penalty c(·) — cost function c(θi;θ−i) = 1 N ∑ j=i c• (θi,θj), c• ≥ 0 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 9 / 38
  • 19. Derivation of model Problem statement Oscillators dynamic game Dynamics of ith oscillator dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0 ui(t) — control 1- 1+1 ith oscillator seeks to minimize ηi(ui;u−i) = lim T→∞ 1 T T 0 E[ c(θi;θ−i) cost of anarchy + 1 2 Ru2 i cost of control ]ds θ−i = (θj)j=i R — control penalty c(·) — cost function c(θi;θ−i) = 1 N ∑ j=i c• (θi,θj), c• ≥ 0 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 9 / 38
  • 20. Derivation of model Problem statement Oscillators dynamic game Dynamics of ith oscillator dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0 ui(t) — control ith oscillator seeks to minimize ηi(ui;u−i) = lim T→∞ 1 T T 0 E[ c(θi;θ−i) cost of anarchy + 1 2 Ru2 i cost of control ]ds θ−i = (θj)j=i R — control penalty c(·) — cost function c(θi;θ−i) = 1 N ∑ j=i c• (θi,θj), c• ≥ 0 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 9 / 38
  • 21. Derivation of model Mean-field model Mean-field model derivation dθi = (ωi +ui(t))dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds Influence Influence Mass 1 Mean-field approximation c(θi;θ−i(t)) = 1 N ∑ j=i c• (θi,θj) N→∞ −−−−−−→ ¯c(θi,t) 2 Optimal control of single oscillator Decentralized control structure [8] M. Huang, P. Caines, and R. Malhame, IEEE TAC, 2007 [HCM]; [13] Lasry & Lions, Japan. J. Math, 2007; P. G. Mehta (Illinois) Harvard Apr. 1, 2011 10 / 38
  • 22. Derivation of model Derivation steps Single oscillator with given cost Dynamics of the oscillator dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0 The cost function is assumed known ηi(ui; ¯c) = lim T→∞ 1 T T 0 E c(θi;θ−i) + 1 2 Ru2 i (s) ds ⇑ ¯c(θi(s),s) HJB equation: ∂thi +ωi∂θ hi = 1 2R (∂θ hi)2 − ¯c(θ,t)+η∗ i − σ2 2 ∂2 θθ hi Optimal control law: u∗ i (t) = ϕi(θ,t) = − 1 R ∂θ hi(θ,t) [1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
  • 23. Derivation of model Derivation steps Single oscillator with given cost Dynamics of the oscillator dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0 The cost function is assumed known ηi(ui; ¯c) = lim T→∞ 1 T T 0 E c(θi;θ−i) + 1 2 Ru2 i (s) ds ⇑ ¯c(θi(s),s) HJB equation: ∂thi +ωi∂θ hi = 1 2R (∂θ hi)2 − ¯c(θ,t)+η∗ i − σ2 2 ∂2 θθ hi Optimal control law: u∗ i (t) = ϕi(θ,t) = − 1 R ∂θ hi(θ,t) [1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
  • 24. Derivation of model Derivation steps Single oscillator with given cost Dynamics of the oscillator dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0 The cost function is assumed known ηi(ui; ¯c) = lim T→∞ 1 T T 0 E c(θi;θ−i) + 1 2 Ru2 i (s) ds ⇑ ¯c(θi(s),s) HJB equation: ∂thi +ωi∂θ hi = 1 2R (∂θ hi)2 − ¯c(θ,t)+η∗ i − σ2 2 ∂2 θθ hi Optimal control law: u∗ i (t) = ϕi(θ,t) = − 1 R ∂θ hi(θ,t) [1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
  • 25. Derivation of model Derivation steps Single oscillator with given cost Dynamics of the oscillator dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0 The cost function is assumed known ηi(ui; ¯c) = lim T→∞ 1 T T 0 E c(θi;θ−i) + 1 2 Ru2 i (s) ds ⇑ ¯c(θi(s),s) HJB equation: ∂thi +ωi∂θ hi = 1 2R (∂θ hi)2 − ¯c(θ,t)+η∗ i − σ2 2 ∂2 θθ hi Optimal control law: u∗ i (t) = ϕi(θ,t) = − 1 R ∂θ hi(θ,t) [1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
  • 26. Derivation of model Derivation steps Single oscillator with given cost Dynamics of the oscillator dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0 The cost function is assumed known ηi(ui; ¯c) = lim T→∞ 1 T T 0 E c(θi;θ−i) + 1 2 Ru2 i (s) ds ⇑ ¯c(θi(s),s) HJB equation: ∂thi +ωi∂θ hi = 1 2R (∂θ hi)2 − ¯c(θ,t)+η∗ i − σ2 2 ∂2 θθ hi Optimal control law: u∗ i (t) = ϕi(θ,t) = − 1 R ∂θ hi(θ,t) [1] D. P. Bertsekas (1995); [14] S. P. Meyn, IEEE TAC, 1997 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 11 / 38
  • 27. Derivation of model Derivation steps Single oscillator with optimal control Dynamics of the oscillator dθi(t) = ωi − 1 R ∂θ hi(θi,t) dt +σ dξi(t) Kolmogorov forward equation for pdf p(θ,t,ωi) FPK: ∂t p+ωi∂θ p = 1 R ∂θ [p(∂θ h)]+ σ2 2 ∂2 θθ p P. G. Mehta (Illinois) Harvard Apr. 1, 2011 12 / 38
  • 28. Derivation of model Derivation steps Mean-field Approximation HJB equation for population ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t)+η(ω)− σ2 2 ∂2 θθ h h(θ,t,ω) Population density ∂t p+ω∂θ p = 1 R ∂θ [p(∂θ h)]+ σ2 2 ∂2 θθ p p(θ,t,ω) Enforce cost consistency ¯c(θ,t) = Ω 2π 0 c• (θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω ≈ 1 N ∑ j=i c• (θ,ϑ) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 13 / 38
  • 29. Derivation of model Derivation steps Mean-field model dθi = (ωi +ui(t))dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds c(θi;θ−i) = 1 N ∑ j=i c• (θi,θj(t)) N→∞ −−−→ ¯c(θi,t) Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t) HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂t p+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) ¯c(ϑ,t) = Ω 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω [8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006; P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
  • 30. Derivation of model Derivation steps Mean-field model dθi = (ωi +ui(t))dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds Influence Influence Mass Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t) HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂t p+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) ¯c(ϑ,t) = Ω 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω [8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006; P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
  • 31. Derivation of model Derivation steps Mean-field model dθi = (ωi +ui(t))dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds Influence Influence Mass Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t) HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂t p+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) ¯c(ϑ,t) = Ω 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω [8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006; P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
  • 32. Derivation of model Derivation steps Mean-field model dθi = (ωi +ui(t))dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds Influence Influence Mass Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t) HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂t p+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) ¯c(ϑ,t) = Ω 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω [8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006; P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
  • 33. Derivation of model Derivation steps Mean-field model dθi = (ωi +ui(t))dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds Influence Influence Mass Letting N → ∞ and assume c(θi,θ−i) → ¯c(θi,t) HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂t p+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) ¯c(ϑ,t) = Ω 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω [8] Huang et al., IEEE TAC, 2007; [13] Lasry & Lions, Japan. J. Math, 2007; [16] Weintraub et al., NIPS, 2006; P. G. Mehta (Illinois) Harvard Apr. 1, 2011 14 / 38
  • 34. Solutions of PDE model ε-Nash equilibrium Solution of PDEs gives ε-Nash equilibrium Optimal control law uo i = − 1 R ∂θ h(θ(t),t,ω) ω=ωi Theorem For the control law uo i = − 1 R ∂θ h(θ(t),t,ω) ω=ωi , we have the ε-Nash equilibrium property ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N, for any adapted control ui. So, we look for solutions of PDEs. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 15 / 38
  • 35. Solutions of PDE model ε-Nash equilibrium Solution of PDEs gives ε-Nash equilibrium Optimal control law uo i = − 1 R ∂θ h(θ(t),t,ω) ω=ωi Theorem For the control law uo i = − 1 R ∂θ h(θ(t),t,ω) ω=ωi , we have the ε-Nash equilibrium property ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N, for any adapted control ui. So, we look for solutions of PDEs. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 15 / 38
  • 36. Solutions of PDE model ε-Nash equilibrium Solution of PDEs gives ε-Nash equilibrium Optimal control law uo i = − 1 R ∂θ h(θ(t),t,ω) ω=ωi Theorem For the control law uo i = − 1 R ∂θ h(θ(t),t,ω) ω=ωi , we have the ε-Nash equilibrium property ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N, for any adapted control ui. So, we look for solutions of PDEs. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 15 / 38
  • 37. Solutions of PDE model Synchronization Main result: Synchronization as a solution of the game Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γγ Incoherence R > Rc(γ) Synchrony Incoherence dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds 1- 1+1 Yin et al., ACC 2010 Strogatz et al., J. Stat. Phy., 1991 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 16 / 38
  • 38. Solutions of PDE model Synchronization Main result: Synchronization as a solution of the game Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γγ Incoherence R > Rc(γ) Synchrony Incoherence dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds 0 0.1 0.2 0.1 0.15 0.2 0.25 0.3 Locking Incoherence κ κ < κc(γ) R γ Synchrony Incoherence dθi = ωi + κ N N ∑ j=1 sin(θj −θi) dt +σ dξi Yin et al., ACC 2010 Strogatz et al., J. Stat. Phy., 1991 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 16 / 38
  • 39. Solutions of PDE model Synchronization Incoherence solution (PDE) Incoherence solution h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) := 1 2π h(θ,t,ω) = 0 ⇒ ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t)+η∗ − σ2 2 ∂2 θθ h ∂t p+ω∂θ p = 1 R ∂θ [p(∂θ h)]+ σ2 2 ∂2 θθ p ¯c(θ,t) = Ω 2π 0 c• (θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω P. G. Mehta (Illinois) Harvard Apr. 1, 2011 17 / 38
  • 40. Solutions of PDE model Synchronization Incoherence solution (PDE) Incoherence solution h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) := 1 2π h(θ,t,ω) = 0 ⇒ ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t)+η∗ − σ2 2 ∂2 θθ h p(θ,t,ω) = 1 2π ⇒ ∂t p+ω∂θ p = 1 R ∂θ [p(∂θ h)]+ σ2 2 ∂2 θθ p ¯c(θ,t) = Ω 2π 0 c• (θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω P. G. Mehta (Illinois) Harvard Apr. 1, 2011 17 / 38
  • 41. Solutions of PDE model Synchronization Incoherence solution (Finite population) Closed-loop dynamics dθi = (ωi + ui =0 )dt +σ dξi(t) Average cost ηi = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i =0 ]dt = lim T→∞ 1 N ∑ j=i 1 T T 0 E[1 2 sin2 θi(t)−θj(t) 2 ]dt = 1 N ∑ j=i 2π 0 E[1 2 sin2 θi(t)−ϑ 2 ] 1 2π dϑ = N −1 N η0 −1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 ε-Nash property ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 18 / 38
  • 42. Solutions of PDE model Synchronization Incoherence solution (Finite population) Closed-loop dynamics dθi = (ωi + ui =0 )dt +σ dξi(t) Average cost ηi = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i =0 ]dt = lim T→∞ 1 N ∑ j=i 1 T T 0 E[1 2 sin2 θi(t)−θj(t) 2 ]dt = 1 N ∑ j=i 2π 0 E[1 2 sin2 θi(t)−ϑ 2 ] 1 2π dϑ = N −1 N η0 −1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 ε-Nash property ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 18 / 38
  • 43. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γγ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds 1 Convergence theory (as N → ∞) 2 Bifurcation analysis 3 Analytical and numerical computations Yin et al., IEEE TAC, To appear; [17] Yin et al., ACC2010 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 19 / 38
  • 44. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γγ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 incoherence soln. dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds 1 Convergence theory (as N → ∞) 2 Bifurcation analysis 3 Analytical and numerical computations Yin et al., IEEE TAC, To appear; [17] Yin et al., ACC2010 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 19 / 38
  • 45. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γγ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 synchrony soln. dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds 1 Convergence theory (as N → ∞) 2 Bifurcation analysis 3 Analytical and numerical computations Yin et al., IEEE TAC, To appear; [17] Yin et al., ACC2010 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 19 / 38
  • 46. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γγ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds 1 Convergence theory (as N → ∞) 2 Bifurcation analysis 3 Analytical and numerical computations Yin et al., IEEE TAC, To appear; [17] Yin et al., ACC2010 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 19 / 38
  • 48. Collaborators Tao Yang Sean P. Meyn “A Control-oriented Approach for Particle Filtering,” ACC 2011 “Feedback Particle Filter with Mean-field Coupling,” submitted to CDC 2011 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 21 / 38
  • 49. Problem Definition Filtering problem Signal, observation processes: dθ(t) = ω dt +σB dB(t) mod 2π (1) dZ(t) = h(θ(t))dt +σW dW(t) (2) - Particle filter: dθi(t) = ω dt +σB dBi(t)+Ui(t) mod 2π, i = 1,...,N. (3) Objective: To choose control Ui(t) for the purpose of filtering. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 22 / 38
  • 50. Problem Definition Filtering problem Signal, observation processes: dθ(t) = ω dt +σB dB(t) mod 2π (1) dZ(t) = h(θ(t))dt +σW dW(t) (2) - Particle filter: dθi(t) = ω dt +σB dBi(t)+Ui(t) mod 2π, i = 1,...,N. (3) Objective: To choose control Ui(t) for the purpose of filtering. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 22 / 38
  • 51. Problem Definition Filtering problem Signal, observation processes: dθ(t) = ω dt +σB dB(t) mod 2π (1) dZ(t) = h(θ(t))dt +σW dW(t) (2) - Particle filter: dθi(t) = ω dt +σB dBi(t)+Ui(t) mod 2π, i = 1,...,N. (3) Objective: To choose control Ui(t) for the purpose of filtering. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 22 / 38
  • 52. Problem Definition Filtering problem Signal, observation processes: dX(t) = a(X(t))dt +σB dB(t) (1) dZ(t) = h(X(t))dt +σW dW(t) (2) X(t) ∈ R – state, Z(t) ∈ R – observation a(·),h(·) – C1 functions B(t),W(t) –ind. standard Wiener processes, σB ≥ 0,σW > 0 X(0) ∼ p0(·) Define Z t := σ{Z(s),0 ≤ s ≤ t} Objective: estimate the posterior distribution p∗ of X(t) given Z t . Solution approaches: Linear dynamics: Kalman filter (R. E. Kalman, 1960) Nonlinear dynamics: Kushner’s equation (H. J. Kushner, 1964) Numerical Methods: Particle filtering (N. J. Gordon et. al, 1993) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 23 / 38
  • 53. Problem Definition Filtering problem Signal, observation processes: dX(t) = a(X(t))dt +σB dB(t) (1) dZ(t) = h(X(t))dt +σW dW(t) (2) X(t) ∈ R – state, Z(t) ∈ R – observation a(·),h(·) – C1 functions B(t),W(t) –ind. standard Wiener processes, σB ≥ 0,σW > 0 X(0) ∼ p0(·) Define Z t := σ{Z(s),0 ≤ s ≤ t} Objective: estimate the posterior distribution p∗ of X(t) given Z t . Solution approaches: Linear dynamics: Kalman filter (R. E. Kalman, 1960) Nonlinear dynamics: Kushner’s equation (H. J. Kushner, 1964) Numerical Methods: Particle filtering (N. J. Gordon et. al, 1993) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 23 / 38
  • 54. Problem Definition Filtering problem Signal, observation processes: dX(t) = a(X(t))dt +σB dB(t) (1) dZ(t) = h(X(t))dt +σW dW(t) (2) X(t) ∈ R – state, Z(t) ∈ R – observation a(·),h(·) – C1 functions B(t),W(t) –ind. standard Wiener processes, σB ≥ 0,σW > 0 X(0) ∼ p0(·) Define Z t := σ{Z(s),0 ≤ s ≤ t} Objective: estimate the posterior distribution p∗ of X(t) given Z t . Solution approaches: Linear dynamics: Kalman filter (R. E. Kalman, 1960) Nonlinear dynamics: Kushner’s equation (H. J. Kushner, 1964) Numerical Methods: Particle filtering (N. J. Gordon et. al, 1993) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 23 / 38
  • 55. Existing Approach Kalman Filter Linear case: Kalman Filter Linear Gaussian system: dX(t) = aX(t)dt +σB dB(t) (1) dZ(t) = hX(t)dt +σW dW(t) (2) Kalman filter: p∗ = N(µ(t),Σ(t)) dµ(t) = aµ(t)dt + hΣ(t) σ2 W (dZ(t)−hµ(t)dt) innovation error dΣ(t) dt = 2aΣ(t)+σ2 B − h2Σ2(t) σ2 W [9] R. E. Kalman, Trans. ASME, Ser. D: J. Basic Eng.,1961 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 24 / 38
  • 56. Existing Approach Kalman Filter Linear case: Kalman Filter Linear Gaussian system: dX(t) = aX(t)dt +σB dB(t) (1) dZ(t) = hX(t)dt +σW dW(t) (2) Kalman filter: p∗ = N(µ(t),Σ(t)) dµ(t) = aµ(t)dt + hΣ(t) σ2 W (dZ(t)−hµ(t)dt) innovation error dΣ(t) dt = 2aΣ(t)+σ2 B − h2Σ2(t) σ2 W [9] R. E. Kalman, Trans. ASME, Ser. D: J. Basic Eng.,1961 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 24 / 38
  • 57. Existing Approach Nonlinear case: Kushner’s equation Nonlinear case: K-S equation Signal, observation processes: dX(t) = a(X(t))dt +σB dB(t), (1) dZ(t) = h(X(t))dt +σW dW(t) (2) B(t),W(t) are independent standard Wiener processes. Kushner’s equation: the posterior distribution p∗ is a solution of a stochastic PDE: dp∗ = L † (p∗ )dt +(h− ˆh)(σ2 W )−1 (dZ(t)− ˆhdt)p∗ where ˆh = E[h(X(t))|Z t ] = h(x)p∗ (x,t)dx L † (p∗ ) = − ∂(p∗ ·a(x)) ∂x + 1 2 σ2 B ∂2 p∗ ∂x2 No closed-form solution in general. Closure problem. [11] H. J. Kushner, SIAM J. Control, 1964 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 25 / 38
  • 58. Existing Approach Nonlinear case: Kushner’s equation Nonlinear case: K-S equation Signal, observation processes: dX(t) = a(X(t))dt +σB dB(t), (1) dZ(t) = h(X(t))dt +σW dW(t) (2) B(t),W(t) are independent standard Wiener processes. Kushner’s equation: the posterior distribution p∗ is a solution of a stochastic PDE: dp∗ = L † (p∗ )dt +(h− ˆh)(σ2 W )−1 (dZ(t)− ˆhdt)p∗ where ˆh = E[h(X(t))|Z t ] = h(x)p∗ (x,t)dx L † (p∗ ) = − ∂(p∗ ·a(x)) ∂x + 1 2 σ2 B ∂2 p∗ ∂x2 No closed-form solution in general. Closure problem. [11] H. J. Kushner, SIAM J. Control, 1964 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 25 / 38
  • 59. Existing Approach Nonlinear case: Kushner’s equation Nonlinear case: K-S equation Signal, observation processes: dX(t) = a(X(t))dt +σB dB(t), (1) dZ(t) = h(X(t))dt +σW dW(t) (2) B(t),W(t) are independent standard Wiener processes. Kushner’s equation: the posterior distribution p∗ is a solution of a stochastic PDE: dp∗ = L † (p∗ )dt +(h− ˆh)(σ2 W )−1 (dZ(t)− ˆhdt)p∗ where ˆh = E[h(X(t))|Z t ] = h(x)p∗ (x,t)dx L † (p∗ ) = − ∂(p∗ ·a(x)) ∂x + 1 2 σ2 B ∂2 p∗ ∂x2 No closed-form solution in general. Closure problem. [11] H. J. Kushner, SIAM J. Control, 1964 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 25 / 38
  • 60. Existing Approach Nonlinear case: Kushner’s equation Numerical methods: Particle filter Idea: Approximate posterior p∗ in terms of particles {Xi(t)}N i=1 so p∗ ≈ 1 N N ∑ i=1 δXi(t)(·) Algorithm outline Initialization Xi(0) ∼ p∗ 0(·). At each time step: Importance sampling (for weight update) Resampling (for variance reduction) [5], N. J. Gordon, D. J. Salmond, and A. F. M. Smith, IEEE Proceedings on Radar and Signal Processing, 1993 [4], A. Doucet, N. Freitas, N. Gordon, Springer-Verlag, 2001 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 26 / 38
  • 61. Existing Approach Nonlinear case: Kushner’s equation Numerical methods: Particle filter Idea: Approximate posterior p∗ in terms of particles {Xi(t)}N i=1 so p∗ ≈ 1 N N ∑ i=1 δXi(t)(·) Algorithm outline Initialization Xi(0) ∼ p∗ 0(·). At each time step: Importance sampling (for weight update) Resampling (for variance reduction) [5], N. J. Gordon, D. J. Salmond, and A. F. M. Smith, IEEE Proceedings on Radar and Signal Processing, 1993 [4], A. Doucet, N. Freitas, N. Gordon, Springer-Verlag, 2001 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 26 / 38
  • 62. A Control-Oriented Approach to Particle Filtering Overview Mean-field filtering Signal, observation processes: dX(t) = a(X(t))dt +σB dB(t) (1) dZ(t) = h(X(t))dt +σW dW(t) (2) Controlled system (N particles): dXi(t) = a(Xi(t))dt +σB dBi(t)+ Ui(t) mean field control , i = 1,...,N (3) {Bi(t)}N i=1 are ind. standard Wiener Processes. Objective: Choose control Ui(t) such that the empirical distribution of the population approximates the posterior distribution p∗ as N → ∞. [7] M. Huang, P. E. Caines, and R. P. Malhame, IEEE TAC, 2007 [18] H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, IEEE TAC, To be appeared [12] J. Lasry and P. Lions, Japanese Journal of Mathematics, 2007 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 27 / 38
  • 63. A Control-Oriented Approach to Particle Filtering Overview Mean-field filtering Signal, observation processes: dX(t) = a(X(t))dt +σB dB(t) (1) dZ(t) = h(X(t))dt +σW dW(t) (2) Controlled system (N particles): dXi(t) = a(Xi(t))dt +σB dBi(t)+ Ui(t) mean field control , i = 1,...,N (3) {Bi(t)}N i=1 are ind. standard Wiener Processes. Objective: Choose control Ui(t) such that the empirical distribution of the population approximates the posterior distribution p∗ as N → ∞. [7] M. Huang, P. E. Caines, and R. P. Malhame, IEEE TAC, 2007 [18] H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, IEEE TAC, To be appeared [12] J. Lasry and P. Lions, Japanese Journal of Mathematics, 2007 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 27 / 38
  • 64. A Control-Oriented Approach to Particle Filtering Overview Mean-field filtering Signal, observation processes: dX(t) = a(X(t))dt +σB dB(t) (1) dZ(t) = h(X(t))dt +σW dW(t) (2) Controlled system (N particles): dXi(t) = a(Xi(t))dt +σB dBi(t)+ Ui(t) mean field control , i = 1,...,N (3) {Bi(t)}N i=1 are ind. standard Wiener Processes. Objective: Choose control Ui(t) such that the empirical distribution of the population approximates the posterior distribution p∗ as N → ∞. [7] M. Huang, P. E. Caines, and R. P. Malhame, IEEE TAC, 2007 [18] H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, IEEE TAC, To be appeared [12] J. Lasry and P. Lions, Japanese Journal of Mathematics, 2007 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 27 / 38
  • 65. A Control-Oriented Approach to Particle Filtering Example: Linear case Example: Linear case Controlled system: for i = 1,...N: dXi(t) = aXi(t)dt +σB dBi(t)+ hΣ(t) σ2 W [dZ(t)−h Xi(t)+ µ(t) 2 dt] Ui(t) (3) where µ(t) ≈ ¯µ(N) (t) = 1 N N ∑ i=1 Xi(t) ←− sample mean Σ(t) ≈ ¯Σ(N) (t) = 1 N −1 N ∑ i=1 (Xi(t)− µ(t))2 ←− sample variance P. G. Mehta (Illinois) Harvard Apr. 1, 2011 28 / 38
  • 66. A Control-Oriented Approach to Particle Filtering Example: Linear case Example: Linear case Feedback particle filter: dXi(t) = aXi(t)dt +σB dBi(t)+ h¯Σ(N)(t) σ2 W [dZ(t)−h Xi(t)+ ¯µ(N)(t) 2 dt] (3) Mean-field model: Let p denote cond. dist. of Xi(t) given Z t with p(x,0) = N(µ(0),Σ(0)). Then p = N(µ(t),Σ(t)) where dµ(t) = aµ(t)dt + hΣ(t) σ2 W (dZ(t)−hµ(t)) dΣ(t) dt = 2aΣ(t)+σ2 B − h2Σ2(t) σ2 W Same as Kalman filter! As N → ∞, the empirical distribution approximates the posterior p∗ P. G. Mehta (Illinois) Harvard Apr. 1, 2011 29 / 38
  • 67. A Control-Oriented Approach to Particle Filtering Example: Linear case Comparison plots mse = 1 T T 0 Σ (N) t −Σt Σt 2 dt 10 1 10 2 10 3 10 −3 10 −2 10 −1 10 0 N Error FPF −0.5 0 0.5 BPF −0.5 0 Bootstrap (BPF) Feedback (FPF) a ∈ [−1,1],h = 3,σB = 1,σW = 0.5, dt = 0.01,T = 50 and N ∈ [20,50,100,200,500,1000]. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 30 / 38
  • 68. Methodology Methodology: Variational problem Continuous-discrete time problem. ....... Signal, observ processes: dX(t) = a(X(t))dt +σB dB(t) Z(tn) = h(X(tn))+W(tn) Particle filter Filter: dXi(t) = a(Xi(t)dt +σB dBi(t) Control: Xi(tn) = Xi(t− n )+u(Xi(t− n )) control Belief map. p∗ n(x): cond. pdf of X(t)|Z t Bayes rule pn(x): cond. pdf of Xi(t)|Z t Variational problem: K-L Metric: KL(pn p∗ n) = Epn [ln pn p∗ n ] Control design: u∗ (x) = arg minuKL(pn p∗ n). P. G. Mehta (Illinois) Harvard Apr. 1, 2011 31 / 38
  • 69. Methodology Methodology: Variational problem Continuous-discrete time problem. ....... Signal, observ processes: dX(t) = a(X(t))dt +σB dB(t) Z(tn) = h(X(tn))+W(tn) Particle filter Filter: dXi(t) = a(Xi(t)dt +σB dBi(t) Control: Xi(tn) = Xi(t− n )+u(Xi(t− n )) control Belief map. p∗ n(x): cond. pdf of X(t)|Z t Bayes rule pn(x): cond. pdf of Xi(t)|Z t Variational problem: K-L Metric: KL(pn p∗ n) = Epn [ln pn p∗ n ] Control design: u∗ (x) = arg minuKL(pn p∗ n). P. G. Mehta (Illinois) Harvard Apr. 1, 2011 31 / 38
  • 70. Methodology Methodology: Variational problem Continuous-discrete time problem. ....... Signal, observ processes: dX(t) = a(X(t))dt +σB dB(t) Z(tn) = h(X(tn))+W(tn) Particle filter Filter: dXi(t) = a(Xi(t)dt +σB dBi(t) Control: Xi(tn) = Xi(t− n )+u(Xi(t− n )) control Belief map. p∗ n(x): cond. pdf of X(t)|Z t Bayes rule pn(x): cond. pdf of Xi(t)|Z t Variational problem: K-L Metric: KL(pn p∗ n) = Epn [ln pn p∗ n ] Control design: u∗ (x) = arg minuKL(pn p∗ n). P. G. Mehta (Illinois) Harvard Apr. 1, 2011 31 / 38
  • 71. Feedback particle filter Feedback particle filter(FPF) Problem: dX(t) = a(X(t))dt +σB dB(t), (1) dZ(t) = h(X(t))dt +σW dW(t) (2) FPF: dXi(t) = a(Xi(t))dt +σB dBi(t) +v(Xi(t))dIi(t)+ 1 2 σ2 W v(Xi(t))v (Xi(t))dt (3) Innovation error dIi(t)=:dZ(t)− 1 2 (h(Xi(t))+ ˆh)dt Population prediction ˆh = 1 N N ∑ i=1 h(Xi(t)). Euler-Lagrange equation: − ∂ ∂x 1 p(x,t) ∂ ∂x {p(x,t)v(x,t)} = 1 σ2 W h (x), (4) with boundary condition lim x→±∞ v(x,t)p(x,t) = 0. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 32 / 38
  • 72. Feedback particle filter Feedback particle filter(FPF) Problem: dX(t) = a(X(t))dt +σB dB(t), (1) dZ(t) = h(X(t))dt +σW dW(t) (2) FPF: dXi(t) = a(Xi(t))dt +σB dBi(t) +v(Xi(t))dIi(t)+ 1 2 σ2 W v(Xi(t))v (Xi(t))dt (3) Innovation error dIi(t)=:dZ(t)− 1 2 (h(Xi(t))+ ˆh)dt Population prediction ˆh = 1 N N ∑ i=1 h(Xi(t)). Euler-Lagrange equation: − ∂ ∂x 1 p(x,t) ∂ ∂x {p(x,t)v(x,t)} = 1 σ2 W h (x), (4) with boundary condition lim x→±∞ v(x,t)p(x,t) = 0. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 32 / 38
  • 73. Feedback particle filter Feedback particle filter(FPF) Problem: dX(t) = a(X(t))dt +σB dB(t), (1) dZ(t) = h(X(t))dt +σW dW(t) (2) FPF: dXi(t) = a(Xi(t))dt +σB dBi(t) +v(Xi(t))dIi(t)+ 1 2 σ2 W v(Xi(t))v (Xi(t))dt (3) Innovation error dIi(t)=:dZ(t)− 1 2 (h(Xi(t))+ ˆh)dt Population prediction ˆh = 1 N N ∑ i=1 h(Xi(t)). Euler-Lagrange equation: − ∂ ∂x 1 p(x,t) ∂ ∂x {p(x,t)v(x,t)} = 1 σ2 W h (x), (4) with boundary condition lim x→±∞ v(x,t)p(x,t) = 0. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 32 / 38
  • 74. Consistency result Consistency result p∗ denotes cond. pdf of X(t) given Z t . dp∗ = L † (p∗ )dt +(h− ˆh)(σ2 W )−1 (dZ(t)− ˆhdt)p∗ p denotes cond. pdf of Xi(t) given Z t dp = L † pdt − ∂ ∂x (vp) dZ(t)− ∂ ∂x (up) dt + σ2 W 2 ∂2 ∂x2 pv2 dt Theorem Consider the two evolution equations for p and p∗ , defined according to the solution of the Kolmogorov forward equation and the K-S equation, respectively. Suppose that the control function v(x,t) is obtained according to (4). Then, provided p(x,0) = p∗ (x,0), we have for all t ≥ 0, p(x,t) = p∗ (x,t) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 33 / 38
  • 75. Linear Gaussian case Linear Gaussian case Linear system: dX(t) = aX(t)dt +σB dB(t) (1) dZ(t) = hX(t)dt +σW dW(t) (2) p(x,t) = 1 2πΣ(t) exp − (x− µ(t))2 2Σ(t) E-L equation: − ∂ ∂x 1 p ∂ ∂x {pv} = h σ2 W ⇒ v(x,t) = hΣ(t) σ2 W FPF: dXi(t) = aXi(t)dt +σB dBi(t)+ hΣ(t) σ2 W [dZ(t)−h Xi(t)+ µ(t) 2 dt] (3) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 34 / 38
  • 76. Linear Gaussian case Linear Gaussian case Linear system: dX(t) = aX(t)dt +σB dB(t) (1) dZ(t) = hX(t)dt +σW dW(t) (2) p(x,t) = 1 2πΣ(t) exp − (x− µ(t))2 2Σ(t) E-L equation: − ∂ ∂x 1 p ∂ ∂x {pv} = h σ2 W ⇒ v(x,t) = hΣ(t) σ2 W FPF: dXi(t) = aXi(t)dt +σB dBi(t)+ hΣ(t) σ2 W [dZ(t)−h Xi(t)+ µ(t) 2 dt] (3) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 34 / 38
  • 77. Linear Gaussian case Linear Gaussian case Linear system: dX(t) = aX(t)dt +σB dB(t) (1) dZ(t) = hX(t)dt +σW dW(t) (2) p(x,t) = 1 2πΣ(t) exp − (x− µ(t))2 2Σ(t) E-L equation: − ∂ ∂x 1 p ∂ ∂x {pv} = h σ2 W ⇒ v(x,t) = hΣ(t) σ2 W FPF: dXi(t) = aXi(t)dt +σB dBi(t)+ hΣ(t) σ2 W [dZ(t)−h Xi(t)+ µ(t) 2 dt] (3) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 34 / 38
  • 78. Linear Gaussian case Linear Gaussian case Linear system: dX(t) = aX(t)dt +σB dB(t) (1) dZ(t) = hX(t)dt +σW dW(t) (2) p(x,t) = 1 2πΣ(t) exp − (x− µ(t))2 2Σ(t) E-L equation: − ∂ ∂x 1 p ∂ ∂x {pv} = h σ2 W ⇒ v(x,t) = hΣ(t) σ2 W FPF: dXi(t) = aXi(t)dt +σB dBi(t)+ hΣ(t) σ2 W [dZ(t)−h Xi(t)+ µ(t) 2 dt] (3) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 34 / 38
  • 79. Example Example: Filtering of nonlinear oscillator Nonlinear oscillator: dθ(t) = ω dt +σB dB(t) mod 2π (1) dZ(t) = 1+cosθ(t) 2 h(θ(t)) dt +σW dW(t) (2) - Controlled system (N particles): dθi(t) = ω dt +σB dBi(t)+v(θi(t))[dZ(t)− 1 2 (h(θi(t))+ ˆh)dt] + 1 2 σ2 W vv dt mod 2π, i = 1,...,N. (3) where v(θi(t)) is obtained via the solution of the E-L equation: − ∂ ∂θ 1 p(θ,t) ∂ ∂θ {p(θ,t)v(θ,t)} = − sinθ σ2 W (4) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 35 / 38
  • 80. Example Example: Filtering of nonlinear oscillator Nonlinear oscillator: dθ(t) = ω dt +σB dB(t) mod 2π (1) dZ(t) = 1+cosθ(t) 2 h(θ(t)) dt +σW dW(t) (2) - Controlled system (N particles): dθi(t) = ω dt +σB dBi(t)+v(θi(t))[dZ(t)− 1 2 (h(θi(t))+ ˆh)dt] + 1 2 σ2 W vv dt mod 2π, i = 1,...,N. (3) where v(θi(t)) is obtained via the solution of the E-L equation: − ∂ ∂θ 1 p(θ,t) ∂ ∂θ {p(θ,t)v(θ,t)} = − sinθ σ2 W (4) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 35 / 38
  • 81. Example Example: Filtering of nonlinear oscillator Fourier form of p(θ,t): p(θ,t) = 1 2π +Ps(t)sinθ +Pc(t)cosθ. Approx. solution of E-L equation (4) (using a perturbation method): v(θ,t) = 1 2σ2 W −sinθ + π 2 [Pc(t)sin2θ −Ps(t)cos2θ] , v (θ,t) = 1 2σ2 W {−cosθ +π[Pc(t)cos2θ +Ps(t)sin2θ]} where Pc(t) ≈ ¯P (N) c (t) = 1 πN N ∑ j=1 cosθj(t), Ps(t) ≈ ¯P (N) s (t) = 1 πN N ∑ j=1 sinθj(t). Reminiscent of coupled oscillators (Winfree) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 36 / 38
  • 82. Example Example: Filtering of nonlinear oscillator Fourier form of p(θ,t): p(θ,t) = 1 2π +Ps(t)sinθ +Pc(t)cosθ. Approx. solution of E-L equation (4) (using a perturbation method): v(θ,t) = 1 2σ2 W −sinθ + π 2 [Pc(t)sin2θ −Ps(t)cos2θ] , v (θ,t) = 1 2σ2 W {−cosθ +π[Pc(t)cos2θ +Ps(t)sin2θ]} where Pc(t) ≈ ¯P (N) c (t) = 1 πN N ∑ j=1 cosθj(t), Ps(t) ≈ ¯P (N) s (t) = 1 πN N ∑ j=1 sinθj(t). Reminiscent of coupled oscillators (Winfree) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 36 / 38
  • 83. Example Example: Filtering of nonlinear oscillator Fourier form of p(θ,t): p(θ,t) = 1 2π +Ps(t)sinθ +Pc(t)cosθ. Approx. solution of E-L equation (4) (using a perturbation method): v(θ,t) = 1 2σ2 W −sinθ + π 2 [Pc(t)sin2θ −Ps(t)cos2θ] , v (θ,t) = 1 2σ2 W {−cosθ +π[Pc(t)cos2θ +Ps(t)sin2θ]} where Pc(t) ≈ ¯P (N) c (t) = 1 πN N ∑ j=1 cosθj(t), Ps(t) ≈ ¯P (N) s (t) = 1 πN N ∑ j=1 sinθj(t). Reminiscent of coupled oscillators (Winfree) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 36 / 38
  • 84. Example Simulation Results Signal, observation processes: dθ(t) = 1dt +0.5dB(t) (1) dZ(t) = 1+cosθ(t) 2 dt +0.4dW(t) (2) Particle system (N = 100): dθi(t) = 1dt +σB dBi(t)+U(θi(t); ¯P (N) c (t), ¯P (N) s (t)) mod 2π (3) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 37 / 38
  • 85. Example Simulation Results 0 2 4 0 2 4 0.01 0.02 0.03 0.04 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 t t x π 2π0 π 2π (a) (b) (c)First harmonic PDF estimate Second harmonic Bounds State Conditional mean est. θ(t) ¯θN (t) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 38 / 38
  • 86. Thank you! Website: http://www.mechse.illinois.edu/research/mehtapg Collaborators Huibing Yin Tao Yang Sean Meyn Uday Shanbhag “Synchronization of coupled oscillators is a game,” IEEE TAC, ACC 2010 “Learning in Mean-Field Oscillator Games,” CDC 2010 “On the Efficiency of Equilibria in Mean-field Oscillator Games,” ACC 2011 “A Control-oriented Approach for Particle Filtering,” ACC2011, CDC2011
  • 87. Part III Learning in mean-field game
  • 88. Learning Q-function approximation Approx. DP Optimality equation min ui {c(θ;θ−i(t))+ 1 2 Ru2 i +Dui hi(θ,t) =: Hi(θ,ui;θ−i(t)) } = η∗ i Optimal control law Kuramoto law u∗ i = − 1 R ∂θ hi(θ,t) u (Kur) i = − κ N ∑ j=i sin(θi −θj(t)) Parameterization: H (Ai,φi) i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1 2 Ru2 i +(ωi −1+ui)AiS(φi) + σ2 2 AiC(φi) where S(φ) (θ,θ−i) = 1 N ∑ j=i sin(θ −θj −φ), C(φ) (θ,θ−i) = 1 N ∑ j=i cos(θ −θj −φ) Approx. optimal control: u (Ai,φi) i = arg min ui {H (Ai,φi) i (θ,ui;θ−i(t))} = − Ai R S(φi) (θ,θ−i) Watkins & Dayan, Q-learning, 1992; Bertsekas & Tsitsiklis, NDP, 1996; Mehta & Meyn, CDC 2009 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 41 / 38
  • 89. Learning Q-function approximation Approx. DP Optimality equation min ui {c(θ;θ−i(t))+ 1 2 Ru2 i +Dui hi(θ,t) =: Hi(θ,ui;θ−i(t)) } = η∗ i Optimal control law Kuramoto law u∗ i = − 1 R ∂θ hi(θ,t) u (Kur) i = − κ N ∑ j=i sin(θi −θj(t)) Parameterization: H (Ai,φi) i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1 2 Ru2 i +(ωi −1+ui)AiS(φi) + σ2 2 AiC(φi) where S(φ) (θ,θ−i) = 1 N ∑ j=i sin(θ −θj −φ), C(φ) (θ,θ−i) = 1 N ∑ j=i cos(θ −θj −φ) Approx. optimal control: u (Ai,φi) i = arg min ui {H (Ai,φi) i (θ,ui;θ−i(t))} = − Ai R S(φi) (θ,θ−i) Watkins & Dayan, Q-learning, 1992; Bertsekas & Tsitsiklis, NDP, 1996; Mehta & Meyn, CDC 2009 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 41 / 38
  • 90. Learning Q-function approximation Approx. DP Optimality equation min ui {c(θ;θ−i(t))+ 1 2 Ru2 i +Dui hi(θ,t) =: Hi(θ,ui;θ−i(t)) } = η∗ i Optimal control law Kuramoto law u∗ i = − 1 R ∂θ hi(θ,t) u (Kur) i = − κ N ∑ j=i sin(θi −θj(t)) Parameterization: H (Ai,φi) i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1 2 Ru2 i +(ωi −1+ui)AiS(φi) + σ2 2 AiC(φi) where S(φ) (θ,θ−i) = 1 N ∑ j=i sin(θ −θj −φ), C(φ) (θ,θ−i) = 1 N ∑ j=i cos(θ −θj −φ) Approx. optimal control: u (Ai,φi) i = arg min ui {H (Ai,φi) i (θ,ui;θ−i(t))} = − Ai R S(φi) (θ,θ−i) Watkins & Dayan, Q-learning, 1992; Bertsekas & Tsitsiklis, NDP, 1996; Mehta & Meyn, CDC 2009 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 41 / 38
  • 91. Learning Q-function approximation Approx. DP Optimality equation min ui {c(θ;θ−i(t))+ 1 2 Ru2 i +Dui hi(θ,t) =: Hi(θ,ui;θ−i(t)) } = η∗ i Optimal control law Kuramoto law u∗ i = − 1 R ∂θ hi(θ,t) u (Kur) i = − κ N ∑ j=i sin(θi −θj(t)) Parameterization: H (Ai,φi) i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1 2 Ru2 i +(ωi −1+ui)AiS(φi) + σ2 2 AiC(φi) where S(φ) (θ,θ−i) = 1 N ∑ j=i sin(θ −θj −φ), C(φ) (θ,θ−i) = 1 N ∑ j=i cos(θ −θj −φ) Approx. optimal control: u (Ai,φi) i = arg min ui {H (Ai,φi) i (θ,ui;θ−i(t))} = − Ai R S(φi) (θ,θ−i) Watkins & Dayan, Q-learning, 1992; Bertsekas & Tsitsiklis, NDP, 1996; Mehta & Meyn, CDC 2009 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 41 / 38
  • 92. Learning Stochastic approx. Learning algorithm Bellman error: Pointwise: L (Ai,φi) (θ,t) = min ui {H (Ai,φi) i }−η (A∗ i ,φ∗ i ) i Stochastic approx. algorithm ˜e(Ai,φi) = 2 ∑ k=1 | L (Ai,φi) , ˜ϕk(θ) |2 dAi dt = −ε d˜e(Ai,φi) dAi , dφi dt = −ε d˜e(Ai,φi) dφi Convergence analysis: Ai(t) → A∗ , φi(t) → φ∗ 100 200 300 400 500 −1 0 1 2 3 10 15 20 25 −1 0 1 2 3 Synchrony Incoherence Yin et al., CDC 2010 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 42 / 38
  • 93. Learning Stochastic approx. Learning algorithm Bellman error: Pointwise: L (Ai,φi) (θ,t) = min ui {H (Ai,φi) i }−η (A∗ i ,φ∗ i ) i Stochastic approx. algorithm ˜e(Ai,φi) = 2 ∑ k=1 | L (Ai,φi) , ˜ϕk(θ) |2 dAi dt = −ε d˜e(Ai,φi) dAi , dφi dt = −ε d˜e(Ai,φi) dφi Convergence analysis: Ai(t) → A∗ , φi(t) → φ∗ 100 200 300 400 500 −1 0 1 2 3 10 15 20 25 −1 0 1 2 3 Synchrony Incoherence Yin et al., CDC 2010 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 42 / 38
  • 94. Learning Bifurcation diagram Comparison of average cost dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds P. G. Mehta (Illinois) Harvard Apr. 1, 2011 43 / 38
  • 95. Learning Bifurcation diagram Comparison of average cost dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds Learning P. G. Mehta (Illinois) Harvard Apr. 1, 2011 43 / 38
  • 96. Learning Bifurcation diagram Comparison of average cost dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds Learning Mean-field P. G. Mehta (Illinois) Harvard Apr. 1, 2011 43 / 38
  • 98. Analysis of phase transition Incoherence solution Overview of the steps HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂t p+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) ¯c(ϑ,t) = Ω 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω Assume c• (ϑ,θ) = c• (ϑ −θ) = 1 2 sin2 ϑ −θ 2 Incoherence solution h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) := 1 2π P. G. Mehta (Illinois) Harvard Apr. 1, 2011 45 / 38
  • 99. Analysis of phase transition Bifurcation analysis Linearization and spectra Linearized PDE (about incoherence solution) ∂ ∂t ˜z(θ,t,ω) = −ω∂θ ˜h− ¯c− σ2 2 ∂2 θθ ˜h −ω∂θ ˜p+ 1 2πR∂2 θθ ˜h+ σ2 2 ∂2 θθ ˜p =: LR ˜z(θ,t,ω) Spectrum of the linear operator 1 Continuous spectrum {S(k) }+∞ k=−∞ S(k) := λ ∈ C λ = ± σ2 2 k2 −kωi for all ω ∈ Ω 2 Discrete spectrum Characteristic eqn: 1 8R Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 46 / 38
  • 100. Analysis of phase transition Bifurcation analysis Linearization and spectra Linearized PDE (about incoherence solution) ∂ ∂t ˜z(θ,t,ω) = −ω∂θ ˜h− ¯c− σ2 2 ∂2 θθ ˜h −ω∂θ ˜p+ 1 2πR∂2 θθ ˜h+ σ2 2 ∂2 θθ ˜p =: LR ˜z(θ,t,ω) Spectrum of the linear operator 1 Continuous spectrum {S(k) }+∞ k=−∞ S(k) := λ ∈ C λ = ± σ2 2 k2 −kωi for all ω ∈ Ω 2 Discrete spectrum Characteristic eqn: 1 8R Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 46 / 38
  • 101. Analysis of phase transition Bifurcation analysis Linearization and spectra Linearized PDE (about incoherence solution) ∂ ∂t ˜z(θ,t,ω) = −ω∂θ ˜h− ¯c− σ2 2 ∂2 θθ ˜h −ω∂θ ˜p+ 1 2πR∂2 θθ ˜h+ σ2 2 ∂2 θθ ˜p =: LR ˜z(θ,t,ω) Spectrum of the linear operator 1 Continuous spectrum {S(k) }+∞ k=−∞ S(k) := λ ∈ C λ = ± σ2 2 k2 −kωi for all ω ∈ Ω −0.2 −0.1 0 0.1 0.2 0.3 −3 −2 −1 0 1 2 3 real imag γ = 0.1 R decreases k=2 k=2 k=1 k=1 2 Discrete spectrum Characteristic eqn: 1 8R Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 46 / 38
  • 102. Analysis of phase transition Bifurcation analysis Linearization and spectra Linearized PDE (about incoherence solution) ∂ ∂t ˜z(θ,t,ω) = −ω∂θ ˜h− ¯c− σ2 2 ∂2 θθ ˜h −ω∂θ ˜p+ 1 2πR∂2 θθ ˜h+ σ2 2 ∂2 θθ ˜p =: LR ˜z(θ,t,ω) Spectrum of the linear operator 1 Continuous spectrum {S(k) }+∞ k=−∞ S(k) := λ ∈ C λ = ± σ2 2 k2 −kωi for all ω ∈ Ω −0.2 −0.1 0 0.1 0.2 0.3 −3 −2 −1 0 1 2 3 real imag γ = 0.1 R decreases k=2 k=2 k=1 k=1 2 Discrete spectrum Characteristic eqn: 1 8R Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 46 / 38
  • 103. Analysis of phase transition Bifurcation analysis Bifurcation diagram (Hamiltonian Hopf) Characteristic eqn: 1 8R Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. Stability proof [3] Dellnitz et al., Int. Series Num. Math., 1992 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 47 / 38
  • 104. Analysis of phase transition Bifurcation analysis Bifurcation diagram (Hamiltonian Hopf) Characteristic eqn: 1 8R Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. Stability proof −0.2 −0.1 0 0.1 0.2 -0.6 -0.8 -1 -1.2 -1.4 real imag (a) Cont.spectrum;ind.ofR Disc.spectrum;fn.ofR Bifurcation point [3] Dellnitz et al., Int. Series Num. Math., 1992 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 47 / 38
  • 105. Analysis of phase transition Bifurcation analysis Bifurcation diagram (Hamiltonian Hopf) Characteristic eqn: 1 8R Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. Stability proof −0.2 −0.1 0 0.1 0.2 -0.6 -0.8 -1 -1.2 -1.4 real imag (a) Cont.spectrum;ind.ofR Disc.spectrum;fn.ofR Bifurcation point 0 0.05 0.1 0.15 0.2 15 20 25 30 35 40 45 50 Incoherence R > R R c(γ γ ) (c)) Synchrony 0.05 [3] Dellnitz et al., Int. Series Num. Math., 1992 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 47 / 38
  • 106. Analysis of phase transition Numerics Numerical solution of PDEs Incoherence; R = 60 incoherence P. G. Mehta (Illinois) Harvard Apr. 1, 2011 48 / 38
  • 107. Analysis of phase transition Numerics Numerical solution of PDEs Incoherence; R = 60 incoherence Synchrony; R = 10 synchrony synchrony P. G. Mehta (Illinois) Harvard Apr. 1, 2011 48 / 38
  • 108. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γγ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds P. G. Mehta (Illinois) Harvard Apr. 1, 2011 49 / 38
  • 109. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γγ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 incoherence soln. dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds P. G. Mehta (Illinois) Harvard Apr. 1, 2011 49 / 38
  • 110. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γγ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 synchrony soln. dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds P. G. Mehta (Illinois) Harvard Apr. 1, 2011 49 / 38
  • 111. Analysis of phase transition Numerics Bifurcation diagram with another cost c• (θ,ϑ) = 0.25(1−cos(θ −ϑ)) 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 R−1/2 0 1 2 3 4 5 6 7 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 θ p R = 23.8359 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 50 / 38
  • 112. Analysis of phase transition Numerics Bifurcation diagram with another cost c• (θ,ϑ) = 0.25(1−cos(θ −ϑ)) 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 R−1/2 0 1 2 3 4 5 6 7 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 θ p R = 23.8359 0 1 2 3 4 5 6 7 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 θ p R = 23.8359 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 50 / 38
  • 113. Analysis of phase transition Numerics Bifurcation diagram with another cost c• (θ,ϑ) = 0.25(1−cos(θ −ϑ)) 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 R−1/2 0 1 2 3 4 5 6 7 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 θ p R = 23.8359 c• (θ,ϑ) = 1−cos(θ −ϑ)−cos3(θ −ϑ) 0 0.5 1 1.5 0 0.2 0.4 0.6 0.8 1 R−1/2 0 1 2 3 4 5 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 1 2 3 4 5 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 P. G. Mehta (Illinois) Harvard Apr. 1, 2011 51 / 38
  • 114. Analysis of phase transition Numerics Bifurcation diagram with another cost c• (θ,ϑ) = 0.25(1−cos(θ −ϑ)) 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 R−1/2 0 1 2 3 4 5 6 7 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 θ p R = 23.8359 c• (θ,ϑ) = 1−cos(θ −ϑ)−cos3(θ −ϑ) 0 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 θ p P. G. Mehta (Illinois) Harvard Apr. 1, 2011 51 / 38
  • 115. Feedback particle filter Numerical method: particle filter Basic algorithm (discretized time): Step 0: Initialization. For i = 1,...,N, sample x (i) 0 ∼ p0|0(x) and set n = 0. Step 1: Importance Sampling step. For i = 1,...,N, sample ˜x (i) n ∼ pN n−1|n−1K(xn) = (1/N) N ∑ k=1 K(xn|x (k) n−1). For i = 1,...,N, evaluate the normalized importance weight w (i) n w (i) n ∝ g zn|˜x (i) n ; N ∑ i=1 w (i) n = 1 pN n|n(x) = N ∑ i=1 w (i) n δ (i) ˜xn (x) Step 2: Resampling For i = 1,...,N, sample x (i) n ∼ pN n|n(x). P. G. Mehta (Illinois) Harvard Apr. 1, 2011 52 / 38
  • 116. Feedback particle filter Calculation of KL divergence Recall the definition of K-L divergence for densities, KL(pn ˆp∗ n) = R pn(s)ln pn(s) ˆp∗ n(s) ds. We make a co-ordinate transformation s = x+u(x) and express the K-L divergence as: KL(pn ˆp∗ n) = R p− n (x) |1+u (x)| ln( p− n (x) |1+u (x)| ˆp∗ n(x+u(x)|zn) )|1+u (x)|dx P. G. Mehta (Illinois) Harvard Apr. 1, 2011 53 / 38
  • 117. Feedback particle filter Derivation of Euler-Lagrange equation Denote: L (x,u,u ) = −p− n (x) ln|1+u (x)| +ln(p− n (x+u)pZ|X (zn|x+u)) . (4) The optimization problem thus is a calculus of variation problem: min u L (x,u,u )dx. The minimizer is obtained via the analysis of first variation given by the well-known Euler-Lagrange equation: ∂L ∂u = d dx ∂L ∂u , Explicitly substituting the expression (4) for L , we obtain the expecting result. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 54 / 38
  • 118. Feedback particle filter Simulation Results When particle filter does not work well: Bounds 0 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 t t First harmonic PDF estimate Second harmonic x π 2π0 π 2π 0 2 4 0 2 4 0 0.01 0.02 0.03 0.04 (a) (b) (c) State Conditional mean est. θ(t) ¯θN (t) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 55 / 38
  • 119. Feedback particle filter Derivation of FPK for FPF Controlled particle system: Controlled system (N particles): dXi(t) = a(Xi(t))dt + Ui(t) mean field control +σB dBi(t), i = 1,...,N (3) Using ansatz: Ui(t) = u(Xi(t))dt +v(Xi(t))dZ(t). Particle system becomes: dXi(t) = (a(Xi(t))+u(Xi(t))) dt +σB dBi(t)+v(Xi(t))dZ(t) i = 1,...,N (3) Then the cond. distribution of Xi(t) given the filteration Z t , p(x,t), satisfies the forward equation: dp = L † pdt − ∂ ∂x (vp) dZt − ∂ ∂x (up) dt +σ2 W 1 2 ∂2 ∂x2 pv2 dt, Note after going through some short calculation, we have u(x,t) = v(x,t) − 1 2 (h(x)+ ˆh)+ 1 2 σ2 W v (x,t) , (4) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 56 / 38
  • 120. Feedback particle filter Derivation of FPF for linear case The density p is a function of the stochastic process µt. Using Itô’s formula, dp(x,t) = ∂ p ∂µ dµt + ∂ p ∂Σ dΣt + 1 2 ∂2 p ∂µ2 dµ2 t , where ∂ p ∂µ = x− µt Σt p, ∂ p ∂Σ = 1 2Σt (x− µt)2 Σt −1 p, and ∂2 p ∂µ2 = 1 Σt (x− µt)2 Σt −1 p. Substituting these into the forward equation, we obtain a quadratic equation Ax2 +Bx = 0, where A = dΣt − 2aΣt +σ2 B − h2Σ2 t σ2 W dt, B = dµt − aµt dt + hΣt σ2 W (dZt −hµt dt) . This leads to the Kalman filter. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 57 / 38
  • 121. Feedback particle filter Derivation of E-L equation In the continuous-time case, the control is of the form: Ui t = u(Xi t ,t)dt +v(Xi t ,t)dZt. (5) Substituting this in the E-L BVP for the continuous-discrete time case, we arrive at the formal equation: ∂ ∂x p(x,t) 1+u dt +v dZt =p(x,t) ∂ ∂ ˜u ln p(x+udt +vdZt,t) +ln pdZ|X (dZt | x+udt +vdZt) , (6) where pdZ|X (dZt|·) = 1 2πσ2 W exp − (dZt −h(·))2 2σ2 W and ˜u = udt +vdZt. For notational ease, we use primes to denote partial derivatives with respect to x: p is used to denote p(x,t), p := ∂ p ∂x (x,t), p := ∂2 p ∂x2 (x,t), u := ∂u ∂x (x,t), v := ∂v ∂x (x,t) etc. Note that the time t is fixed. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 58 / 38
  • 122. Feedback particle filter Derivation of E-L equation cont.1 Step 1: The three terms in (6) are simplified as: ∂ ∂x p 1+u dt +v dZt = p − f1 dt −(p v + pv )dZt p ∂ ∂ ˜u ln p(x+udt +vdZt) = p + f2 dt +(p v− p 2v p )dZt p ∂ ∂ ˜u ln pdZ|X (dZt|x+udt +vdZt) = p σ2 W [h dZt +(h v−hh )dt] where we have used Itô’s rules dZ2 t = σ2 W dt, dZt dt = 0 etc. f1 = (p u + pu )−σ2 W (p v 2 +2pv v ), f2 = (p u− p 2u p )+σ2 W v2 1 2 p − 3p p 2p + p 3 p2 . Collecting terms in O(dZt) and O(dt), after some simplification, leads to the following ODEs: P. G. Mehta (Illinois) Harvard Apr. 1, 2011 59 / 38
  • 123. Feedback particle filter Derivation of E-L equation cont.2 E (v) = 1 σ2 W h (x) (7) E (u) = − 1 σ2 W h(x)h (x)+h (x)v+σ2 W G(x,t) (8) where E (v) = − ∂ ∂x 1 p(x,t) ∂ ∂x {p(x,t)v(x,t)} , and G = −2v v −(v )2 (ln p) + 1 2 v2 (ln p) . P. G. Mehta (Illinois) Harvard Apr. 1, 2011 60 / 38
  • 124. Feedback particle filter Derivation of E-L equation cont.3 Step 2. Suppose (u,v) are admissible solutions of the E-L BVP (7)-(8). Then it is claimed that −(pv) = h− ˆh σ2 W p (9) −(pu) = − (h− ˆh)ˆh σ2 W p− 1 2 σ2 W (pv2 ) . (10) Recall that admissible here means lim x→±∞ p(x,t)u(x,t) = 0, lim x→±∞ p(x,t)v(x,t) = 0. (11) To show (9), integrate (7) once to obtain −(pv) = 1 σ2 W hp+Cp, P. G. Mehta (Illinois) Harvard Apr. 1, 2011 61 / 38
  • 125. Feedback particle filter Derivation of E-L equation cont.4 where the constant of integration C = − ˆh σ2 W is obtained by integrating once again between −∞ to ∞ and using the boundary conditions for v (11). This gives (9). To show (10), we denote its right hand side as R and claim R p = − hh σ2 W +h v+σ2 W G. (12) The equation (10) then follows by using the ODE (8) together with the boundary conditions for u (11). The verification of the claim involves a straightforward calculation, where we use (7) to obtain expressions for h and v . The details of this calculation are omitted on account of space. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 62 / 38
  • 126. Feedback particle filter Derivation of E-L equation cont.5 Step 3. The E-L equation for v is given by (7). After going through some short calculation, we have u(x,t) = v(x,t) − 1 2 (h(x)+ ˆh)+ 1 2 σ2 W v (x,t) , (13) P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
  • 127. Bibliography Dimitri P. Bertsekas. Dynamic Programming and Optimal Control, volume 1. Athena Scientific, Belmont, Massachusetts, 1995. Eric Brown, Jeff Moehlis, and Philip Holmes. On the phase reduction and response dynamics of neural oscillator populations. Neural Computation, 16(4):673–715, 2004. M. Dellnitz, J.E. Marsden, I. Melbourne, and J. Scheurle. Generic bifurcations of pendula. Int. Series Num. Math., 104:111–122, 1992. A. Doucet, N. de Freitas, and N. Gordon. Sequential Monte-Carlo Methods in Practice. Springer-Verlag, April 2001. N. J. Gordon, D. J. Salmond, and A. F. M. Smith. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
  • 128. Bibliography IEE Proceedings F Radar and Signal Processing, 140(2):107–113, 1993. J. Guckenheimer. Isochrons and phaseless sets. J. Math. Biol., 1:259–273, 1975. M. Huang, P. E. Caines, and R. P. Malhame. Large-population cost-coupled LQG problems with nonuniform agents: Individual-mass behavior and decentralized ε-Nash equilibria. 52(9):1560–1571, 2007. Minyi Huang, Peter E. Caines, and Roland P. Malhame. Large-population cost-coupled LQG problems with nonuniform agents: Individual-mass behavior and decentralized ε-nash equilibria. IEEE transactions on automatic control, 52(9):1560–1571, 2007. R. E. Kalman. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
  • 129. Bibliography A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1):35–45, 1960. Y. Kuramoto. International Symposium on Mathematical Problems in Theoretical Physics, volume 39 of Lecture Notes in Physics. Springer-Verlag, 1975. H. J. Kushner. On the differential equations satisfied by conditional probability densities of markov process. SIAM J. Control, 2:106–119, 1964. J. Lasry and P. Lions. Mean field games. Japanese Journal of Mathematics, 2(2):229–260, 2007. Jean-Michel Lasry and Pierre-Louis Lions. Mean field games. Japan. J. Math., 2:229–260, 2007. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
  • 130. Bibliography Sean P. Meyn. The policy iteration algorithm for average reward markov decision processes with general state space. IEEE Transactions on Automatic Control, 42(12):1663–1680, December 1997. S. H. Strogatz and R. E. Mirollo. Stability of incoherence in a population of coupled oscillators. Journal of Statistical Physics, 63:613–635, May 1991. G. Y. Weintraub, L. Benkard, and B. Van Roy. Oblivious equilibrium: A mean field approximation for large-scale dynamic games. In Advances in Neural Information Processing Systems, volume 18. MIT Press, 2006. Huibing Yin, Prashant G. Mehta, Sean P. Meyn, and Uday V. Shanbhag. Synchronization of coupled oscillators is a game. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38
  • 131. Bibliography In Proc. of 2010 American Control Conference, pages 1783–1790, Baltimore, MD, 2010. Mehta P. G. Meyn S. P. Yin, H. and U. V. Shanbhag. Synchronization of coupled oscillators is a game. IEEE Trans. Automat. Control. P. G. Mehta (Illinois) Harvard Apr. 1, 2011 63 / 38