SlideShare a Scribd company logo
CSLCOORDINATED SCIENCE LABORATORY
Synchronization of coupled oscillators is a game
Prashant G. Mehta1
1Coordinated Science Laboratory
Department of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign
University of Maryland, March 4, 2010
Acknowledgment: AFOSR, NSF
Huibing Yin Sean P. Meyn Uday V. Shanbhag
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of coupled oscillators is a game,” ACC 2010
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 2 / 69
Millennium bridge
Video of London Millennium bridge from youtube
[11] S. H. Strogatz et al., Nature, 2005
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 3 / 69
Classical Kuramoto model
dθi(t) =
�
ωi +
κ
N
N
∑
j=1
sin(θj(t)−θi(t))
�
dt +σ dξi(t), i = 1,...,N
ωi taken from distribution g(ω) over [1−γ,1+γ]
γ — measures the heterogeneity of the population
κ — measures the strength of coupling
[6] Y. Kuramoto (1975)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 4 / 69
Classical Kuramoto model
dθi(t) =
�
ωi +
κ
N
N
∑
j=1
sin(θj(t)−θi(t))
�
dt +σ dξi(t), i = 1,...,N
ωi taken from distribution g(ω) over [1−γ,1+γ]
γ — measures the heterogeneity of the population
κ — measures the strength of coupling 1- 1+1
[6] Y. Kuramoto (1975)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 4 / 69
Classical Kuramoto model
dθi(t) =
�
ωi +
κ
N
N
∑
j=1
sin(θj(t)−θi(t))
�
dt +σ dξi(t), i = 1,...,N
ωi taken from distribution g(ω) over [1−γ,1+γ]
γ — measures the heterogeneity of the population
κ — measures the strength of coupling
[6] Y. Kuramoto (1975)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 4 / 69
Classical Kuramoto model
dθi(t) =
�
ωi +
κ
N
N
∑
j=1
sin(θj(t)−θi(t))
�
dt +σ dξi(t), i = 1,...,N
ωi taken from distribution g(ω) over [1−γ,1+γ]
γ — measures the heterogeneity of the population
κ — measures the strength of coupling
0 0.1 0.2
0.1
0.15
0.2
0.25
0.3 Locking
Incoherence
κ
κ < κc(γ)
γ
Synchrony
Incoherence
[6] Y. Kuramoto (1975)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 4 / 69
Movies of incoherence and synchrony solution
Incoherence Synchrony
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 5 / 69
Problem statement
Dynamics of ith
oscillator
dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0
ui(t) — control 1- 1+1
ith
oscillator seeks to minimize
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[ c(θi;θ−i)
� �� �
cost of anarchy
+ 1
2Ru2
i
� �� �
cost of control
]ds
θ−i = (θj)j�=i
R — control penalty
c(·) — cost function
c(θi;θ−i) =
1
N ∑
j�=i
c•
(θi,θj), c•
≥ 0
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 6 / 69
Problem statement
Dynamics of ith
oscillator
dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0
ui(t) — control 1- 1+1
ith
oscillator seeks to minimize
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[ c(θi;θ−i)
� �� �
cost of anarchy
+ 1
2Ru2
i
� �� �
cost of control
]ds
θ−i = (θj)j�=i
R — control penalty
c(·) — cost function
c(θi;θ−i) =
1
N ∑
j�=i
c•
(θi,θj), c•
≥ 0
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 6 / 69
Problem statement
Dynamics of ith
oscillator
dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0
ui(t) — control 1- 1+1
ith
oscillator seeks to minimize
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[ c(θi;θ−i)
� �� �
cost of anarchy
+ 1
2Ru2
i
� �� �
cost of control
]ds
θ−i = (θj)j�=i
R — control penalty
c(·) — cost function
c(θi;θ−i) =
1
N ∑
j�=i
c•
(θi,θj), c•
≥ 0
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 6 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Motivation Why a game?
Quiz
In the video you just watched, why were the
individuals walking strangely?
A. To show respect to the Queen.
B. Anarchists in the crowd were trying to destabilize the bridge.
C. They were stepping to the beat of the soundtrack "Walk Like an
Egyptian."
D. The individuals were trying to maintain their balance.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
Motivation Why a game?
Quiz
In the video you just watched, why were the
individuals walking strangely?
A. To show respect to the Queen.
B. Anarchists in the crowd were trying to destabilize the bridge.
C. They were stepping to the beat of the soundtrack "Walk Like an
Egyptian."
D. The individuals were trying to maintain their balance.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
Motivation Why a game?
Quiz
In the video you just watched, why were the
individuals walking strangely?
A. To show respect to the Queen.
B. Anarchists in the crowd were trying to destabilize the bridge.
C. They were stepping to the beat of the soundtrack "Walk Like an
Egyptian."
D. The individuals were trying to maintain their balance.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
Motivation Why a game?
Quiz
In the video you just watched, why were the
individuals walking strangely?
A. To show respect to the Queen.
B. Anarchists in the crowd were trying to destabilize the bridge.
C. They were stepping to the beat of the soundtrack "Walk Like an
Egyptian."
D. The individuals were trying to maintain their balance.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
Motivation Why a game?
Quiz
In the video you just watched, why were the
individuals walking strangely?
A. To show respect to the Queen.
B. Anarchists in the crowd were trying to destabilize the bridge.
C. They were stepping to the beat of the soundtrack "Walk Like an
Egyptian."
D. The individuals were trying to maintain their balance.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
Motivation Why a game?
“Rational irrationality”
“—behavior that, on the individual level, is perfectly reasonable but
that, when aggregated in the marketplace, produces calamity.”
Examples
Millennium bridge
Financial market
John Cassidy, “Rational Irrationality: The real reason that capitalism is so crash-prone,” The New Yorker, 2009
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 9 / 69
Motivation Why a game?
“Rational irrationality”
“—behavior that, on the individual level, is perfectly reasonable but
that, when aggregated in the marketplace, produces calamity.”
Examples
Millennium bridge
Financial market
John Cassidy, “Rational Irrationality: The real reason that capitalism is so crash-prone,” The New Yorker, 2009
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 9 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Motivation Why Oscillators?
Hodgkin-Huxley type Neuron model
C
dV
dt
= −gT ·m2
∞(V)·h·(V −ET)
−gh ·r ·(V −Eh)−......
dh
dt
=
h∞(V)−h
τh(V)
dr
dt
=
r∞(V)−r
τr(V)
2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000
−150
−100
−50
0
50
100
Voltage
time
Neural spike train
[4] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 11 / 69
Motivation Why Oscillators?
Hodgkin-Huxley type Neuron model
C
dV
dt
= −gT ·m2
∞(V)·h·(V −ET)
−gh ·r ·(V −Eh)−......
dh
dt
=
h∞(V)−h
τh(V)
dr
dt
=
r∞(V)−r
τr(V)
2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000
−150
−100
−50
0
50
100
Voltage
time
Neural spike train
[4] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 11 / 69
Motivation Why Oscillators?
Hodgkin-Huxley type Neuron model
C
dV
dt
= −gT ·m2
∞(V)·h·(V −ET)
−gh ·r ·(V −Eh)−......
dh
dt
=
h∞(V)−h
τh(V)
dr
dt
=
r∞(V)−r
τr(V)
2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000
−150
−100
−50
0
50
100
Voltage
time
Neural spike train
−100
−50
0
50
100
0
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
Vh
r
Limit cyle
r
h v
[4] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 11 / 69
Motivation Why Oscillators?
Hodgkin-Huxley type Neuron model
C
dV
dt
= −gT ·m2
∞(V)·h·(V −ET)
−gh ·r ·(V −Eh)−......
dh
dt
=
h∞(V)−h
τh(V)
dr
dt
=
r∞(V)−r
τr(V)
2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000
−150
−100
−50
0
50
100
Voltage
time
Neural spike train
−100
−50
0
50
100
0
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
Vh
r
Limit cyle
r
h v
Normal form reduction
−−−−−−−−−−−−−→
˙θi = ωi +ui ·Φ(θi)
[4] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 11 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Problems and results Problem statement
Finite oscillator model
Dynamics of ith
oscillator
dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0
ui(t) — control 1- 1+1
ith
oscillator seeks to minimize
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[ c(θi;θ−i)
� �� �
cost of anarchy
+ 1
2Ru2
i
� �� �
cost of control
]ds
θ−i = (θj)j�=i
R — control penalty
c(·) — cost function
c(θi;θ−i) =
1
N ∑
j�=i
c•
(θi,θj), c•
≥ 0
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 13 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Problems and results Main results
1. Synchronization is a solution of game
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γ
Incoherence
R > Rc(γ)
Synchrony
Incoherence
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
1- 1+1
Yin et al., ACC 2010 Strogatz et al., J. Stat. Phy., 1992
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 15 / 69
Problems and results Main results
1. Synchronization is a solution of game
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γ
Incoherence
R > Rc(γ)
Synchrony
Incoherence
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds
0 0.1 0.2
0.1
0.15
0.2
0.25
0.3 Locking
Incoherence
κ
κ < κc(γ)
γ
Synchrony
Incoherence
dθi =
�
ωi +
κ
N
N
∑
j=1
sin(θj −θi)
�
dt +σ dξi
Yin et al., ACC 2010 Strogatz et al., J. Stat. Phy., 1992
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 15 / 69
Problems and results Main results
2. Kuramoto control is approximately optimal
−0.2
0
0.2
0.4
0.6
ω = 1
Kuramoto
Population
Density
Control laws
0 π 2π θ
ui = −
A∗
i
R
1
N ∑
j�=i
sin(θ −θj(t))
0 50 100 150 200 250 300
2
2.5
3
3.5
4
4.5
5
5.5
6
t
k = 0.01; R = 1000
A
i
A*
Learning algorithm:
dAi
dt
= −ε ...
Yin et.al. CDC 2010
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 16 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Derivation of model Overview
Overview of model derivation
dθi = (ωi +ui(t))dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[¯c(θi,t)+ 1
2 Ru2
i ]ds
Influence
Influence
Mass
1 Mean-field approximation
Assumption:
c(θi;θ−i(t)) =
1
N ∑
j�=i
c•
(θi,θj)
N→∞
−−−−−−→ ¯c(θ,t)
2 Optimal control of single oscillator
Decentralized control structure
[5] M. Huang, P. Caines, and R. Malhame, IEEE TAC, 2007 [HCM]
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 18 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Derivation of model Derivation steps
Single oscillator with given cost
Dynamics of the oscillator
dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0
The cost function is assumed known
ηi(ui; ¯c) = lim
T→∞
1
T
� T
0
E[ c(θi;θ−i) + 1
2Ru2
i (s)]ds
⇑
¯c(θi(s),s)
HJB equation:
∂thi +ωi∂θ hi =
1
2R
(∂θ hi)2
− ¯c(θ,t)+η∗
i −
σ2
2
∂2
θθ hi
Optimal control law: u∗
i (t) = ϕi(θ,t) = −
1
R
∂θ hi(θ,t)
[1] D. P. Bertsekas (1995); [9] S. P. Meyn, IEEE TAC, 1997
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 20 / 69
Derivation of model Derivation steps
Single oscillator with optimal control
Dynamics of the oscillator
dθi(t) =
�
ωi −
1
R
∂θ hi(θi,t)
�
dt +σ dξi(t)
Fokker-Planck equation for pdf p(θ,t,ωi)
FPK: ∂tp+ωi∂θ p =
1
R
∂θ [p(∂θ h)]+
σ2
2
∂2
θθ p
[7] A. Lasota and M. C. Mackey, “Chaos, Fractals and Noise,” Springer 1994
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 21 / 69
Derivation of model Derivation steps
Mean-field Approximation
HJB equation for population
∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t)+η(ω)−
σ2
2
∂2
θθ h h(θ,t,ω)
Population density
∂tp+ω∂θ p =
1
R
∂θ [p(∂θ h)]+
σ2
2
∂2
θθ p p(θ,t,ω)
Enforce cost consistency
¯c(θ,t) =
�
Ω
� 2π
0
c•
(θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω
≈
1
N ∑
j�=i
c•
(θ,ϑ)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 22 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Derivation of model PDE model
Summary
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂tp+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
Mean-field approx.: ¯c(ϑ,t) =
�
Ω
� 2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
1 Bellman’s optimality principle (H,J,B)
2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . )
3 Mean-field approximation (Boltzmann, Kac,. . . )
4 Connection to Nash game (Weintraub, HCM, Altman,. . . )
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
Derivation of model PDE model
Summary
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂tp+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
Mean-field approx.: ¯c(ϑ,t) =
�
Ω
� 2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
1 Bellman’s optimality principle (H,J,B)
2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . )
3 Mean-field approximation (Boltzmann, Kac,. . . )
4 Connection to Nash game (Weintraub, HCM, Altman,. . . )
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
Derivation of model PDE model
Summary
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂tp+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
Mean-field approx.: ¯c(ϑ,t) =
�
Ω
� 2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
1 Bellman’s optimality principle (H,J,B)
2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . )
3 Mean-field approximation (Boltzmann, Kac,. . . )
4 Connection to Nash game (Weintraub, HCM, Altman,. . . )
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
Derivation of model PDE model
Summary
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂tp+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
Mean-field approx.: ¯c(ϑ,t) =
�
Ω
� 2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
1 Bellman’s optimality principle (H,J,B)
2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . )
3 Mean-field approximation (Boltzmann, Kac,. . . )
4 Connection to Nash game (Weintraub, HCM, Altman,. . . )
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
Derivation of model PDE model
Summary
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂tp+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
Mean-field approx.: ¯c(ϑ,t) =
�
Ω
� 2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
1 Bellman’s optimality principle (H,J,B)
2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . )
3 Mean-field approximation (Boltzmann, Kac,. . . )
4 Connection to Nash game (Weintraub, HCM, Altman,. . . )
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
Derivation of model PDE model
1. Solution of PDE gives ε-Nash equilibrium
Optimal control law
uo
i = −
1
R
∂θ h(θ(t),t,ω)
�
�
ω=ωi
ε-Nash property (as N → ∞)
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N.
So, we look for solutions of PDEs.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 25 / 69
Derivation of model PDE model
1. Solution of PDE gives ε-Nash equilibrium
Optimal control law
uo
i = −
1
R
∂θ h(θ(t),t,ω)
�
�
ω=ωi
ε-Nash property (as N → ∞)
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N.
So, we look for solutions of PDEs.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 25 / 69
Derivation of model PDE model
1. Solution of PDE gives ε-Nash equilibrium
Optimal control law
uo
i = −
1
R
∂θ h(θ(t),t,ω)
�
�
ω=ωi
ε-Nash property (as N → ∞)
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N.
So, we look for solutions of PDEs.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 25 / 69
Derivation of model PDE model
2. Incoherence solution (PDE)
Incoherence solution
h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) :=
1
2π
incoherence
h(θ,t,ω) = 0 ⇒ ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t)+η∗
−
σ2
2
∂2
θθ h
∂tp+ω∂θ p =
1
R
∂θ [p(∂θ h)]+
σ2
2
∂2
θθ p
¯c(θ,t) =
�
Ω
� 2π
0
c•
(θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 26 / 69
Derivation of model PDE model
2. Incoherence solution (PDE)
Incoherence solution
h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) :=
1
2π
incoherence
h(θ,t,ω) = 0 ⇒ ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t)+η∗
−
σ2
2
∂2
θθ h
p(θ,t,ω) = 1
2π ⇒ ∂tp+ω∂θ p =
1
R
∂θ [p(∂θ h)]+
σ2
2
∂2
θθ p
¯c(θ,t) =
�
Ω
� 2π
0
c•
(θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 26 / 69
Derivation of model PDE model
2. Incoherence solution (PDE)
Assume c•
(ϑ,θ) = c•
(ϑ −θ) = 1
2 sin2
�
ϑ −θ
2
�
Incoherence solution
h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) :=
1
2π
Optimal control u = −
1
R
∂θ h = 0
Average cost
¯c(θ,t) =
�
Ω
� 2π
0
1
2 sin2
�
θ −ϑ
2
�
1
2π
g(ω)dϑ dω
η∗
(ω) = ¯c(θ,t) =
1
4
=: η0 for all ω ∈ Ω
incoherence soln.
No cost of control
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 27 / 69
Derivation of model PDE model
2. Incoherence solution (Finite population)
Closed-loop dynamics dθi = (ωi + ui
����
=0
)dt +σ dξi(t)
Average cost
ηi = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2 Ru2
i
� �� �
=0
]dt
= lim
T→∞
1
N ∑
j�=i
1
T
� T
0
E[1
2 sin2
�
θi(t)−θj(t)
2
�
]dt
=
1
N ∑
j�=i
� 2π
0
E[1
2 sin2
�
θi(t)−ϑ
2
�
]
1
2π
dϑ =
N −1
N
η0
incoherence
ε-Nash property
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 28 / 69
Derivation of model PDE model
2. Incoherence solution (Finite population)
Closed-loop dynamics dθi = (ωi + ui
����
=0
)dt +σ dξi(t)
Average cost
ηi = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2 Ru2
i
� �� �
=0
]dt
= lim
T→∞
1
N ∑
j�=i
1
T
� T
0
E[1
2 sin2
�
θi(t)−θj(t)
2
�
]dt
=
1
N ∑
j�=i
� 2π
0
E[1
2 sin2
�
θi(t)−ϑ
2
�
]
1
2π
dϑ =
N −1
N
η0
incoherence
ε-Nash property
ηi(uo
i ;uo
−i) ≤ ηi(ui;uo
−i)+O(
1
√
N
), i = 1,...,N.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 28 / 69
Derivation of model PDE model
3. Synchronization is a solution of game
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γ
Incoherence
R > Rc(γ)
Synchrony
Incoherence
R−1/ 2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω= 0.95
ω= 1
ω= 1.05
R > Rc
η(ω) = η0
R < R
c
η(ω) < η0
c
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds η(ω) = min
ui
ηi(ui;uo
−i)
0 1 2 3 4 5 6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
t = 38.24
Synchrony solution of
Yin et al., “Synchronization of oscillators is a game,” ACC2010
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 29 / 69
Derivation of model PDE model
3. Synchronization is a solution of game
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γ
Incoherence
R > Rc(γ)
Synchrony
Incoherence
R−1/ 2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω= 0.95
ω= 1
ω= 1.05
R > Rc
η(ω) = η0
R < R
c
η(ω) < η0
c
incoherence soln.
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds η(ω) = min
ui
ηi(ui;uo
−i)
0 1 2 3 4 5 6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
t = 38.24
Synchrony solution of
Yin et al., “Synchronization of oscillators is a game,” ACC2010
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 29 / 69
Derivation of model PDE model
3. Synchronization is a solution of game
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γ
Incoherence
R > Rc(γ)
Synchrony
Incoherence
R−1/ 2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω= 0.95
ω= 1
ω= 1.05
R > Rc
η(ω) = η0
R < R
c
η(ω) < η0
c
synchrony soln.
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2 Ru2
i ]ds η(ω) = min
ui
ηi(ui;uo
−i)
0 1 2 3 4 5 6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
t = 38.24
Synchrony solution of
Yin et al., “Synchronization of oscillators is a game,” ACC2010
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 29 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Analysis of phase transition Incoherence solution
Overview of the steps
HJB: ∂th+ω∂θ h =
1
2R
(∂θ h)2
− ¯c(θ,t) +η∗
−
σ2
2
∂2
θθ h ⇒ h(θ,t,ω)
FPK: ∂tp+ω∂θ p =
1
R
∂θ [p( ∂θ h )]+
σ2
2
∂2
θθ p ⇒ p(θ,t,ω)
¯c(ϑ,t) =
�
Ω
� 2π
0
c•
(ϑ,θ) p(θ,t,ω) g(ω)dθ dω
Assume c•
(ϑ,θ) = c•
(ϑ −θ) = 1
2 sin2
�
ϑ −θ
2
�
Incoherence solution
h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) :=
1
2π
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 31 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Analysis of phase transition Bifurcation analysis
Linearization and spectra
Linearized PDE (about incoherence solution)
∂
∂t
˜z(θ,t,ω) =
�
−ω∂θ
˜h− ¯c− σ2
2 ∂2
θθ
˜h
−ω∂θ ˜p+ 1
2πR ∂2
θθ
˜h+ σ2
2 ∂2
θθ ˜p
�
=: LR˜z(θ,t,ω)
Spectrum of the linear operator
1 Continuous spectrum {S(k)
}+∞
k=−∞
S(k)
:=�
λ ∈ C
�
�λ = ±
σ2
2
k2
−kωi for all ω ∈ Ω
�
2 Discrete spectrum
Characteristic eqn:
1
8R
�
Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 33 / 69
Analysis of phase transition Bifurcation analysis
Linearization and spectra
Linearized PDE (about incoherence solution)
∂
∂t
˜z(θ,t,ω) =
�
−ω∂θ
˜h− ¯c− σ2
2 ∂2
θθ
˜h
−ω∂θ ˜p+ 1
2πR ∂2
θθ
˜h+ σ2
2 ∂2
θθ ˜p
�
=: LR˜z(θ,t,ω)
Spectrum of the linear operator
1 Continuous spectrum {S(k)
}+∞
k=−∞
S(k)
:=�
λ ∈ C
�
�λ = ±
σ2
2
k2
−kωi for all ω ∈ Ω
�
2 Discrete spectrum
Characteristic eqn:
1
8R
�
Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 33 / 69
Analysis of phase transition Bifurcation analysis
Linearization and spectra
Linearized PDE (about incoherence solution)
∂
∂t
˜z(θ,t,ω) =
�
−ω∂θ
˜h− ¯c− σ2
2 ∂2
θθ
˜h
−ω∂θ ˜p+ 1
2πR ∂2
θθ
˜h+ σ2
2 ∂2
θθ ˜p
�
=: LR˜z(θ,t,ω)
Spectrum of the linear operator
1 Continuous spectrum {S(k)
}+∞
k=−∞
S(k)
:=�
λ ∈ C
�
�λ = ±
σ2
2
k2
−kωi for all ω ∈ Ω
�
−0.2 −0.1 0 0.1 0.2 0.3
−3
−2
−1
0
1
2
3
real
imag
γ = 0.1
R decreases
k=2 k=2
k=1 k=1
2 Discrete spectrum
Characteristic eqn:
1
8R
�
Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 33 / 69
Analysis of phase transition Bifurcation analysis
Linearization and spectra
Linearized PDE (about incoherence solution)
∂
∂t
˜z(θ,t,ω) =
�
−ω∂θ
˜h− ¯c− σ2
2 ∂2
θθ
˜h
−ω∂θ ˜p+ 1
2πR ∂2
θθ
˜h+ σ2
2 ∂2
θθ ˜p
�
=: LR˜z(θ,t,ω)
Spectrum of the linear operator
1 Continuous spectrum {S(k)
}+∞
k=−∞
S(k)
:=�
λ ∈ C
�
�λ = ±
σ2
2
k2
−kωi for all ω ∈ Ω
�
−0.2 −0.1 0 0.1 0.2 0.3
−3
−2
−1
0
1
2
3
real
imag
γ = 0.1
R decreases
k=2 k=2
k=1 k=1
2 Discrete spectrum
Characteristic eqn:
1
8R
�
Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 33 / 69
Analysis of phase transition Bifurcation analysis
Bifurcation diagram (Hamiltonian Hopf)
Characteristic eqn:
1
8R
�
Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
Stability proof
[3] Dellnitz et al., Int. Series Num. Math., 1992
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 34 / 69
Analysis of phase transition Bifurcation analysis
Bifurcation diagram (Hamiltonian Hopf)
Characteristic eqn:
1
8R
�
Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
Stability proof
−0.2 −0.1 0 0.1 0.2
-0.6
-0.8
-1
-1.2
-1.4
real
imag
(a)
Cont.spectrum;ind.ofR
Disc.spectrum;fn.ofR
Bifurcation point
[3] Dellnitz et al., Int. Series Num. Math., 1992
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 34 / 69
Analysis of phase transition Bifurcation analysis
Bifurcation diagram (Hamiltonian Hopf)
Characteristic eqn:
1
8R
�
Ω
g(ω)
(λ − σ2
2 +ωi)(λ + σ2
2 +ωi)
dω +1 = 0.
Stability proof
−0.2 −0.1 0 0.1 0.2
-0.6
-0.8
-1
-1.2
-1.4
real
imag
(a)
Cont.spectrum;ind.ofR
Disc.spectrum;fn.ofR
Bifurcation point
0 0.05 0.1 0.15 0.2
15
20
25
30
35
40
45
50
Incoherence
R > R
R
c(γ
γ
) (c)
Synchrony
0.05
[3] Dellnitz et al., Int. Series Num. Math., 1992
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 34 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Analysis of phase transition Numerics
Numerical solution of PDEs
Incoherence; R = 60
incoherence
incoherence
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 36 / 69
Analysis of phase transition Numerics
Numerical solution of PDEs
Incoherence; R = 60
incoherence
incoherence
Synchrony; R = 10
synchrony
synchrony
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 36 / 69
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2Ru2
i ]ds
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 37 / 69
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
incoherence soln.
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2Ru2
i ]ds
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 37 / 69
Analysis of phase transition Numerics
Bifurcation diagram
Locking
0 0.1 0.2
0.15
0.2
0.25
R−1/ 2
γ
Incoherence
R > Rc(γ)
Synchrony
Incoherence R−1/2
η(ω)
0. 1 0.15 0. 2 0.25 0. 3 0.35
0. 1
0.15
0. 2
0.25
ω = 0.95
ω = 1
ω = 1.05
R > Rc
η(ω) = η0
R < Rc
η(ω) < η0
synchrony soln.
dθi = (ωi +ui)dt +σ dξi
ηi(ui;u−i) = lim
T→∞
1
T
� T
0
E[c(θi;θ−i)+ 1
2Ru2
i ]ds
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 37 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Learning Q-function approximation
Comparison to Kuramoto law
Control law u = ϕ(θ,t,ω)
−0.2
0
0.2
0.4
0.6
ω = 0.95
ω = 1
ω = 1.05
Population
Density
Control laws
0 π 2π θ
Equivalent control law in Kuramoto oscillator
u
(Kur)
i =
κ
N
N
∑
j=1
sin(θj(t)−θi)
N→∞
≈ κ0 sin(ϑ0 +t −θi)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 39 / 69
Learning Q-function approximation
Comparison to Kuramoto law
Control law u = ϕ(θ,t,ω)
−0.2
0
0.2
0.4
0.6
ω = 0.95
ω = 1
ω = 1.05
Kuramoto
Population
Density
Control laws
0 π 2π θ
Equivalent control law in Kuramoto oscillator
u
(Kur)
i =
κ
N
N
∑
j=1
sin(θj(t)−θi)
N→∞
≈ κ0 sin(ϑ0 +t −θi)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 39 / 69
Learning Q-function approximation
Optimality equation min
ui
{c(θ;θ−i(t))+ 1
2 Ru2
i +Dui hi(θ,t)
� �� �
=: Hi(θ,ui;θ−i(t))
} = η∗
i
Optimal control law Kuramoto law
u∗
i = −
1
R
∂θ hi(θ,t) u
(Kur)
i = −
κ
N ∑
j�=i
sin(θi −θj(t))
Parameterization:
H
(Ai,φi)
i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1
2 Ru2
i +(ωi −1+ui)AiS(φi)
+
σ2
2
AiC(φi)
where
S(φ)
(θ,θ−i) =
1
N ∑
j�=i
sin(θ −θj −φ), C(φ)
(θ,θ−i) =
1
N ∑
j�=i
cos(θ −θj −φ)
Approx. optimal control:
u
(Ai,φi)
i = argmin
ui
{H
(Ai,φi)
i (θ,ui;θ−i(t))} = −
Ai
R
S(φi)
(θ,θ−i)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 40 / 69
Learning Q-function approximation
Optimality equation min
ui
{c(θ;θ−i(t))+ 1
2 Ru2
i +Dui hi(θ,t)
� �� �
=: Hi(θ,ui;θ−i(t))
} = η∗
i
Optimal control law Kuramoto law
u∗
i = −
1
R
∂θ hi(θ,t) u
(Kur)
i = −
κ
N ∑
j�=i
sin(θi −θj(t))
Parameterization:
H
(Ai,φi)
i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1
2 Ru2
i +(ωi −1+ui)AiS(φi)
+
σ2
2
AiC(φi)
where
S(φ)
(θ,θ−i) =
1
N ∑
j�=i
sin(θ −θj −φ), C(φ)
(θ,θ−i) =
1
N ∑
j�=i
cos(θ −θj −φ)
Approx. optimal control:
u
(Ai,φi)
i = argmin
ui
{H
(Ai,φi)
i (θ,ui;θ−i(t))} = −
Ai
R
S(φi)
(θ,θ−i)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 40 / 69
Learning Q-function approximation
Optimality equation min
ui
{c(θ;θ−i(t))+ 1
2 Ru2
i +Dui hi(θ,t)
� �� �
=: Hi(θ,ui;θ−i(t))
} = η∗
i
Optimal control law Kuramoto law
u∗
i = −
1
R
∂θ hi(θ,t) u
(Kur)
i = −
κ
N ∑
j�=i
sin(θi −θj(t))
Parameterization:
H
(Ai,φi)
i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1
2 Ru2
i +(ωi −1+ui)AiS(φi)
+
σ2
2
AiC(φi)
where
S(φ)
(θ,θ−i) =
1
N ∑
j�=i
sin(θ −θj −φ), C(φ)
(θ,θ−i) =
1
N ∑
j�=i
cos(θ −θj −φ)
Approx. optimal control:
u
(Ai,φi)
i = argmin
ui
{H
(Ai,φi)
i (θ,ui;θ−i(t))} = −
Ai
R
S(φi)
(θ,θ−i)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 40 / 69
Learning Q-function approximation
Optimality equation min
ui
{c(θ;θ−i(t))+ 1
2 Ru2
i +Dui hi(θ,t)
� �� �
=: Hi(θ,ui;θ−i(t))
} = η∗
i
Optimal control law Kuramoto law
u∗
i = −
1
R
∂θ hi(θ,t) u
(Kur)
i = −
κ
N ∑
j�=i
sin(θi −θj(t))
Parameterization:
H
(Ai,φi)
i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1
2 Ru2
i +(ωi −1+ui)AiS(φi)
+
σ2
2
AiC(φi)
where
S(φ)
(θ,θ−i) =
1
N ∑
j�=i
sin(θ −θj −φ), C(φ)
(θ,θ−i) =
1
N ∑
j�=i
cos(θ −θj −φ)
Approx. optimal control:
u
(Ai,φi)
i = argmin
ui
{H
(Ai,φi)
i (θ,ui;θ−i(t))} = −
Ai
R
S(φi)
(θ,θ−i)
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 40 / 69
1 Motivation
Why a game?
Why Oscillators?
2 Problems and results
Problem statement
Main results
3 Derivation of model
Overview
Derivation steps
PDE model
4 Analysis of phase transition
Incoherence solution
Bifurcation analysis
Numerics
5 Learning
Q-function approximation
Steepest descent algorithm
Learning Steepest descent algorithm
Bellman error:
Pointwise: L (Ai,φi)
(θ,t) = min
ui
{H
(Ai,φi)
i }−η
(A∗
i ,φ∗
i )
i
Simple gradient descent algorithm
˜e(Ai,φi) =
2
∑
k=1
|�L (Ai,φi)
, ˜ϕk(θ)�|2
dAi
dt
= −ε
d˜e(Ai,φi)
dAi
,
dφi
dt
= −ε
d˜e(Ai,φi)
dφi
(∗)
Theorem (Convergence)
Assume population is in synchrony. The ith
oscillator updates
according to (∗). Then
Ai(t) → A∗
=
1
2σ2
The pointwise Bellman error L (Ai,0)
(θ,t) = ε(R)cos2(θ −t)
where ε(R) =
1
16Rσ4
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 42 / 69
Learning Steepest descent algorithm
Phase transition
Suppose all oscillators use approx. optimal control law:
ui = −
A∗
R
1
N ∑
j�=i
sin(θi −θj(t))
then the phase transition boundary is
Rc(γ) =
� 1
2σ4 if γ = 0
1
4σ2γ
tan−1
�
2γ
σ2
�
if γ > 0
0 50 100 150 200 250 300
2
2.5
3
3.5
4
4.5
5
5.5
6
t
k = 0.01; R = 1000
A
i
A
*
0 0.05 0.1 0.15 0.2
15
20
25
30
35
40
45
50
γ
R
PDE
Learning
Incoherence
Synchrony
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 43 / 69
Thank you!
Website: http://www.mechse.illinois.edu/research/mehtapg
Huibing Yin Sean P. Meyn Uday V. Shanbhag
H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of coupled oscillators is a game,” ACC 2010
Bibliography
Dimitri P. Bertsekas.
Dynamic Programming and Optimal Control, volume 1.
Athena Scientific, Belmont, Massachusetts, 1995.
Eric Brown, Jeff Moehlis, and Philip Holmes.
On the phase reduction and response dynamics of neural
oscillator populations.
Neural Computation, 16(4):673–715, 2004.
M. Dellnitz, J.E. Marsden, I. Melbourne, and J. Scheurle.
Generic bifurcations of pendula.
Int. Series Num. Math., 104:111–122, 1992.
J. Guckenheimer.
Isochrons and phaseless sets.
J. Math. Biol., 1:259–273, 1975.
Minyi Huang, Peter E. Caines, and Roland P. Malhame.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 69 / 69
Bibliography
Large-population cost-coupled LQG problems with nonuniform
agents: Individual-mass behavior and decentralized ε-nash
equilibria.
IEEE transactions on automatic control, 52(9):1560–1571, 2007.
Y. Kuramoto.
International Symposium on Mathematical Problems in Theoretical
Physics, volume 39 of Lecture Notes in Physics.
Springer-Verlag, 1975.
Andrzej Lasota and Michael C. Mackey.
Chaos, Fractals and Noise.
Springer, 1994.
P. Mehta and S. Meyn.
Q-learning and Pontryagin’s Minimum Principle.
To appear, 48th IEEE Conference on Decision and Control,
December 16-18 2009.
Sean P. Meyn.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 69 / 69
Bibliography
The policy iteration algorithm for average reward markov decision
processes with general state space.
IEEE Transactions on Automatic Control, 42(12):1663–1680,
December 1997.
S. H. Strogatz and R. E. Mirollo.
Stability of incoherence in a population of coupled oscillators.
Journal of Statistical Physics, 63:613–635, May 1991.
Steven H. Strogatz, Daniel M. Abrams, Bruno Eckhardt, and
Edward Ott.
Theoretical mechanics: Crowd synchrony on the millennium
bridge.
Nature, 438:43–44, 2005.
P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 69 / 69

More Related Content

Recently uploaded

Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
PsychoTech Services
 
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
National Information Standards Organization (NISO)
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
deepaannamalai16
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDFLifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Vivekanand Anglo Vedic Academy
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
National Information Standards Organization (NISO)
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
danielkiash986
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
Nutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour TrainingNutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour Training
melliereed
 

Recently uploaded (20)

Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
 
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDFLifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
Nutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour TrainingNutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour Training
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
Marius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Expeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Maryland 2010

  • 1. CSLCOORDINATED SCIENCE LABORATORY Synchronization of coupled oscillators is a game Prashant G. Mehta1 1Coordinated Science Laboratory Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign University of Maryland, March 4, 2010 Acknowledgment: AFOSR, NSF
  • 2. Huibing Yin Sean P. Meyn Uday V. Shanbhag H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of coupled oscillators is a game,” ACC 2010 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 2 / 69
  • 3. Millennium bridge Video of London Millennium bridge from youtube [11] S. H. Strogatz et al., Nature, 2005 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 3 / 69
  • 4. Classical Kuramoto model dθi(t) = � ωi + κ N N ∑ j=1 sin(θj(t)−θi(t)) � dt +σ dξi(t), i = 1,...,N ωi taken from distribution g(ω) over [1−γ,1+γ] γ — measures the heterogeneity of the population κ — measures the strength of coupling [6] Y. Kuramoto (1975) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 4 / 69
  • 5. Classical Kuramoto model dθi(t) = � ωi + κ N N ∑ j=1 sin(θj(t)−θi(t)) � dt +σ dξi(t), i = 1,...,N ωi taken from distribution g(ω) over [1−γ,1+γ] γ — measures the heterogeneity of the population κ — measures the strength of coupling 1- 1+1 [6] Y. Kuramoto (1975) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 4 / 69
  • 6. Classical Kuramoto model dθi(t) = � ωi + κ N N ∑ j=1 sin(θj(t)−θi(t)) � dt +σ dξi(t), i = 1,...,N ωi taken from distribution g(ω) over [1−γ,1+γ] γ — measures the heterogeneity of the population κ — measures the strength of coupling [6] Y. Kuramoto (1975) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 4 / 69
  • 7. Classical Kuramoto model dθi(t) = � ωi + κ N N ∑ j=1 sin(θj(t)−θi(t)) � dt +σ dξi(t), i = 1,...,N ωi taken from distribution g(ω) over [1−γ,1+γ] γ — measures the heterogeneity of the population κ — measures the strength of coupling 0 0.1 0.2 0.1 0.15 0.2 0.25 0.3 Locking Incoherence κ κ < κc(γ) γ Synchrony Incoherence [6] Y. Kuramoto (1975) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 4 / 69
  • 8. Movies of incoherence and synchrony solution Incoherence Synchrony P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 5 / 69
  • 9. Problem statement Dynamics of ith oscillator dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0 ui(t) — control 1- 1+1 ith oscillator seeks to minimize ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[ c(θi;θ−i) � �� � cost of anarchy + 1 2Ru2 i � �� � cost of control ]ds θ−i = (θj)j�=i R — control penalty c(·) — cost function c(θi;θ−i) = 1 N ∑ j�=i c• (θi,θj), c• ≥ 0 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 6 / 69
  • 10. Problem statement Dynamics of ith oscillator dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0 ui(t) — control 1- 1+1 ith oscillator seeks to minimize ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[ c(θi;θ−i) � �� � cost of anarchy + 1 2Ru2 i � �� � cost of control ]ds θ−i = (θj)j�=i R — control penalty c(·) — cost function c(θi;θ−i) = 1 N ∑ j�=i c• (θi,θj), c• ≥ 0 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 6 / 69
  • 11. Problem statement Dynamics of ith oscillator dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0 ui(t) — control 1- 1+1 ith oscillator seeks to minimize ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[ c(θi;θ−i) � �� � cost of anarchy + 1 2Ru2 i � �� � cost of control ]ds θ−i = (θj)j�=i R — control penalty c(·) — cost function c(θi;θ−i) = 1 N ∑ j�=i c• (θi,θj), c• ≥ 0 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 6 / 69
  • 12. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 13. Motivation Why a game? Quiz In the video you just watched, why were the individuals walking strangely? A. To show respect to the Queen. B. Anarchists in the crowd were trying to destabilize the bridge. C. They were stepping to the beat of the soundtrack "Walk Like an Egyptian." D. The individuals were trying to maintain their balance. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
  • 14. Motivation Why a game? Quiz In the video you just watched, why were the individuals walking strangely? A. To show respect to the Queen. B. Anarchists in the crowd were trying to destabilize the bridge. C. They were stepping to the beat of the soundtrack "Walk Like an Egyptian." D. The individuals were trying to maintain their balance. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
  • 15. Motivation Why a game? Quiz In the video you just watched, why were the individuals walking strangely? A. To show respect to the Queen. B. Anarchists in the crowd were trying to destabilize the bridge. C. They were stepping to the beat of the soundtrack "Walk Like an Egyptian." D. The individuals were trying to maintain their balance. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
  • 16. Motivation Why a game? Quiz In the video you just watched, why were the individuals walking strangely? A. To show respect to the Queen. B. Anarchists in the crowd were trying to destabilize the bridge. C. They were stepping to the beat of the soundtrack "Walk Like an Egyptian." D. The individuals were trying to maintain their balance. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
  • 17. Motivation Why a game? Quiz In the video you just watched, why were the individuals walking strangely? A. To show respect to the Queen. B. Anarchists in the crowd were trying to destabilize the bridge. C. They were stepping to the beat of the soundtrack "Walk Like an Egyptian." D. The individuals were trying to maintain their balance. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 8 / 69
  • 18. Motivation Why a game? “Rational irrationality” “—behavior that, on the individual level, is perfectly reasonable but that, when aggregated in the marketplace, produces calamity.” Examples Millennium bridge Financial market John Cassidy, “Rational Irrationality: The real reason that capitalism is so crash-prone,” The New Yorker, 2009 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 9 / 69
  • 19. Motivation Why a game? “Rational irrationality” “—behavior that, on the individual level, is perfectly reasonable but that, when aggregated in the marketplace, produces calamity.” Examples Millennium bridge Financial market John Cassidy, “Rational Irrationality: The real reason that capitalism is so crash-prone,” The New Yorker, 2009 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 9 / 69
  • 20. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 21. Motivation Why Oscillators? Hodgkin-Huxley type Neuron model C dV dt = −gT ·m2 ∞(V)·h·(V −ET) −gh ·r ·(V −Eh)−...... dh dt = h∞(V)−h τh(V) dr dt = r∞(V)−r τr(V) 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 −150 −100 −50 0 50 100 Voltage time Neural spike train [4] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 11 / 69
  • 22. Motivation Why Oscillators? Hodgkin-Huxley type Neuron model C dV dt = −gT ·m2 ∞(V)·h·(V −ET) −gh ·r ·(V −Eh)−...... dh dt = h∞(V)−h τh(V) dr dt = r∞(V)−r τr(V) 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 −150 −100 −50 0 50 100 Voltage time Neural spike train [4] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 11 / 69
  • 23. Motivation Why Oscillators? Hodgkin-Huxley type Neuron model C dV dt = −gT ·m2 ∞(V)·h·(V −ET) −gh ·r ·(V −Eh)−...... dh dt = h∞(V)−h τh(V) dr dt = r∞(V)−r τr(V) 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 −150 −100 −50 0 50 100 Voltage time Neural spike train −100 −50 0 50 100 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 Vh r Limit cyle r h v [4] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 11 / 69
  • 24. Motivation Why Oscillators? Hodgkin-Huxley type Neuron model C dV dt = −gT ·m2 ∞(V)·h·(V −ET) −gh ·r ·(V −Eh)−...... dh dt = h∞(V)−h τh(V) dr dt = r∞(V)−r τr(V) 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 −150 −100 −50 0 50 100 Voltage time Neural spike train −100 −50 0 50 100 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 Vh r Limit cyle r h v Normal form reduction −−−−−−−−−−−−−→ ˙θi = ωi +ui ·Φ(θi) [4] J. Guckenheimer, J. Math. Biol., 1975; [2] J. Moehlis et al., Neural Computation, 2004 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 11 / 69
  • 25. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 26. Problems and results Problem statement Finite oscillator model Dynamics of ith oscillator dθi = (ωi +ui(t))dt +σ dξi, i = 1,...,N, t ≥ 0 ui(t) — control 1- 1+1 ith oscillator seeks to minimize ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[ c(θi;θ−i) � �� � cost of anarchy + 1 2Ru2 i � �� � cost of control ]ds θ−i = (θj)j�=i R — control penalty c(·) — cost function c(θi;θ−i) = 1 N ∑ j�=i c• (θi,θj), c• ≥ 0 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 13 / 69
  • 27. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 28. Problems and results Main results 1. Synchronization is a solution of game Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γ Incoherence R > Rc(γ) Synchrony Incoherence dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds 1- 1+1 Yin et al., ACC 2010 Strogatz et al., J. Stat. Phy., 1992 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 15 / 69
  • 29. Problems and results Main results 1. Synchronization is a solution of game Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γ Incoherence R > Rc(γ) Synchrony Incoherence dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds 0 0.1 0.2 0.1 0.15 0.2 0.25 0.3 Locking Incoherence κ κ < κc(γ) γ Synchrony Incoherence dθi = � ωi + κ N N ∑ j=1 sin(θj −θi) � dt +σ dξi Yin et al., ACC 2010 Strogatz et al., J. Stat. Phy., 1992 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 15 / 69
  • 30. Problems and results Main results 2. Kuramoto control is approximately optimal −0.2 0 0.2 0.4 0.6 ω = 1 Kuramoto Population Density Control laws 0 π 2π θ ui = − A∗ i R 1 N ∑ j�=i sin(θ −θj(t)) 0 50 100 150 200 250 300 2 2.5 3 3.5 4 4.5 5 5.5 6 t k = 0.01; R = 1000 A i A* Learning algorithm: dAi dt = −ε ... Yin et.al. CDC 2010 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 16 / 69
  • 31. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 32. Derivation of model Overview Overview of model derivation dθi = (ωi +ui(t))dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[¯c(θi,t)+ 1 2 Ru2 i ]ds Influence Influence Mass 1 Mean-field approximation Assumption: c(θi;θ−i(t)) = 1 N ∑ j�=i c• (θi,θj) N→∞ −−−−−−→ ¯c(θ,t) 2 Optimal control of single oscillator Decentralized control structure [5] M. Huang, P. Caines, and R. Malhame, IEEE TAC, 2007 [HCM] P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 18 / 69
  • 33. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 34. Derivation of model Derivation steps Single oscillator with given cost Dynamics of the oscillator dθi = (ωi +ui(t))dt +σ dξi, t ≥ 0 The cost function is assumed known ηi(ui; ¯c) = lim T→∞ 1 T � T 0 E[ c(θi;θ−i) + 1 2Ru2 i (s)]ds ⇑ ¯c(θi(s),s) HJB equation: ∂thi +ωi∂θ hi = 1 2R (∂θ hi)2 − ¯c(θ,t)+η∗ i − σ2 2 ∂2 θθ hi Optimal control law: u∗ i (t) = ϕi(θ,t) = − 1 R ∂θ hi(θ,t) [1] D. P. Bertsekas (1995); [9] S. P. Meyn, IEEE TAC, 1997 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 20 / 69
  • 35. Derivation of model Derivation steps Single oscillator with optimal control Dynamics of the oscillator dθi(t) = � ωi − 1 R ∂θ hi(θi,t) � dt +σ dξi(t) Fokker-Planck equation for pdf p(θ,t,ωi) FPK: ∂tp+ωi∂θ p = 1 R ∂θ [p(∂θ h)]+ σ2 2 ∂2 θθ p [7] A. Lasota and M. C. Mackey, “Chaos, Fractals and Noise,” Springer 1994 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 21 / 69
  • 36. Derivation of model Derivation steps Mean-field Approximation HJB equation for population ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t)+η(ω)− σ2 2 ∂2 θθ h h(θ,t,ω) Population density ∂tp+ω∂θ p = 1 R ∂θ [p(∂θ h)]+ σ2 2 ∂2 θθ p p(θ,t,ω) Enforce cost consistency ¯c(θ,t) = � Ω � 2π 0 c• (θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω ≈ 1 N ∑ j�=i c• (θ,ϑ) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 22 / 69
  • 37. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 38. Derivation of model PDE model Summary HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂tp+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) Mean-field approx.: ¯c(ϑ,t) = � Ω � 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω 1 Bellman’s optimality principle (H,J,B) 2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . ) 3 Mean-field approximation (Boltzmann, Kac,. . . ) 4 Connection to Nash game (Weintraub, HCM, Altman,. . . ) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
  • 39. Derivation of model PDE model Summary HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂tp+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) Mean-field approx.: ¯c(ϑ,t) = � Ω � 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω 1 Bellman’s optimality principle (H,J,B) 2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . ) 3 Mean-field approximation (Boltzmann, Kac,. . . ) 4 Connection to Nash game (Weintraub, HCM, Altman,. . . ) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
  • 40. Derivation of model PDE model Summary HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂tp+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) Mean-field approx.: ¯c(ϑ,t) = � Ω � 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω 1 Bellman’s optimality principle (H,J,B) 2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . ) 3 Mean-field approximation (Boltzmann, Kac,. . . ) 4 Connection to Nash game (Weintraub, HCM, Altman,. . . ) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
  • 41. Derivation of model PDE model Summary HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂tp+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) Mean-field approx.: ¯c(ϑ,t) = � Ω � 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω 1 Bellman’s optimality principle (H,J,B) 2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . ) 3 Mean-field approximation (Boltzmann, Kac,. . . ) 4 Connection to Nash game (Weintraub, HCM, Altman,. . . ) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
  • 42. Derivation of model PDE model Summary HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂tp+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) Mean-field approx.: ¯c(ϑ,t) = � Ω � 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω 1 Bellman’s optimality principle (H,J,B) 2 Propagation of chaos (F,P,K, Mckean, Vlasov,. . . ) 3 Mean-field approximation (Boltzmann, Kac,. . . ) 4 Connection to Nash game (Weintraub, HCM, Altman,. . . ) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 24 / 69
  • 43. Derivation of model PDE model 1. Solution of PDE gives ε-Nash equilibrium Optimal control law uo i = − 1 R ∂θ h(θ(t),t,ω) � � ω=ωi ε-Nash property (as N → ∞) ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N. So, we look for solutions of PDEs. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 25 / 69
  • 44. Derivation of model PDE model 1. Solution of PDE gives ε-Nash equilibrium Optimal control law uo i = − 1 R ∂θ h(θ(t),t,ω) � � ω=ωi ε-Nash property (as N → ∞) ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N. So, we look for solutions of PDEs. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 25 / 69
  • 45. Derivation of model PDE model 1. Solution of PDE gives ε-Nash equilibrium Optimal control law uo i = − 1 R ∂θ h(θ(t),t,ω) � � ω=ωi ε-Nash property (as N → ∞) ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N. So, we look for solutions of PDEs. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 25 / 69
  • 46. Derivation of model PDE model 2. Incoherence solution (PDE) Incoherence solution h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) := 1 2π incoherence h(θ,t,ω) = 0 ⇒ ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t)+η∗ − σ2 2 ∂2 θθ h ∂tp+ω∂θ p = 1 R ∂θ [p(∂θ h)]+ σ2 2 ∂2 θθ p ¯c(θ,t) = � Ω � 2π 0 c• (θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 26 / 69
  • 47. Derivation of model PDE model 2. Incoherence solution (PDE) Incoherence solution h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) := 1 2π incoherence h(θ,t,ω) = 0 ⇒ ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t)+η∗ − σ2 2 ∂2 θθ h p(θ,t,ω) = 1 2π ⇒ ∂tp+ω∂θ p = 1 R ∂θ [p(∂θ h)]+ σ2 2 ∂2 θθ p ¯c(θ,t) = � Ω � 2π 0 c• (θ,ϑ)p(ϑ,t,ω)g(ω)dϑ dω P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 26 / 69
  • 48. Derivation of model PDE model 2. Incoherence solution (PDE) Assume c• (ϑ,θ) = c• (ϑ −θ) = 1 2 sin2 � ϑ −θ 2 � Incoherence solution h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) := 1 2π Optimal control u = − 1 R ∂θ h = 0 Average cost ¯c(θ,t) = � Ω � 2π 0 1 2 sin2 � θ −ϑ 2 � 1 2π g(ω)dϑ dω η∗ (ω) = ¯c(θ,t) = 1 4 =: η0 for all ω ∈ Ω incoherence soln. No cost of control P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 27 / 69
  • 49. Derivation of model PDE model 2. Incoherence solution (Finite population) Closed-loop dynamics dθi = (ωi + ui ���� =0 )dt +σ dξi(t) Average cost ηi = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2 Ru2 i � �� � =0 ]dt = lim T→∞ 1 N ∑ j�=i 1 T � T 0 E[1 2 sin2 � θi(t)−θj(t) 2 � ]dt = 1 N ∑ j�=i � 2π 0 E[1 2 sin2 � θi(t)−ϑ 2 � ] 1 2π dϑ = N −1 N η0 incoherence ε-Nash property ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 28 / 69
  • 50. Derivation of model PDE model 2. Incoherence solution (Finite population) Closed-loop dynamics dθi = (ωi + ui ���� =0 )dt +σ dξi(t) Average cost ηi = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2 Ru2 i � �� � =0 ]dt = lim T→∞ 1 N ∑ j�=i 1 T � T 0 E[1 2 sin2 � θi(t)−θj(t) 2 � ]dt = 1 N ∑ j�=i � 2π 0 E[1 2 sin2 � θi(t)−ϑ 2 � ] 1 2π dϑ = N −1 N η0 incoherence ε-Nash property ηi(uo i ;uo −i) ≤ ηi(ui;uo −i)+O( 1 √ N ), i = 1,...,N. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 28 / 69
  • 51. Derivation of model PDE model 3. Synchronization is a solution of game Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γ Incoherence R > Rc(γ) Synchrony Incoherence R−1/ 2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω= 0.95 ω= 1 ω= 1.05 R > Rc η(ω) = η0 R < R c η(ω) < η0 c dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds η(ω) = min ui ηi(ui;uo −i) 0 1 2 3 4 5 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t = 38.24 Synchrony solution of Yin et al., “Synchronization of oscillators is a game,” ACC2010 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 29 / 69
  • 52. Derivation of model PDE model 3. Synchronization is a solution of game Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γ Incoherence R > Rc(γ) Synchrony Incoherence R−1/ 2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω= 0.95 ω= 1 ω= 1.05 R > Rc η(ω) = η0 R < R c η(ω) < η0 c incoherence soln. dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds η(ω) = min ui ηi(ui;uo −i) 0 1 2 3 4 5 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t = 38.24 Synchrony solution of Yin et al., “Synchronization of oscillators is a game,” ACC2010 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 29 / 69
  • 53. Derivation of model PDE model 3. Synchronization is a solution of game Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γ Incoherence R > Rc(γ) Synchrony Incoherence R−1/ 2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω= 0.95 ω= 1 ω= 1.05 R > Rc η(ω) = η0 R < R c η(ω) < η0 c synchrony soln. dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2 Ru2 i ]ds η(ω) = min ui ηi(ui;uo −i) 0 1 2 3 4 5 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t = 38.24 Synchrony solution of Yin et al., “Synchronization of oscillators is a game,” ACC2010 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 29 / 69
  • 54. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 55. Analysis of phase transition Incoherence solution Overview of the steps HJB: ∂th+ω∂θ h = 1 2R (∂θ h)2 − ¯c(θ,t) +η∗ − σ2 2 ∂2 θθ h ⇒ h(θ,t,ω) FPK: ∂tp+ω∂θ p = 1 R ∂θ [p( ∂θ h )]+ σ2 2 ∂2 θθ p ⇒ p(θ,t,ω) ¯c(ϑ,t) = � Ω � 2π 0 c• (ϑ,θ) p(θ,t,ω) g(ω)dθ dω Assume c• (ϑ,θ) = c• (ϑ −θ) = 1 2 sin2 � ϑ −θ 2 � Incoherence solution h(θ,t,ω) = h0(θ) := 0 p(θ,t,ω) = p0(θ) := 1 2π P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 31 / 69
  • 56. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 57. Analysis of phase transition Bifurcation analysis Linearization and spectra Linearized PDE (about incoherence solution) ∂ ∂t ˜z(θ,t,ω) = � −ω∂θ ˜h− ¯c− σ2 2 ∂2 θθ ˜h −ω∂θ ˜p+ 1 2πR ∂2 θθ ˜h+ σ2 2 ∂2 θθ ˜p � =: LR˜z(θ,t,ω) Spectrum of the linear operator 1 Continuous spectrum {S(k) }+∞ k=−∞ S(k) :=� λ ∈ C � �λ = ± σ2 2 k2 −kωi for all ω ∈ Ω � 2 Discrete spectrum Characteristic eqn: 1 8R � Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 33 / 69
  • 58. Analysis of phase transition Bifurcation analysis Linearization and spectra Linearized PDE (about incoherence solution) ∂ ∂t ˜z(θ,t,ω) = � −ω∂θ ˜h− ¯c− σ2 2 ∂2 θθ ˜h −ω∂θ ˜p+ 1 2πR ∂2 θθ ˜h+ σ2 2 ∂2 θθ ˜p � =: LR˜z(θ,t,ω) Spectrum of the linear operator 1 Continuous spectrum {S(k) }+∞ k=−∞ S(k) :=� λ ∈ C � �λ = ± σ2 2 k2 −kωi for all ω ∈ Ω � 2 Discrete spectrum Characteristic eqn: 1 8R � Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 33 / 69
  • 59. Analysis of phase transition Bifurcation analysis Linearization and spectra Linearized PDE (about incoherence solution) ∂ ∂t ˜z(θ,t,ω) = � −ω∂θ ˜h− ¯c− σ2 2 ∂2 θθ ˜h −ω∂θ ˜p+ 1 2πR ∂2 θθ ˜h+ σ2 2 ∂2 θθ ˜p � =: LR˜z(θ,t,ω) Spectrum of the linear operator 1 Continuous spectrum {S(k) }+∞ k=−∞ S(k) :=� λ ∈ C � �λ = ± σ2 2 k2 −kωi for all ω ∈ Ω � −0.2 −0.1 0 0.1 0.2 0.3 −3 −2 −1 0 1 2 3 real imag γ = 0.1 R decreases k=2 k=2 k=1 k=1 2 Discrete spectrum Characteristic eqn: 1 8R � Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 33 / 69
  • 60. Analysis of phase transition Bifurcation analysis Linearization and spectra Linearized PDE (about incoherence solution) ∂ ∂t ˜z(θ,t,ω) = � −ω∂θ ˜h− ¯c− σ2 2 ∂2 θθ ˜h −ω∂θ ˜p+ 1 2πR ∂2 θθ ˜h+ σ2 2 ∂2 θθ ˜p � =: LR˜z(θ,t,ω) Spectrum of the linear operator 1 Continuous spectrum {S(k) }+∞ k=−∞ S(k) :=� λ ∈ C � �λ = ± σ2 2 k2 −kωi for all ω ∈ Ω � −0.2 −0.1 0 0.1 0.2 0.3 −3 −2 −1 0 1 2 3 real imag γ = 0.1 R decreases k=2 k=2 k=1 k=1 2 Discrete spectrum Characteristic eqn: 1 8R � Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 33 / 69
  • 61. Analysis of phase transition Bifurcation analysis Bifurcation diagram (Hamiltonian Hopf) Characteristic eqn: 1 8R � Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. Stability proof [3] Dellnitz et al., Int. Series Num. Math., 1992 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 34 / 69
  • 62. Analysis of phase transition Bifurcation analysis Bifurcation diagram (Hamiltonian Hopf) Characteristic eqn: 1 8R � Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. Stability proof −0.2 −0.1 0 0.1 0.2 -0.6 -0.8 -1 -1.2 -1.4 real imag (a) Cont.spectrum;ind.ofR Disc.spectrum;fn.ofR Bifurcation point [3] Dellnitz et al., Int. Series Num. Math., 1992 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 34 / 69
  • 63. Analysis of phase transition Bifurcation analysis Bifurcation diagram (Hamiltonian Hopf) Characteristic eqn: 1 8R � Ω g(ω) (λ − σ2 2 +ωi)(λ + σ2 2 +ωi) dω +1 = 0. Stability proof −0.2 −0.1 0 0.1 0.2 -0.6 -0.8 -1 -1.2 -1.4 real imag (a) Cont.spectrum;ind.ofR Disc.spectrum;fn.ofR Bifurcation point 0 0.05 0.1 0.15 0.2 15 20 25 30 35 40 45 50 Incoherence R > R R c(γ γ ) (c) Synchrony 0.05 [3] Dellnitz et al., Int. Series Num. Math., 1992 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 34 / 69
  • 64. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 65. Analysis of phase transition Numerics Numerical solution of PDEs Incoherence; R = 60 incoherence incoherence P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 36 / 69
  • 66. Analysis of phase transition Numerics Numerical solution of PDEs Incoherence; R = 60 incoherence incoherence Synchrony; R = 10 synchrony synchrony P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 36 / 69
  • 67. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2Ru2 i ]ds P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 37 / 69
  • 68. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 incoherence soln. dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2Ru2 i ]ds P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 37 / 69
  • 69. Analysis of phase transition Numerics Bifurcation diagram Locking 0 0.1 0.2 0.15 0.2 0.25 R−1/ 2 γ Incoherence R > Rc(γ) Synchrony Incoherence R−1/2 η(ω) 0. 1 0.15 0. 2 0.25 0. 3 0.35 0. 1 0.15 0. 2 0.25 ω = 0.95 ω = 1 ω = 1.05 R > Rc η(ω) = η0 R < Rc η(ω) < η0 synchrony soln. dθi = (ωi +ui)dt +σ dξi ηi(ui;u−i) = lim T→∞ 1 T � T 0 E[c(θi;θ−i)+ 1 2Ru2 i ]ds P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 37 / 69
  • 70. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 71. Learning Q-function approximation Comparison to Kuramoto law Control law u = ϕ(θ,t,ω) −0.2 0 0.2 0.4 0.6 ω = 0.95 ω = 1 ω = 1.05 Population Density Control laws 0 π 2π θ Equivalent control law in Kuramoto oscillator u (Kur) i = κ N N ∑ j=1 sin(θj(t)−θi) N→∞ ≈ κ0 sin(ϑ0 +t −θi) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 39 / 69
  • 72. Learning Q-function approximation Comparison to Kuramoto law Control law u = ϕ(θ,t,ω) −0.2 0 0.2 0.4 0.6 ω = 0.95 ω = 1 ω = 1.05 Kuramoto Population Density Control laws 0 π 2π θ Equivalent control law in Kuramoto oscillator u (Kur) i = κ N N ∑ j=1 sin(θj(t)−θi) N→∞ ≈ κ0 sin(ϑ0 +t −θi) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 39 / 69
  • 73. Learning Q-function approximation Optimality equation min ui {c(θ;θ−i(t))+ 1 2 Ru2 i +Dui hi(θ,t) � �� � =: Hi(θ,ui;θ−i(t)) } = η∗ i Optimal control law Kuramoto law u∗ i = − 1 R ∂θ hi(θ,t) u (Kur) i = − κ N ∑ j�=i sin(θi −θj(t)) Parameterization: H (Ai,φi) i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1 2 Ru2 i +(ωi −1+ui)AiS(φi) + σ2 2 AiC(φi) where S(φ) (θ,θ−i) = 1 N ∑ j�=i sin(θ −θj −φ), C(φ) (θ,θ−i) = 1 N ∑ j�=i cos(θ −θj −φ) Approx. optimal control: u (Ai,φi) i = argmin ui {H (Ai,φi) i (θ,ui;θ−i(t))} = − Ai R S(φi) (θ,θ−i) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 40 / 69
  • 74. Learning Q-function approximation Optimality equation min ui {c(θ;θ−i(t))+ 1 2 Ru2 i +Dui hi(θ,t) � �� � =: Hi(θ,ui;θ−i(t)) } = η∗ i Optimal control law Kuramoto law u∗ i = − 1 R ∂θ hi(θ,t) u (Kur) i = − κ N ∑ j�=i sin(θi −θj(t)) Parameterization: H (Ai,φi) i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1 2 Ru2 i +(ωi −1+ui)AiS(φi) + σ2 2 AiC(φi) where S(φ) (θ,θ−i) = 1 N ∑ j�=i sin(θ −θj −φ), C(φ) (θ,θ−i) = 1 N ∑ j�=i cos(θ −θj −φ) Approx. optimal control: u (Ai,φi) i = argmin ui {H (Ai,φi) i (θ,ui;θ−i(t))} = − Ai R S(φi) (θ,θ−i) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 40 / 69
  • 75. Learning Q-function approximation Optimality equation min ui {c(θ;θ−i(t))+ 1 2 Ru2 i +Dui hi(θ,t) � �� � =: Hi(θ,ui;θ−i(t)) } = η∗ i Optimal control law Kuramoto law u∗ i = − 1 R ∂θ hi(θ,t) u (Kur) i = − κ N ∑ j�=i sin(θi −θj(t)) Parameterization: H (Ai,φi) i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1 2 Ru2 i +(ωi −1+ui)AiS(φi) + σ2 2 AiC(φi) where S(φ) (θ,θ−i) = 1 N ∑ j�=i sin(θ −θj −φ), C(φ) (θ,θ−i) = 1 N ∑ j�=i cos(θ −θj −φ) Approx. optimal control: u (Ai,φi) i = argmin ui {H (Ai,φi) i (θ,ui;θ−i(t))} = − Ai R S(φi) (θ,θ−i) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 40 / 69
  • 76. Learning Q-function approximation Optimality equation min ui {c(θ;θ−i(t))+ 1 2 Ru2 i +Dui hi(θ,t) � �� � =: Hi(θ,ui;θ−i(t)) } = η∗ i Optimal control law Kuramoto law u∗ i = − 1 R ∂θ hi(θ,t) u (Kur) i = − κ N ∑ j�=i sin(θi −θj(t)) Parameterization: H (Ai,φi) i (θ,ui;θ−i(t)) = c(θ;θ−i(t))+ 1 2 Ru2 i +(ωi −1+ui)AiS(φi) + σ2 2 AiC(φi) where S(φ) (θ,θ−i) = 1 N ∑ j�=i sin(θ −θj −φ), C(φ) (θ,θ−i) = 1 N ∑ j�=i cos(θ −θj −φ) Approx. optimal control: u (Ai,φi) i = argmin ui {H (Ai,φi) i (θ,ui;θ−i(t))} = − Ai R S(φi) (θ,θ−i) P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 40 / 69
  • 77. 1 Motivation Why a game? Why Oscillators? 2 Problems and results Problem statement Main results 3 Derivation of model Overview Derivation steps PDE model 4 Analysis of phase transition Incoherence solution Bifurcation analysis Numerics 5 Learning Q-function approximation Steepest descent algorithm
  • 78. Learning Steepest descent algorithm Bellman error: Pointwise: L (Ai,φi) (θ,t) = min ui {H (Ai,φi) i }−η (A∗ i ,φ∗ i ) i Simple gradient descent algorithm ˜e(Ai,φi) = 2 ∑ k=1 |�L (Ai,φi) , ˜ϕk(θ)�|2 dAi dt = −ε d˜e(Ai,φi) dAi , dφi dt = −ε d˜e(Ai,φi) dφi (∗) Theorem (Convergence) Assume population is in synchrony. The ith oscillator updates according to (∗). Then Ai(t) → A∗ = 1 2σ2 The pointwise Bellman error L (Ai,0) (θ,t) = ε(R)cos2(θ −t) where ε(R) = 1 16Rσ4 P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 42 / 69
  • 79. Learning Steepest descent algorithm Phase transition Suppose all oscillators use approx. optimal control law: ui = − A∗ R 1 N ∑ j�=i sin(θi −θj(t)) then the phase transition boundary is Rc(γ) = � 1 2σ4 if γ = 0 1 4σ2γ tan−1 � 2γ σ2 � if γ > 0 0 50 100 150 200 250 300 2 2.5 3 3.5 4 4.5 5 5.5 6 t k = 0.01; R = 1000 A i A * 0 0.05 0.1 0.15 0.2 15 20 25 30 35 40 45 50 γ R PDE Learning Incoherence Synchrony P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 43 / 69
  • 80. Thank you! Website: http://www.mechse.illinois.edu/research/mehtapg Huibing Yin Sean P. Meyn Uday V. Shanbhag H. Yin, P. G. Mehta, S. P. Meyn and U. V. Shanbhag, “Synchronization of coupled oscillators is a game,” ACC 2010
  • 81. Bibliography Dimitri P. Bertsekas. Dynamic Programming and Optimal Control, volume 1. Athena Scientific, Belmont, Massachusetts, 1995. Eric Brown, Jeff Moehlis, and Philip Holmes. On the phase reduction and response dynamics of neural oscillator populations. Neural Computation, 16(4):673–715, 2004. M. Dellnitz, J.E. Marsden, I. Melbourne, and J. Scheurle. Generic bifurcations of pendula. Int. Series Num. Math., 104:111–122, 1992. J. Guckenheimer. Isochrons and phaseless sets. J. Math. Biol., 1:259–273, 1975. Minyi Huang, Peter E. Caines, and Roland P. Malhame. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 69 / 69
  • 82. Bibliography Large-population cost-coupled LQG problems with nonuniform agents: Individual-mass behavior and decentralized ε-nash equilibria. IEEE transactions on automatic control, 52(9):1560–1571, 2007. Y. Kuramoto. International Symposium on Mathematical Problems in Theoretical Physics, volume 39 of Lecture Notes in Physics. Springer-Verlag, 1975. Andrzej Lasota and Michael C. Mackey. Chaos, Fractals and Noise. Springer, 1994. P. Mehta and S. Meyn. Q-learning and Pontryagin’s Minimum Principle. To appear, 48th IEEE Conference on Decision and Control, December 16-18 2009. Sean P. Meyn. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 69 / 69
  • 83. Bibliography The policy iteration algorithm for average reward markov decision processes with general state space. IEEE Transactions on Automatic Control, 42(12):1663–1680, December 1997. S. H. Strogatz and R. E. Mirollo. Stability of incoherence in a population of coupled oscillators. Journal of Statistical Physics, 63:613–635, May 1991. Steven H. Strogatz, Daniel M. Abrams, Bruno Eckhardt, and Edward Ott. Theoretical mechanics: Crowd synchrony on the millennium bridge. Nature, 438:43–44, 2005. P. G. Mehta (UIUC) Univ. of Maryland Mar. 4, 2010 69 / 69