A Game Theoretical Approach to Multi-Agent Synchronization

A Game Theoretical Approach to
Multi-Agent Synchronization

Soﬁe Haesaert, DCSC
co-authors: prof. dr. R. Babuˇka and prof. dr. F.L. Lewis
s
March 28, 2012

Linear-Quadratic Discrete-Time Graphical Game
Game-Theoretical Solution
Example: Five Agent Synchronization

Multi-Agent Synchronization

Leader-Follower synchronization :
Cooperative
Game Theory
Communication graph

Multitude of applications in:
Computer science
Spacecraft
Unmanned air vehicles

2 / 18 A Game Theoretical Approach to Multi-Agent Synchronization


Outline

Communication Graph
Local Tracking Error
Performance Indices

Global Nash Equilibrium
Coupled Riccati Equations



Linear-Quadratic Discrete-Time Graphical Game Communication Graph
Game-Theoretical Solution Local Tracking Error
Example: Five Agent Synchronization Performance Indices

Leader-Follower Synchronization

z0
State of Leader Agent
z0 (k + 1) = Az0 (k) z1 z2
State of i-th Agent
zi (k + 1) = Azi (k) + Bi ui (k) ∀i ∈ {1, ..., N} z3

Objective: z4 z5
zi (k) → z0 (k) ∀i ∈ {1, ..., N}

[Wang, and Chen, 2002]


Communication Graph

z0
Graph G
Nodes V(G) = {z1 , z2 , . . . , zN } z1 z2

z3

z4 z5


Communication Graph

z0
Graph G
Nodes V(G) = {z1 , z2 , . . . , zN } z1 z2
Edges E ⊆ V × V
e13 e23
Edge weights z3
Edge weights eij e34 e35
z4 z5
e45



Communication Graph

z0
Graph G g1 g2
Nodes V(G) = {z1 , z2 , . . . , zN } g3
z1 z2
Edges E ⊆ V × V
Edge weights z3
Edge weights eij
Pinning gains: gi
z4 z5




Local Tracking Error:
z0
xi (k) = eij (zi,k − zj,k ) + gi (zi,k − z0,k ) g1
j∈Ni
z1 z2

Dynamics e13
z3
xi (k + 1) = Axi,k + eij + gi Bi ui,k
j∈Ni
z4 z5
− eij Bj uj,k
j∈Ni




Local Tracking Error Dynamics:
z0
xi (k + 1) = Axi,k + eij + gi Bi ui,k g1
j∈Ni
x1 x2
− eij Bj uj,k
j∈Ni
x3

The states xi of the agents in the graph
can be combined into the global state: x4 x5

x(k) = [x1 (k) x2 (k) . . . xN (k)]T
T T T

[Khoo, Xie, and Man, 2009]


Performance Index

Each agent optimizes its own
performance index, consisting of z0
g1
Local tracking error xi
Cost for own actions ui,k x1 x2
Cost for actions of neighbors uj,k
∞
T T x3
Ji = (xi,k Qii xi,k ) + ui,k Rii ui,k
k=0
x4 x5
T
+ uj,k Rij uj,k
j∈Ni

[Vamvoudakis, and Lewis, 2011]


z0
g1

x1 x2

All agents follow a policy πi such that:
x3
ui (k) = πi x(k) ∀i ∈ {1, . . . , N}
x4 x5
Deﬁnition (Global Nash Equilibrium)

∗ ∗
An N-tuple of policies Π = {π1 , . . . , πN } constitutes a global Nash
equilibrium solution for an N-agent game if every agent is in its
best response to all the other agents in the graph.

[Basar and Olsder, 1999]


The expected cost the i-th agent in the global Nash equilibrium
can be expressed as:

VΠ,i (x(k)) = x T (k)Si x(k)

with:
x(k) The global tracking error state
∗ ∗
Π The N-tuple of policies {π1 , . . . , πN }
Si The Riccati matrix for the i-th agent



The Coupled Riccati equations are
¯
Si = Qi + ΛT Si Λ + ∗T ∗ ∀i ∈ N
j∈{Ni ,i} πj Rij πj

With:
¯
Qi The state weighting is such that:
¯
x T (k)Qi x(k) = xiT (k)Qii xi (k)
Λ The global closed loop matrix :
−1 ¯
Λ = I + i∈N Bi Rii BiT Siq
¯ −1 ¯ A
π ∗ The policies π ∗ = −R −1 B T S Λ
¯ i
i i ii i



The Coupled Riccati equations are
¯
Si = Qi + ΛT Si Λ
¯ −1 −1 ¯
+ j∈{Ni ,i} ΛT Sj Bj Rjj Rij Rjj BjT Sj Λ ∀i ∈ N

The positive deﬁnite solution of Coupled Riccati Equations:
Asymptotically Stabilizes the states xi ∀i ∈ {1, . . . , N}
The Global Nash Equilibrium solution



Diﬀerence Coupled Riccati Equations

Iterative solution of the Coupled Riccati Equations:

Siq+1 =Qi + Λq
¯ T
Siq Λq
+ Λq T
Sjq Bj Rjj Rij Rjj BjT Sjq Λq
¯ −1 −1 ¯
∀i ∈ N
j∈{Ni ,i}

The iteration exists if the inverse of I + i∈N Bi Rii BiT Siq
¯ −1 ¯
exists for all q.



Synchronization
z0
1 1
Considering the following dynamics: z1
1 z2

z0 (k + 1) = Az0 (k) 1 1
z3
zi (k + 1) = Azi (k) + Bi ui (k) ∀i ∈ {1, ..., N} 1 1
z4 z5
with : 1
0.995 0.0998 2 2
A= , B1 = B2 = , B3 = ,
−0.0998 0.995 3 2
1 1 0
B4 = B5 , Qii = , Rij = 1, for all i, j ∈ {1, . . . , N}
2 0 1



Synchronization: z

z0

z1 z2

z3

z4 z5



Synchronization: The local tracking error x

z0
x1 x2

x3

x4 x5



Conclusion

Exact solution of the linear-quadratic discrete-time graphical game.

Future work:

Use solution to quantify accuracy of approximative (learning)
algorithms for linear-quadratic discrete-time graphical game.



Thank you for your time

Are there any questions?


A Game Theoretical Approach to Multi-Agent Synchronization

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Recently uploaded

Recently uploaded (20)

A Game Theoretical Approach to Multi-Agent Synchronization