SlideShare a Scribd company logo
1 of 1
Download to read offline
Extortion Strategies in the Prisoner’s Dilemma
Companion poster to“Extortion on the Front Lines”
Chris Hughes
Extortion Strategies in the Prisoner’s Dilemma
Companion poster to“Extortion on the Front Lines”
Chris Hughes
Introduction
“Extortion on the Front lines” aims to provide a clear and concise
introduction to the Prisoner’s Dilemma game; in a way that can be
understood by an undergraduate mathematician.
• I discuss the key developments in the field, providing the reader
with the necessary knowledge to explore recent publications.
• I describe how the Prisoner’s Dilemma can be applied to WW1
trench warfare [1] and, using historical sources, create a model
inspired by this.
• I place particular emphasis on the derivation and applications of
zero-determinant extortion strategies [3]; as well exploring the af-
fects of modifying of a strategy’s parameters.
• I demonstrate the robustness zero-determinant extortion strate-
gies, against an unwitting adaptive opponent, through examples.
The Iterated Prisoner’s Dilemma
The Iterated Prisoner’s Dilemma (IPD) is a two-player game in which
selfish individuals attempt to maximise their respective payoffs by
balancing cooperation with competition [1].
Each round, players X and Y can choose between two available
moves: Cooperate (C) or Defect (D).
Rules of the game
• if both players choose to Cooperate (C) both players receive
R points - this is the Reward for mutual cooperation
• if both players choose to Defect (D) both players receive P
points - this is the Punishment for mutual defection
• If one player Defects while the other Cooperates, the Defect-
ing player receives T points and the Cooperating player receives
S points - one player has embraced the Temptation to defect
while the other received the Sucker’s payoff.
This can be summarised in a Payoff Matrix:
1000 1100 900 1000 1100 900
1000 900 1100 950 1100 900
Temptation 1824 2042 1606 1832 2006 1642
Reward 1607 2000 1213 1666 1767 1446
Punishment 1002 1300 900 1050 1102 902
Suckers 588 647 529 588 647 529
Temptation 1824 1606 2042 1724 2006 1642
Reward 1607 1213 2000 1467 1767 1446
Punishment 1002 900 1300 950 1102 902
Suckers 588 529 647 559 647 529
Cooperate Defect
Cooperate (R, R) (S, T)
Defect
(T, S) (P,P)
Player 1
Player 2
We can also define the payoff vectors for both players as
SX = (R, S, T, P) and SY = (R, T, S, P). (1)
The payoffs in an IPD game must be strictly ordered such that
T > R > P > S and 2R > T + S.
For interesting analysis, and to eliminate any endgame tactics, we
introduce a probability w that the game will end at the end of the
current round - placing the players in a situation in which they are
unsure if they will meet again.
Strategies in an IPD
The strategy of a player determines how the player chooses his next
move.
A memory-one strategy depends only on the previous round of the
game
For players X and Y, with respective moves x and y, the four possible
outcomes from each round of an IPD can be represented, from the per-
spective of player X, as
xy ∈ {CC, CD, DC, DD}.
We can represent player X’s memory-one strategy as p =
(pCC, pCD, pDC, pDD) such that:
pCC = P(Xn+1 = C|Xn = C, Yn = C)
pCD = P(Xn+1 = C|Xn = C, Yn = D)
pDC = P(Xn+1 = C|Xn = D, Yn = C)
pDD = P(Xn+1 = C|Xn = D, Yn = D)
We can similarly express player Y’s strategy, corresponding to the out-
comes yx ∈ {CC, CD, DC, DD}.
As we only consider memory-1 strategies, we can describe the IPD game
as a four-state Markov chain with state space {CC, CD, DC, DD}.
The stationary distribution v of this Markov chain is the probability dis-
tribution of outcomes for any given round (in the long run). This governs
the long-run expected payoffs of each player by the formulas
PX = SX · v, PY = SY · v.
Press & Dyson’s Contribution
Using undergraduate linear algebra, Press & Dyson showed [3] that the
dot product of the stationary distribution v with any vector f can be
expressed as a 4 × 4 determinant, in which:
• one column is f
• one column is entirely under the control of player X - only involving
the four probabilities that describe X’s strategy,
• one column is entirely under the control of player Y.
Key Observation
One player, by choosing a strategy to ensure the column they control is
proportional to f, can independently force the dot product of v with
f to be zero.
If a player is able to select a strategy, for some α, β, γ ∈ R, which
satisfies
˜p = (p1 − 1, p2 − 1, p3, p4) = αSX + βSY + γ1,
then regardless his opponent’s strategy, the following linear re-
lation will be enforced between the expected payoffs of the players:
αPX + βPY + γ = 0.
The strategy ˜p is a zero-determinant (ZD) strategy.
Applications of Zero-determinant Strategies
The applications detailed in the project are:
• Unilaterally setting an opponent’s score - fixing an opponent’s long term payoff to some value between the
P and R payoffs, regardless of their strategy
• Extorting an opponent - one player enforces a relation resulting in himself gaining a greater payoff than his
opponent
• Extorting an adapting player - enforcing an unfair relation against a player who adapts his strategy
It is also demonstrated why a player is unable to set his own payoff using a ZD strategy.
When considering only two players, with one extorting the other, ZD strategies are extremely robust. It can
be proved [2], that when facing an unwitting adapting player, a player using a ZD extortion strategy will always
receive his maximum long term score, regardless of how his opponent adapts. This is demonstrated through
several examples. I also demonstrate that it is impossible for both players to extort the other simultaneously.
1000
1100
1200
1300
1400
1500
1600
1700
1800
1
101
201
301
401
501
601
701
801
901
1001
1101
1201
1301
1401
1501
1601
1701
1801
1901
2001
2101
2201
Expected	payoff	per	round
Steps
P1	Max P1	Score-1 P1	Score-2 P1	Score-3 P1	Score-4
P2	Max P2	Score-1 P2	Score-2 P2	Score-3 P2	Score-4
Figure 1: Paths taken by an adapting player (P2) and his extorter (P1) in four different instances, both players arriving at the maximum
scores in each case
Acknowledgements
I would like to thank my supervisor Dr. Gustav W. Delius for his enthusiasm, guidance and dedication; providing
excellent supervision over the course of this project. I would also like to thank Rebecca Nelson for inspiring and
motivating me to work to the best of my ability.
References
[1] Axelrod R. The Evolution of Cooperation. New York: Basic Books; 1984
[2] Chen J.; Zinger A. The Robustness of Zero-Determinant Strategies in Iterated Prisoner’s Dilemma Games.
Journal of Theoretical Biology Vol. 357; 2014. pp. 46-54
[3] Press W.; Dyson F. Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent.
Proceedings of the National Academy of Sciences 109. 2012.
[4] Taylor J.G. Lanchester Models of Warfare 1983, vol. 1.
This work uses non-conventional values (T, R, P, S) = (1824, 1607, 1002, 588) for
both players. These were attained through Lanchester combat modelling [4].
Using results from [3], we are able to consider only memory-one strategies during our analysis on
zero-determinant extortion strategies, without loss of generality. This is fully explained in the project.
Here, the adapting
player’s strategy was
simulated using the
Gradient Descent
method. The appli-
cation of this method
to the example is fully
detailed in the project.

More Related Content

What's hot

What's hot (20)

Game theory
Game theoryGame theory
Game theory
 
Game theory
Game theoryGame theory
Game theory
 
Game theory
Game theoryGame theory
Game theory
 
Game Theory Presentation
Game Theory PresentationGame Theory Presentation
Game Theory Presentation
 
Game theory
Game theoryGame theory
Game theory
 
Game theory
Game theoryGame theory
Game theory
 
Game
Game Game
Game
 
Games
GamesGames
Games
 
An introduction to Game Theory
An introduction to Game TheoryAn introduction to Game Theory
An introduction to Game Theory
 
Game theory
Game theoryGame theory
Game theory
 
Applications of game theory on event management
Applications of game theory on event management Applications of game theory on event management
Applications of game theory on event management
 
lect1207
lect1207lect1207
lect1207
 
Black schole
Black scholeBlack schole
Black schole
 
Game theory
Game theoryGame theory
Game theory
 
GAME THEORY
GAME THEORYGAME THEORY
GAME THEORY
 
Game theory and its applications
Game theory and its applicationsGame theory and its applications
Game theory and its applications
 
gt_2007
gt_2007gt_2007
gt_2007
 
Black-Scholes overview
Black-Scholes overviewBlack-Scholes overview
Black-Scholes overview
 
Game theory
Game theoryGame theory
Game theory
 
Should a football team go for a one or two point conversion? A dynamic progra...
Should a football team go for a one or two point conversion? A dynamic progra...Should a football team go for a one or two point conversion? A dynamic progra...
Should a football team go for a one or two point conversion? A dynamic progra...
 

Similar to Dissertation Conference Poster

Optimization of Fuzzy Matrix Games of Order 4 X 3
Optimization of Fuzzy Matrix Games of Order 4 X 3Optimization of Fuzzy Matrix Games of Order 4 X 3
Optimization of Fuzzy Matrix Games of Order 4 X 3IJERA Editor
 
Theory of Repeated Games
Theory of Repeated GamesTheory of Repeated Games
Theory of Repeated GamesYosuke YASUDA
 
Wiese heinrich2021 article-the_frequencyofconvergentgamesu
Wiese heinrich2021 article-the_frequencyofconvergentgamesuWiese heinrich2021 article-the_frequencyofconvergentgamesu
Wiese heinrich2021 article-the_frequencyofconvergentgamesuAbdurahmanJuma1
 
The Minority Game: Individual and Social Learning
The Minority Game: Individual and Social LearningThe Minority Game: Individual and Social Learning
The Minority Game: Individual and Social LearningStathis Grigoropoulos
 
Chris Hughes Final Year Project
Chris Hughes Final Year ProjectChris Hughes Final Year Project
Chris Hughes Final Year ProjectChris Hughes
 
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersSolving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersIJSRD
 
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersSolving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersIJSRD
 
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersSolving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersIJSRD
 
navingameppt-191018085333.pdf
navingameppt-191018085333.pdfnavingameppt-191018085333.pdf
navingameppt-191018085333.pdfDebadattaPanda4
 
Bba 3274 qm week 5 game theory
Bba 3274 qm week 5 game theoryBba 3274 qm week 5 game theory
Bba 3274 qm week 5 game theoryStephen Ong
 

Similar to Dissertation Conference Poster (20)

Optimization of Fuzzy Matrix Games of Order 4 X 3
Optimization of Fuzzy Matrix Games of Order 4 X 3Optimization of Fuzzy Matrix Games of Order 4 X 3
Optimization of Fuzzy Matrix Games of Order 4 X 3
 
Theory of Repeated Games
Theory of Repeated GamesTheory of Repeated Games
Theory of Repeated Games
 
Wiese heinrich2021 article-the_frequencyofconvergentgamesu
Wiese heinrich2021 article-the_frequencyofconvergentgamesuWiese heinrich2021 article-the_frequencyofconvergentgamesu
Wiese heinrich2021 article-the_frequencyofconvergentgamesu
 
Newgame (2)
Newgame (2)Newgame (2)
Newgame (2)
 
Gt brno
Gt brnoGt brno
Gt brno
 
CAGT-IST Student Presentations
CAGT-IST Student Presentations CAGT-IST Student Presentations
CAGT-IST Student Presentations
 
The Minority Game: Individual and Social Learning
The Minority Game: Individual and Social LearningThe Minority Game: Individual and Social Learning
The Minority Game: Individual and Social Learning
 
Chris Hughes Final Year Project
Chris Hughes Final Year ProjectChris Hughes Final Year Project
Chris Hughes Final Year Project
 
104 Icdcit05
104 Icdcit05104 Icdcit05
104 Icdcit05
 
file1
file1file1
file1
 
game THEORY ppt
game THEORY pptgame THEORY ppt
game THEORY ppt
 
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersSolving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
 
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersSolving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
 
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersSolving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
 
navingameppt-191018085333.pdf
navingameppt-191018085333.pdfnavingameppt-191018085333.pdf
navingameppt-191018085333.pdf
 
OR 14 15-unit_4
OR 14 15-unit_4OR 14 15-unit_4
OR 14 15-unit_4
 
Computer Network Assignment Help.pptx
Computer Network Assignment Help.pptxComputer Network Assignment Help.pptx
Computer Network Assignment Help.pptx
 
Bba 3274 qm week 5 game theory
Bba 3274 qm week 5 game theoryBba 3274 qm week 5 game theory
Bba 3274 qm week 5 game theory
 
Computer Network Assignment Help
Computer Network Assignment HelpComputer Network Assignment Help
Computer Network Assignment Help
 
Game theory
Game theoryGame theory
Game theory
 

Dissertation Conference Poster

  • 1. Extortion Strategies in the Prisoner’s Dilemma Companion poster to“Extortion on the Front Lines” Chris Hughes Extortion Strategies in the Prisoner’s Dilemma Companion poster to“Extortion on the Front Lines” Chris Hughes Introduction “Extortion on the Front lines” aims to provide a clear and concise introduction to the Prisoner’s Dilemma game; in a way that can be understood by an undergraduate mathematician. • I discuss the key developments in the field, providing the reader with the necessary knowledge to explore recent publications. • I describe how the Prisoner’s Dilemma can be applied to WW1 trench warfare [1] and, using historical sources, create a model inspired by this. • I place particular emphasis on the derivation and applications of zero-determinant extortion strategies [3]; as well exploring the af- fects of modifying of a strategy’s parameters. • I demonstrate the robustness zero-determinant extortion strate- gies, against an unwitting adaptive opponent, through examples. The Iterated Prisoner’s Dilemma The Iterated Prisoner’s Dilemma (IPD) is a two-player game in which selfish individuals attempt to maximise their respective payoffs by balancing cooperation with competition [1]. Each round, players X and Y can choose between two available moves: Cooperate (C) or Defect (D). Rules of the game • if both players choose to Cooperate (C) both players receive R points - this is the Reward for mutual cooperation • if both players choose to Defect (D) both players receive P points - this is the Punishment for mutual defection • If one player Defects while the other Cooperates, the Defect- ing player receives T points and the Cooperating player receives S points - one player has embraced the Temptation to defect while the other received the Sucker’s payoff. This can be summarised in a Payoff Matrix: 1000 1100 900 1000 1100 900 1000 900 1100 950 1100 900 Temptation 1824 2042 1606 1832 2006 1642 Reward 1607 2000 1213 1666 1767 1446 Punishment 1002 1300 900 1050 1102 902 Suckers 588 647 529 588 647 529 Temptation 1824 1606 2042 1724 2006 1642 Reward 1607 1213 2000 1467 1767 1446 Punishment 1002 900 1300 950 1102 902 Suckers 588 529 647 559 647 529 Cooperate Defect Cooperate (R, R) (S, T) Defect (T, S) (P,P) Player 1 Player 2 We can also define the payoff vectors for both players as SX = (R, S, T, P) and SY = (R, T, S, P). (1) The payoffs in an IPD game must be strictly ordered such that T > R > P > S and 2R > T + S. For interesting analysis, and to eliminate any endgame tactics, we introduce a probability w that the game will end at the end of the current round - placing the players in a situation in which they are unsure if they will meet again. Strategies in an IPD The strategy of a player determines how the player chooses his next move. A memory-one strategy depends only on the previous round of the game For players X and Y, with respective moves x and y, the four possible outcomes from each round of an IPD can be represented, from the per- spective of player X, as xy ∈ {CC, CD, DC, DD}. We can represent player X’s memory-one strategy as p = (pCC, pCD, pDC, pDD) such that: pCC = P(Xn+1 = C|Xn = C, Yn = C) pCD = P(Xn+1 = C|Xn = C, Yn = D) pDC = P(Xn+1 = C|Xn = D, Yn = C) pDD = P(Xn+1 = C|Xn = D, Yn = D) We can similarly express player Y’s strategy, corresponding to the out- comes yx ∈ {CC, CD, DC, DD}. As we only consider memory-1 strategies, we can describe the IPD game as a four-state Markov chain with state space {CC, CD, DC, DD}. The stationary distribution v of this Markov chain is the probability dis- tribution of outcomes for any given round (in the long run). This governs the long-run expected payoffs of each player by the formulas PX = SX · v, PY = SY · v. Press & Dyson’s Contribution Using undergraduate linear algebra, Press & Dyson showed [3] that the dot product of the stationary distribution v with any vector f can be expressed as a 4 × 4 determinant, in which: • one column is f • one column is entirely under the control of player X - only involving the four probabilities that describe X’s strategy, • one column is entirely under the control of player Y. Key Observation One player, by choosing a strategy to ensure the column they control is proportional to f, can independently force the dot product of v with f to be zero. If a player is able to select a strategy, for some α, β, γ ∈ R, which satisfies ˜p = (p1 − 1, p2 − 1, p3, p4) = αSX + βSY + γ1, then regardless his opponent’s strategy, the following linear re- lation will be enforced between the expected payoffs of the players: αPX + βPY + γ = 0. The strategy ˜p is a zero-determinant (ZD) strategy. Applications of Zero-determinant Strategies The applications detailed in the project are: • Unilaterally setting an opponent’s score - fixing an opponent’s long term payoff to some value between the P and R payoffs, regardless of their strategy • Extorting an opponent - one player enforces a relation resulting in himself gaining a greater payoff than his opponent • Extorting an adapting player - enforcing an unfair relation against a player who adapts his strategy It is also demonstrated why a player is unable to set his own payoff using a ZD strategy. When considering only two players, with one extorting the other, ZD strategies are extremely robust. It can be proved [2], that when facing an unwitting adapting player, a player using a ZD extortion strategy will always receive his maximum long term score, regardless of how his opponent adapts. This is demonstrated through several examples. I also demonstrate that it is impossible for both players to extort the other simultaneously. 1000 1100 1200 1300 1400 1500 1600 1700 1800 1 101 201 301 401 501 601 701 801 901 1001 1101 1201 1301 1401 1501 1601 1701 1801 1901 2001 2101 2201 Expected payoff per round Steps P1 Max P1 Score-1 P1 Score-2 P1 Score-3 P1 Score-4 P2 Max P2 Score-1 P2 Score-2 P2 Score-3 P2 Score-4 Figure 1: Paths taken by an adapting player (P2) and his extorter (P1) in four different instances, both players arriving at the maximum scores in each case Acknowledgements I would like to thank my supervisor Dr. Gustav W. Delius for his enthusiasm, guidance and dedication; providing excellent supervision over the course of this project. I would also like to thank Rebecca Nelson for inspiring and motivating me to work to the best of my ability. References [1] Axelrod R. The Evolution of Cooperation. New York: Basic Books; 1984 [2] Chen J.; Zinger A. The Robustness of Zero-Determinant Strategies in Iterated Prisoner’s Dilemma Games. Journal of Theoretical Biology Vol. 357; 2014. pp. 46-54 [3] Press W.; Dyson F. Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent. Proceedings of the National Academy of Sciences 109. 2012. [4] Taylor J.G. Lanchester Models of Warfare 1983, vol. 1. This work uses non-conventional values (T, R, P, S) = (1824, 1607, 1002, 588) for both players. These were attained through Lanchester combat modelling [4]. Using results from [3], we are able to consider only memory-one strategies during our analysis on zero-determinant extortion strategies, without loss of generality. This is fully explained in the project. Here, the adapting player’s strategy was simulated using the Gradient Descent method. The appli- cation of this method to the example is fully detailed in the project.