Cooperation and Reputation

Cooperation and Reputation
Vincent Traag
June 29, 2010

Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Outline
1. Introduction
2. Cooperative Mechanisms
3. Indirect Reciprocity
4. Proposed model

Cooperation
Cooperation (and defection)
• Organizations (also Wikipedia, open source software, . . . )
◮ Why do people contribute?
• Worker ants in colonies
◮ Why do workers help without individual beneﬁt?
• Prudents parasites in hosts
◮ Why do parasites not replicate faster?
• Human body
◮ Why do cells not replicate faster?
Central question
If defecting (not cooperating) is a real option, why (and how) has
cooperation evolved?

Formal cooperation (and defection)
Prisoner’s Dilemma
• The game knows two options, donating or not donating.
• Donate at a cost c > 0 to benefit someone else with benefit
b > c.
• Agents are paired, and play a round of donating or not.
• Cooperators C donate, defectors D do not donate.
This can be summarized in the payoff matrix
A =
C D
C b − c −c
D b 0
Defectors dominate
Whatever strategy you encounter (C or D), always better to defect.

Evolutionary Stability (static)
Definition (Nash equilibrium)
Strategy i is a Nash equilibrium if Aii ≥ Aji
and is a strict Nash equilibrium if Aii > Aji .
Players cannot benefit by switching from strategy i if it is a Nash
equilibrium.
Definition (ESS)
Strategy i is an Evolutionary Stable Strategy (ESS) if
Aii > Aji or (Aii = Aji and Aij > Ajj ).
A population of players with strategy i cannot be ‘invaded’ by a
small number of different strategies.
Strict Nash =⇒ ESS =⇒ Nash

Mixed strategies
Mixed strategies
• There are n different ‘pure’ strategies (e.g. Cooperate, Defect).
• Mixed strategy p is: play ‘pure’ strategy i with probability pi .
• Average payoff for ‘pure’ strategy i versus p is then (Ap)i .
• Average payoff for mixed strategy q versus p is then q⊺Ap.
Stability revisited
Strategy p is
(Strict) Nash p⊺Ap ≥ q⊺Ap
ESS p⊺Ap > q⊺Ap or
p⊺Ap = p⊺Aq and p⊺Aq > q⊺Aq
There always exists a mixed strategy Nash equilibrium.

Dynamical View
• Natural to model game dynamics in an evolutionary context.
• Survival of the fittest (fitness = payoff).
Definition (Replicator equation)
Population with i = 1, . . . , n different mixed strategies pi
xi Relative abundance (frequency)
¯p = i pi xi Average strategy
fi = p⊺
i A¯p Expected payoff
¯f = ¯p⊺
A¯p Average payoff
Evolution of the population given by
˙xi = xi (fi − ¯f ) = xi ((pi − ¯p)⊺
A¯p).

Stability (dynamic)
Fixed points
• Total population always i xi = 1.
• Dynamics are restricted to unit simplex Sn.
• Fixed point x∗ then p⊺
i A¯p = ¯pA¯p for xi > 0.
Nash and ESS vs. fixed points
• If x∗ is (strictly) Nash, then it is a (stable) fixed point.
• If the fixed point x∗ is stable, it is a Nash equilibrium.
• if x∗ is ESS then it is a stable fixed point.
• An interior ESS x∗ is globally stable.

Overview
What are possibly mechanisms to get cooperation?
Payoff matrix
A =
C D
C b − c −c
D b 0
Mechanisms
• Kin selection (r > c
b )
Cooperate because offspring benefits of your cooperation. Basis
of ‘selfish gene’, or ‘inclusive fitness’.
• Direct reciprocity (w > c
b )
Cooperate because of possible future payoffs.
• Indirect reciprocity (q > c
b )
Cooperate because someone else may cooperate with you in the
future.

Kin selection
Kin and gene
• Focus is on the gene, how can the gene spread?
• If coeﬃcient of kinship r > c
b the cooperative gene will spread.
Game theoretic dynamic view
• Let 0 ≤ r ≤ 1 be the assortativity.
• Average payoﬀ (cooperators x, defectors 1 − x)
fC (x) = r(b − c) + (1 − r) (x(b − c) − (1 − x)c)
fD(x) = (1 − r)xb
• Dynamics ˙x = x(1 − x)(fC − fD), x∗ = 1 is stable if r > c
b .

Kin selection
Change in payoff
• Average payoff (cooperators x, defectors 1 − x)
fC (x) = r(b − c) + (1 − r) (x(b − c) − (1 − x)c)
fD(x) = (1 − r)xb
• Gives payoff matrix
A =
C D
C b − c rb − c
D (1 − r)b 0
• Cooperation is ESS if (b − c) > (1 − r)b, hence if r > c
b .

Reciprocity
Cooperate because possible future rewards.
Iterated Prisoner’s Dilemma
• Play the PD game multiple times.
• Usually probability w to play another round.
• Huge number of possible strategies.
• No deﬁnite ESS.
Framework
• Play on average k = 1/(1 − w) rounds, then apply selection.
• Expected payoﬀ aij of strategy i vs j.
• Then apply earlier framework (ESS, replicator).

Some strategies
Example (Always)
Defect/cooperate on all rounds
Other CDDDDCC
AllD DDDDDDD
AllC CCCCCCCC
Example (Win-Stay, Lose-Shift)
Change strategy if losing, keep it
otherwise.
Other CDDDDCC
WSLS CCDCDCC
Example (Tit-for-tat)
Start cooperating, then repeat
opponent.
Other CDDDDCC
TFT CCDDDDC
Example (Generous Tit-for-tat)
As TFT, but cooperates after
defection with probability p.
Other CDDDDCC
GTFT CCDDCDC

Stability of reciprocity (TFT)
TFT vs. AllD
• TFT will cooperate first round, then defect subsequently.
• Expected payoff matrix
A =
TFT AllD
TFT (b − c)/(1 − w) −c
AllD b 0
• TFT is ESS when (b − c)/(1 − w) > b, or w > c
b .
TFT vs. AllC
• TFT is neutral vs AllC, neither is ESS.
• Expected payoff always (b − c)/(1 − w) for both TFT and AllC.

Cyclic behaviour
Weaknesses of TFT
• TFT population can drift towards AllC.
• TFT does not restore cooperation on errors
TFT CCDCDCDD
TFT CCCDCDDD
• Generous TFT (GTFT) sometimes cooperates unreciprocally.
• GTFT can correct errors but still neutral vs AllC.
TFT GTFT
AllCAllD

Introduction
Why is kin selection and reciprocity not suﬃcient?
Insuﬃcient explanation
• Humans cooperate also with non-kin.
• Humans cooperate in non-iterative situations.
Indirect reciprocity
• Cooperate if cooperated with others in the past.
• Brings reputation into play.
• How to respond to reputation?
• How to determine new reputation?

Indirect Reciprocity
Cooperate because others will return the favor.
Reputation
• Cooperation increases reputation, defection decreases it.
• Cooperate with those who have a good reputation.
• Defect those who have a bad reputation.
Action and assesment
• Many other possible interactions between cooperation and
reputation.
• Should it be ‘bad’ or ‘good’ to cooperate with ‘bad’ agents?
• Should you cooperate only to increase your own reputation?

Image score
Deﬁnition (Image score, reputation)
• Integer status −5 ≤ Si ≤ 5 known to all.
• If cooperate increase (with 1).
• If defect decrease (with 1).
Deﬁnition (Discriminator Strategy)
• Cooperative threshold −5 ≤ kj ≤ 6.
• If status Si ≥ kj cooperate, otherwise defect.
• Strategy kj = −5 corresponds to AllC.
• Strategy kj = 6 corresponds to AllD.

Image score
Simulation
• Have n agents playing m rounds of donating.
• Each agent i has a threshold ki and
reputation Si .
• Reproduce oﬀspring proportional to payoﬀ.
Results of simulation
• Cooperative strategies (ki ≤ 0) prevails
without mutation.
• Cycles of Discriminator → AllC → AllD with
mutation.

Some simple analytics
Simple image score
• Only good (1) or bad (0) reputation.
• Conditional cooperation (CC): cooperate if reputation is good.
• Probability q to know reputation of defector.
CC vs AllD
• Payoﬀ matrix
A =
CC AllD
CC b − c −c(1 − q)
AllD b(1 − q) 0
• Conditional Cooperation is ESS when q > c
b .

Other reputation dynamics
Morals
• Defecting a defector: bad in image score.
• What action should be regarded as good?
• When to cooperate, when to defect?
GG GB BG BB
C ∗ ∗ ∗ ∗
D ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗
Reputation of donor and recipientAction of donor
New reputation can be
either Good or Bad
Action can be either
Cooperate or Defect

Some reputation dynamics
GG GB BG BB
C G G G G
D B B B B
Image scoring
C G G G G
D B G B B
Standing
C G B G B
D B G B B
Judging
C G B G B
D B B B B
Shunning

Leading eight
Best strategies
• In total 2, 048 diﬀerent possible strategies.
• There are 8 strategies (leading eight) that perform best (highest
payoﬀ, and ESS).
GG GB BG BB
C G ∗ G ∗
D B G B ∗
C D C ×
Maintainance of cooperation
Mark defectors
Punish defectors
Forgive defectors
Apologize

Subjective reputation
Subjective reputation
• Unrealistic that everybody knows the reputation of everybody.
• Introduce a subjective (private) reputation.
• ‘Observe’ only a few interactions.
Observing
• Probability q of observing an interaction.
• Cooperation declines with lower q.
• Diverging reputations cause further errors.
• Good may defect bad, but not all agree on who’s bad.

Synchronize reputations
Synchronizing reputations
• Spread local information to synchronize reputations.
• Players ‘gossip’ about each other to share information.
• Start gossip, spread gossip and how to interpret gossip?
Lying, cheating and defecting
• Possibly ‘false’ gossips spread.
• Spread rumours unconditionally allows liars to invade.
• Liars cannot invade conditional rumour spreaders.

Empirical evidence
Directly observable
• Humans seem to be using image scoring.
• Norm (help if S > k) can be different across groups.
• Standing strategy might be too ‘demanding’.
• Generates trust, also in subsequent games.
With gossip
• Gossip effective to spread information on reputation.
• Even in presence of direct observation, gossip has an effect.
• More gossip increases the effect.

Current research
Research questions
• What population structure can result from gossip?
• How stable are certain population structures?
Desired properties
• Have subjective reputations.
• Inﬂuenced by ‘local’ gossip.
• In the absence of gossip, rely on own observations.
• More gossip should have more inﬂuence.
• Have an analytically tractable model.

Simple model
• Start with some simple model and obtain some results.
• Somewhat arbitrary choices, which might be varied later on.
Basics
1 Each agent has a reputation of the other: Sij .
2 Everybody plays and cooperates/defects based on reputation.
3 Everybody gossips the result of the interaction.
4 Update reputation based on own observation and gossip.

Reputation and cooperation
One interaction
• Suppose agent i and j interact
• Each agent has a reputation of the other: Sij and Sji
• Probability to cooperate αij and αji depend on reputation.
Approximation to image score
• Image score uses eﬀectively a Heaviside step function:
αij = Θ(Sij − k)
• We propose continuous version (for now, k = 0)
αij =
1
1 + e−γ(Sij −k)

Individual strategy
The four diﬀerent outcomes have the following probabilities:
Player j
Player i
C D
C αij αji αij (1 − αji )
D (1 − αij )αji (1 − αij )(1 − αji )
Individual strategy
• +1 for ‘good’ actions, −1 for ‘bad’ actions to reputation.
• TFT-like: Consider CC and DC as good.
• We currently study WSLS-like: Consider CC and DD as good.
∆i Sij (t) =αij αji + (1 − αij )(1 − αji )
− (1 − αij )αji − αij (1 − αji )
=(2αij − 1)(2αji − 1)

Gossiping
Who gossips?
• To whom should you gossip?
• What gossip should you trust?
• Pass on the gossip?
• Currently: no further spreading, talk to cooperative people.
Gossip about what?
• Gossip about reputation?
• Gossip about last interaction?
• Currently: last interaction.

Gossiping
Consider all neighbours k when updating the reputation Sij .
i j
k
The link to
be updated.
Does i ‘like’ k?
Will k gossip to i?
What action
has j taken
to k?
Change in reputation after gossiping
∆g Sij (t) =
k=i,j
αki (2αik − 1)(2αjk − 1)

Reputation dynamics
Reputation
• Combine change from individual strategy and from gossiping.
• Balance the two changes with a ‘social inﬂuence’ parameter
0 ≤ λ ≤ 1.
∆Sij (t) = (1−λ) (2αij − 1)(2αji − 1)
Individual strategy
+λ
k=i,j
αki (2αik − 1)(2αjk − 1)
Gossip inﬂuence

Analytics
Obtain diﬀerential equation
• Assume for interval ∆t < 1, probability to interact is ∆t.
• Then we can take the limit lim∆t→0 ∆Sij (t)/∆t
• The derivative ˙Sij can be written in terms of αij , we obtain
˙Sij =
˙αij
γ(1 − αij )αij
Diﬀerential equation becomes (with rescaled time τ = γt)
˙αij = αij (1 − αij ) (1 − λ)(2αij − 1)(2αji − 1)
+ λ
k=i,j
αki (2αik − 1)(2αjk − 1)

No gossip
No gossip
• When gossip is not present
diﬀerential equation is simple:
˙αij = αij (1 − αij )(2αij − 1)(2αji − 1)
• Only dependent on αij and αji .
• Only stable ﬁxed point: α∗
ij = α∗
ji = 1.
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0

Stability of fixed points
Two classes of fixed points
• Let Sn be the unit hypercube of dimension n.
• First class of fixed points is the corner of Sn.
• That is α∗
ij = 0, 1 for all ij
• Second class is outside the corners (internal points).
• That is, there is at least one α∗
ij = 0, 1
Corner
Stability of points
• Points in the corner are easily classified as (un)stable
• Internal points more difficult.
• It seems that most internal points are non-hyperbolic.
• Possibly some (limit) cycles may exist.

Corner points
Corner points
• All corner points are fixed points.
• Jacobian of ˙α = F(α) defined as
∇F =




∂f12
∂α12
· · · ∂f12
αn(n−1)
...
...
...
...
∂fn(n−1)
∂α12
· · ·
∂fn(n−1)
αn(n−1)




α∗
• For corner points, only ∂fij /∂αij is non-zero:
Condition for stability in corners:
(1 − 2α∗
ij ) (1 − λ)(2α∗
ij − 1)(2α∗
ji − 1) + λ(k+
ij − k−
ij ) < 0
where k±
ij is the number of matches/differences between i and j.

Stable groups
Groups
• One special case of corner points
• Cooperate within group, defect between groups
• Working out stability conditions gives
nc >
1
λ
• Social inﬂuence λ induces lower bound on group size.

Invasion from AllD
AllD
• Suppose system in equilibrium α∗ = (1, 1, . . . , 1).
• Add a number of defectors (AllD).
• Relationships between gossiping cooperators uneﬀected.
• Only reputation of defector changes.
New reputation equilibrium
• Let i be a cooperator, and j a defector, then
˙αij = αij (1 − αij ) [(1 − λ)(1 − 2αij ) − λ(nc − 1)]
• Stable ﬁxed point 1−λnc
2(1−λ) exists if nc < 1
λ (otherwise 0).

Invasion from AllD
• In equilibrium, expected payoﬀ Acc of cooperator vs. itself is
(b − c)
nc(nc − 1)
n2
• Expected payoﬀ Adc of defector vs. cooperator is
b
1 − λn
2(1 − λ)
ncnd
n2
• Condition Acc > Adc reduces to
1 −
(1 − λnc)nd
2(1 − λ)(nc − 1)
>
c
b
• Since c
b < 1, if RHS larger than that, AllD cannot invade. This
reduces to
nc >
1
λ

Invasion from AllD
Group size
• Two regimes of behavior:
nc <
1
λ
and nc >
1
λ
• In ﬁrst regime, some cooperation with defectors.
• Amount of cooperation decreases with group size nc and social
inﬂuence λ.
• In second regime, defectors can never invade.
• But by earlier stability of groups
nc >
1
λ
.
• So, always stable against invasion from AllD.

Cooperation and Reputation

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Cooperation and Reputation

Similar to Cooperation and Reputation (20)

More from Vincent Traag

More from Vincent Traag (9)

Recently uploaded

Recently uploaded (20)

Cooperation and Reputation