Case File
efta-efta01199747DOJ Data Set 9OtherDS9 Document EFTA01199747
Date
Unknown
Source
DOJ Data Set 9
Reference
efta-efta01199747
Pages
29
Persons
0
Integrity
No Hash Available
Extracted Text (OCR)
Text extracted via OCR from the original document. May contain errors from the scanning process.
3
4 "
"
Christian Hilbe' ", Bin Wub, Arne Traulsenb, and Martin A. Nowak"
5
,,n,Cooperation and control in multiplayer social dilemmas
0:7.6.9 °Program for Evolutionary Dynamics, Harvard University, Cambridge, MA 0213t °Department for Evolutionary Theory, Max Planck Institute for
6
Evolutionary Biology, 20306 Plan, Germany; and 'Department of Organismic and Evolutionary Biology and Department of Mathematics, Harvard University,
7
Cambridge, MA 02138
8
9
10
II
12
13
14
5
16
17
I8
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Edited by Joshua B. Plotkin, University of Pennsylvania, Philadelphia, PA. and accepted by the Editorial Board September 26, 2010 (received for review
April 30, 2010)
Direct reciprocity and conditional cooperation are important mecha-
nisms to prevent free riding in social dilemmas. However, in large
groups, these mechanisms may become ineffective because they re-
quire single individuals to have a substantial influence on their peers.
However, the recent discovery of zero-determinant strategies in the
iterated prisoner's dilemma suggests that we may have underesti-
mated the degree of control that a single player can exert. Here,
we develop a theory for zero-determinant strategies for multiplayer
social dilemmas, with any number of involved players. We distinguish
several particularly interesting subclasses of strategies: fair strategies
ensure that the own payoff matches the average payoff of the group;
extortionate strategies allow a player to perform above average;
and generous strategies let a player perform below average. We use
this theory to descnbe strategies that sustain cooperation. induding
generalized variants of Tit-for-Tat and Win-Stay Lose-Shift. Moreover,
we explore two models that show how individuals can further enhance
their strategic options by coordinating their play with others. Our
results highlight the importance of individual control and coordination
to succeed in large groups
evolutionary game theory I alliances I public goods game I
volunteer's dilemma I cooperation
C
ooperation among self-interested individuals is generally
difficult to achieve (1-3), but typically the free rider problem
is aggravated even further when groups become large (4-9). In
small communities, cooperation can often be stabilized by forms
of direct and indirect reciprocity (10-17). For large groups, how-
ever, it has been suggested that these mechanisms may turn out to
be ineffective, as it becomes more difficult to keep track of the
reputation of others and because the individual influence on others
diminishes (4-8). To prevent the tragedy of the commons and to
40
compensate for the lack of individual control, many successful
41
communities have thus established central institutions that enforce
42
mutual cooperation (18-22).
43
However, a recent discovery suggests that we may have un-
derestimated the amount of control that single players can exert in
repeated games. For the repeated prisoners dilemma, Press and
Dyson (23) have shown the existence of zero-determinant strategies
(or ZD strategies), which allow a player to unilaterally enforce
a linear relationship between the own payoff and the coplayer's
48
payoff, irrespective of the coplayer's actual strategy. The class of
49
zero-determinant strategies is surprisingly rich: for example, a player
who wants to ensure that the own payoff will always match the
coplayer's payoff can do so by applying a fair ZD strategy, like Tit-
for-Tat. On the other hand, a player who wants to outperform the
respective opponent can do so by slightly tweaking the Tit-for-Tat
strategy to the own advantage, thereby giving rise to extortionate
54
ZD strategies. The discovery of such strategies has prompted sev-
55
eral theoretical studies, exploring how different ZD strategies
56
evolve under various evolutionary conditions (24-30).
57
ZD strategies are not confined to the repeated prisoner's di-
58
lemma. Recently published studies have shown that ZD strate-
59
gies also exist in other repeated two player games (29) or in
repeated public goods games (31). Herein, we will show that such
strategies exist for all symmetric social dilemmas, with an arbi-
61
trary number of participants. We use this theory to describe
62
which ZD strategies can be used to enforce fair outcomes or to
44
45
46
47
50
5I
52
53
prevent free riders from taking over. Our results, however, are
not restricted to the space of ZD strategies. By extending the
techniques introduced by Press and Dyson (23) and Akin (27), we
also derive exact conditions when generalized versions of Grim, Tit-
for-Tat, and Win-Stay Lase-Shift allow for stable cooperation. In
this way, we find that most of the theoretical solutions for the re-
peated prisoner's dilemma can be directly transferred to repeated
dilemmas with an arbitrary number of involved players.
In addition, we also propose two models to explore how indi-
viduals can further enhance their strategic options by coordinating
their play with others. To this end, we extend the notion of ZD
strategies for single players to subgroups of players (to which we
refer as ZD alliances). We analyze two models of ZD alliances,
depending on the degree of coordination between the players.
When players form a strategy alliance, they only agree on the set
of alliance members, and on a common strategy that each alliance
member independently applies during the repeated game. When
players form a synchronized alliance, on the other hand, they
agree to act as a single entity, with all alliance members playing the
same action in a given round. We show that the strategic power of
ZD alliances depends on the size of the alliance, the applied
strategy of the allies, and on the properties of the underlying social
dilemma. Surprisingly, the degree of coordination only plays a role
as alliances become large (in which case a synchronized alliance
has more strategic options than a strategy alliance).
To obtain these results, we consider a repeated social dilemma
between n players. In each round of the game, players can decide
whether to cooperate (C) or to defect (D). A player's payoff
depends on the player's own decision and on the decisions of all
other group members (Fig. 1A): in a group in which/ of the other
group members cooperate, a cooperator receives the payoff al,
whereas a defector obtains b,. We assume that payoffs satisfy the
Significance
Many of the world's most pressing problems, like the prevention
of climate change, have the form of a large-scale social dilemma
with numerous involved players. Previous results in evolutionary
game theory suggest that multiplayer dilemmas make it partic-
ularly difficult to achieve mutual cooperation because of the lack
of individual control in large groups. Herein, we extend the
theory of zero-determinant strategies to multiplayer games to
describe which strategies maintain cooperation. Moreover, we
propose two simple models of alliances in multiplayer dilemmas.
The effect of these alliances is determined by their size, the
strategy of the allies, and the properties of the social dilemma.
When a single individual's strategic options are limited, forming
an alliance can result in a drastic leverage.
Author contrbutions: B.W. initiated the project; CN. B.W. AT, and M.A.N. designed
research; C.H.,B.W., AT., and MAN. performed research; At and MAN. analysed data
and C H and 6 W. wrote the paper.
The authors declare no conflict of interest.
'Mks artkle Is a PNAS Direct Submission. J.B.P. is a guest editor Invited by the mow
Board.
Freely available online through the PNAS open access option.
'To whom correspondence should be addressed. Email: nitiorlasriarraisethi.
'nth article contains supporting information online aUnww.pnas.orgiloalruWsupplidoi:10.
ion/linos motion imxsowir mental.
b.;
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
97
98
99
100
101
102
103
1(14
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
vnewpnas.orgrcglidoi/10.1073/pnas.1407887111
PNAS Early Edition I 1 of 6
EFTA01199747
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
I50
I51
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
ISO
181
182
183
184
185
186
following three properties that are characteristic for social
dilemmas (corresponding to the individual-centered interpretation
of altruism in ref. 32): (i) irrespective of the own strategy, players
prefer the other group members to cooperate (aft' ≥ aj and bp,. ≥ bj
for allj); (ii) within any mixed group, defectors obtain strictly higher
payoffs than cooperators
> aj for all j); and (iii) mutual co-
operation is favored over mutual defection (an_i> bo). To illustrate
our results, we will discuss two particular examples of multiplayer
games (Fig. 1B). In the first example, the public goods game (33),
cooperators contribute an amount c > 0 to a common pool. knowing
that total contributions are multiplied by r (with 1 <r < it) and evenly
shared among all group members. Thus, a cooperator's payoff is
rc
a = (j + 1)/n — c, whereas defectors yield bj=rcj/n. In the second
example, the volunteer's dilemma (34), at least one group member
has to volunteer to bear a cost c> 0 in order for all group members
to derive a benefit h>c. Therefore, cooperators obtain cif= b —c
(irrespective of j), whereas defectors yield hj = b if j a 1 and bo =0.
Both examples (and many more, such as the collective risk dilemma)
(7, 8, 35) are simple instances of multiplayer social dilemmas.
We assume that the social dilemma is repeated, such that in-
dividuals can react to their coplayers' past actions (for simplicity,
we will focus here on the case of an infinitely repeated game). As
usual, payoffs for the repeated game are defined as the average
payoff that players obtain over all rounds. In general, strategies
for such repeated games can become arbitrarily complex, as
subjects may condition their behavior on past events and on the
round number in nontrivial ways. Nevertheless, as in pairwise
games, ZD strategies turn out to be surprisingly simple.
A
B
3
2
g
C
0
2
4
6
a
10
Number of cooperators
among co-players
a.1
1}.2
....
2
1
0
Cooperators payoff
an-r
arr.2 ... az
as
no
Defectors payoff
bn-s
bn-2
b2
br
bo
2
a l
0
Volunteers Dilemma
Dete c. ‘.
ctor
4
,
b0.0. to 4,
1
.—e—e—crathr
arb—C
0
2
4
6
8
10
Number et cocperabng co-64eyere
Plumber of oocperalfrig co-players
ZD Alliance
Outsiders
1
1
Fig. 1. Illustration of the model assumptions for repeated soda! dilemma (A)
We consider symmetric n-player soda' dilemmas in which each player can either
cooperate or defect The players payoff depends on its own decision and on the
number of other group members who decide to cooperate. (B) We will discuss
two particular examples: the public goods game (in which payoffs are pro-
portional to the number of cooperators) and the volunteers dilemma (as the
most simple example of a nonlinear social dilemma). (C) In adcfrtion to individual
strategies, we will also explore how subjects can enhance their strategic options
by coordinating their play with other group members. We refer to the members
of such a ID alliance as allies, and we call group member that are not part of
the 2D alliance outsiders. Outsiders are not restricted to any particular strategy.
Some or all of the outsiders may even form their own alliance.
Results
Memory-One Strategies and Akin's lemma ZD strategies are
memory-one strategies (23, 36); they only condition their behavior
on the outcome of the previous round. Memory-one strategies can
be written as a vector p= (Pca
Pc o.PnA-4
Poo)- The
entries ps) denote the probability to cooperate in the next round,
given that the player previously played S E {C. O} and that j of the
coplayers cooperated (in the SI Tar, we present an extension in
which players additionally take into account who of the coplayers
cooperated). A simple example of a memory-one strategy is the
strategy Repeat. prim, which simply reiterates the own move of the
previous round, pic.i7 =1 and 147 =0. In addition, memory-one
strategies need to specify a cooperation probability pa for the first
round. However, our results will often be independent of the initial
play, and in that case we will drop Po.
Let us consider a repeated game in which a focal player with
memory-one strategy p interacts with n —1 arbitrary coplayers
(who are not restricted to any particular strategy). Let vs4(r)
denote the probability that the outcome of round t is (S,j). Let
v(0=
(t)
vao(t)] be the vector of these probabilities. A
limit distribution v is a limit point for a' —• co of the sequence
tv(1)+
+v(t)Wr. The entries vs, of such a limit distribution
correspond to the fraction of rounds in which the focal player
finds herself in state (S. j) over the course of the game.
There is a surprisingly powerful relationship between a focal
player's memory-one strategy and the resulting limit distribution
of the iterated game. To show this relationship, let qc(r) be the
probability that the focal player cooperates in round r. By definition
of pRiv we can write qc(r) = pRA° • v(1)=Evcs-i (0+ ... +vco(0).
Similarly, we can express the probability that the focal player
cooperates in the next round as qc(r + I) = p • v(t). It follows that
qc(r +1)— qc(t)=(p— pRc") • v(t). Summing up over all rounds
from 1 to t, and dividing by t. yields (p — pR•17)• iv(I)+
v(r))/r= [qc(r+ I) —qc(1)1/r, which has absolute value at most
IA By taking the limit r
co we can conclude that
(p —pRe0)•v=0.
This relation between a player's memory-one strategy and the
resulting limit distribution will prove to be extremely useful.
Because the importance of Eq. 1 has been first highlighted by Akin
(27) in the context of the pairwise prisoner's dilemma, we will refer
to it as Akin's lemma. We note that Akin's lemma is remarkably
general, because it neither makes any assumptions on the specific
game being played nor does it make any restrictions on the strat-
egies applied by the remaining n —1 group members.
zero
-Determinant Strategies in Multiplayer Social Dilemmas. As an
application of Akin's lemma, we will show in the following that
single players can gain an unexpected amount of control over
the resulting payoffs in a multiplayer social dilemma. To this
end, we first need to introduce some further notation. For
a focal player i, let us write the possible payoffs in a given round
as a vector g = (es), with g'n =a) and eDi =b). Similarly, let us
write the average payoffs ores coplayers as r= (gr), where
the entries are given by k g., =fra) + (n —j-1)br ibl(n — 1) and
gilo =frafri +(n —j —1)64/(n — I). Finally, let 1 denote the 2n-
dimensional vector with all entries being one. Using this notation, we
can write player Ps payoff in the repeated game as x' = g' • v, and the
average payoff of ts coplayers as
=
• v. Moreover, by defini-
tion of v as a limit distribution. it follows that I • v= 1. After these
preparations. let us assume player f applies the memory-one strategy
P=
+4 +//C+71.
[2]
with a, p, and y being parameters that can be chosen by player i
(with the only restriction that p#0). Due to Akin's lemma, we
can conclude that such a player enforces the relationship
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
2 of 6 I www.pnes.orgfcgildoi/10.1073Mnas.1407837 III
Hilbe et al.
EFTA01199748
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
0 = (p - pile/ • v = (cre +fie +71)v =ad +fir' +y.
131
Player i's strategy thus guarantees that the resulting payoffs of
the repeated game obey a linear relationship, irrespective of how
the other group members play. Moreover, by appropriately
choosing the parameters a, ft, and y, the player has direct control
on the form of this payoff relation. As in Press and Dyson (23),
who were first to discover such strategies for the prisoner's di-
lemma, we refer to the memory-one strategies in Eq. 2 as zero-
determinant strategies or ZD strategies.
For our purpose, it will be convenient to proceed with
a slightly different representation of ZD strategies. Using the
parameter transformation 1=-71(a+ fi), s = —al fi, and ¢=—p,
ZD strategies take the form
p =
+ OKI -s)(!1-gi) +
—
[4]
and the enforced payoff relationship according to Eq. 3 becomes
e 1 =ski +(i -s)1.
We refer to 1 as the baseline payoff of the ZD strategy and to s as
the strategy's slope. Both parameters allow an intuitive interpre-
tation: when all players adopt the same ZD strategy p such that
x' =x-', it follows from Eq. 5 that each player yields the payoff 1.
The value of s determines how the mean payoff of the other
group members e' varies with d. The parameter 0 does not
have a direct effect on Eq. 5: however, the magnitude of ¢ de-
termines how fast payoffs converge to this linear payoff relation-
ship as the repeated game proceeds (37).
The parameters 1. s. and efr of a ZD strategy cannot be chosen
arbitrarily, because the entries psi are probabilities that need to
satisfy 0 <psi < 1. In general, the admissible parameters depend
on the specific social dilemma being played. In SI Tier, we show
that exactly those relations 5 can be enforced for which either
s= 1 (in which case the parameter 1 in the definition of ZD
strategies becomes irrelevant) or for which / and s < 1 satisfy
j
—a-_1
—j— I b 41.1
max fb.
</< mM
+n
osisx-i
n —
1 —s
osisn-i
n — 1
—s
161
It follows that feasible baseline payoffs are bounded by the payoffs
for mutual cooperation and mutual defection, b0 s! < a„_i, and that
the slope needs to satisfy —1/(n — 1) <s < 1. With s sufficiently
close to 1, any baseline payoff between bo and a„_ can be achieved.
Moreover, because the conditions in Eq. 6 become increasingly re-
strictive as the group size n increases, larger groups make it more
difficult for players to enforce specific payoff relationships.
Important Examples of ZD Strategist In the following, we discuss
some examples of ZD strategies. At first, let us consider a player
who sets the slope to s = 1. By Eq. 5, such a player enforces the
payoff relation x' =x', such that's payoff matches the average
payoff of the other group members. We call such ZD strategies
fair. As shown in Fig. 24, fair strategies do not ensure that all
group members get the same payoff; due to our definition of
social dilemmas, unconditional defectors always outperform
unconditional cooperators, no matter whether the group also
contains fair players. Instead, fair players can only ensure that
they do not take any unilateral advantage of their peers. Our
characterization 6 implies that all social dilemmas permit a
player to be fair, irrespective of the group size. As an example,
consider the strategy proportional lit -for-Tat (pTFT), for which
the probability to cooperate is simply given by the fraction of
cooperators among the coplayers in the previous round
pTFTs- = n —I
[71
For pairwise games, this definition of pTFT simplifies to Tit-for-
Tat, which is a fair ZD strategy (23). However, also for the public
goods game and for the volunteer's dilemma, pTFT is a ZD
strategy, because it can be obtained from Eq. 4 by setting s= 1
and ¢=1/c, with c being the cost of cooperation.
As another interesting subclass of ZD strategies, let us con-
sider strategies that choose the mutual defection payoff as
baseline payoff, 1=60, and that enforce a positive slope 0 <s < 1.
The enforced payoff relation 5 becomes se"' =sx' + (1 —s)bo, im-
plying that on average the other group members only get
a fraction s of any surplus over the mutual defection payoff.
Moreover, as the slope s is positive, the payoffs x' and le are
positively related. As a consequence, the collective best reply for
the remaining group members is to maximize i's payoffs by
cooperating in every round. In analogy to Press and Dyson (23),
we call such ZD strategies extortionate, and we call the quantity
x= 1/s the extortion factor. For games in which 1=14=0, Eq. 5
shows that the extortion factor can be written as x = Je/x-I. Large
extortion factors thus signal a substantial inequality in favor of
player i. Extortionate strategies are particularly powerful in so-
cial dilemmas in which mutual defection leads to the lowest
group payoff (as in the public goods game and in the volunteer's
dilemma). In that case, they enforce the relation Ki > cc; on
average, player i performs at least as well as the other group
members (as also depicted in Fig. 2B). As an example, let us
consider a public goods game and a Z1D strategy pEr with 1=0,
=n IK" —r)sc+rcl. for which Eq. 4 implies
- I [i (1 scir+t -Ir)si.
n - I
Ig]
independent of the players own move Se {C.D}. In the limit
s I, pa approaches the fair strategypTFT. Ass decreases from
1, the cooperation probabilities of ey are increasingly biased to
the own advantage. Extortionate strategies exist for all social
dilemmas (this follows from condition [6] by setting 1=b0 and
choosing an s close to 1). However, larger groups make extor-
tion more difficult. For example, in public goods games with
n > r fir — I), players cannot be arbitrarily extortionate any longer
as [6] implies that there is an upper bound on,' (SI Tea).
As the benevolent counterpart to extortioners, Stewart and
350
Plotkin described a set of generous strategies for the iterated
351
prisoner's dilemma (24, 28). Generous players set the baseline
352
payoff to the mutual cooperation payoff I= a„._i while still
enforcing a positive slope 0<s < 1. This results in the payoff con
353
relation c' =so' + (1 —5)0„_1. In particular, for games in which
354
mutual cooperation is the optimal outcome for the group (as in
the public goods game and in the prisoner's dilemma but not in
the volunteer's dilemma), the payoff of a generous player sat-
isfies x' <
(Fig. 2C). For the example of a public goods game,
we obtain a generous ZD strategy p' by setting 1=rc—c and
ep
—r)sc +rc], such that
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
1
Ge
I
„
ti
1 n(r —1)
Ps n
+ (I s) n —I r+(n—r)s.
Ig1
For s
1, ptt approaches the fair strategy pTFT, whereas lower
values of s make pcw more cooperative. Again, such generous
strategies exist for all social dilemmas, but the extent to which
players can be generous depends on the particular social di-
lemma and on the size of the group.
As a last interesting class of ZD strategies, let us consider players
who chooses =0. By Eq. 5, such players enforce the payoff relation
x'=1, meaning that they have unilateral control over the mean
payoff of the other group members (for the prisoner's dilemma,
such equalizer strategies were first discovered in ref. 38). However,
345
346
347
348
349
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
Hilbe et el.
PNAS lefty Edition I 3 o16
EFTA01199749
373
374
375
376
377
378
379
380 Q:26
the other group members. This does not imply that
o.s
OA
groupmembers yield
0.
381
all other
the same payoff. (8)
D
20
40
60
DO
IDO
40
BO
0
20
BO
100
D
20
40
60
HO
IGO
Flotnd Number
Round
Round Number
For games in which mutual defection leads to the
382
lowest group payoff, extortionate players ensure that their payoffs are above average. (C) In games in which mutual cooperation is the social optimum,
383
generous players let their coplayers gain higher payoffs. The three graphs depict the case of a public goods game with r =4, c =1, and group size n=20. For
384
the strategies of the other group members, we used random memory-one strategies, where the cooperation probabilities were independently drawn from
385
a uniform distribution. For the strategies of the focal player, we used (A)pift (8) p” with s =0.8, and (C) pc* with s =0.8.
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
1101
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
Fig. 2. Characteristic dynamics of payoffs over the
course of the game for three different 2D strategies.
Each panel depicts the payoff of the focal player fr'
(blue) and the average payoff of the other group
members g-' (red) by thick lines. Additionally, the
individual payoffs of the other group members are
shown as thin red lines. (A) A fair player ensures
that the own payoff matches the mean payoff of
A
2.5
gists
. as
FEW strategy
Focal
13,
paysasi
c
Mee
OretiOMeoters
B
2_5
ExtallCilete Strategy
Focal
01aYOr
°0e(
, I
gray marrtors
C
2.5
Gene/00$ SeategY
Other
*9v members
.
'Sc
Focal
Fiery
unlike extortionate and generous strategies, equalizer strategies
typically cease to exist once the group size exceeds a critical
threshold. For the example of a public goods game this thresh-
old is given by n =211(r — 1). For larger groups, single players
cannot determine the mean payoff of their peers any longer.
Stable Cooperation In Multipbyer Sodal Dilemmas. Let us next ex-
plore which ZU strategies give rise to a Nash equilibrium with
stable cooperation. In S/ Text, we prove that such ZD strategies
need to have two properties: they need to be generous (by setting
/ =a„_i and s> 0), but they must not be too generous [the slope
needs to satisfy s ≥ (n — 2)/(n —1)1. In particular, whereas in the
repeated prisoner's dilemma any generous strategy with s> 0 is
a Nash equilibrium (27, 28), larger group sizes make it increasingly
difficult to uphold cooperation. In the limit of infinitely large
groups, it follows that s needs to approach 1, suggesting that ZD
strategies need to become fair. For the public goods game. this
implies that stable cooperation can always be achieved when
players cooperate in the first round and adopt proportional Tit-
for-Tat thereafter. Interestingly, this strategy has received little
attention in the previous literature. Instead, researchers have fo-
cused on other generalized versions of Tit-for-Tat, which oo-
operate if at least k coplayers cooperated in the previous round (4,
39, 40). Such memory-one strategies take the form psi =0 if j <k
and psi = 1 if ≥k. Unlike pTFT, these threshold strategies nei-
ther enforce a linear relation between payoffs, nor do they induce
fair outcomes, suggesting that pTFT may be the more natural
generalization of Tit-for-Tat in large-scale social dilemmas.
In addition to the stable ZD strategies, Akin's lemma also
allows us to characterize all pure memory-one strategies that
sustain mutual cooperation. In SI Text, we show that any such
strategy p needs to satisfy the following four conditions
an-i —ao
pea-I=1. PC A-2= 0, pal S
and
an-i —00
PODS bn_i — a „..1'
with no restrictions being imposed on the other envies psi. The
first conditionpe„_i = 1 ensures that individuals continue to play C
after mutual cooperation; the second condition pc,,- 2 = 0 guaran-
tees that any unilateral deviation is punished; and the last two
conditions describe whether players are allowed to revert to co-
operation after rounds with almost uniform defection. Surprisingly,
only these last two conditions depend on the specific payoffs of the
social dilemma As an application, condition 10 imply that the
threshold variants of Tit-for-Tat discussed above are only a Nash
equilibrium if they use the most stringent threshold: k =n —1. Such
unforgiving strategies, however, have the disadvantage that they are
often susceptible to errors: already a small probability that players
fail to cooperate may cause a complete breakdown of cooperation
(41). Instead, the stochastic simulations by Hauert and Schuster (5)
showed that successful strategies tend to cooperate after mutual
cooperation and after mutual defection lie., Pea-t =Pao = 1 and
psi =0 for all other states (S.j)]. We refer to such a behavior as
WSLS, because for painvise dilemmas it corresponds to the Win-
Stay, Lose-Shift strategy described by ref. 36. Because of condition
1101, WSLS is a Nash equilibrium if and only if the social dilemma
satisfies (6„_1 +b0)12sa„_i. For the example of a public goods
game, this condition simplifies to r> 2n1(n + 1). which is always
fulfilled for r≥ 2. For social dilemmas that meet this condition,
WSLS provides a stable route to cooperation that is robust to errors.
Zero-Determinant Alliances. In agreement with most of the theo-
retical literature on repeated social dilemmas, our previous
analysis is based on the assumption that individuals act in-
dependently. As a result, we observed that a player's strategic
options typically diminish with group size. As a countermeasure,
subjects may try to gain strategic power by coordinating their
strategies with others. In the following, we thus extend our the-
ory of ZD strategies for single individuals to subgroups of play-
ers. We refer to these subgroups as ZD alliances. Because the
strategic power of ZD alliances is likely to depend on the exact
mode of coordination between the allies, we consider two dif-
ferent models: when subjects form a strategy alliance, they only
agree on the set of alliance members and on a common ZD
strategy that each ally independently applies. During the actual
game, there is no further communication between the allies.
Strategy alliances can thus be seen as a boundary case of co-
ordinated play, which requires a minimum amount of coordi-
nation. Alternatively, we also analyze synchronized alliances, in
which all allies synchronize their actions in each round (i.e., the
allies cooperate collectively, or they defect collectively). In ef-
fect, such a synchronized alliance thus behaves like a new entity
that has a higher leverage than each player individually. Syn-
chronized alliances thus may be considered as a boundary case of
coordinated play that requires substantial coordination.
To model strategy alliances, let us consider a group of is4
allies, with 1 <nA cu. We assume that all allies make a binding
agreement that they will play according to the same ZD strategy
p during the repeated game. Because the ZD strategy needs to
allow allies to differentiate between the actions of the other allies
and the outsiders, we need to consider a more general state space
than before. The state space now takes the form (S.P.j-4). The
first entry S corresponds to the focal player's own play in the pre-
vious round, j4 gives the number of cooperators among the other
allies, andsA is the number of cooperators among the outsider& A
memory-one strategy p again needs to specify a cooperation prob-
abilitypsiAtA for each of the passible states. Using this state space,
we can define ZD strategies for a player i in a strategy alliances as
p= pReP+Okl —sX/1 — g9+ g —(n4 — l)wAgi —(n —
Ill]
The vector r 4 contains the average payoff of the other allies for
each possible state, and g-4 is the corresponding vector for the
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
4 of 6 I www.poes.orgicgikloii10.10TYpnas.140T88211t
Hilbe et al.
EFTA01199750
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
outsiders. The weights w-4 ≥ 0 and iv4 ≥ 0 are additional para-
meters that determine the relative importance of outsiders
and other allies, being subject to the constraint (n4 -1)w4+
(n -nA)w-A = 1. In the special case of a single player forming
an alliance, nA = 1, this guarantees that the two definitions of
ZD strategies 4 and 11 are equivalent.
Similarly to the case of single individuals, we can apply Akin's
lemma to show that strategy alliances enforce a linear relation-
ship between their own mean payoff ? and the mean payoff of
the outsiders rir4 (for details, see SI Text)
x-4 =eV + (1 - sit
112]
where the slope of the alliance is given by s4 =
- (nA - I )w4]/
[1 — (n4 - I)r 4]. A strategy alliance can enforce exactly those
payoff relationships 12 for which either el = 1 or for which I
and sA c 1 satisfy the conditions
max a( b, — -A
1 -3A
bt —
≤/≤
min
+n —j— 1
—a,
114-1Q9I-I
n —nA 1 -34 '
113)
Interestingly, to reach this strategic power, an alliance needs to
put a higher weight on the within-alliance payoffs (i.e., w4 needs
to exceed ;1r4; SI Tett), such that the allies are stronger affected
by what the other allies do, as opposed to the actions of the
outsiders. For single player alliances, n4 = 1. condition 13 again
simplifies to the previous condition 6. However, as the alliance
size nA increases, condition 13 becomes easier to satisfy. Larger
alliances can therefore enforce more extreme payoff relationships.
For the example of a public goods game, we noted that single
players cannot be arbitrarily extortionate when n > r I (r - 1). Alli-
ances, on the other hand, only need to be sufficiently large.
nA Ina (r -1)1r. Once an alliance has this critical mass, there are
no bounds to extortion.
In a similar way, we can also analyze the strategic possibilities
of a synchronized alliance. Because synchronized alliances act as
a single entity, they transform the symmetric social dilemma
between n independent players to an asymmetric game between
n —n4 + 1 independent players. From the perspective of the al-
liance, the state space now takes the form (S.j), where Se (C. Et)
is the common action of all allies and where 0 ≤j ≤n — via is the
number of cooperators among the outsiders. ZD strategies for
the synchronized alliance can be defined analogously to ZD
strategies for single players
p = pReP + 0[(1
g-4) +g4 —
1141
with le being the payoff vector for the allies and g-4 being the
payoff vector of the outsiders. For a single player alliance, n4 = 1,
this again reproduces the definition of ZD strategies in 4. By
applying Akin's lemma to Eq. 14, we conclude that synchronized
alliances enforce sr-A =SAKA +(I —SA)!, which is the same as re-
lationship 12 for strategy alliances. Surprisingly, we even find that
for reasonable alliance sizes, nA S n/2, strategy alliances and syn-
chronized alliances have the same set of enforceable parameters I
and ?, as given by Eq. 13 (see SI Tea for details). Thus, for the
two models of ZD alliances considered here, the exact mode of
coordination is irrelevant for the alliance's strategic power unless
the alliance has reached a substantial size.
Table 1 gives an overview of our findings on ZD strategies and
ZD alliances in multiplayer social dilemmas. It shows that, al-
though generally, ZD strategies exist for all group sizes, the power
of single players to enforce particular outcomes typically dimin-
ishes or disappears in large groups. Forming ZD alliances then
allows players to increase their strategic scope. The impact of
a given ZD alliance, however, depends on the specific social di-
lemma: although ZD alliances can become arbitrarily powerful in
public goods games, their strategic options remain limited in the
volunteer's dilemma.
Discussion
When Press and Dyson (23) discovered the new class of ZD
strategies for the repeated prisoner's dilemma, this came as a big
surprise (24, 25): after more than five decades of research, it
seemed unlikely that any major property of the prisoner's di-
lemma has been overlooked. For repeated multiplayer dilemmas
the situation is different. Although various Folk theorems guar-
567
antee that cooperation is also feasible in large groups (42, 43),
568
there has been considerably less theoretical research on the
569
evolution of cooperation in repeated multiplayer dilemmas (4, 5,
570
39, 40). This may be due to the higher complexity: the mathe-
571
matics of repeated n-player dilemmas seems to be more intricate,
and numerical investigations are impeded because the time to
572
compute payoffs increases exponentially in the number of play-
573
ers (5). Nevertheless, we showed here that many of the results for
574
the repeated prisoner's dilemma can be directly transferred to
575
general social dilemmas, with an arbitrary number of involved
576
subjects. The foundation for this progress is a new framework, ()Az
577
provided by Akin's lemma and the theory of Press and Dyson.
578
Using this framework. we extended the theory of repeated
multiplayer dilemmas into three direction& The first and most
immediate direction is our finding that ZD strategies exist in all
social dilemmas. These strategies allow players to unilaterally dic-
tate linear payoff relations, irrespective of the specific social di-
lemma being played, irrespective of the group size, and irrespective
of the counter measures taken by the other group members. In
particular. we showed that any social dilemma allows players to be
fair, extortionate, or generous. Each of these strategy classes has its
own particular strengths: extortionate strategies give a player a rel-
ative advantage compared with the other group members; fair
strategies help to avoid further inequality within a group; and
generous strategies allow players to revers to mutual cooperation
when a coplayer defected by accident. At the same time, ZD
strategies are remarkably simple. For example, to be fair in
a public goods game, players only need to apply a rule called
proportional Tit-for-Tat: if j of the n —1 other group members
cooperated in the previous round, then cooperate with prob-
ability j/(tt — 1) in the following round. Extortionate and gen-
erous strategies can be obtained in a similar way, by slightly
modifying pTFT to the own advantage or to the advantage of
the others.
As the second direction, we explored which ZD strategies and
which pure memory-one strategies can be used to sustain co-
operation in multiplayer dilemmas. Among ZD strategies, such
strategies need to be generous (such that players never try to
outperform their peers) (27, 28), but at the same time they must
not be too generous. The right degree of generosity depends on the
size of the group but not on the specific social dilemma being
played. As a rule of thumb, we obtain that in larger groups, subjects
are required to show less generosity.
As the last direction, we extended the concept of zero-
determinant strategies from single players to subgroups of players,
to which we refer to as ZD alliances. Depending on the degree of
coordination, we explored two forms of ZD alliances: members
of a strategy alliance only agree on using a common ZD strategy
during the game, but they do not coordinate each of their
decisions; members of a synchronized alliance, on the other
hand, act as a single entiryt—they either all cooperate or they all
defect in a given round. The effect of such ZD alliances depends
on the size of the alliance, the applied strategy, and the prop-
erties of the underlying social dilemma. In general, we find that
by coordinating their play with others, subjects can increase their
strategic options considerably. The exact mode of coordination,
however, only turns out to play a minor role: As long as the size
of the ZD alliance is below half the group size, strategy alliances
and synchronized alliances have the same strategic power. In ad-
dition to their static properties, ZD strategies for the prisoner's
dilemma also have a remarkable dynamic component (23, 44):
559
560
561
562
563
564
565
566
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
Mix et el.
PNAS Early Edition I 5 ed 6
EFTA01199751
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
Q:13
659
660
661
662
663
664 Q:14
665
666
667
668
669
Q:15
670
671
672
673
674 Qit;
675
676
677
678
679
680
681
682
Table 1. Strategic power of different ZD strategies for three different social dilemmas
Strategy
Typical
Prisoner's
dass
property
dilemma
Public goods game
Volunteer's dilemma
Fair strategies
Extortionate
strategies
Generous
strategies
Equalizers
A A =A
>,r 4
eA = i
Always exist
Always exist
Always exist
Always exist
Always exist
In large groups, single players cannot be
arbitrarily extortionate, but sufficiently large
ZD alliances can be arbitrarily extortionate
In large groups, single players cannot be
arbitrarily generous, but sufficiently large ZD
alliances can be arbitrarily generous
May not be feasible for single players, but is
always feasible for sufficiently large ZD alliances
Always exist
Even large ZD alliances cannot be
arbitrarily extortionate
Do not ensure that own payoff is
below average
Only feasible if size of ZD alliance is
nA = n -1, can only enforce 1=b —c
Analogously to the case of individual players, 20 alliances are fair when they set ? = 1; they are extortionate when 1=130 and 0<s-4 <1; they are generous
for i=a,,_i and 0 <54 <1; and they are equalizers when ? =0. For each of the three considered social dilemmas, we explore whether a given ZD strategy is
feasible by examining the respective conditions in ref. 13. In the repeated prisoner's dilemma, single players can exert all strategic behaviors (23, 28, 29). Other
social dilemmas either require players to form alliances to gain sufficient control (as in the public goods game), or they only allow for limited forms of control
(as in the volunteer's dilemma). These results hold both for strategy alliances and for synchronized alliances.
when a player commits himself to an extortionate ZD strategy,
then adapting coplayers team to cooperate over time. Numerical
simulations in the SI show an analogous result for multiplayer
dilemmas: when ZD alliances apply strategies with a positive
slope, they can trigger a positive group dynamics among the out-
siders. The magnitude of this dynamic effect again depends on the
size of the alliance, and on the applied strategy of the allies.
Here, we focused on ZD strategies; but the toolbox that we
apply (in particular Akin's lemma) is more general. As an ex-
ample, we identified all pure memory-one strategies that allow
for stable cooperation, including the champion of the repeated
prisoner's dilemma, Win-Stay Lose-Shift (36,45). We expect that
1. Hardin G (1968) The tragedy of the commons. The populationproblem has notechnical
solution; it requite fundamental extension in morality. Science 162(3315101243-1248.
2. Olson M(1971) The Logic or Collective Action: Public Goods and the Theory Ol Groups
(Harvard Univ Press, Cambridge, 6,M), Revised Ed.
3. Nowak MA (2006) Five rules for the evolution of cooperation. Science 314(6805k
1560-1563.
4. Boyd R. Rkherson P) 41988) The evolution of reciprocity in sizable groups. r Theo, Dial
132(3):337-356.
5. Hauert C, Schuette' HG (1997) Effects of increasing the number of players and memory
size in the iterated prisoners dilemma A numerical approads. Prue Not Sci 264:513-51g
6. Suzuki 5, Akyama E (2037)6w:take of indirect reciprocity in groups of various sizes
and comparison with direct reciprocity. l Theo, CON 245(3)539-552.
7. Santos FC, Padsoto 1M 42011) Risk of collective failure provides an escape from the
tragedy of the commons. Prot Nate Arad Sc) USA 108(26):10421-10425.
8. Amu Chakra M. Traulsen A (2012) Evolutionary dynamics of strategic behavior h
a colkctive-risk dilemma. PLOS Comput Biol8(8)z1002652.
9. Yang Vi, et al. (2013) Nonlinear effects of group size oncollective action and resource
outcomes. hoc Ned Aced Sri USA 110(27):10916-10921.
10. Trivets RL (1971) The ay.:Mien of reciprocal altruism. C) Rev BM:4635-57.
Axelrod R 11984) The Evolution of Cooperation Maw Books. New York).
12. Alexander R (1987) The Biology of Moral Systems (Aldine de Gruyter, New York).
13. Wedekind C. Milinski M (2000) Cooperation through image goring in humans, Sci-
ence 288(5467)350-852.
14. Henrichl, et al. (2001) Cooperation, reciprocity and punishment in fifteen small scale
societies. Ans Eton Rev 91:73-78.
IS. Nowak MA Sigmund K (2E05) Evolution of irarect redprocity. Nature 437(70641291-1298.
16. van Veelen M, Garcia 1, Rand DG, Nowak MA (2012) Direct reciprocity in structured
POPulatiOns. hoc Had AOC( SO USA I09(25):9929-9934.
17. 2agonky BM. Reiter 16, Chatterjee K. Nowak MA (2013) Forgiver triumphs in atter-
nating Prisoner's Dilemma. PLOS ONE 8412):e80814.
IS. YamagisN T (1986) The provision of a sanctioning system as a public good Jeers Sou
Psycho? 51:110-116.
19. Ostrom E 41990) Governing the Commons: The Evolution or Mstitutions for Collective
Action (Cambridge Univ Press, Cambridge, UK).
20. Sigmund K, De Silva H, Traulsen A, Hauert C (2010) Social learning promotes in-
stitutions for governing the commons. Nature 466(73003614363.
21. Guala F (2012) Reciprocity: Weak or strong? What punishment experiments do (and
do nod demonstrate. Behav Drain SO 35(0:1-15.
22. Hite C. Traulsen A. RON Tv Milinski M (2014) Democratic decisions establish stable
authorities that overcame the paradox of second.order punishment. Proc Natl Aced
So USA 111(2):752-756.
23. Press W14, Dyson Fl (2012) Iterated Prisoner's Dilemma contains strategies that
dominate any evolutionary opponent. Proc Natl Aced Sri USA 109(26)10609-10413.
them will be further applications of Aldn's lemma to come. Such
applications may include, for instance, a characterization of all
Nash equilibria among the stochastic memory-one strategies or
an analysis of how alliances are formed and whether evolutionary
forces favor particular alliances over others (46, 47).
Overall, our results reveal how single players in multiplayer
games can increase their control by choosing the right strategies
and how they can increase their strategic options by joining
forces with others.
ACKNOWLEDGMENTS. C.H. gratefully acknowledges generous funding by
the SchrOdinger scholarship of the Austrian Science Fund 03475).
24. Stewart Al. Plotkin 18 (2012) Extortion and cooperation in the prisoners dilemma.
Proc Had had Sd USA (09(26):10134-10135.
25. Ball (2012) PhysicistS suggest selfishness can par' Nature.
0:17
26. Hike C Novak MA, Sigmund K (2013) The evolution of extortion in iterated pris-
oners dilemma games. Proc Natl Aced Sci USA 110(17)£9134918.
27. Akin E (2013) Stable cooperative solutions for the iterated prisoners diemma.
0:18
28. Stewart Al. Plotkin Rs (2013) From extortion to generosity, evolution In the Iterated
Prisoners Dilemma. Proc Natl Arad Sc) USA 110438):15348-15353.
29. Hite C. Nowak MA, Traulsen A (2013) Adaptive dynamics of extortion and come&
ante. PLOS ONE 801):e77886.
30. Szolnoki A, Pere M (2014) Evolution of extortion in structured populations. Phys Rev E
Rat NOnlin Soft Matter PhyS 89:022804.
ct:19
31. Pan L, Hao 0, Rang Z Zhou (2014)2ero-determinant strategies in the iterated public
goods game.
32. Kerr B, G.:dirty-Smith P, Feldman MW (2004) What is altruism? Trends Er& Evoll9(3):
135-140.
33. Ledyard JO (1995) Public goods: A suivey of experimental research. The Handbook of
Experimental Economia, eds Kagel JR Roth AE (Princeton Univ Press, Princeton).
Q:21
34. Olekmann A (1985) VOlunteerS dilemma. I Contact &whit 29:605-610.
ctaz
35. Milinski M, Sommerfeld RD, Kra beck II-1, Reed FA Marotthe 1 (2008) the collective-
risk social dilemma and the prevention of simulated dangerous climate change, Par
Had Arad Sul LISA 105(71:2291-2294.
36. Nowak M. Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms tit-
for-tat in the Prisoner's Dilemma game. Nature 36404321:56-58.
37. Mite C, ROM T, Milinski M (2014) Extortion subdues human players but is finally
Punished In the prisoners dilemma. Nat Common 5:3976.
31. Boedijst MC, Nowak MA Sigmund K (1997) Equal pay for all Prisoners-Am Math Mon
104303-307.
0:23
39. KurOkawa 5, Ihara Y (2009) Emergence of cooperation in public goods games. floc
Biel Sri 276(1660)1379-1384.
40. Van seeoreed 5. Pacheco 10.4 Lenaerts T. Santos FC (2012) Emergence of falmess in
repeated group interactions. Phys Rev Lett 10/305k158104.
41. Sigmund K (2010) The Calculus of Selfishness (Princeton Vniv Press, Princeton).
42. Friedman 1 (1971) A non-cooperative equiltdurn for supergames. Rev Eton Stud 38:1-12.0:24
43. Fudenberg D. Tirok 10998) Game Theory (MIT Press. cancnoee. MA). 6M Ed.
44. Own 1, Zinger A (2014) The robustness Of zero-ckterminant strategies in iterated
Prisoners Dilemma games. / Theo, OM 357:46-54.
45. Kraires DP, KrainesVY (1989) Pavlov and the prisoners dilemma TheogrDecis 28.47-79.0:B
46. Peleg B, Sudhalter P (2003)4160dt/et/on to the Theory of Cooperathe Games (Springer,
Berlin).
47. mesterton.ciatons M, Gavrilets S. Gravner 1, Akcay E (2011) Models of coalition or
alliance format:kn..; Meer Biel 274(0:187-204.
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
6 of 6 I www.pnaS.OrgfCgild0i/10.1073/pnaS.1407887 1 1
Hilbe et al.
EFTA01199752
AUTHOR QUERIES
AUTHOR PLEASE ANSWER ALL QUERIES
1
Q: 1_Please contact [email protected] if you have questions about the editorial
changes, this list of queries, or the figures in your article. Please include your manuscript number in
the subject line of all email correspondence; your manuscript number is 201407887.
Q: 2_Please (i) review the author affiliation and footnote symbols carefully, (ii) check the order of the
author names, and (iii) check the spelling of all author names, initials, and affiliations. Please check
with your coauthors about how they want their names and affiliations to appear. To confirm that the
author and affiliation lines are correct, add the comment "OK" next to the author line. This is your
final opportunity to correct any errors prior to publication. Misspelled names or missing initials will
affect an author's searchability. Once a manuscript publishes online, any corrections (if approved)
will require publishing an erratum; there is a processing fee for approved erratum.
Q: 3_Please review and confirm your approval of the short title: Cooperation and control in multiplayer
dilemmas. If you with to make further changes, please adhere to the 50-character limit. (NOTE: The
short title is used only for the mobile app and the RSS feed.)
Q: 4_Please review the information in the author contribution footnote carefully. Please make sure that the
information is correct and that the correct author initials are listed. Note that the order of author
initials matches the order of the author line per journal style. You may add contributions to the list in
the footnote; however, funding should not be an author's only contribution to the work.
Q: 5_You have chosen the open access option for your paper and have agreed to pay an additional $1350
(or $1000 if your institution has a site license). Please confirm this is correct and note your approval
in the margin.
Q: 6_Please verify that all supporting information (SI) citations are correct. Note, however, that the
hyperlinks for SI citations will not work until the article is published online. In addition, SI that is
not composed in the main SI PDF (appendices, datasets, movies, and "Other Supporting Information
Files") have not been changed from your originally submitted file and so are not included in this set
of proofs. The proofs for any composed portion of your SI are included in this proof as subsequent
pages following the last page of the main text. If you did not receive the proofs for your SI, please
contact [email protected].
Q: 7_Please check the order of your keywords and approve or reorder them as necessary.
Q: 8_If your article contains links to websites (other than the SI links for your article), please verify that
the links are valid and will direct readers to the proper webpage.
Q: 9_Please provide department/section for affiliation a.
Q: 10_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This results in the ...."
EFTA01199753
AUTHOR QUERIES
AUTHOR PLEASE ANSWER ALL QUERIES
2
Q: 11_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This may be due of ...."
Q: 12_PNAS discourages claims of priority; is this truly new? If not, please either (a) replace the term new
with a term such as "previously unidentified" or (b) remove it altogether to avoid the claim of
priority.
Q: 13_For ref. 5, please provide issue number.
Q: 14_For ref. 10, please provide issue number.
Q: 15_For ref. 14, please provide issue number.
Q: 16_For ref. 18, please provide issue number.
Q: 17_For ref. 25, please provide volume, issue, and page range.
Q: 18_For ref. 27, please provide journal title, volume, issue, and page range.
Q: 19_For ref. 30, please provide issue number.
Q: 20_For ref. 31, please provide journal title, volume, issue, and page range.
Q: 21_For ref. 33, please provide page range.
Q: 22_For ref. 34, please provide issue number.
Q: 23_For ref. 38, please provide issue number.
Q: 24_For ref 42, please provide issue number.
Q: 25_For ref. 45, please provide issue number.
Q: 26_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This does not imply that ...."
EFTA01199754
6
Supporting Information
64
3
6',
4
66
Hilbe et al. 10.1073/pnas.1407887111
5
67
6
SI Text
68
7
In the following. we show how the mathematical framework introduced by Press and Dyson (I) and Akin (2) can be extended to explore
69
8
cooperation and control in multiplayer social dilemmas. We begin by defining the setup of repeated social dilemmas, and then we discuss the
70
9
existence and the properties of zero-determinant strategies. In particular, we study ZD strategies that allow a player to differentiate between
71
10
the actions of different coplayers. We also identify strategies that give rise to stable cooperation. To this end, we focus on two strategy classes:
72
11
ZD strategies and pure memory-one strategies. Then, we investigate how individuals can extend their strategic options by coordinating their
73
12
behaviors with others, and we apply our results to two examples of multiplayer dilemmas: the public goods game and the volunteer's di-
74
13
lemma. The appendix contains the proofs for our propositions.
75
14
76
Setup of the Model: Repeated Multiplayer Dilemmas. We consider repeated social dilemmas between n players (as illustrated in Fig. 1). In
15
each round, players may either cooperate (C) or defect (D), and the players payoffs for each round depend on their own action and on the
77
16
number of cooperators among the other group members. Specifically, in a group with/ other cooperators, a cooperator receives the payoff
78
17
ai, whereas a defector obtains b1. To qualify as a social dilemma, we assume that one-shot payoffs satisfy the following three conditions (3):
79
18
i) Independent of the own action, players prefer their coplayers to be cooperative
80
19
81
/0
8/
21
ar t a ag and btu a Di for all/with 0 <j< n —1.
[S1]
83
22
84
23
85
24
ii) Within each mixed group, defectors strictly outperform cooperators
86
15
87
/6
88
1g
KI
or all4with 0 Si <n —1.
[Sy]
89
90
iii) Mutual cooperation is preferred over metal defection %yr'
root
17
/9
91
30
92
31
en- t > ba•
[S3]
93
32
94
33
As particular examples of such social dilemmas, we discuss the linear public goods game (4) and the volunteer's dilemma (5) later.
95
M Ro
We will assume here that the social dilemma is repeated infinitely often. This is merely done for simplicity of the argument; similar
96
35
results can be obtained for finitely many rounds or for games with discounting of the future as in refs. 6 and 7. In repeated games,
97
36
a player's strategy needs to specify how to act in given round, depending on the outcomes of the previous rounds. Given the strategies
98
of all group members, let us denote player is expected payoff in round t as ai(r). The payoffs for the repeated game are defined as the
37
99
average payoff per round
38
100
1 T
39
101
= lim
CO T EAT).
[S4]
40
d
r -*
102
ml
41
103
42 oa
In the following, we will assume that these limits exist. This always holds, for example, when players only base their decisions on a finite
104
43
(but arbitrarily large) number of past rounds.
105
44
106
45
Zero-Determinant Strategies for Multiplayer Dilemmas.
107
46
Memory.one strategies and Akin's lemma. Although in general, strategies for repeated games can be arbitrarily complicated, we showed that
108
players can achieve a remarkable control over the possible payoff relations by referring to the outcome of the last round only. In particular, we
47
109
focused on players who only consider their own move in the previous round and the number of cooperators in the previous round (this is
48
a consequence of our assumption that the game is symmetric, such that payoffs do not depend on who of the coplayers cooperated, but only on
110
49
1 l l
how many). Such strategies are particularly relevant when players can only observe the outcome of the game, but not the coplayers' individual
50
actions. In the context of alliances, however, it is useful to consider a slightly more general strategy set, which allows players to distinguish
112
51
between different coplayers. In this section, we will therefore develop a more general theory of ZD strategies.
113
52
To this end, let us denote player i's action in a given round as 51e {C,D}, and let a= (S1
Sa) e (C.Dr denote the overall
114
53
outcome of that round. A memory-one strategy is a rule that tells a player what to do in the next round, given the outcome of the
115
54
previous round. Formally, such memory-one strategies correspond to a map that takes the outcome of the previous round a as input
116
55
and that returns the cooperation probability p,, for the next round, p= (ps)„,ic..Dr. For example, player i's strategy Repeat, which
117
56
simply reiterates the own move from the previous round, takes the form p
P with
118
58
'Ns,.
=
so
0 if Si =D .
gel,
{1 if S, =C
[SS]
119
120
57
59
121
60
Additionally a memory-one strategy also needs to specify a cooperation probability for the first round of the game. Because the outcome
122
61
of infinitely repeated games is often independent of the first round, the initial cooperation probability is typically neglected (I, 2, 8-13).
123
62
In the following, we will therefore only specify a player's initial cooperation probability when necessary.
124
Hite et el. www.pnes.orgAglkontentishort/1407811711t
1 of 19
EFTA01199755
125
If all members of the group use memory strategies, the calculation of payoffs according to Eq. S4 becomes particularly simple. In that
187
126
case, the repeated game can be described as a Markov chain, as the outcome in the next round only depends on the outcome of the
188
127
previous round (14-16). Although the assumption of memory-one strategies often simplifies calculations, we will show below that the
189
128
properties of ZD strategies hold irrespective of the strategies of the other group members (in particular, ZD strategists do not require
190
129
their coplayers to apply memory-one strategies).
191
130
For a game between n players with arbitrary but fixed strategies, let ve(t) be the probability that the outcome of the rth round is a and
192
131
let v(t) = [v"(t)]eelcol^ be the vector of these probabilities. For example, for pairwise games v(t) becomes kat). vcn(r), vrac(0.van(01.
193
132
A limit distribution visa limit point for t-. co of the sequence (v(I) + ...v(t)]/t. The entries of v. of such a limit distribution cor-
194
133
respond to the fraction of rounds in which the group members find themselves in state a e {CO}" over the course of the game. If one
195
134
of the players applies a memory-one strategy p, Akin's lemma again guarantees that there is a powerful relationship between p and v
196
135
(which can be shown with literally the same proof as in the main text).
197
136
198
137
Lonna (Akin's Lemma). Suppose the focal player applies an athitraty memory-one strategy p. Then, for any limiting distribution v (irrespective
199
138
of the outcome of the initial round), we have
200
139
201
140
(p — pRel
• v = 0,
IS6]
202
141
203
142
where the product refers to the usual scalar product, p•v= r__
,._,...(arpos•
204
143
We note that Akin's lemma makes no assumptions on the payoff structure of the game (in particular it also applies to games
205
144
that do not have the form of a social dilemma). Moreover, there are no restrictions on the strategies applied by the remaining
206
145
n —1 group members.
207
146
Zero-detenninant strategies. To define zero-determinant strategies, let us first introduce some further notation. For a given round outcome
208
14Pt3
GE {C. D}". let lad denote the total number of cooperators (i.e., 'al equals to the number of Cs in a). This allows us to write player i's
209
148
payoff g, for that round as
210
149
211
150
IS7]
212
egl. .S.) = { aro; I if :St —=DC.
151
213
152
2
Let g' = &e,),„.(cmr be the corresponding payoff vector. Using this notation, we can write player i's payoff in round t as ne(t) = g • v(t),
153
21514
and i's expected payoff for the repeated social dilemma according to Eq. S4 becomes d= g' • v. Finally, let 1 = (1).,Icar denote the
154
216
vector with all entries being equal to one. By definition of v, it follows that 1. v= I. We can now introduce ZD strategies as follows.
155
217
156
Theorem (Press and Dyson). Let a, 4, and y be parameters such that E mpi0 O. If player i applies a memory one strategy of the form
218
157
219
158
II
220
159
P = PReP +cle + E fife + r I •
PSI
221
160
Oi
222
162
161
223
then, itrespettive of the strategies of the remaining n —1 group members, payoffs obey the equation
224
163
225
164
a:if+ Eft"? + y =O.
091
226
165
or
227
166
228
We refer to strategies of the form S8 as zero-de:eminent strategies or ZD strategies.
229
167
Proof Follows immediately from Akin's lemma, because
168
230
169
231
170
232
171
0 = (p — pate ) • v = (agt + E 6e + rt •v = azi + E isys + T.
pie]
233
172
n"
or
234
173
235
174
By using ZD strategies, a player can thus enforce a linear payoff relation between his own payoff and the payoffs of the coplayers.
236
Moreover, by appropriately choosing the parameters a, fir and y, the player has direct control on the form of this payoff relation.
175
237
For our purpose, it will be convenient to use a slightly different representation of ZD strategies. For a player i who applies a ZD
176
238
strategy, let us consider the following parameter transformation:
177
239
178
240
179
-r
- a
4
241
180
i_
s=,.-•
Mt' =
(SII]
242
a +Ek.efik.
Lkonk
rAiiil'
a
iii i =°.
46=Efik•
L-
k
181
k4i
243
182
Using these new parameters, ZD strategies take the form
244
183
245
184
246
185
P = PRI' +Ok
i ''' Eine
+ ( 1
•t)111 ,
15121
247
186
i*
248
Hite et el. www.pnes.orgicglkontentishort/140783711 I
2 of 19
EFTA01199756
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
3%4
309
310
{
I +95[(1 0a
it
(blel akr1-01
—Ial
subject to the constraints #*0, w, =0, and E74,w, =1, which directly arise from the definitions in Eq. S11. When player i applies such
a ZD strategy, the enforced payoff relation according to Eq. S9 becomes
rr'= sx' + (1 —s)/.
IS13)
where
=
iH7d is the weighted average payoff of the coplayers. We refer to! as the baseline payoff of the ZD strategy, to s as the
slope, and tow = (wi ) as the strategy's weights. The parameter # does not have a direct effect on Eq. S13; however, the magnitude of #
determines how fast payoffs converge to the enforced payoff relation as the game proceeds (7).
Examples (the Impact of different weights):
i) Equal weight on all coplayers. Suppose player i applies a ZD strategy with weights voi = I/(n — I) for all j 01. According to Eq. S12,
the entries of such a ZD strategy have the form
Po=
95[(1—s)(1-41) +
if Si =D.
if Si =C
(S141
The cooperation probabilities of player i thus only depend on the player's own action Si in the previous round and on the number of
cooperators H. That is, the ZD strategies that we discussed in the main text are exactly those ZD strategies that use the same weight for
each of their coplayers. According to Eq. 513, such strategies enforce tr, ' =sr' +(I —s)/, with tr-' being the arithmetic mean of all coplayers
payoffs tr-' =E
1(rt — I). In Fig. SI, we illustrate this relationship for two different social dilemmas (the public goods game and the
volunteer's dilemma) and for two different ZD strategies (a generous and an extortionate ZD strategy).
Full weight on one coplayer. Let us now suppose instead that player i chooses w such that all entries are zero except for w, =I for
some j# i. It follows that
1 +0(1 —s)(1—akh,)
1 +01.saiel..i
+(1 s)I]
Pe=
4,[sblei — am_i+ (1 —s)/]
0(1 —0 —No
if Si=Si=C
ifSe=C,Si =D
if Si =D.S)=C
if Si =Si =D.
ISIS]
That is, player i's reaction only depends on the number of cooperators. the own move, and on player ts move. The enforced payoff
relation 513 becomes Kr =sd + (I —s)1.
A player cannot enforce arbitrary payoff relations 513 because the parameters I. s, w, and ¢ need to be set such that the resulting
cooperation probabilities according to Eq. S12 are in the unit interval. We thus say that a payoff relation (Ls. w) is enforceable if there
is a ¢#0 such that the resulting ZD strategy p satisfies p,, [0, 1] for all possible outcomes aE (C.Dr. The following gives some
necessary conditions for enforceable payoff relations.
Proposition 1 (Necessary Conditions for Enforceable Payoff Relations).Any enforceable payoff relation (1.s.w) satisfies —r t.'. <s < 1, and ifs < I
then be <1 <an-i. Moreover, the parameter ep# 0 needs to be chosen such that ifi> 0.
In addition to these necessary conditions, one can also give a characterization of all possible payoff relations.
Proposition 2 (Enforceable Payoff Relations). Consider a payoff relation (1,s. w) for player i such that ovi> 0 ford/ j. Let fv) denote the sum of
the j smallest entries of w (ercluding the entry W1), and let % =0. Then (1.4,w) is enforceable for a given social dilemma if and only if either
s =1 or
b. —
max
{
riy(bi —ai-1)}</< min
a
—a,)}
{
0≤iln -1
1—s
hsism-1
1
I —s
IS16I
Remark (some observations on enforceable payoff relations):
i) A direct consequence of Proposition 2 is that ZD strategies exist for all social dilemmas and for all weights w with In> 0 (one only
needs to set s =1). Moreover, if all weights are positive, i.e., iv, > 0 for all jOi, it also follows that any baseline payoff between
be s! <an_ i is enforceable for some s < I (because b)>al _i for all j; one only needs to choose an s that is sufficiently close to one).
ii) For a given slope and given weights, we also note that if the baseline payoffs It and /2 are enforceable, then so is any baseline payoff
between Ii and h.
iii) In the special case of equal weights on all coplayers, wi = 1/(n — I) for all j, the sum of the j smallest entries of w is simply given by
=j/(n —1). In that case, a payoff relation is enforceable if and only if either s= 1 or
max {b -
j
bi-aj-}<7 <
+n
-j -1 biti -a)}.
(5171
—
01•93-1 ' n -1 1-s
twsx.-I
n-1
1-s
Fig. S2 gives an illustration of the enforceable payoff relations, again for the two examples of a public goods game and a volunteer's
dilemma. In particular, in both examples the space of enforceable payoff relations shrinks with group size. This holds more generally:
if the payoff advantage of a defector br i —a1 does not increase with group size, then Proposition 2 implies that larger group sizes n
make it more difficult to enforce specific payoff relations.
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
HlIbe et el. www.poes.org/cglkontentishort/140788711I
3 of 19
EFTA01199757
373
iv) In the special case of full weight on one coplayer, i.e., we = I for some k 0 i and all other entries of w are zero, the sum of the/
435
374
smallest entries is either ii7= 0 (if/ <n — I) or go) =I (if i =n —1). Because bm ≥ hi and am ≥ at, a payoff relation is enforceable if
436
375
either s =1 or
437
376
438
377
(5181
439
max {Dn.-2. iril-
fbft-1 <!< min{b
a°, al }.
I —s
1 —s
378
440
379
441
For games in which ba- 2 >al, condition S18 cannot be satisfied. In particular, the condition cannot be satisfied for social dilemmas
380
4t
with group size n > 3 (because for such social dilemmas ba_2 >ai follows from S1 and S2). Thus, in large groups, the only feasible
381
ZD strategies are those with s = 1, i.e., the only payoff relationship that player i can enforce between his own payoff and player is
443
382
payoff is the fair payoff relationship, s =j/.
444
383
v) A comparison of Eqs. S17 and SIR thus shows that the set of enforceable payoff relations is larger if player i uses the same weight
445
384,ts
for all coplayers. This holds more generally: if a payoff relation (ts.w) is enforceable for some weight vector w, then it is also
446
385
enforceable for the weight vector that puts equal weight on all coplayers (this follows from Proposition 2 because go) becomes
447
386
largest when all coplayers have the same weight).
44S
387
449
388
Nash Equilibria of Repeated Multiplayer Dilemmas. Various Folk theorems have shown how repetition can be used to sustain cooperation in
450
389
social dilemmas (17. IS). To prove these Folk theorems, one typically constructs specific strategies that give rise to given payoff
451
390
combinations (xi
x") and shows then that these strategies are indeed an equilibrium of the repeated game. Herein, we ask a somewhat
452
391
different question: within a certain strategy class, what are all Nash equilibria? We respond to this question for two different strategy classes,
453
392
the class of ZD strategies, and the class of pure memory-one strategies. The equilibria among these two strategy classes will prove to be stable
454
393
against any mutant strategy (i.e., we do not need to assume that mutants are restricted to ZD strategies or memoryone strategies).
455
394
To simplify the analysis, we will focus on symmetric memory-one strategies for which the cooperation probability only depends on the
456
395
own move in the previous round and on the number of cooperators in the previous round (but not on who of the coplayers cooperated).
457
396
In that case, each possible outcome a can be identified with a pair (S, j), where S e (C,D} is the player's own move and/ is the number
458
39b.
of cooperators among the other n —1 group members. This allows us to represent memory-one strategies as vectors of the form
459
398
p = (ps1). Using an analogous notation as in the previous section, v= (v51) corresponds to the frequency of observing each of the states
460
399
(S,j) over the course of the game, g' = (gi .) corresponds to player rs payoffs, and r = (r) corresponds to the average payoffs of i's
.54
s)
461
400
coplayers (using the arithmetic mean r = n—Li_ E rne). Because symmetric memory-one strategies are a subset of all memory-one
462
401
strategies, the previous results (in particular Akin's lemma and the Press and Dyson theorem) naturally carry over.
463
402
Nash equilibria among ZO strategies. Consider a group in which all players apply the ZD strategy p with parameters 1, s, 0, and w) = 1/(n —1)
464
403
for all/ #i. and let us suppose the nth player considers deviating. We will refer to the first n — 1 players as the residents and to the nth
465
4D4
player as the mutant, and we denote a resident's payoff by x and the mutant's payoff by I. Because residents apply a ZD strategy, Eq. S13
466
405
implies that each of them enforces the relationship
467
406
468
n
I
407
n —1
—2 n+n l ir=sn+(l—s)!,
(519]
469
408
470
409
which can be rewritten as
471
410
472
411
it = Ar+ (I —siz)!,
(S20]
473
412
474
413
with
475
414
476
415
ea = s(n — 1) — (n —2).
(S21]
477
416
478
417
That is, then —1 residents collectively enforce a linear relationship between their own payoff Jr and the payoff of the mutant ir, with the
479
418
same baseline payoff I and with slope st. By using the same strategy as the residents, the mutant yields the payoff if = fr. Forst< I, Eq. S20
480
then implies that ir = A=l. Forst =s =1, the value of I is a free parameter that does not have an effect on the entries of the ZD strategy; to
419
481
be consistent with the cases < 1, we define! to be the payoff that the strategy yields if applied by all group members (this payoff depends on
420
the strategy's cooperation probability in the initial round). Thus, by using the residents' strategy the mutant yields the payoff !.
482
421
For p to be a Nash equilibrium, a minimum requirement is that the mutant must not have an incentive to switch to a different ZD
483
4//
strategy 0, with parameters i and
,--th ci <1. Such a mutant would enforce the payoff relation
484
423
485
424
n = lir+ (1 —1)I.
(S22]
486
425
487
426
Solving the two linear equations S20 and S22 for the mutant's payoff yields
488
427
489
428
it _ 1(1 —IR) 14.TR(1-1)
(S23]
490
429
1 —isia
491
43
4310
The mutant has no incentive to deviate when 1
492
<1, that is when
493
432
494
433
0 —0:71(1 —1)
495
u I - `
SO.
(524]
434
I —31/1
496
lite et el. www.petes.ceseglicOntenthhcet/140703711I
4 of 19
EFTA01199758
497
Because —
d
3,—ri ≤s <1 (by Proposition I) and because --4 ct <1 (by assumption), the denominator of Eq. S24 is positive. We can
559
498
therefore
iitinguish three cases.
560
499
561
i) sR =0. In that case 1-1=0, and player n cannot improve his payoff by deviating.
500
562
ii) sR > 0. In that case 1 —/ ≤0 if and only if I <1. To prevent the mutant from deviating, the residents thus need to apply a strategy
Jul
563
with maximum possible baseline payoff, 1=a„_i.
502
iii) sR <O. Then 1-1 <0 if and only if ? — / > 0. To prevent the mutant from deviating, the residents' ZD strategy needs to set 1 to the
564
503
minimum value 1=bo.
565
504
566
505
This result also holds if mutants are not restricted to ZD strategies.
567
506
568
Proposition 3 (Nash Equilibria Among ZD Strategies). Considers social dilemma in which mutual cooperation is the best outcome for the group
507
569
whereas mutual defection is the burst possible outcome
508
be < onisit
<
. jak.. I + (n —Dbr
570
509
571
510
MU MA + l(11—Db) <6,„_1.
IS25)
572
511
n
osisn
n
573
512
574
Let p be a ZD strategy with parameters 4 s, and # and let I R = (n - 1)5—(n —2). Then p is a Nash equilibrium if and only if one of the
513
575
following three cases holds:
514
576
515
577
516
5R =0 and bo <I <a,r-i ; sit > 0 and 1=a„..t;
and .sR <0 and / = bo.
578
517
579
518
Remark (some observations for stable z0 strategies):
580
519
i) The three conditions in Proposition 3 do not depend on 0. Whether a ZD strategy is stable only depends on the payoff relation that
581
520
it enforces, but not on the exact strategy that gives rise to this payoff relation.
582
521
ii) The three conditions do not directly depend on the sign of s, but on the sign of .5R = (n —1)5 — (st — 2). Whether a given ZD strategy
583
522
is stable thus depends on the group size.
584
523
iii) The second condition is of particular interest, because it states that stable mutual cooperation can be achieved by ZD strategies
585
524
with 1=an_ t and s> (n — 2/n —1). For pairwise games these are exactly the generous ZD strategies with I =4,,_i and s> 0 (2, 10).
586
519 7
As the group size increases, generous strategies need to approach the fair strategies (with s= I) to allow for stable cooperation.
587
526
588
Corollary 1 (Convexity of the Set of Nash Equilibria). Consider a social dilemma in which mutual cooperation is the best outcome for the group.
527
Suppose pi and p* are tno ZD smmgies that both give rise to stable cooperation (i.e., l' =I" = tr„_i and ?> (n —21n —1), se> (n —21n — 1)).
589
528
Then any linear combination p =.1p' + (I —A)p° with 0 < A <1 is afro a ZD strategy that gMtc rise to stable cooperation.
590
529
Proof A direct computation shows that p can be written as a ZD strategy with parameters / =an_i, 46 =20' + (1 —2)0' >0, and
591
530
s= (IA24' +(I— A)erfill[Ab' + (1 —A)01}≥ (ti — 2/n —1).
592
531
A similar result can also be shown for ZD strategies that lead to mutual defection.
593
532
Nash equilibria among pure memory-one strategies. As another application of Akin's lemma, we will show in the following which pure memory-
594
533
one strategies allow for stable cooperation in multiplayer social dilemmas. To this end, let us again consider a group in which the first n — 1
595
534
players apply some pure memory-one strategy p= (psi ) and in which the nth player considers deviating. Let v denote a limit distribution
596
535
from the perspective of the n —1 resident players and let t be the corresponding limit distribution from the perspective of the mutant
597
536
player. The following observation will be useful:
598
537
Lemma 1 (Relationship Between Limit Distributions). If the residents apply a pure memory-one strategy, psie (0, 1) for all (S,j), the entries of
599
538
v swish,
600
539
601
540
602
541
ye.; = 0
for all j<n —2
(S261
603
542
varpi= 0 for all j> 1.
604
543
60
Moreover, the limit distributions v and i are related by
544
6056
545
607
546
ka-i =vca-t, i)c.o =YD.&
(S27]
608
547
lion-I=VCA-2,
i)D.0 = VD.0,
609
548
610
549
and 1,si= 0 for all other states (54).
611
550
Proof As all residents use the same pure strategy, they play the same action in any given round. Thus, if one of the residents cooperates,
612
then there are at least n —2 other cooperators, and therefore the probability to end up in a state with less than n-2 other cooperators
551
613
is zero, vcd =0 for j <n —2. The same argument shows that vD0 =0 forj> 1. Finally, for Eq. S27, we note that the mutant is in the state
552
614
(Cis — 1) if and only if each of the residents is in the state (Can —1). Similarly, the mutant is in the states (C. 0), (An -1), and (D.0) if
553
615
and only if the residents are in the state (D, 1), (Cat — 2), and (D.0), respectively.
554
616
555
Proposition 4 (Pure Memory-One Strategies That Give Rise to Stable Cooperation). Consider a social dilemma in which there is an incentive to
617
556
deviate from mutual cooperation, 1),,,_1>a„_1. Let p be a pure memory-one strategy that cooperates in the first round and that sticks to
618
557
cooperation as long as all other playas do so (i.e., pc.„-I = 1). Then p is a Nash equilibrium if and only, if the entries of p satisfy the three
619
558
conditions
620
HIt . et el. www.pnes.orgicglkontentishort/140783711 I
5 of 19
EFTA01199759
621
683
622
N A-2 =0
IS28a)
684
623
685
624
[bbl
686
bn-i—a,,-t
687
626
PD.I S an-1 — a0
625
an
—be
688
627
9
628
Poo S bff_i
-1
—an_i'
IS28O
68
690
629
691
630
Remark (some remarks on Proposition 4):
692
631
i) The full proof of Proposition 4 is given in the appendix; the step (*) follows from a straightforward computation of payoffs for two
693
632
possible mutant strategies. The step (a) requires a more sophisticated argument; it is exactly in this second step where Akin's
694
633
lemma comes into play.
695
634
ii) According to Proposition 4, the stability of a cooperative and pure memory-one strategy p is solely determined by the four entries
696
635
pc.„_] =1,pc,,_2= 0, pDA, and pp°. This observation is a consequence of Lemma 1, which allowed us to neglect all other entries of
697
636
p. For pairwise games, Lemma 1 is not required, and thus pairwise games allow more general versions of Proposition 4, which are
698
63f 4
then also valid for mixed memory-one strategies (2, 6).
699
638
Examples (for stable memory-one strategies):
700
639
701
i) The proofs of the various Folk theorems are often based on trigger strategies, which relentlessly play D after any deviation from
640
702
the equilibrium path (17). An example of such a strategy is Grim, for which pc,' =I andps,, =0 for all other states (S, j). Because
641
Grim satisfies the three conditions in Eq. S28, Grain is indeed a Nash equilibrium for all multiplayer dilemmas.
703
642
ii) In their analysis of multiplayer dilemmas, refs. 19 and 20 consider the performance of generalized versions of Tit-for-Tat. These
704
643
TPTk strategies cooperate if at least k other group members have cooperated in the previous round, i.e.,psi = 1 if and only if j≥ k.
705
644
As the conditions 528 are only satisfied for k =n —1, it follows that TPTa-i is the only Nash equilibrium among these generalized
706
645
versions of Tit-for-Tat.
707
646
iii) Unfortunately, neither Grim nor TF7i_i is robust under the realistic assumption that players sometimes commit errors (16). For
708
647
stochastic simulations of variants of the public goods game, ref. 15 found that evolution may promote strategies that only
7®
648
cooperate after mutual cooperation or after mutual defection, i.e., Pea-1 =Pao = I. and psi =0 for all other states (S.j). We refer
710
to such strategies asWSLI For the prisoner's dilemma, this strategy corresponds to the Win-Stay Lose-Shift behavior described by
649
711
ref. 14. According to Eq. S28, WSLS is a Nash equilibrium if and only if the social dilemma satisfies (b„_, +b0)12 ≤an_1. In Fig. S3,
650
we illustrate the stability of these strategies when the social dilemma is a public goods game.
712
651
713
652
With a similar approach, we can also characterize the pure memory-one strategies that result in defection.
714
653
715
654
Proposition S (Pure Memory-One Strategies That Give Rise to Stable Defection). Consider a social dilemma with bo> ao and bn-t
ern...t. Let p
716
be a pure memory-one strany that defects in the first round and that sticks to defection as long as all other players do so (Le., PD.o =0). Then
655
717
p is a Nash equilibrium if and only if at least one of the following nvo conditions is satisfied
656
718
657
719
b„_1 +ao
658
pal =0; and n
=pca-2=0 and
<be.
2
659
ren-I
(529]
720
721
660
722
Remark (some remarks on Proposition 5):
661
723
662
i) As a special case of the above proposition, we conclude that AllD is an equilibrium for any social dilemma with ba_i ≥an_1 and
724
663
Do a ao.
725
ii) Conversely, mutual defection is never stable in social dilemmas with b0 <ao (such as the volunteers dilemma). If bo <ao, it follows
664
726
from condition Si that be <ai for all/. As a consequence, an equilibrium in which everyone defects can always be invaded by
66
6
728
5
727
a player who switches to Al1C.
66
667
729
668
Coordinated Play and Zero-Determinant Alliances. In the previous sections, we analyzed the amount of control that single individuals can
730
669
exert over their coplayers in repeated multiplayer dilemmas. Now we are interested in the question whether individuals can gain a higher
731
amount of control by coordinating their play with other group members. To this end, let us consider a set of n-4 < n players, who agree on
670
732
671
a joint strategy. We will refer to these players as the allies and to the remaining group members as the outsiders (as depicted in Fig. 1).
733
672
Depending on the degree of coordination, one can think of various forms of coordinated play. In the following we are going to
734
673
explore two modes of coordination:
735
674
i) Strategy alliances. Here, players only agree on the set of alliance members and on a common memory-one strategy that each ally
736
675
applies. During the actual game, there is no further communication taking place. Strategy alliances thus require a minimum amount
737
676
of coordination. This alliance type seems particularly relevant when allies do not have the possibility to communicate during the
738
677
game or when it is too costly to coordinate the allies actions in each round.
739
678
ii) Synchronized alliances. Alternatively, players may synchronize their decisions in each round such that all allies play the same action. In
740
679
effect, a synchronized alliance thus behaves like a new entity that has a higher leverage than each single player.
741
680
We will use the symbols ri-4 to refer to the payoff of an ally and C A to refer to the average payoff of the outsiders. In the following,
742
681
we investigate which linear payoff relationships ZD alliances can enforce and how their strategic strength depends on the mode of
743
682
coordination.
744
Hite et el. www.poes.oroRglkontentishoit./1407837111
6 of 19
EFTA01199760
745
Strategy alliances. To model strategy alliances, let us consider a group of nA allies, with 1 ≤nit <n. We assume that the allies agree on
807
746
using the same ZD strategy p during the game. The parameters of this ZD strategy are given by /, s, and 0> O. To allow allies to
808
747
differentiate between the actions of other allies and outsiders, the weights w= (wi) of the ZD strategy may depend on a player's
809
748
membership in the alliance. That is, if player i is a member of the alliance, then
810
749
811
750
wA
if j is a member of the alliance
812
751
wi'4 = { w"v1 if j is an outsider.
(530]
813
752
814
753
with w4 ≥ 0 and nr4 ≥ 0 such that the weights sum up to one, (nA —1)w4 + (n —ItA)nrA = 1. Because all allies use the same strategy, it
815
754
follows that they all get the same payoff. Moreover, by Eq. 513, each of the allies enforces the payoff relationship
816
755
817
756
(x4 — 1)wAr? + (n —nA))v-All-A = srA + (I —sy.
(S31]
818
757
819
758
This payoff relationship can be rewritten as
820
759
821
760
fir4 = sANA + (1 —s4)1.
(S32]
822
761
823
762
such that
824
763
825
764
826
? —
s— (nA
1 — OA — 1)
—1)"'4
765
(533]
827
wA'
766
828
767
829
is the effective slope of the strategy alliance. For HA = 1 /(n — 1), we recover the case of equal weights on all group members. For general
768
830
weights; A, we say that the payoff relationship S32 with parameters (/,s4) is enforceable if we can find 0> 0 and 0 < wA < 1/(nA —1)
769
such that all entries of the resulting ZD strategy according to Eq. S12 are in the unit interval. The following gives the corresponding
831
770
characterization.
832
771
833
772
Proposition 6 (Enforceable Payoff Relations for Strategy Alliances).A strategy alliance can enforce the payoff relation (1.?) if and only if either
834
773
? =I or
835
774
836
j
bi — ar , 1
n — f — I bio —a/
775
sA <1 and sr:7,0 {b,
</ <
min
•to +—
ISM]
837
0
776
n—nA I —et
nA-isisn-i 1
n — nA
I —sA }.
838
777
839
Moreover, if nA <n12, then —1 <— nA fin —nA)<sA < I.
778
Remark (on strategy alliances):
840
779
841
780
i) Earlier we saw that individuals typically lose their strategic power when groups become large. The above proposition shows
842
781
that players can regain control by forming alliances. In particular, the space of enforceable payoff relations (i,sA) increases
843
with the size of the alliance n4. Larger alliances can therefore enforce more extreme payoff relationships, as illustrated in
782
844
Fig. S4.
783
845
ii) Somewhat surprisingly, it follows from the proof of Proposition 6 that the set of enforceable payoff relationships becomes maximal
784
when 1A approaches II (nA -1) (and therefore w-A -. 0). The most powerful alliances are those in which the outsiders' actions
846
785
only have an infinitesimal influence.
847
786
iii) In contrast to the case of ZD strategies for individual players, we note that the effective slope ? for alliances does not need to be
848
787
bounded from below. As an example, let us assume the alliance has reached a size n4 such that ba_„.. ≤a„A_I (which can only
849
788
happen when nA > n/2). Because b„,A is an upper bound for the left side of Eq. S34 and a...I_ 1 is a lower bound for the right side
850
789
of Eq. S34, it follows that any I with b„,,A 515a,,.4_1 can be enforced, irrespective of the value of sA <1.
851
790
iv) However, for alliances that have reached a size nA such that b„_„A <a„A _i, the theory of ZD strategies becomes somewhat less
852
791
relevant: such alliances are better off by cooperating in each round (if all allies cooperate, their payoff is at least anA _I, whereas if
853
they all defect, their payoff is at most b„_„.4). In other words, if n4 individuals are able to make a binding agreement that they will
792
854
all play the same strategy, in a social dilemma with b„_„A <a„A_I, then unconditional cooperation is a dominant strategy for the
793
855
alliance.
794
856
795
Synchronized afflattes. In the previous scenario, we assumed that each of the allies decides independently whether to cooperate in a given
857
796
round. Let us now turn to a scenario in which allies meet after each round to decide which action they collectively play in the next
858
797
round. As a result, the alliance members act as a single entity, in a game with n —nA coplayers. To investigate such a scenario, let us
859
798
first adapt our notation correspondingly. For a given set of allies, let (S,j) refer to the outcome in which all allies choose SE {C,D},
860
799
and in which j of the outsiders cooperate. As in previous sections, the limit distribution v = (vs.') corresponds to the fraction of
861
800
rounds the alliance finds herself in state (S.,i) over the course of the game. A memory-one strategy for a synchronized alliance is
862
801
a vector p= (ps.1)—given the outcome (S,j) of the previous round, the cooperation probabilitypsi is used to determine whether all
863
802
allies cooperate or all allies defect in the next round. The synchronized alliance uses the strategy Repeat if cooperation probabilities
864
803
are given by
865
804
866
805
Rep... f 1 if S=C
[535]
867
806
Ps.) — 1 0 if S=O.
868
lite et el. www.roes.orgleglkontenthhott/140788711I
7 of 19
EFTA01199761
869
870
871
872
873
874
875
With literally the same proof as in the main text, one can verify Akin's lemma for synchronized alliances: if the alliance applies
a memory-one strategy p then any corresponding limit distribution v satisfies (p — pReP) • v = 0. Let us next write down the possible
payoffs in a given round. The payoff vector gA for the synchronized alliance has the entries
...A
1 a„or i if S=C
(S36 ]
55I =
bi
if S=D,
931
932
933
934
935
936
937
876
and the corresponding vector g-A that contains the average payoffs of the outsiders (using the arithmetic mean) takes the form
938
877
939
878
ja„A,,j_i +(n — nA —))b,,At, if S=C
940
n —nA
IS37]
fait+(n —nA —.Obi
if S=D.
879
880
Ss/ —
941
942
881
943
n —nA
882
944
883
Using these payoff vectors, the payoff of each ally is given by x-4 = gA • v, and the mean payoff of the outsiders is sr-4 =r
4 • v.
945
884
Analogously to the case of individual players, we can define ZD strategies for synchronized alliances as strategies of the form
946
885
947
886
P=PR`P+cre+fig-A +71.
IS38]
94
887
949
888
with I being the memory-one strategy with all entries being one and with a, II, and y being parameters of the ZD strategy (with fi 0 0).
950
889
Akin's lemma then implies that such alliances enforce the relationship
951
890
891
ourA + fir -A + y =O.
1S39]
952
953
892
Again, we stress the fact that this relationship holds irrespective of the strategies of the outsiders: even if outsiders notice that they are
954
893
facing a synchronized alliance, there is nothing they can do to
the above
relationship. As above, we use the
955
894
prevent
payoff
parameter
transformation 1= —y/(a +11), sA = --alp, and # = —it to write ZD strategies as follows:
956
895
957
896
p =pRy+ ek[sAg4 —g- A + (1 —sA)11].
IS40]
958
897
959
898
With these new parameters, the enforced payoff relationship according to Eq. S39 takes the usual form
960
899
961
900
yr- A = sAr4 + (1 —sA)l.
IS411
962
901
963
902
Proposition 7 (Enforceable Payoff Relations for Synchronized Alliances). A synchronized alliance can enforce the payoff relation (1,sA) if and
964
903
only if either sA = I or
965
904
966
905
sA #1 and
j
bi-aj-i} ,
n -j- I biti -ai
<
967
si
min
{a + —
--
IS42]
n -$1-4 1 —sA
— w4-isisx-i 1
n—nA 1 —s•A }"
906
osmisnA{b I
968
907
969
908
Moreover, if nA <n12, then —1 < —nA 1(n — nA) < et < I.
970
909
Remark (on synchronized alliances):
971
910
i) For not too large alliances with nA <n12, Propositions 6 and 7 imply exactly the same conditions on enforceable payoff relation-
972
911
ships. Thus, although strategy alliances require considerably less coordination between the allies, they have the same strategic
973
912
power as synchronized alliances.
974
913
ii) When nA > 42, synchronized alliances may be able to enforce a strictly larger set of payoff relationships, because they are not
975
914
restricted to payoff relationships with sA s 1. Whether relationships with s-4 > I are enforceable depends on the social dilemma.
976
915
When the social dilemma satisfies b„_,,A <a„A_I then condition S42 can be satisfied for any b„_„A <I <a,,A_I by choosing sA > 1
sufficiently large. Conversely, when b„_„4 >a„.4_1 then only slopes s-4 < I are feasible (because for sA > 1 the left side of Eq. S42 is
977
916
917
strictly larger than h„-„A, whereas the right side is strictly lower than a„.4_1).
iii) Overall, we conclude that synchronized alliances are more powerful than strategy alliances if and only if the alliance has reached
978
979
918
a size nA such that ba„A <a„A_I. However, as noted for strategy alliances, the condition b„,A <a,,A_, transforms the social
980
919
dilemma into a game in which mutual cooperation is the best strategy for the alliance, such that the notion of ZD alliances
981
920
becomes less important.
982
9
983
911
21
923
Here we explored the strategic power of alliances, assuming that the allies agree on a joint ZD strategy. Given these results, one may
ask which ZD strategy the allies should agree on and which combinations of alliance strategies and outsider strategies form an
984
985
9;4
925
equilibrium of the game between allies and outsiders. This question is different from the questions explored above; there we considered
a homogeneous group of players, and we explored which strategies are stable if applied by all group members. To explore equilibria for
games with ZD alliances, one needs to distinguish between the strategies of the allies and the strategies of the outsiders. This inherent
986
987
917
asymmetry makes the equilibrium analysis more intricate, and we thus leave this question for future work
988
989
928
Applications. In this section, we apply our theory to two particular examples of multiplayer social dilemmas: the public goods game and the
990
929
volunteer's dilemma. For simplicity, we will focus here on symmetric strategies that only depend on the number of cooperators but not
991
930
on the cooperators' identities (for ZD strategies this implies that we consider the case of equal weights on all coplayers).
992
Hite et el. www.pnes.oraftglkontenthbort/140788711 I
8 of 19
EFTA01199762
993
Public goods games. In a public goods game, each player of a group can cooperate by contributing an amount c > 0 to a public pool. Total
1055
994
contributions are multiplied by a factor r with 1 <r <n and evenly shared among all group members. Thus, payoffs are given by
1056
995
1057
996
is) =
(j+1)m
,.
fir
1058
—
c. and ui
—.
(8431
997
n
= n
1059
998
1060
999
Some of the properties of ZD strategies for public goods games have been recently described by (13), using an independent approach.
1061
1000
Here we complement and extend these results.
1062
ID strategies for public goods games. Plugging the payoff values 843 into representation S14 shows that ZD strategies have the form
1001
1063
1002
1003
1004
1005
1006
1007
I +0[(1 —s)(/— g
a
r +c)
ci
n
-nn, I
if Si =C
Psi =
# [(I —s)(/-1—
re) + '
ci
n
n 1
if S, =D.
1S44]
1064
1065
1066
1067
1068
1069
1008
To explore which payoff relationships (1,$) a single player can enforce, we use the characterization given in Eq. 817. Because the
1070
1009
payoffs of the public goods game are linear in the number of coplayersj, the corresponding conditions become particularly simple (as
1071
1010
only the boundary cases j= 0 and j =n — 1 need to be eobsinereodi:ire.:licoenclude+tIliatsa single player can enforce a linear payoff relation
1072
1011
with parameters O) if either s= 1 or
1073
1012
1074
1013
1075
1014
(n —1)tr
c
(r
n)c
c
18481
1076
1015
n
1-s-
n
1077
1016
1078
1017
Fig. S2 shows the set of all pairs (1.$) that satisfy the above constraints for various group sizes n. We get the following conclusions for the
1079
1018
existence of extortionate strategies, generous strategies, and equalizers, depending on the size n of the group:
1080
1019
i) Extortionate strategies (7= bo = 0). Let us ask which slopes s an extortionate player can enforce. The inequalities 845 then imply that
1081
1020
slopes s> (r— 1)/r can always be enforced, irrespective of the group size /1. However, slopes s < (r —1)/r are only enforceable if the
1082
1021
group size is sufficiently small
-
1083
or
lee
more
1022
1084
1023
r(I -s)
1085
1024
" S
(S46]
r(1 -s)- I oe
1086
1025
1087
1026
We conclude that in large groups, n -• co, only extortionate strategies with s < (r - 1)/r are feasible.
1088
1027
i) Generous strategies (/ = a„..i = is —c). Using the inequalities 545, one can show that generous players can enforce exactly the same
1089
1028
slopes as extortionate players.
1090
1029
ii) Equalizers (s = 0). For equalizers, the inequalities S4S imply there are three regimes: (a) if n Sri(' —1), all baseline payoffs
1091
1030
0 <I <rc —c can be enforced; (b) if r/(r —1) <n < 2r/(r — I), only a limited subset of baseline payoffs 0 <1 <re —c can be enforced;
1092
1031
and (c) if n > 2r/(r — I), there are no equalizers.
1093
1032
1094
In particular, we conclude that for a given multiplication factor r > 1 the set of equalizer strategies disappears as groups become large.
1033
1095
Strategy alliances in the public goods game. By Proposition 6, strategy alliances with nA members can enforce a linear relation with
1034
parameters (l. sA) if and only if either sA =I or if the two following inequalities hold:
1096
1035
1097
1036
0<1<rc—c
1098
1037
(n—n
1038
)rc
c
(nsA — n)c
c
(S47]
1099
1100
1039
n
1 —sA — /
—
n
+ 1 —sx
1101
1040
1102
For the special cases of extortionate strategies, generous strategies, and equalizers, these inequalities allow us to derive the following
1041
1103
conclusions:
1042
1104
1043
i) Extortionate strategies (1 =b0 =0). We can rewrite the inequalities S47 to obtain a critical threshold on the fraction of alliance
1105
members that is needed to enforce a certain slope sA
1044
1106
1045
1107
1046
nA r(I—sA)— 1
1108
>
18481
1047
n — r(I-4)
'
1109
1048
1110
1050
becomes
particular, if an alliance wants to enforce arbitrarily high extortion factors x—• co, then sA = 1/x —.0 and the critical threshold
becomes nA in > (r— 1)/r. This condition is always satisfied if nA =n -1, implying that an alliance with n — I members can always
Ill'
be arbitrarily extortionate toward the remaining group member.
1051
1113
ii) Generous strategies (l=cen-i = rc —c). The inequalities S47 lead to the same threshold for nA pi as in the case of extortionate
1052
strategies, as given in Eq. S48.
1114
1053
iii) Equalizers (s = 0). For equalizers, the inequalities S47 lead to two critical thresholds; to be able to set the payoffs of the outsiders
1115
1054
to any value between 0 < / < or —c, the fraction of the allies needs to satisfy
1116
HIE. et al. wwwones.orolcolkontenthbort/1407837111
9 of 19
EFTA01199763
1117
1179
1118
nA r —1
>
1S49]
1180
1119
II
r
1181
1120
182
However, to be able to set the payoffs of the outsiders to some value between 0 <I < rc —c, the number of allies only needs to exceed
183
1122
1183
1122
1184
1123
nA (n -2)(r - 1)
1185
1124
≥
IS50]
1186
n
n + (n -2)r '
1125
1187
1126
1188
1127
1189
1128
Nash equilibria for the public goods game. In the following, let us describe a few strategies that allow for stable cooperation in the public
1190
1129
goods game. According to Proposition 3, this can be achieved by using a ZD strategy with parameters 1=4,,-i, 4>O, and
1191
(n -2/n -1) <s < 1. When we choose the boundary case s=1 and efr= 1/c (which is the maximum value of 0, given the constraint
1130
1192
0 <psi S 1), the resulting ZD strategy according to Eq. S44 is proportional Tit-for-Tat with entries
1131
1193
1132
1194
1133
pTFTsi =
i l
(S51]
1195
n
1134
1196
1135
1197
which is independent of the player's own move. This rule says that the player's cooperation probability is given by the fraction of
1136
cooperators among the coplayers in the previous round (additionally, we need to specify the cooperation probability for the first round,
1198
1137
which needs to be set to one).
1199
1138
Another boundary case is given by the choice s = (rt - 2/n — 1) and 0= [n(n -1)]/(c[n(n - 2) +r]) (which is again the maximum value
1200
1139
of 0). We refer to the resulting ZD strategy as generous Tit-for-Tat, which has entries
1201
1140
1202
1141
gTFTsi
j
n -j -1 n(r - 1)
1552]
1203
1142
n - 1+ n - I
(n -2)n +I
1204
1143
1205
1144
Also gTFT is independent of the player's own move, and it is generally more cooperative than prit because gTFTsi > pTFTsi for all
1206
1145
j <n - 1.
1207
1146
A last boundary case is given by 0-.0, in which case the resulting ZD strategy approaches the strategy Repeat, independent of the
1208
1147
choice of (n — 2/n — 1) <s <I. Due to Corollary 1, we can conclude that any linear combination of these three strategies of the form
1209
1148
1210
1149
p = Ai • p7FT + ArgTFT +A3 . Repeat,
(S53]
1211
P
150
/
lift '
with 0 <Ak <1, A3 < 1, and Ai +.22 +A3 =I is also a stable ZD strategy.
1213
Among the pure memory-one strategies, Proposition 4 allows us to conclude that Grim and TFT„..i are always Nash equilibria.
1152
1214
Moreover. the strategy WSLS is a Nash equilibrium if r≥ 2n /(n + 1), as illustrated in Fig. S3.
1153
1215
Volunteer's dilemma. In the volunteer's dilemma, at least one of the players needs to cooperate and pay a cost c > 0 in order for all group
1154
members to derive a benefit b>c> O. Thus, the payoffs are given by
1216
1155
1217
1156
ai = b —c for all b and b, =b if j> 1 and bo = 0.
(S54]
1218
1157
1219
1158
ID strategies for the volunteer's dilemma. According to Eq. S14, the ZD strategies with equal weight on all coplayers have the form
1220
1159
1221
1160
0/2
1 + 41( 1 —s)(1—b +c)—n; j : I c] if S, =C
1161
/3
1162
1163
Psi = 0[(1
— s ) (1 — b) + 1+2
ci
if S, =Dfl
1S551
0/4
l
ms
1164
1226
1165
96 (I —4 1
if S, =D.j =O.
1227
1166
1228
1167
By condition S17, exactly those parameters 1 ands can be enforced for which either s=1 or
1229
1168
1230
1169
1
c
s
max{ 0,6 --
—}
/sb -c.
15561
1231
1170
n -1 1 —s
1232
1171
1233
1172
This set of enforceable payoff relations is illustrated in Fig, S28. In the special case of extortionate strategies (1=0), condition S56
1234
1173
implies that a given slopes can only be enforced for sufficiently small groups
1235
1174
n <I +
c
1236
1175
b(1-s)'
15571
1237
1176
1238
1177
For equalizers (s = 0), the inequalities in Eq. S56 imply that a player can only determine the average payoff of the coplayers if n=2,
1239
1178
and then the only enforceable baseline-payoff is I= b -c.
1240
lite et el. www.imes.org/eglkontenthhort/140788711t
10 of 19
EFTA01199764
1241
Strategy alliances in the volunteer's dilemma. For strategy alliances, the enforceable payoff relationships according to Eq. S34 become
1303
1242
1304
1243
max I 0. b —
1
c
—}
<1Sb — c.
IS58]
1305
1244
n —
n4
I —
sA
—
1306
1245
1307
1246
In particular, alliances that aim to enforce an extortionate relationship (with 1=0 and some ? >0) need to have a critical size
1308
1247
1309
list
c
1310
1248
> 1
(SS9]
1249
n —
nb(I —sA).
1311
1250
1312
1251
It follows that alliances cannot be arbitrarily extortionate (setting ? =0 on the right side implies that such alliances would need to
1313
1151
satisfy n't > n - 1). Instead, even a large alliance of size n4 =n —1 can only enforce slopes with s4> (b —c)/ b.
1314
The performance of ZD alliances against adapting outsiders. In addition to the static properties of ZD strategies, Press and Dyson (1) also
1253
1315
highlighted a remarkable dynamic property of ZD strategies: when a player with a fixed extortionate ZD strategy is matched with an
1254
1316
adapting opponent who is free to change his strategy over time, then the adapting opponent will move toward more cooperation. In the
1255
1317
following we present simulations suggesting that ZD alliances can have a similar effect.
1-x'
1318
To explore the performance of ZD alliances against adapting outsiders, we consider a group of n subjects. Let us assume that the
1257
1319
players (1.
n4} form a synchronized alliance with ti4 <n and that the allies commit themselves to act according to a ZD strategy
1258
1320
with parameters 1, r 1 , and # (similar results could also be obtained under the assumption that the allies form a strategy alliance instead
1259
1321
of a synchronized alliance). Moreover, let us assume that each of the outsiders applies some arbitrary memory-one strategy. To p-
1260
1322
rametrize memory-one strategies, we note that the possible outcomes of a single round of the game can be written as
1261
1323
0= (S-4.S„.40
S„), where SA E (CD) is the joint action of the allies and where Si e {C. D} is the action of each of the outsiders. As
1/6)
1324
a result, there are 211-11A+1 possible outcomes a. The memory-one strategies for the outsiders are thus modeled as vectors p = (p,) with
1263
1325
2"-" A+I entriespa G [0,1].
12.64
1326
We assume that the ZD alliance and the outsiders interact in a series of repeated games. Although the strategy of the ZD alliance is
1265
assumed to be fixed, outsiders are allowed to adapt their strategy from one repeated game to the next. Specifically, we assume that in each
1327
1266
1328
time step, the group interacts in a repeated public goods game, resulting in the payoff el for each of the allies and the payoffs x) for each
1267
outsider/. Because all players use memory-one strategies, these payoffs can be calculated using a Markov-chain approach (15). In the
1329
1268
1330
next time step, one of the outsiders is randomly picked to change his strategy from p to p'. The entries of the new strategy pi are
1269
1331
independently drawn from a normal distribution around the old strategy (using an SD of 0.01). If the outsider's payoff using the new
1270
strategy is xi, then we assume that the outsider keeps the new strategy with probability
1332
1271
1333
1272
1
1334
1273
P — I 4. exp (—racy' — sin
(S60]
1335
1274
1336
1275
Otherwise, the outsider rejects the new strategy and continues to use the old strategy. The parameter fa> 0 corresponds to the strength
1337
1276
of selection. In the limit of weak selection (a—. 0, this yields p= 1/2, such that the choice between the new and the old strategy is fully
1338
1277
random. For the simulations, we consider the case of strong selection (we used m=100); in that case, the new strategy is likely to be
1339
1278
adopted if x' > x, and it is likely to be rejected when le <x. This elementary process, in which outsiders are allowed to experiment with
1340
1279
new strategies, is repeated for r time steps. For the initial population of outsiders, we assume that all outsiders start with unconditional
1341
1280
defection.
1342
1281
Fig. S5 reports the outcome of this evolutionary scenario for three different ZD strategies: a fair strategy, an extortionate strategy, and
1343
1282
a generous strategy. In all three scenarios, the outsiders become more cooperative over time; as a consequence, also the allies cooperate
1344
1283
more often, because all three ZD strategies are conditionally cooperative (Fig. S5, Upper). However, there are clear differences in the
1345
1284
final cooperation rates between the three scenarios. Extortionate alliances seem to be least successful to incentivize cooperation,
1346
whereas fair alliances tend to achieve full cooperation in the long run. This success of fair strategies can be attributed to their higher
1 85
1347
1286
slope values: because fair strategies uses =1, they perfectly recoup outsiders for increasing their cooperation rates. Both other strategy
1348
1287
classes use slopes with s < 1, which makes it less attractive for the outsiders to become more cooperative. As depicted in the lower
1349
panels of Fig. 85, fair ZD alliances therefore also yield the highest payoffs by the end of the simulation.
1288
1350
1289
Of course, the numerical simulations presented here only provide a snapshot of the full dynamical properties of ZD strategies (which
1351
deserve a careful analysis on their own). The simulations serve as a proof of principle: as previously shown for the iterated prisoner's
190
1352
dilemma (1, 21), ZD strategies can be used to generate a positive group dynamics in multiplayer dilemmas.
1291
1353
1292
Appendix Proofs.
1354
1293
Proof of Proposition 1: By the definition of ZD strategies. the cooperation probabilities after mutual cooperation and mutual defection are
1355
1294
given by
1356
1295
1357
1296
row n=
1 +0(1 - s)(1 —a„_t)
1358
1297
P61]
1359
po D) = 0(1 —s)ff — bo).
1298
1360
1299
As these two entries need to be in the unit interval, it follows that
1361
1300
1362
1301
0( 1 —s)(1—a„_i ) 5 0
1363
(S621
1302
0 S 0(1 —*/ — AO-
1364
HIlbe et el. www.pnes.orsecglkontenthhoit./140783711I
II et 19
EFTA01199765
1365
Adding up these two inequalities implies 0(1 —s)(bo —a„..1)≤0, and because of Eq. S3
1427
1366
1428
1367
0(1 -s)a. O.
1368
(8631
1429
1430
1369
Analogously. let us consider outcomes a in which all players but one cooperate (i.e., a is a permutation of (C,...,C, D), in which case
1431
1370
1432
1371
Pe . { 1+ O[sa„_2 — (I — wl )a„_2 — wibn-i + (I —s)/) if the defector is a coplayerf 0 i
(364]
1433
1372
O6.-1 -an-2 4' (1 -s)/I
if the defector is player i
1434
1373
1374
Because all these entries pe need to be in the unit interval
1435
1375
O[sa,,-2— (1— wl )an -2 — Wibe-i + (I —s)/}≤ 0
143743386
1376
1S65]
1
0 ≤ egsba-t — an-2 + (1 —s)1.
4
1377
1439
1378
Adding up these inequalities yields #(s+ wi)(b, _1 — a„_2)≥ 0 for all jOi, and because of Eq. S2
1440
1379
1441
1380
0(s + tv.,) a 0 for all j*i.
1381
1566]
1442
1443
1382
Combining the inequalities 563 and 566 then yields
1444
1383
1445
1384
0(1 +ii/j) ≥ 0 for all /0/.
(567]
1446
1385
1447
1386
and because at least one of the iv' is larger than zero (because all wi sum up to one), it follows that #≥ 0. The restriction efr$ 0 then
1448
1387
implies 417 O. Due to the inequalities S63 and 866, we may also conclude that —minfowi ss ≤ I. Because minf,,,w, ≤ 1/(n — 1) (again
1449
1388
because the wi sum up to one), it follows that —1/(n — 1)≤s ≤ 1.
1450
For s# 1, the inequalities S62 and S63 imply bo ≤/≤a„_,.
1389
1451
Proof of Proposition 2: For a given a= (Si
S',,), the entries p„ of a ZD strategy according to Eq. 512 can be written as
1390
1452
1391
1453
pn = lir + 46[(1 —s)(1-10,) + E wi(gf, —gni.
1568]
1392
1454
1393
114
1455
1394
with Al, given by Eq. SS and with t ie and g, given by Eq. S7. Let e denote the set of i's coplayers who cooperate in state a, nd let up
1395
145
14567
denote the corresponding set of coplayers who defect. Using this notation, the entry pe is given by
1396
1458
1397
1459
1398
I +# [(I -s)(/ -aki..i) - E w, 0,,,, -al,-0I if S, = C
1460
1399
AAP
Pe =
1569]
1461
1400
1462
1401
#1 1 -s)(/-4,1)+ E w,(blei -ale !)
if S, =D.
1463
1402
icor
1464
1403
1465
1404
Because et, > 0 can be chosen arbitrarily small, the conditionpeep, I] is thus satisfied for all a if and only if the following inequalities hold
1466
1405
1467
(1 —s)(/—akfri) — Eivi(bm — apt ') SO
for all a with S, =C
1406
1468
Se
1407
(I
/
-s)( -4 4) + E iv,o,,,, -nicht) ≥ 0 for all a with SD
, =
.
18701
1469
1408
1470
1409
160r
1471
1410
1472
Ifs =1, these inequalities are independent of the parameter I. and they are satisfied for any social dilemma (because wi≥ 0 for all/ by
1411
assumption and because bk,i >tilt , by Eq. S2). Fors < 1, we may divide the above inequalities by (I —s)> 0, implying that Eq. S70 is
1473
1412
equivalent to
1474
1413
1475
jadow)(biel —°101-1)
1414
E
1476
atel-I +
I —s
≥l for all a with Si =C
1415
(S71]
1477
1416
Enew)(biel - °101-1)
1478
1417
voi —
≤/ for all a with Si =D.
1479
1—s
1418
1480
1419
The inequalities S71 in turn are satisfied if and only if
1481
1420
1482
1421
if ,150161 —alel-I))
EGew)(biel—alel-0)
1483
max thcei
≤/≤ min atei.., +
(S72]
1422
I —s
elvic
I —s
.
1484
elY,•11
1423
1485
1424
Because the terms (blei- Ott! )/(1—s) are positive, the respective maxima and minima are attained by choosing the weights wi as small
1486
1425
as possible. That is, for a given number of total cooperators lal, the extrema are attained for those states a for which E irrews and
1487
1426
E ja,,,,w, are minimal. This observation implies that condition S72 is equivalent to
1488
Hite et el. www.pnes.orgfcglkontentMort/1407811711 I
12 of 19
EFTA01199766
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
O
max
b
1—s
—a/A))
<1<
min
+
asisn-I
Ittn_j_i 1—s
(bit , —ai)}
with 45 being the sum of the j smallest entries in (vi)
.
Proof of Proposition 3: We already know that for a ZD strategy to be a Nash equilibrium, one of the three conditions need to be fulfilled
(otherwise there would be a different ZD strategy that yields a higher payoff). Conversely, let us assume that one of the three conditions
of the proposition is fulfilled.
I) Ifsit =0, then by Eq. S20, the mutant's payoff is 1=1, irrespective of the mutant's strategy. In particular, there is no incentive to deviate.
ii) Suppose s R > 0,1= an-I, and let us assume to the contrary that the zero-determinant strategy is not a Nash equilibrium. Then there
is a mutant strategy such that 1> a,,A. Because the residents collectively enforce the relation if =st k+ (1 —sR)a,,_i and because
s' >0, we can conclude k>a„_i. However, then the average payoff of all group members exceeds a„..i, contradicting the
assumption that an_i is the maximum average payoff per round.
iii) Under the assumption that be is the minimum average payoff per round, the case s R <0 and 1 = bo can be treated analogously to
the previous case.
Proof of Proposition 4: Because p is a Nash equilibrium, the payoff of any mutant strategy j, satisfies I <an-1. Let us first consider a mutant
who applies the strategy
i.e., h i =0 for all (S. j). Because the mutant never cooperates,
=0 for all j, and by Lemma 1 also
Poi =0 for allj except/ e {0.n — 1}. The values of fia„_i and ODA can be obtained by calculating the left eigenvectors of the transition matrix
(An -1)
(D.0)
n — 1)
(D,O)
PC4-2
I - PC„-2 •
PD.0
1 -pD.0
If we hadpt .-2 =1, then the assumption that p players start with cooperation would imply i Da_i =1, such that the payoff of AHD was
ir=b„_, >a„..1. Because this contradicts the assumption tha p is a Nash equilibrium, we conclude that pc.„_2 =O.
A calculation of the left eigenvector of Eq. S74 with respect to the eigenvalue 1 then yields
As a result, the payoff of AID is
~tDn 11 .[P
no
+Pas)
1 / (I -Fpoo)
+ bo
k = b„_APD„,,A + bean!
1 +pn.o
oot
The requirement ir <an-I then implies the condition S28c.
As another special case of a possible mutant strategy, let us consider a mutant p such that pca_l =0, and pso =I for all other states
(S,j). With a similar calculation as in the previous case, one can determine this mutant's payoff as
(ao-i + bo-0 'Nu + 0 0
= °n-iPco-1+bo-ipao-t
+koPco
•
1 + 2pn.1
The requirement if <€2,,_1 implies condition S28b.
Suppose the n —1 residents apply the pure memory-one strategy p. Due to Akin's lemma
n-I
n-I
E (pci- Ovci + Expo; =0,
fro
which because of Lemma 1 and
=1, N o-2=0 simplifies to
VC„-z=paimi +pnevn.o.
Then, irrespective of the mutant's strategy, the payoff k satisfies
— a n_t = r .(1 (ai—aa_Octi + Qtt,
Lemma
=
(ae — a o.4)ke
+ (bo_i — a o..4)1t.o.4+(bo —
)ip.°
)
Lemma I,
=
Igo
+(bo-3 —°n-i)vca-2+ (bo —ao-i)voa
Etl
=
l"0
— an-0(Prkiva3 +PD.ovass)+ (b0 — an-dvao
(528bal
= frbn-I
(72e-t — clo)] • V Al + [(bn-I "an-I)PD.0
(ao-i
bad • VD.0
S
0 ,
that is, p is a Nash equilibrium.
1551
073]
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
IS74]
1570
1571
1572
1573
1574
1575
1576
1577
(S751
1578
1579
1580
1581
1582
IS761
1583
1584
1585
1586
1587
1588
IS77]
1589
1590
1591
1592
1593
1594
(S781
1595
1596
1597
1598
1599
(S791
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
Hite et el. www.pnas.oegfcglkontent/thort/140788711
13 of 19
EFTA01199767
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
164gio
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
Proof of Proposition 5: The proof follows along the same lines as the proof of Proposition 4.
Suppose pis a Nash equilibrium and assume thatpal 00 (because p is a pure strategy it follows that pal =1). We have to show that
these assumptions imply pc,' =pca-2 =0 and (b,,-.1+a0)12<bo. To this end, let us first consider a mutant with strategy AMC, such
that psi =I for all (S./). Because of Lemma 1, and because the mutant always cooperates, the only possible outcomes in a given round
(from the perspective of the mutant) are (C. is — 1) and (C.0). The transition matrix is given by
The limit distribution of this transition matrix is
such that the payoff of ABC becomes
(C,ri —1)
(C.0)
— 1)
(c 0)
PCA-1
I
1
0
(
1
Pco
1—Pc„ -1
)
,
(PcA-1) = 2 —Pca-1
2 PC„-1
+ (1 —
= an-a0c0-1+ agto
2 —Pc„-1
If we had pc.„-I =1, this payoff would equal to ir=a„_i >60=x, contradicting our assumption that p is a Nash equilibrium. Thus,
pc„-1 = 0.
Let us now consider another mutant strategy p with PD.o=1 and psi =0 for all other states. Again by constructing the transition
matrix [with possible states (C. n —1). (C.0), (D.n —1), and (D,0)], one can compute the payoff of this mutant as
N-1+ —Pc.-2)(bo+ao)
=
3
2Pcn-2
If pc,,_2 =1, then 1=6„-i >bo= if, again contradicting the assumption that p is a Nash equilibrium. Therefore, isc.,,_2= 0, and the
mutant's payoff becomes ir= (b„_i +b0 +a0)/3. For p to be an equilibrium, this payoff needs to satisfy it <b0, which yields
(b„..) +ao)/2sbo.
Let us first consider the case pDA =0. Because the memory-one strategy also prescribes to defect in the first round and because
pao= 0, it follows that all residents play defect throughout the game, irrespective of the strategy of the mutant. Thus, any mutant's
payoff can be written as k=i)a.obo +itoao <bo= A. This shows that p is a Nash equilibrium.
Let us now consider the second case, pc,,_) = pc„..2 =0. Without loss of generality, we can also set pal = 1, and by assumption,
pa0= 0. Under these conditions, Alcin's lemma becomes
= ka-1 +VC a-2.
Then, irrespective of the strategy of the mutant, the mutant's payoff satisfies
rt-1
ir - b0= Da / - bo)fti +(bi -bo)Poi
i-0
Ltm=ma
+(ao -bo)Pc.o+ (bn-i
Isma I,
=
-bo)vc,„_) +(ao -60)voi + (b,,_) - bo)vc,,,_2
Ell I CI(an-1 - bo)vcA-1 + (ao -bo)(Vca-1+VCA-2)
bO)9C4 -2
= (an-1 no - 2b0)' VC„-1 + (bn-i +a0 - 2b0). van-2 5 0,
with the last inequality being due to the fact that (b„_) + ao)/2 sb0 and a,,..) <b,,_). Again, we conclude that p is a Nash equilibrium.
Proof of Proposition 6: Due to our construction, strategy alliances require each ally to apply a ZD strategy p with parameter I, s, and 0
and weights w. To enforce an effective slope r 1, Eq. S33 implies that the parameter s needs to be chosen such that
s = sA + (11A
DIVA (] —SA).
For? =1, we get s=1, and Proposition 2 guarantees that the payoff relationship is enforceable (independent of I and the weights wA).
Let us now assume that s <I (and therefore? < I). Because of Proposition 2, the alliance can enforce the payoff relationship (I.?) if
and only if we can find an appropriate weight vector w= (w)) such that
1675
1676
1677
1678
1679
1680
1681
(MO)
1682
1683
1684
1685
1686
1687
1688
1689
lS811
1690
1691
1692
1693
1694
1695
1696
(582)
1697
1698
1699
1700
1701
1702
1703
l 5831
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1
ISM)
714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
(5851
1732
1733
1734
1735
1736
Me et el. www.pnes.orgicglkontentishoit/140788711I
14 et 19
EFTA01199768
1737
1799
max {Is' — 1 _ 08.4 _ oirt l'nfi-i
l —sii 1 </ ≤ min
asisn-I fa, + l _ friA _ owA 1 _ 5,4 }.
(S861
1800
-i
1738
1739
asssn
i)
gin- I
bit. —a)
1801
1740
1741
As in Proposition 2,W) refers to the sum of the smallest entries in w (excluding the entry corresponding to the focal player). Because w
1802
1803
only has the two possible entries w4 and tv-A, we can write 67 as
1742
1804
1743
1805
1744
fivA
if wA <w- A. j <nA —1
1806
1745
1746
w., —_ {(n A — Owl + (1 — nA +1)W-A if wA <w-A. j> n A - I
1807
jw-A
if wA >w-A, tot -n4
15871
1808
1747
(I +11-4 -n)tvA + (n -nlitr A if wA >WA, j > it -nA
1809
1748
181
1749
The constraint (nA — 1)IvA + (is — nA)w-A =I implies iv-A= [1— (nA —1)w/fin —nA). By plugging this into Eq. S87, we can calculate
1811
the expression
1750
1812
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
iwA
1
<
if w.4 <_ -1' j
nA -1
n-1,
if iiP4 <
, j> nA - 1
1— (nA —1)wA
1
is —j— 1
if
1 - (nA -1)wA
n -flA
MB]
1 - (nA -1)wA
1
1
if HA > _.
j< It -nA
n —1
n —nA
1 - (n - j -1)wA
if wA
1 i.j> n —nA.
1— (nA —1)wA
1813
1814
1815
1816
1817
1818
1819
182109
1821
1822
1823
1762
Due to condition 586. the space of enforceab e payoff relations becomes maximal when we choose the weight wA such that
1824
1763
k/[1— (nA —1)wAl becomes maximal. Eq. S88 suggests that by/[I — (nA —1)wAi is monotonically increasing in wA. Thus, considering
1825
1764
the restriction 0 <wA <1/(nA — 1), the maximum is attained for HA-. 1/(nA -1), which also implies 1A> 1/(n- 1). From Eq. S88
1826
1765
we obtain
1827
1766
1828
1767
1829
1768
fi m
ii
_ { n n A
co
if jsn-nA
1S891
1830
1769
.a -Inn' -If 1 — (nA —1)wA
1831
if j >n — nA .
1770
1832
1771
Thus, for wA sufficiently close to 1/(nA-1), condition S86 is satisfied if and only if
1833
1772
1834
1773
1774
0sisn-n-4
' n — it— 1 .5"
nA -min -1 '
is .11'-
1
SA ) .
max {bi—
j
A i
r): 1 )
<I <
min
{a, +n —*
,
i 1 .n.
bs —ai
(S981
1835
1836
1775
1837
I 71011
This is condition S34. Moreover, if nA <n/2, setting j=n/2 in the left side of Eq. S90 suggests
1838
1777
1839
1778
n/2 bna — a/t12-1
bn/2
I.
1591]
1840
1779
n — nA
1 — sA
1841
1780
1842
1781
Similarly, setting j=n/2 — 1 in the right side of Eq. 890 leads to
1843
1782
n —n/2 k ip —anfz-a
1844
1783
1784
1≤64/2-1 + n _ nA
1 — sA
' 1S92]
1845
1846
1785
Summing up these two inequalities shows that sA > — [nA /(n —nA)[.
1847
1786
Proof of Proposition 7: The proof follows the lines of Propositions I and 2. By its definition (Eq. S40), a ZD strategy for a synchronized
1848
1787
alliance has the form
1849
1788
1790
1789
1791
1792
1793
1794
1795
1796
1797
1798
1+0[0 —sA)(1—a„,,,n_i )—II: nn7 i (b„Ay —an.,n_id
18509
1851
if S =C
1852
Psi =
{
P 931
1853
4,[(1-sA)(/-M+
4 01-ai_1)]
if S=O,
1854
1855
for 0 <j S n —nA. The conditions pci <1 and poi ≥ 0 imply the following sign constraints
1856
1857
1858
., n -nA -j ,..
0
(S9421
1859
a. 0[(1 -sio- a„An_ij
n n.„1 (a04,5 - anAti-id
1860
Hltbe et al. www.pfsas.org/eglkontenthhortn40703711I
15 et 19
EFTA01199769
1861
1923
0 ≤ 0 [(I -sA)e -1)1) +n -
*
1.4 0 1- afri)].
1862
[S94b]
1924
1863
1925
1864
1926
for all 0 ≤j < n -n-4. Setting j=n—nA in Eq. S94a yields
1865
1927
1866
441 —sA)(1 — an.° 1. 0.
(S951
1928
1867
1929
1868
and setting j=0 in Eq. S94b yields
1930
1869
1931
1870
0≤0(1—sA)(1—bo).
(S961
1932
1871
1933
1872
Adding up Eqs. S95 and S96 shows that
1934
1873
1935
1874
#(1 -sA) ≥ O.
1S971
1936
1875
1937
1876
For s 4 =1, we can ensure that 0 ≤psi S I for all entries in Eq. S93 by choosing a 0> 0 that is sufficiently small. Let us therefore
1938
1877
assume ? # 1. Because we also have 0# 0 by definition, condition S97 becomes 0(1 — s-4) > 0. Dividing the sign constraints in Eq. S94
1939
1878
by 0(1 —sA) then implies
1940
1879
1941
1880
max {b.-
J
2—} Vs
min la„A4)_i +n -"A -j bnAli - anA+)-1).
(8981
1942
i
b. -a._i
1881
0s)sn-nA
I n flA I
SA
ssisn-nA
n —11A
I —sA
1943
1
1
882
944
1883
This condition in turn is equivalent to condition S42.
1945
Conversely, suppose Eq. S42 is satisfied for some (/,?) with : A O 1. It follows that the sign constraints S94 are satisfied for every
1884
1946
choice of A. subject to the condition 0(1 —sA)> 0, which implies that the entriespsi in Eq. S93 satisfypci s I and ppi ≥ 0 for all/ By
18112
choosing # sufficiently close to zero, we can also ensure that pc) >0 and ppo ≤1. This shows that the payoff relationship (1,sA) is
1947
188
enforceable.
1948
1887
Finally, suppose nA ≤n/2. Setting/ =0 in Eq. S94a yields
1949
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1899
1898
1900
0[(1 -sA) (I - a„A _I) - (b„A - a„A_ I)) 5,. O.
1S991
and setting j = nA in Eq. S94b results in
„A
0 ≤ 0 [(1 —?)(/ — b„.i) + n — n
..1 (bo — anA ..1 )1 .
1S1001
Adding up these two inequalities shows that
/I
.4
H
0 (s4 +
≥ O.
n ItA
01011
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1901
Combining Eqs. S97 and SIO1 gives 0> 0, which in turn implies —[nA /(n —n-4)] ≤sA ≤ I.
1963
1902
1964
1903
1. Press WH, Dyson PI (2012) Iterated Prisoners Dilemma contains strategies that dominate any evolutionary opponent. Prot Nati Arad Sc) USA 109126):10409-10411
1965
wan
2. Akin E (2013) Stable cooperative solution for the iterated prisoner's dilemma.
1966
3. Kerr B, Godfrey-Smith P. Feldman MW (2004) What is altruism? Trends (col Evol 190i:13S-140.
1905
4. ledyard 10 (1995) Pubk goods: A survey of experimental research. The Handbook of Experiments Economic. eds Kagel lit Roth AE (Princeton Univ Press. Prin(eton).
1967
19061e
S. Diekmann A (1985) Volunteer's dilemma. 1 Conflict Resoled 29:60S-610.
1968
i96I"
6. HAW C. Trauisen A. Sigmund K Partners co nvals7 Strategies to, the Iterated prisoners dilemma. Working Paper. 2013.
1969
7. HAW C Rohl T. Milinski IA (2014) Extortion subdues human players but Is finally punished in the prisoners dilemma. Nat Common 5:3976.
1908
8. Stewart Al, Plotkin M (2012) Extortion and cooperation in the prisoner's dilemma Poe Nail Aced Sci USA 109(26):10134-1013S.
1970
1909
9. Hilbe C. Nowak MA. Sigmund K (2013) Evolution of extortion In Iterated Prisoners Dilemma games. Prot Nate Atad Sc? USA 110071:6913-6918.
1971
10. Stewart Al, Plotkin 1B(2013) From extortion to generosity, evolution in the Iterated Prisoner's Dilemma. Proc Nat1 Aced So USA 110(30:15348-15353.
1910
1972
11. Hite C, Novak MA, Trauben A (2013) Adaptive dynamics of extortion and compliance. PtoS ONE 8111)x+77886.
191m
n. szomoki A. Pert M (2014) Evolution of extortion In structured populations. Phys Rev E stet monansott matter PhyS 89:022804.
1973
iusin
13. Pan 1., Ito D, Rang 2, thou T (2014) Zero-determinant strategies in the iterated pubic goods game.
1974
14. NOwak 6A. Sigmund K (1993) A Strategy 01 win-Start lose-shill that outperforms tit-fa-tat in the Prisoners Dilemma game. Nature 344(6432):56-93.
19 Via
IS. Haven C. Schuster HG (1997) Effects of increasing the number of players and memory size In the iterated prisoners dilemma: a numerical approach. I00( WO SO 26t513-519.
1975
1914
IL Sigmund K (2010) The Calculus of Se/Pshness (Princeton Mir Press, Princeton).
1976
19 811
A. Friedman 1O971) A noncooperative eguikbrium for suPergarnes. Rev Ron Stud 313i-12.
1977
1B. Fudenberg D. Mackin E (19.96) the folk theorem in repeated games with discounting or with incomplete information. Econometrica S4(3):533-554.
1916
19. Boyd R. Richerson Pi (1988) The evolution of reciprodty in sizable grown./ Meer anal 132(3):337-356.
1978
1917
20. Kurokawa 5, tiara V (2009) emergence of cooperation In pude( goods games. Prix Old SCi 2760660):1379-13M.
1979
21. Chen 1, Zinger A (2014) The robustness of zero-determinant strategies in iterated Prisoner's Dilemma games. 1 Theor Bid 387%6-54.
1918
1980
1919
1981
1920
1982
1921
1983
1922
1984
Hilbe et al. www.pnas.OrgAgikontentishort/1407887111
16 of 19
EFTA01199770
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
A
towage peon
or co-players
R'
Public Goods Game
•c -c ltd
10.03
1.
Payoff of
bcal player
B
Volunteers Dilemma
Average00AB
OICO-PlOYenl
*o av ha
Payoff of
fccatplayer
Fig. 51. Illustration of 20 strategies in the case of equal weights, iv) =1/(n —1), for all j ii, and for (A) the linear public goods game and (8) the volunteers
dilemma. The blue-shaded area represent all feasible payoffs, with the x axis representing the payoff of player i and they axis representing the mean payoff
of is coplayers. The dashed diagonal gives the payoff combinations for which rri =re'. In both graphs, the strategy of player i is fixed to some ZD strategy,
wfiereas for the coplayers we sampled 104 random memory-one strategies. Red dots represent the resulting payoff combinations, and the gray line gives the
2002
prediction according to Eq. 513. For both graphs, we considered an infinitely repeated game in a group of size n=0. Parameters: (A) public goods game
2003
1a1= Q+ 1)rcin — c and bi =jrcinj with r = 2.4 and c = 1. For the strategy of player i, we used a generous ZD strategy with parameters / =rc—c, s = 2/3, 4=1/2.
2004
(8) Volunteers dilemma (ai = b -
blip =b, and by =0) with b=1.S, c=1; player i applies an extortionate strategy with parameters 1=0, s =9/10, 0=1/2.
2005
2006
2007
A
Public Goods Game
2008
snot
2009
2010
1
2011
2012
*It
2014
*AS-
n
n
2016
b
et
2015
2017
2018
o
Bawer*
of
l
Baseline
2019
payoff
-e
Parr(
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2093
2032
2094
2033
2095
2034
2096
2035
2097
2036
2098
2037
2099
2038
2100
2039
2101
2040
2102
2041
2103
2042
2104
2043
2105
2044
2106
2045
2107
2046
2108
2013
B
Volunteers Dilemma
Slope.
Fig. 52. Enforceable payoff relations in the case of equal weight on all coplayers for (A) the linear public goods game and (8) the volunteers dilemma. A pair
(1,$) is enforceable for a given group size n if the point is within the respectively shaded area. The set of enforceable pairs for large n is a subset of the re-
spective set for smaller n, i.e., the set of enforceable pairs shrinks with increasing group size. Parameters: (A) linear public goods game with r = 2.4, c=1; (B)
volunteer's dilemma with b=1.5, c=1.
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
Hite et al. www.poes.org/cgikontentishortiMON14711I
17 of 19
EFTA01199771
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
Slopes
2135
2136
1
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
-1
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
5
a
a
to
04
g
2
2
4
5
6
7
Group size n
Grim & TFTn-1
WSLS
Grim & TFT
Ng. 53. Stable memory-one strategies for the linear public goods game. The figure illustrates for which parameter regions the strategies Grim, TETA, and
WSLS are Nash equilibria, provided that the public goods game constitutes a social dilemma
1 <r <n). Grim and TfT,i are always Nash equilibria. 7P7 for
k <n— I is never a Nash equilibrium. WSLS is a Nash equilibrium when (N.' +b0)/2 san.i, which yields r z2n,t(n +1). In particular, WSLS is always a Nash
equilibrium when r22, irrespective of group size.
=1
=2
3
Baseline
payoff
Ng. 54. Strategic power of strategy alliances in the public goods game. Each colored area illustrates the set of enforceable payoff relations according to
Proposition 6 for different alliance sizes M. T e set of enforceable payoff relations for small nA is a subset of the respective set with larger nA. Consequently,
the larger the alliance, the more extreme payoff relations it can enforce. Parameters: linear public goods game with r= 2 4. c =1, and group size n=10.
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
1177
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
Hite et el. www.pnes.OrgeglkOntenthhcet/140788711I
18 of 19
EFTA01199772
2233
2234
2235
2236
2237
2238
2239
2240
1141
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
1161
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
A
i.o
a 0.B
5 0.6
10.4
02
C.2 0.0
a
2.0
1g 1.6
ros
0.5
Os
Extortionate alliance
100.000
200.000
AIlance
OWSklerS
0
100.000
lime
200.000
B
l.0
OA
OA
0.4
02
OA
Fair alliance
PAisnce ( Outsiders
2.0
Is
1.0
os
OA
100.000
200.000
Miaow
100.400
0
Time
200.000
C
Generous alliance
1.0
OA
OA
0.4
02
00
0
2.0
I.5
IA
OA
AO
100.000
200.000
row=
Outsiders
ance
0
100.000
lime
200400
Fog. SS. Performance of three different 2O alliances against adaptive outsiders. For each simulation, the strategy of the 2O alliance was fixed, whereas
outsiders were allowed to adapt their strategy as described in the text. (Upper) Average cooperation rate during each repeated game. (Lower) Resulting
payoffs for allies and outsiders. fad) panel depicts the average of 20 simulations. All simulations were run for a public goods game with r =3 and c= 1, in
a group of size n = S with n" =2 allies. For the strategies of the ZO alliances we used (A) an extortionate strategy with =0, s = 0.8, and #=1/2; (8) proportional
Tit-for-Tat with s = 1 and 0=1; and ft') a generous strategy with i= rc — c= Z 5=0.8, and #=1/2.
PNAS prooi
Embargoed
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
14110e et el. mvw.pelaS.049/CgikOntent/160a/1407687111
19 of 19
EFTA01199773
AUTHOR QUERIES
AUTHOR PLEASE ANSWER ALL QUERIES
1
Q: 1_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This is merely done...."
Q: 2_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This always holds ...."
Q: 3_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This allows us to...."
Q: 4_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This holds more generally...."
Q: 5_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This holds more generally ...."
Q: 6_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This allows us to...."
Q: 7_Please clarify if (2, 10) refers to references or equation numbers.
Q: 8_Please clarify if (2, 6) is equations or references.
Q: 9_Please rewrite the following for clarity: "...is also a stable...."
Q: 10_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This shows that...."
Q: 11_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This is condition ...."
Q: 12_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": `This shows that the...."
Q: 13_For SI ref. 2, please provide journal name, volume, issue, and page range.
Q: 14_For SI ref. 5, please provide issue number.
Q: 15_Is SI ref. 6 published. References must be published to be listed in the reference list. Please provide
type of reference and all necessary publishing information or delete if unpublished and renumber
reference section accordingly.
Q: 16_For SI ref. 12, please provide issue number.
Q: 17_For SI ref. 13, please provide journal name, volume, issue, and page range.
Q: 18_For SI ref. 15, please provide issue number.
EFTA01199774
AUTHOR QUERIES
AUTHOR PLEASE ANSWER ALL QUERIES
2
Q: 19_For SI ref. 17, please provide issue number.
EFTA01199775
Technical Artifacts (39)
View in Artifacts BrowserEmail addresses, URLs, phone numbers, and other technical indicators extracted from this document.
Domain
www.imes.orgDomain
www.pfsas.orgDomain
www.poes.orgEmail
[email protected]Phone
(3315101243Phone
1231
1170Phone
1237
1176Phone
1258
1320Phone
1291
1353Phone
1389
1451Phone
1435
1375Phone
1560-1563Phone
1589
1590Phone
1660)1379Phone
1760
1761Phone
1845
1846Phone
1849
1788Phone
1894
1895Phone
1899
1898Phone
2760660Phone
291-2294Phone
3003614363Phone
3743386Phone
6404321Phone
641291-1298Phone
714
1715Phone
7687111Phone
7811711Phone
7837111Phone
7887111Phone
913-6918Phone
9134918Phone
929-9934Phone
931
1853Wire Ref
ReferencesWire Ref
referenceWire Ref
referencesWire Ref
referringForum Discussions
This document was digitized, indexed, and cross-referenced with 1,400+ persons in the Epstein files. 100% free, ad-free, and independent.
Annotations powered by Hypothesis. Select any text on this page to annotate or highlight it.