From: "Nowak, Martin" <=I From: "Nowak, Martin"

Case File

efta-efta01199747DOJ Data Set 9Other

DS9 Document EFTA01199747

Date

Unknown

Source

DOJ Data Set 9

Reference

efta-efta01199747

Pages

Persons

Integrity

No Hash Available

Summary

Ask AI About This Document

0upvotesShare

Post Reddit

Save Post Watch

Extracted Text (OCR)

EFTA Disclosure

Text extracted via OCR from the original document. May contain errors from the scanning process.

3 4 " " Christian Hilbe' ", Bin Wub, Arne Traulsenb, and Martin A. Nowak" 5 ,,n,Cooperation and control in multiplayer social dilemmas 0:7.6.9 °Program for Evolutionary Dynamics, Harvard University, Cambridge, MA 0213t °Department for Evolutionary Theory, Max Planck Institute for 6 Evolutionary Biology, 20306 Plan, Germany; and 'Department of Organismic and Evolutionary Biology and Department of Mathematics, Harvard University, 7 Cambridge, MA 02138 8 9 10 II 12 13 14 5 16 17 I8 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Edited by Joshua B. Plotkin, University of Pennsylvania, Philadelphia, PA. and accepted by the Editorial Board September 26, 2010 (received for review April 30, 2010) Direct reciprocity and conditional cooperation are important mecha- nisms to prevent free riding in social dilemmas. However, in large groups, these mechanisms may become ineffective because they re- quire single individuals to have a substantial influence on their peers. However, the recent discovery of zero-determinant strategies in the iterated prisoner's dilemma suggests that we may have underesti- mated the degree of control that a single player can exert. Here, we develop a theory for zero-determinant strategies for multiplayer social dilemmas, with any number of involved players. We distinguish several particularly interesting subclasses of strategies: fair strategies ensure that the own payoff matches the average payoff of the group; extortionate strategies allow a player to perform above average; and generous strategies let a player perform below average. We use this theory to descnbe strategies that sustain cooperation. induding generalized variants of Tit-for-Tat and Win-Stay Lose-Shift. Moreover, we explore two models that show how individuals can further enhance their strategic options by coordinating their play with others. Our results highlight the importance of individual control and coordination to succeed in large groups evolutionary game theory I alliances I public goods game I volunteer's dilemma I cooperation C ooperation among self-interested individuals is generally difficult to achieve (1-3), but typically the free rider problem is aggravated even further when groups become large (4-9). In small communities, cooperation can often be stabilized by forms of direct and indirect reciprocity (10-17). For large groups, how- ever, it has been suggested that these mechanisms may turn out to be ineffective, as it becomes more difficult to keep track of the reputation of others and because the individual influence on others diminishes (4-8). To prevent the tragedy of the commons and to 40 compensate for the lack of individual control, many successful 41 communities have thus established central institutions that enforce 42 mutual cooperation (18-22). 43 However, a recent discovery suggests that we may have un- derestimated the amount of control that single players can exert in repeated games. For the repeated prisoners dilemma, Press and Dyson (23) have shown the existence of zero-determinant strategies (or ZD strategies), which allow a player to unilaterally enforce a linear relationship between the own payoff and the coplayer's 48 payoff, irrespective of the coplayer's actual strategy. The class of 49 zero-determinant strategies is surprisingly rich: for example, a player who wants to ensure that the own payoff will always match the coplayer's payoff can do so by applying a fair ZD strategy, like Tit- for-Tat. On the other hand, a player who wants to outperform the respective opponent can do so by slightly tweaking the Tit-for-Tat strategy to the own advantage, thereby giving rise to extortionate 54 ZD strategies. The discovery of such strategies has prompted sev- 55 eral theoretical studies, exploring how different ZD strategies 56 evolve under various evolutionary conditions (24-30). 57 ZD strategies are not confined to the repeated prisoner's di- 58 lemma. Recently published studies have shown that ZD strate- 59 gies also exist in other repeated two player games (29) or in repeated public goods games (31). Herein, we will show that such strategies exist for all symmetric social dilemmas, with an arbi- 61 trary number of participants. We use this theory to describe 62 which ZD strategies can be used to enforce fair outcomes or to 44 45 46 47 50 5I 52 53 prevent free riders from taking over. Our results, however, are not restricted to the space of ZD strategies. By extending the techniques introduced by Press and Dyson (23) and Akin (27), we also derive exact conditions when generalized versions of Grim, Tit- for-Tat, and Win-Stay Lase-Shift allow for stable cooperation. In this way, we find that most of the theoretical solutions for the re- peated prisoner's dilemma can be directly transferred to repeated dilemmas with an arbitrary number of involved players. In addition, we also propose two models to explore how indi- viduals can further enhance their strategic options by coordinating their play with others. To this end, we extend the notion of ZD strategies for single players to subgroups of players (to which we refer as ZD alliances). We analyze two models of ZD alliances, depending on the degree of coordination between the players. When players form a strategy alliance, they only agree on the set of alliance members, and on a common strategy that each alliance member independently applies during the repeated game. When players form a synchronized alliance, on the other hand, they agree to act as a single entity, with all alliance members playing the same action in a given round. We show that the strategic power of ZD alliances depends on the size of the alliance, the applied strategy of the allies, and on the properties of the underlying social dilemma. Surprisingly, the degree of coordination only plays a role as alliances become large (in which case a synchronized alliance has more strategic options than a strategy alliance). To obtain these results, we consider a repeated social dilemma between n players. In each round of the game, players can decide whether to cooperate (C) or to defect (D). A player's payoff depends on the player's own decision and on the decisions of all other group members (Fig. 1A): in a group in which/ of the other group members cooperate, a cooperator receives the payoff al, whereas a defector obtains b,. We assume that payoffs satisfy the Significance Many of the world's most pressing problems, like the prevention of climate change, have the form of a large-scale social dilemma with numerous involved players. Previous results in evolutionary game theory suggest that multiplayer dilemmas make it partic- ularly difficult to achieve mutual cooperation because of the lack of individual control in large groups. Herein, we extend the theory of zero-determinant strategies to multiplayer games to describe which strategies maintain cooperation. Moreover, we propose two simple models of alliances in multiplayer dilemmas. The effect of these alliances is determined by their size, the strategy of the allies, and the properties of the social dilemma. When a single individual's strategic options are limited, forming an alliance can result in a drastic leverage. Author contrbutions: B.W. initiated the project; CN. B.W. AT, and M.A.N. designed research; C.H.,B.W., AT., and MAN. performed research; At and MAN. analysed data and C H and 6 W. wrote the paper. The authors declare no conflict of interest. 'Mks artkle Is a PNAS Direct Submission. J.B.P. is a guest editor Invited by the mow Board. Freely available online through the PNAS open access option. 'To whom correspondence should be addressed. Email: nitiorlasriarraisethi. 'nth article contains supporting information online aUnww.pnas.orgiloalruWsupplidoi:10. ion/linos motion imxsowir mental. b.; 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 97 98 99 100 101 102 103 1(14 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 vnewpnas.orgrcglidoi/10.1073/pnas.1407887111 PNAS Early Edition I 1 of 6 EFTA01199747 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 I50 I51 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 ISO 181 182 183 184 185 186 following three properties that are characteristic for social dilemmas (corresponding to the individual-centered interpretation of altruism in ref. 32): (i) irrespective of the own strategy, players prefer the other group members to cooperate (aft' ≥ aj and bp,. ≥ bj for allj); (ii) within any mixed group, defectors obtain strictly higher payoffs than cooperators > aj for all j); and (iii) mutual co- operation is favored over mutual defection (an_i> bo). To illustrate our results, we will discuss two particular examples of multiplayer games (Fig. 1B). In the first example, the public goods game (33), cooperators contribute an amount c > 0 to a common pool. knowing that total contributions are multiplied by r (with 1 <r < it) and evenly shared among all group members. Thus, a cooperator's payoff is rc a = (j + 1)/n — c, whereas defectors yield bj=rcj/n. In the second example, the volunteer's dilemma (34), at least one group member has to volunteer to bear a cost c> 0 in order for all group members to derive a benefit h>c. Therefore, cooperators obtain cif= b —c (irrespective of j), whereas defectors yield hj = b if j a 1 and bo =0. Both examples (and many more, such as the collective risk dilemma) (7, 8, 35) are simple instances of multiplayer social dilemmas. We assume that the social dilemma is repeated, such that in- dividuals can react to their coplayers' past actions (for simplicity, we will focus here on the case of an infinitely repeated game). As usual, payoffs for the repeated game are defined as the average payoff that players obtain over all rounds. In general, strategies for such repeated games can become arbitrarily complex, as subjects may condition their behavior on past events and on the round number in nontrivial ways. Nevertheless, as in pairwise games, ZD strategies turn out to be surprisingly simple. A B 3 2 g C 0 2 4 6 a 10 Number of cooperators among co-players a.1 1}.2 .... 2 1 0 Cooperators payoff an-r arr.2 ... az as no Defectors payoff bn-s bn-2 b2 br bo 2 a l 0 Volunteers Dilemma Dete c. ‘. ctor 4 , b0.0. to 4, 1 .—e—e—crathr arb—C 0 2 4 6 8 10 Number et cocperabng co-64eyere Plumber of oocperalfrig co-players ZD Alliance Outsiders 1 1 Fig. 1. Illustration of the model assumptions for repeated soda! dilemma (A) We consider symmetric n-player soda' dilemmas in which each player can either cooperate or defect The players payoff depends on its own decision and on the number of other group members who decide to cooperate. (B) We will discuss two particular examples: the public goods game (in which payoffs are pro- portional to the number of cooperators) and the volunteers dilemma (as the most simple example of a nonlinear social dilemma). (C) In adcfrtion to individual strategies, we will also explore how subjects can enhance their strategic options by coordinating their play with other group members. We refer to the members of such a ID alliance as allies, and we call group member that are not part of the 2D alliance outsiders. Outsiders are not restricted to any particular strategy. Some or all of the outsiders may even form their own alliance. Results Memory-One Strategies and Akin's lemma ZD strategies are memory-one strategies (23, 36); they only condition their behavior on the outcome of the previous round. Memory-one strategies can be written as a vector p= (Pca Pc o.PnA-4 Poo)- The entries ps) denote the probability to cooperate in the next round, given that the player previously played S E {C. O} and that j of the coplayers cooperated (in the SI Tar, we present an extension in which players additionally take into account who of the coplayers cooperated). A simple example of a memory-one strategy is the strategy Repeat. prim, which simply reiterates the own move of the previous round, pic.i7 =1 and 147 =0. In addition, memory-one strategies need to specify a cooperation probability pa for the first round. However, our results will often be independent of the initial play, and in that case we will drop Po. Let us consider a repeated game in which a focal player with memory-one strategy p interacts with n —1 arbitrary coplayers (who are not restricted to any particular strategy). Let vs4(r) denote the probability that the outcome of round t is (S,j). Let v(0= (t) vao(t)] be the vector of these probabilities. A limit distribution v is a limit point for a' —• co of the sequence tv(1)+ +v(t)Wr. The entries vs, of such a limit distribution correspond to the fraction of rounds in which the focal player finds herself in state (S. j) over the course of the game. There is a surprisingly powerful relationship between a focal player's memory-one strategy and the resulting limit distribution of the iterated game. To show this relationship, let qc(r) be the probability that the focal player cooperates in round r. By definition of pRiv we can write qc(r) = pRA° • v(1)=Evcs-i (0+ ... +vco(0). Similarly, we can express the probability that the focal player cooperates in the next round as qc(r + I) = p • v(t). It follows that qc(r +1)— qc(t)=(p— pRc") • v(t). Summing up over all rounds from 1 to t, and dividing by t. yields (p — pR•17)• iv(I)+ v(r))/r= [qc(r+ I) —qc(1)1/r, which has absolute value at most IA By taking the limit r co we can conclude that (p —pRe0)•v=0. This relation between a player's memory-one strategy and the resulting limit distribution will prove to be extremely useful. Because the importance of Eq. 1 has been first highlighted by Akin (27) in the context of the pairwise prisoner's dilemma, we will refer to it as Akin's lemma. We note that Akin's lemma is remarkably general, because it neither makes any assumptions on the specific game being played nor does it make any restrictions on the strat- egies applied by the remaining n —1 group members. zero -Determinant Strategies in Multiplayer Social Dilemmas. As an application of Akin's lemma, we will show in the following that single players can gain an unexpected amount of control over the resulting payoffs in a multiplayer social dilemma. To this end, we first need to introduce some further notation. For a focal player i, let us write the possible payoffs in a given round as a vector g = (es), with g'n =a) and eDi =b). Similarly, let us write the average payoffs ores coplayers as r= (gr), where the entries are given by k g., =fra) + (n —j-1)br ibl(n — 1) and gilo =frafri +(n —j —1)64/(n — I). Finally, let 1 denote the 2n- dimensional vector with all entries being one. Using this notation, we can write player Ps payoff in the repeated game as x' = g' • v, and the average payoff of ts coplayers as = • v. Moreover, by defini- tion of v as a limit distribution. it follows that I • v= 1. After these preparations. let us assume player f applies the memory-one strategy P= +4 +//C+71. [2] with a, p, and y being parameters that can be chosen by player i (with the only restriction that p#0). Due to Akin's lemma, we can conclude that such a player enforces the relationship 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 2 of 6 I www.pnes.orgfcgildoi/10.1073Mnas.1407837 III Hilbe et al. EFTA01199748 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 0 = (p - pile/ • v = (cre +fie +71)v =ad +fir' +y. 131 Player i's strategy thus guarantees that the resulting payoffs of the repeated game obey a linear relationship, irrespective of how the other group members play. Moreover, by appropriately choosing the parameters a, ft, and y, the player has direct control on the form of this payoff relation. As in Press and Dyson (23), who were first to discover such strategies for the prisoner's di- lemma, we refer to the memory-one strategies in Eq. 2 as zero- determinant strategies or ZD strategies. For our purpose, it will be convenient to proceed with a slightly different representation of ZD strategies. Using the parameter transformation 1=-71(a+ fi), s = —al fi, and ¢=—p, ZD strategies take the form p = + OKI -s)(!1-gi) + — [4] and the enforced payoff relationship according to Eq. 3 becomes e 1 =ski +(i -s)1. We refer to 1 as the baseline payoff of the ZD strategy and to s as the strategy's slope. Both parameters allow an intuitive interpre- tation: when all players adopt the same ZD strategy p such that x' =x-', it follows from Eq. 5 that each player yields the payoff 1. The value of s determines how the mean payoff of the other group members e' varies with d. The parameter 0 does not have a direct effect on Eq. 5: however, the magnitude of ¢ de- termines how fast payoffs converge to this linear payoff relation- ship as the repeated game proceeds (37). The parameters 1. s. and efr of a ZD strategy cannot be chosen arbitrarily, because the entries psi are probabilities that need to satisfy 0 <psi < 1. In general, the admissible parameters depend on the specific social dilemma being played. In SI Tier, we show that exactly those relations 5 can be enforced for which either s= 1 (in which case the parameter 1 in the definition of ZD strategies becomes irrelevant) or for which / and s < 1 satisfy j —a-_1 —j— I b 41.1 max fb. </< mM +n osisx-i n — 1 —s osisn-i n — 1 —s 161 It follows that feasible baseline payoffs are bounded by the payoffs for mutual cooperation and mutual defection, b0 s! < a„_i, and that the slope needs to satisfy —1/(n — 1) <s < 1. With s sufficiently close to 1, any baseline payoff between bo and a„_ can be achieved. Moreover, because the conditions in Eq. 6 become increasingly re- strictive as the group size n increases, larger groups make it more difficult for players to enforce specific payoff relationships. Important Examples of ZD Strategist In the following, we discuss some examples of ZD strategies. At first, let us consider a player who sets the slope to s = 1. By Eq. 5, such a player enforces the payoff relation x' =x', such that's payoff matches the average payoff of the other group members. We call such ZD strategies fair. As shown in Fig. 24, fair strategies do not ensure that all group members get the same payoff; due to our definition of social dilemmas, unconditional defectors always outperform unconditional cooperators, no matter whether the group also contains fair players. Instead, fair players can only ensure that they do not take any unilateral advantage of their peers. Our characterization 6 implies that all social dilemmas permit a player to be fair, irrespective of the group size. As an example, consider the strategy proportional lit -for-Tat (pTFT), for which the probability to cooperate is simply given by the fraction of cooperators among the coplayers in the previous round pTFTs- = n —I [71 For pairwise games, this definition of pTFT simplifies to Tit-for- Tat, which is a fair ZD strategy (23). However, also for the public goods game and for the volunteer's dilemma, pTFT is a ZD strategy, because it can be obtained from Eq. 4 by setting s= 1 and ¢=1/c, with c being the cost of cooperation. As another interesting subclass of ZD strategies, let us con- sider strategies that choose the mutual defection payoff as baseline payoff, 1=60, and that enforce a positive slope 0 <s < 1. The enforced payoff relation 5 becomes se"' =sx' + (1 —s)bo, im- plying that on average the other group members only get a fraction s of any surplus over the mutual defection payoff. Moreover, as the slope s is positive, the payoffs x' and le are positively related. As a consequence, the collective best reply for the remaining group members is to maximize i's payoffs by cooperating in every round. In analogy to Press and Dyson (23), we call such ZD strategies extortionate, and we call the quantity x= 1/s the extortion factor. For games in which 1=14=0, Eq. 5 shows that the extortion factor can be written as x = Je/x-I. Large extortion factors thus signal a substantial inequality in favor of player i. Extortionate strategies are particularly powerful in so- cial dilemmas in which mutual defection leads to the lowest group payoff (as in the public goods game and in the volunteer's dilemma). In that case, they enforce the relation Ki > cc; on average, player i performs at least as well as the other group members (as also depicted in Fig. 2B). As an example, let us consider a public goods game and a Z1D strategy pEr with 1=0, =n IK" —r)sc+rcl. for which Eq. 4 implies - I [i (1 scir+t -Ir)si. n - I Ig] independent of the players own move Se {C.D}. In the limit s I, pa approaches the fair strategypTFT. Ass decreases from 1, the cooperation probabilities of ey are increasingly biased to the own advantage. Extortionate strategies exist for all social dilemmas (this follows from condition [6] by setting 1=b0 and choosing an s close to 1). However, larger groups make extor- tion more difficult. For example, in public goods games with n > r fir — I), players cannot be arbitrarily extortionate any longer as [6] implies that there is an upper bound on,' (SI Tea). As the benevolent counterpart to extortioners, Stewart and 350 Plotkin described a set of generous strategies for the iterated 351 prisoner's dilemma (24, 28). Generous players set the baseline 352 payoff to the mutual cooperation payoff I= a„._i while still enforcing a positive slope 0<s < 1. This results in the payoff con 353 relation c' =so' + (1 —5)0„_1. In particular, for games in which 354 mutual cooperation is the optimal outcome for the group (as in the public goods game and in the prisoner's dilemma but not in the volunteer's dilemma), the payoff of a generous player sat- isfies x' < (Fig. 2C). For the example of a public goods game, we obtain a generous ZD strategy p' by setting 1=rc—c and ep —r)sc +rc], such that 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 1 Ge I „ ti 1 n(r —1) Ps n + (I s) n —I r+(n—r)s. Ig1 For s 1, ptt approaches the fair strategy pTFT, whereas lower values of s make pcw more cooperative. Again, such generous strategies exist for all social dilemmas, but the extent to which players can be generous depends on the particular social di- lemma and on the size of the group. As a last interesting class of ZD strategies, let us consider players who chooses =0. By Eq. 5, such players enforce the payoff relation x'=1, meaning that they have unilateral control over the mean payoff of the other group members (for the prisoner's dilemma, such equalizer strategies were first discovered in ref. 38). However, 345 346 347 348 349 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 Hilbe et el. PNAS lefty Edition I 3 o16 EFTA01199749 373 374 375 376 377 378 379 380 Q:26 the other group members. This does not imply that o.s OA groupmembers yield 0. 381 all other the same payoff. (8) D 20 40 60 DO IDO 40 BO 0 20 BO 100 D 20 40 60 HO IGO Flotnd Number Round Round Number For games in which mutual defection leads to the 382 lowest group payoff, extortionate players ensure that their payoffs are above average. (C) In games in which mutual cooperation is the social optimum, 383 generous players let their coplayers gain higher payoffs. The three graphs depict the case of a public goods game with r =4, c =1, and group size n=20. For 384 the strategies of the other group members, we used random memory-one strategies, where the cooperation probabilities were independently drawn from 385 a uniform distribution. For the strategies of the focal player, we used (A)pift (8) p” with s =0.8, and (C) pc* with s =0.8. 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 1101 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 Fig. 2. Characteristic dynamics of payoffs over the course of the game for three different 2D strategies. Each panel depicts the payoff of the focal player fr' (blue) and the average payoff of the other group members g-' (red) by thick lines. Additionally, the individual payoffs of the other group members are shown as thin red lines. (A) A fair player ensures that the own payoff matches the mean payoff of A 2.5 gists . as FEW strategy Focal 13, paysasi c Mee OretiOMeoters B 2_5 ExtallCilete Strategy Focal 01aYOr °0e( , I gray marrtors C 2.5 Gene/00$ SeategY Other *9v members . 'Sc Focal Fiery unlike extortionate and generous strategies, equalizer strategies typically cease to exist once the group size exceeds a critical threshold. For the example of a public goods game this thresh- old is given by n =211(r — 1). For larger groups, single players cannot determine the mean payoff of their peers any longer. Stable Cooperation In Multipbyer Sodal Dilemmas. Let us next ex- plore which ZU strategies give rise to a Nash equilibrium with stable cooperation. In S/ Text, we prove that such ZD strategies need to have two properties: they need to be generous (by setting / =a„_i and s> 0), but they must not be too generous [the slope needs to satisfy s ≥ (n — 2)/(n —1)1. In particular, whereas in the repeated prisoner's dilemma any generous strategy with s> 0 is a Nash equilibrium (27, 28), larger group sizes make it increasingly difficult to uphold cooperation. In the limit of infinitely large groups, it follows that s needs to approach 1, suggesting that ZD strategies need to become fair. For the public goods game. this implies that stable cooperation can always be achieved when players cooperate in the first round and adopt proportional Tit- for-Tat thereafter. Interestingly, this strategy has received little attention in the previous literature. Instead, researchers have fo- cused on other generalized versions of Tit-for-Tat, which oo- operate if at least k coplayers cooperated in the previous round (4, 39, 40). Such memory-one strategies take the form psi =0 if j <k and psi = 1 if ≥k. Unlike pTFT, these threshold strategies nei- ther enforce a linear relation between payoffs, nor do they induce fair outcomes, suggesting that pTFT may be the more natural generalization of Tit-for-Tat in large-scale social dilemmas. In addition to the stable ZD strategies, Akin's lemma also allows us to characterize all pure memory-one strategies that sustain mutual cooperation. In SI Text, we show that any such strategy p needs to satisfy the following four conditions an-i —ao pea-I=1. PC A-2= 0, pal S and an-i —00 PODS bn_i — a „..1' with no restrictions being imposed on the other envies psi. The first conditionpe„_i = 1 ensures that individuals continue to play C after mutual cooperation; the second condition pc,,- 2 = 0 guaran- tees that any unilateral deviation is punished; and the last two conditions describe whether players are allowed to revert to co- operation after rounds with almost uniform defection. Surprisingly, only these last two conditions depend on the specific payoffs of the social dilemma As an application, condition 10 imply that the threshold variants of Tit-for-Tat discussed above are only a Nash equilibrium if they use the most stringent threshold: k =n —1. Such unforgiving strategies, however, have the disadvantage that they are often susceptible to errors: already a small probability that players fail to cooperate may cause a complete breakdown of cooperation (41). Instead, the stochastic simulations by Hauert and Schuster (5) showed that successful strategies tend to cooperate after mutual cooperation and after mutual defection lie., Pea-t =Pao = 1 and psi =0 for all other states (S.j)]. We refer to such a behavior as WSLS, because for painvise dilemmas it corresponds to the Win- Stay, Lose-Shift strategy described by ref. 36. Because of condition 1101, WSLS is a Nash equilibrium if and only if the social dilemma satisfies (6„_1 +b0)12sa„_i. For the example of a public goods game, this condition simplifies to r> 2n1(n + 1). which is always fulfilled for r≥ 2. For social dilemmas that meet this condition, WSLS provides a stable route to cooperation that is robust to errors. Zero-Determinant Alliances. In agreement with most of the theo- retical literature on repeated social dilemmas, our previous analysis is based on the assumption that individuals act in- dependently. As a result, we observed that a player's strategic options typically diminish with group size. As a countermeasure, subjects may try to gain strategic power by coordinating their strategies with others. In the following, we thus extend our the- ory of ZD strategies for single individuals to subgroups of play- ers. We refer to these subgroups as ZD alliances. Because the strategic power of ZD alliances is likely to depend on the exact mode of coordination between the allies, we consider two dif- ferent models: when subjects form a strategy alliance, they only agree on the set of alliance members and on a common ZD strategy that each ally independently applies. During the actual game, there is no further communication between the allies. Strategy alliances can thus be seen as a boundary case of co- ordinated play, which requires a minimum amount of coordi- nation. Alternatively, we also analyze synchronized alliances, in which all allies synchronize their actions in each round (i.e., the allies cooperate collectively, or they defect collectively). In ef- fect, such a synchronized alliance thus behaves like a new entity that has a higher leverage than each player individually. Syn- chronized alliances thus may be considered as a boundary case of coordinated play that requires substantial coordination. To model strategy alliances, let us consider a group of is4 allies, with 1 <nA cu. We assume that all allies make a binding agreement that they will play according to the same ZD strategy p during the repeated game. Because the ZD strategy needs to allow allies to differentiate between the actions of the other allies and the outsiders, we need to consider a more general state space than before. The state space now takes the form (S.P.j-4). The first entry S corresponds to the focal player's own play in the pre- vious round, j4 gives the number of cooperators among the other allies, andsA is the number of cooperators among the outsider& A memory-one strategy p again needs to specify a cooperation prob- abilitypsiAtA for each of the passible states. Using this state space, we can define ZD strategies for a player i in a strategy alliances as p= pReP+Okl —sX/1 — g9+ g —(n4 — l)wAgi —(n — Ill] The vector r 4 contains the average payoff of the other allies for each possible state, and g-4 is the corresponding vector for the 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 4 of 6 I www.poes.orgicgikloii10.10TYpnas.140T88211t Hilbe et al. EFTA01199750 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 outsiders. The weights w-4 ≥ 0 and iv4 ≥ 0 are additional para- meters that determine the relative importance of outsiders and other allies, being subject to the constraint (n4 -1)w4+ (n -nA)w-A = 1. In the special case of a single player forming an alliance, nA = 1, this guarantees that the two definitions of ZD strategies 4 and 11 are equivalent. Similarly to the case of single individuals, we can apply Akin's lemma to show that strategy alliances enforce a linear relation- ship between their own mean payoff ? and the mean payoff of the outsiders rir4 (for details, see SI Text) x-4 =eV + (1 - sit 112] where the slope of the alliance is given by s4 = - (nA - I )w4]/ [1 — (n4 - I)r 4]. A strategy alliance can enforce exactly those payoff relationships 12 for which either el = 1 or for which I and sA c 1 satisfy the conditions max a( b, — -A 1 -3A bt — ≤/≤ min +n —j— 1 —a, 114-1Q9I-I n —nA 1 -34 ' 113) Interestingly, to reach this strategic power, an alliance needs to put a higher weight on the within-alliance payoffs (i.e., w4 needs to exceed ;1r4; SI Tett), such that the allies are stronger affected by what the other allies do, as opposed to the actions of the outsiders. For single player alliances, n4 = 1. condition 13 again simplifies to the previous condition 6. However, as the alliance size nA increases, condition 13 becomes easier to satisfy. Larger alliances can therefore enforce more extreme payoff relationships. For the example of a public goods game, we noted that single players cannot be arbitrarily extortionate when n > r I (r - 1). Alli- ances, on the other hand, only need to be sufficiently large. nA Ina (r -1)1r. Once an alliance has this critical mass, there are no bounds to extortion. In a similar way, we can also analyze the strategic possibilities of a synchronized alliance. Because synchronized alliances act as a single entity, they transform the symmetric social dilemma between n independent players to an asymmetric game between n —n4 + 1 independent players. From the perspective of the al- liance, the state space now takes the form (S.j), where Se (C. Et) is the common action of all allies and where 0 ≤j ≤n — via is the number of cooperators among the outsiders. ZD strategies for the synchronized alliance can be defined analogously to ZD strategies for single players p = pReP + 0[(1 g-4) +g4 — 1141 with le being the payoff vector for the allies and g-4 being the payoff vector of the outsiders. For a single player alliance, n4 = 1, this again reproduces the definition of ZD strategies in 4. By applying Akin's lemma to Eq. 14, we conclude that synchronized alliances enforce sr-A =SAKA +(I —SA)!, which is the same as re- lationship 12 for strategy alliances. Surprisingly, we even find that for reasonable alliance sizes, nA S n/2, strategy alliances and syn- chronized alliances have the same set of enforceable parameters I and ?, as given by Eq. 13 (see SI Tea for details). Thus, for the two models of ZD alliances considered here, the exact mode of coordination is irrelevant for the alliance's strategic power unless the alliance has reached a substantial size. Table 1 gives an overview of our findings on ZD strategies and ZD alliances in multiplayer social dilemmas. It shows that, al- though generally, ZD strategies exist for all group sizes, the power of single players to enforce particular outcomes typically dimin- ishes or disappears in large groups. Forming ZD alliances then allows players to increase their strategic scope. The impact of a given ZD alliance, however, depends on the specific social di- lemma: although ZD alliances can become arbitrarily powerful in public goods games, their strategic options remain limited in the volunteer's dilemma. Discussion When Press and Dyson (23) discovered the new class of ZD strategies for the repeated prisoner's dilemma, this came as a big surprise (24, 25): after more than five decades of research, it seemed unlikely that any major property of the prisoner's di- lemma has been overlooked. For repeated multiplayer dilemmas the situation is different. Although various Folk theorems guar- 567 antee that cooperation is also feasible in large groups (42, 43), 568 there has been considerably less theoretical research on the 569 evolution of cooperation in repeated multiplayer dilemmas (4, 5, 570 39, 40). This may be due to the higher complexity: the mathe- 571 matics of repeated n-player dilemmas seems to be more intricate, and numerical investigations are impeded because the time to 572 compute payoffs increases exponentially in the number of play- 573 ers (5). Nevertheless, we showed here that many of the results for 574 the repeated prisoner's dilemma can be directly transferred to 575 general social dilemmas, with an arbitrary number of involved 576 subjects. The foundation for this progress is a new framework, ()Az 577 provided by Akin's lemma and the theory of Press and Dyson. 578 Using this framework. we extended the theory of repeated multiplayer dilemmas into three direction& The first and most immediate direction is our finding that ZD strategies exist in all social dilemmas. These strategies allow players to unilaterally dic- tate linear payoff relations, irrespective of the specific social di- lemma being played, irrespective of the group size, and irrespective of the counter measures taken by the other group members. In particular. we showed that any social dilemma allows players to be fair, extortionate, or generous. Each of these strategy classes has its own particular strengths: extortionate strategies give a player a rel- ative advantage compared with the other group members; fair strategies help to avoid further inequality within a group; and generous strategies allow players to revers to mutual cooperation when a coplayer defected by accident. At the same time, ZD strategies are remarkably simple. For example, to be fair in a public goods game, players only need to apply a rule called proportional Tit-for-Tat: if j of the n —1 other group members cooperated in the previous round, then cooperate with prob- ability j/(tt — 1) in the following round. Extortionate and gen- erous strategies can be obtained in a similar way, by slightly modifying pTFT to the own advantage or to the advantage of the others. As the second direction, we explored which ZD strategies and which pure memory-one strategies can be used to sustain co- operation in multiplayer dilemmas. Among ZD strategies, such strategies need to be generous (such that players never try to outperform their peers) (27, 28), but at the same time they must not be too generous. The right degree of generosity depends on the size of the group but not on the specific social dilemma being played. As a rule of thumb, we obtain that in larger groups, subjects are required to show less generosity. As the last direction, we extended the concept of zero- determinant strategies from single players to subgroups of players, to which we refer to as ZD alliances. Depending on the degree of coordination, we explored two forms of ZD alliances: members of a strategy alliance only agree on using a common ZD strategy during the game, but they do not coordinate each of their decisions; members of a synchronized alliance, on the other hand, act as a single entiryt—they either all cooperate or they all defect in a given round. The effect of such ZD alliances depends on the size of the alliance, the applied strategy, and the prop- erties of the underlying social dilemma. In general, we find that by coordinating their play with others, subjects can increase their strategic options considerably. The exact mode of coordination, however, only turns out to play a minor role: As long as the size of the ZD alliance is below half the group size, strategy alliances and synchronized alliances have the same strategic power. In ad- dition to their static properties, ZD strategies for the prisoner's dilemma also have a remarkable dynamic component (23, 44): 559 560 561 562 563 564 565 566 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 Mix et el. PNAS Early Edition I 5 ed 6 EFTA01199751 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 Q:13 659 660 661 662 663 664 Q:14 665 666 667 668 669 Q:15 670 671 672 673 674 Qit; 675 676 677 678 679 680 681 682 Table 1. Strategic power of different ZD strategies for three different social dilemmas Strategy Typical Prisoner's dass property dilemma Public goods game Volunteer's dilemma Fair strategies Extortionate strategies Generous strategies Equalizers A A =A >,r 4 eA = i Always exist Always exist Always exist Always exist Always exist In large groups, single players cannot be arbitrarily extortionate, but sufficiently large ZD alliances can be arbitrarily extortionate In large groups, single players cannot be arbitrarily generous, but sufficiently large ZD alliances can be arbitrarily generous May not be feasible for single players, but is always feasible for sufficiently large ZD alliances Always exist Even large ZD alliances cannot be arbitrarily extortionate Do not ensure that own payoff is below average Only feasible if size of ZD alliance is nA = n -1, can only enforce 1=b —c Analogously to the case of individual players, 20 alliances are fair when they set ? = 1; they are extortionate when 1=130 and 0<s-4 <1; they are generous for i=a,,_i and 0 <54 <1; and they are equalizers when ? =0. For each of the three considered social dilemmas, we explore whether a given ZD strategy is feasible by examining the respective conditions in ref. 13. In the repeated prisoner's dilemma, single players can exert all strategic behaviors (23, 28, 29). Other social dilemmas either require players to form alliances to gain sufficient control (as in the public goods game), or they only allow for limited forms of control (as in the volunteer's dilemma). These results hold both for strategy alliances and for synchronized alliances. when a player commits himself to an extortionate ZD strategy, then adapting coplayers team to cooperate over time. Numerical simulations in the SI show an analogous result for multiplayer dilemmas: when ZD alliances apply strategies with a positive slope, they can trigger a positive group dynamics among the out- siders. The magnitude of this dynamic effect again depends on the size of the alliance, and on the applied strategy of the allies. Here, we focused on ZD strategies; but the toolbox that we apply (in particular Akin's lemma) is more general. As an ex- ample, we identified all pure memory-one strategies that allow for stable cooperation, including the champion of the repeated prisoner's dilemma, Win-Stay Lose-Shift (36,45). We expect that 1. Hardin G (1968) The tragedy of the commons. The populationproblem has notechnical solution; it requite fundamental extension in morality. Science 162(3315101243-1248. 2. Olson M(1971) The Logic or Collective Action: Public Goods and the Theory Ol Groups (Harvard Univ Press, Cambridge, 6,M), Revised Ed. 3. Nowak MA (2006) Five rules for the evolution of cooperation. Science 314(6805k 1560-1563. 4. Boyd R. Rkherson P) 41988) The evolution of reciprocity in sizable groups. r Theo, Dial 132(3):337-356. 5. Hauert C, Schuette' HG (1997) Effects of increasing the number of players and memory size in the iterated prisoners dilemma A numerical approads. Prue Not Sci 264:513-51g 6. Suzuki 5, Akyama E (2037)6w:take of indirect reciprocity in groups of various sizes and comparison with direct reciprocity. l Theo, CON 245(3)539-552. 7. Santos FC, Padsoto 1M 42011) Risk of collective failure provides an escape from the tragedy of the commons. Prot Nate Arad Sc) USA 108(26):10421-10425. 8. Amu Chakra M. Traulsen A (2012) Evolutionary dynamics of strategic behavior h a colkctive-risk dilemma. PLOS Comput Biol8(8)z1002652. 9. Yang Vi, et al. (2013) Nonlinear effects of group size oncollective action and resource outcomes. hoc Ned Aced Sri USA 110(27):10916-10921. 10. Trivets RL (1971) The ay.:Mien of reciprocal altruism. C) Rev BM:4635-57. Axelrod R 11984) The Evolution of Cooperation Maw Books. New York). 12. Alexander R (1987) The Biology of Moral Systems (Aldine de Gruyter, New York). 13. Wedekind C. Milinski M (2000) Cooperation through image goring in humans, Sci- ence 288(5467)350-852. 14. Henrichl, et al. (2001) Cooperation, reciprocity and punishment in fifteen small scale societies. Ans Eton Rev 91:73-78. IS. Nowak MA Sigmund K (2E05) Evolution of irarect redprocity. Nature 437(70641291-1298. 16. van Veelen M, Garcia 1, Rand DG, Nowak MA (2012) Direct reciprocity in structured POPulatiOns. hoc Had AOC( SO USA I09(25):9929-9934. 17. 2agonky BM. Reiter 16, Chatterjee K. Nowak MA (2013) Forgiver triumphs in atter- nating Prisoner's Dilemma. PLOS ONE 8412):e80814. IS. YamagisN T (1986) The provision of a sanctioning system as a public good Jeers Sou Psycho? 51:110-116. 19. Ostrom E 41990) Governing the Commons: The Evolution or Mstitutions for Collective Action (Cambridge Univ Press, Cambridge, UK). 20. Sigmund K, De Silva H, Traulsen A, Hauert C (2010) Social learning promotes in- stitutions for governing the commons. Nature 466(73003614363. 21. Guala F (2012) Reciprocity: Weak or strong? What punishment experiments do (and do nod demonstrate. Behav Drain SO 35(0:1-15. 22. Hite C. Traulsen A. RON Tv Milinski M (2014) Democratic decisions establish stable authorities that overcame the paradox of second.order punishment. Proc Natl Aced So USA 111(2):752-756. 23. Press W14, Dyson Fl (2012) Iterated Prisoner's Dilemma contains strategies that dominate any evolutionary opponent. Proc Natl Aced Sri USA 109(26)10609-10413. them will be further applications of Aldn's lemma to come. Such applications may include, for instance, a characterization of all Nash equilibria among the stochastic memory-one strategies or an analysis of how alliances are formed and whether evolutionary forces favor particular alliances over others (46, 47). Overall, our results reveal how single players in multiplayer games can increase their control by choosing the right strategies and how they can increase their strategic options by joining forces with others. ACKNOWLEDGMENTS. C.H. gratefully acknowledges generous funding by the SchrOdinger scholarship of the Austrian Science Fund 03475). 24. Stewart Al. Plotkin 18 (2012) Extortion and cooperation in the prisoners dilemma. Proc Had had Sd USA (09(26):10134-10135. 25. Ball (2012) PhysicistS suggest selfishness can par' Nature. 0:17 26. Hike C Novak MA, Sigmund K (2013) The evolution of extortion in iterated pris- oners dilemma games. Proc Natl Aced Sci USA 110(17)£9134918. 27. Akin E (2013) Stable cooperative solutions for the iterated prisoners diemma. 0:18 28. Stewart Al. Plotkin Rs (2013) From extortion to generosity, evolution In the Iterated Prisoners Dilemma. Proc Natl Arad Sc) USA 110438):15348-15353. 29. Hite C. Nowak MA, Traulsen A (2013) Adaptive dynamics of extortion and come& ante. PLOS ONE 801):e77886. 30. Szolnoki A, Pere M (2014) Evolution of extortion in structured populations. Phys Rev E Rat NOnlin Soft Matter PhyS 89:022804. ct:19 31. Pan L, Hao 0, Rang Z Zhou (2014)2ero-determinant strategies in the iterated public goods game. 32. Kerr B, G.:dirty-Smith P, Feldman MW (2004) What is altruism? Trends Er& Evoll9(3): 135-140. 33. Ledyard JO (1995) Public goods: A suivey of experimental research. The Handbook of Experimental Economia, eds Kagel JR Roth AE (Princeton Univ Press, Princeton). Q:21 34. Olekmann A (1985) VOlunteerS dilemma. I Contact &whit 29:605-610. ctaz 35. Milinski M, Sommerfeld RD, Kra beck II-1, Reed FA Marotthe 1 (2008) the collective- risk social dilemma and the prevention of simulated dangerous climate change, Par Had Arad Sul LISA 105(71:2291-2294. 36. Nowak M. Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms tit- for-tat in the Prisoner's Dilemma game. Nature 36404321:56-58. 37. Mite C, ROM T, Milinski M (2014) Extortion subdues human players but is finally Punished In the prisoners dilemma. Nat Common 5:3976. 31. Boedijst MC, Nowak MA Sigmund K (1997) Equal pay for all Prisoners-Am Math Mon 104303-307. 0:23 39. KurOkawa 5, Ihara Y (2009) Emergence of cooperation in public goods games. floc Biel Sri 276(1660)1379-1384. 40. Van seeoreed 5. Pacheco 10.4 Lenaerts T. Santos FC (2012) Emergence of falmess in repeated group interactions. Phys Rev Lett 10/305k158104. 41. Sigmund K (2010) The Calculus of Selfishness (Princeton Vniv Press, Princeton). 42. Friedman 1 (1971) A non-cooperative equiltdurn for supergames. Rev Eton Stud 38:1-12.0:24 43. Fudenberg D. Tirok 10998) Game Theory (MIT Press. cancnoee. MA). 6M Ed. 44. Own 1, Zinger A (2014) The robustness Of zero-ckterminant strategies in iterated Prisoners Dilemma games. / Theo, OM 357:46-54. 45. Kraires DP, KrainesVY (1989) Pavlov and the prisoners dilemma TheogrDecis 28.47-79.0:B 46. Peleg B, Sudhalter P (2003)4160dt/et/on to the Theory of Cooperathe Games (Springer, Berlin). 47. mesterton.ciatons M, Gavrilets S. Gravner 1, Akcay E (2011) Models of coalition or alliance format:kn..; Meer Biel 274(0:187-204. 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 6 of 6 I www.pnaS.OrgfCgild0i/10.1073/pnaS.1407887 1 1 Hilbe et al. EFTA01199752 AUTHOR QUERIES AUTHOR PLEASE ANSWER ALL QUERIES 1 Q: 1_Please contact [email protected] if you have questions about the editorial changes, this list of queries, or the figures in your article. Please include your manuscript number in the subject line of all email correspondence; your manuscript number is 201407887. Q: 2_Please (i) review the author affiliation and footnote symbols carefully, (ii) check the order of the author names, and (iii) check the spelling of all author names, initials, and affiliations. Please check with your coauthors about how they want their names and affiliations to appear. To confirm that the author and affiliation lines are correct, add the comment "OK" next to the author line. This is your final opportunity to correct any errors prior to publication. Misspelled names or missing initials will affect an author's searchability. Once a manuscript publishes online, any corrections (if approved) will require publishing an erratum; there is a processing fee for approved erratum. Q: 3_Please review and confirm your approval of the short title: Cooperation and control in multiplayer dilemmas. If you with to make further changes, please adhere to the 50-character limit. (NOTE: The short title is used only for the mobile app and the RSS feed.) Q: 4_Please review the information in the author contribution footnote carefully. Please make sure that the information is correct and that the correct author initials are listed. Note that the order of author initials matches the order of the author line per journal style. You may add contributions to the list in the footnote; however, funding should not be an author's only contribution to the work. Q: 5_You have chosen the open access option for your paper and have agreed to pay an additional $1350 (or $1000 if your institution has a site license). Please confirm this is correct and note your approval in the margin. Q: 6_Please verify that all supporting information (SI) citations are correct. Note, however, that the hyperlinks for SI citations will not work until the article is published online. In addition, SI that is not composed in the main SI PDF (appendices, datasets, movies, and "Other Supporting Information Files") have not been changed from your originally submitted file and so are not included in this set of proofs. The proofs for any composed portion of your SI are included in this proof as subsequent pages following the last page of the main text. If you did not receive the proofs for your SI, please contact [email protected]. Q: 7_Please check the order of your keywords and approve or reorder them as necessary. Q: 8_If your article contains links to websites (other than the SI links for your article), please verify that the links are valid and will direct readers to the proper webpage. Q: 9_Please provide department/section for affiliation a. Q: 10_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after `This": "This results in the ...." EFTA01199753 AUTHOR QUERIES AUTHOR PLEASE ANSWER ALL QUERIES 2 Q: 11_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after `This": "This may be due of ...." Q: 12_PNAS discourages claims of priority; is this truly new? If not, please either (a) replace the term new with a term such as "previously unidentified" or (b) remove it altogether to avoid the claim of priority. Q: 13_For ref. 5, please provide issue number. Q: 14_For ref. 10, please provide issue number. Q: 15_For ref. 14, please provide issue number. Q: 16_For ref. 18, please provide issue number. Q: 17_For ref. 25, please provide volume, issue, and page range. Q: 18_For ref. 27, please provide journal title, volume, issue, and page range. Q: 19_For ref. 30, please provide issue number. Q: 20_For ref. 31, please provide journal title, volume, issue, and page range. Q: 21_For ref. 33, please provide page range. Q: 22_For ref. 34, please provide issue number. Q: 23_For ref. 38, please provide issue number. Q: 24_For ref 42, please provide issue number. Q: 25_For ref. 45, please provide issue number. Q: 26_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after `This": "This does not imply that ...." EFTA01199754 6 Supporting Information 64 3 6', 4 66 Hilbe et al. 10.1073/pnas.1407887111 5 67 6 SI Text 68 7 In the following. we show how the mathematical framework introduced by Press and Dyson (I) and Akin (2) can be extended to explore 69 8 cooperation and control in multiplayer social dilemmas. We begin by defining the setup of repeated social dilemmas, and then we discuss the 70 9 existence and the properties of zero-determinant strategies. In particular, we study ZD strategies that allow a player to differentiate between 71 10 the actions of different coplayers. We also identify strategies that give rise to stable cooperation. To this end, we focus on two strategy classes: 72 11 ZD strategies and pure memory-one strategies. Then, we investigate how individuals can extend their strategic options by coordinating their 73 12 behaviors with others, and we apply our results to two examples of multiplayer dilemmas: the public goods game and the volunteer's di- 74 13 lemma. The appendix contains the proofs for our propositions. 75 14 76 Setup of the Model: Repeated Multiplayer Dilemmas. We consider repeated social dilemmas between n players (as illustrated in Fig. 1). In 15 each round, players may either cooperate (C) or defect (D), and the players payoffs for each round depend on their own action and on the 77 16 number of cooperators among the other group members. Specifically, in a group with/ other cooperators, a cooperator receives the payoff 78 17 ai, whereas a defector obtains b1. To qualify as a social dilemma, we assume that one-shot payoffs satisfy the following three conditions (3): 79 18 i) Independent of the own action, players prefer their coplayers to be cooperative 80 19 81 /0 8/ 21 ar t a ag and btu a Di for all/with 0 <j< n —1. [S1] 83 22 84 23 85 24 ii) Within each mixed group, defectors strictly outperform cooperators 86 15 87 /6 88 1g KI or all4with 0 Si <n —1. [Sy] 89 90 iii) Mutual cooperation is preferred over metal defection %yr' root 17 /9 91 30 92 31 en- t > ba• [S3] 93 32 94 33 As particular examples of such social dilemmas, we discuss the linear public goods game (4) and the volunteer's dilemma (5) later. 95 M Ro We will assume here that the social dilemma is repeated infinitely often. This is merely done for simplicity of the argument; similar 96 35 results can be obtained for finitely many rounds or for games with discounting of the future as in refs. 6 and 7. In repeated games, 97 36 a player's strategy needs to specify how to act in given round, depending on the outcomes of the previous rounds. Given the strategies 98 of all group members, let us denote player is expected payoff in round t as ai(r). The payoffs for the repeated game are defined as the 37 99 average payoff per round 38 100 1 T 39 101 = lim CO T EAT). [S4] 40 d r -* 102 ml 41 103 42 oa In the following, we will assume that these limits exist. This always holds, for example, when players only base their decisions on a finite 104 43 (but arbitrarily large) number of past rounds. 105 44 106 45 Zero-Determinant Strategies for Multiplayer Dilemmas. 107 46 Memory.one strategies and Akin's lemma. Although in general, strategies for repeated games can be arbitrarily complicated, we showed that 108 players can achieve a remarkable control over the possible payoff relations by referring to the outcome of the last round only. In particular, we 47 109 focused on players who only consider their own move in the previous round and the number of cooperators in the previous round (this is 48 a consequence of our assumption that the game is symmetric, such that payoffs do not depend on who of the coplayers cooperated, but only on 110 49 1 l l how many). Such strategies are particularly relevant when players can only observe the outcome of the game, but not the coplayers' individual 50 actions. In the context of alliances, however, it is useful to consider a slightly more general strategy set, which allows players to distinguish 112 51 between different coplayers. In this section, we will therefore develop a more general theory of ZD strategies. 113 52 To this end, let us denote player i's action in a given round as 51e {C,D}, and let a= (S1 Sa) e (C.Dr denote the overall 114 53 outcome of that round. A memory-one strategy is a rule that tells a player what to do in the next round, given the outcome of the 115 54 previous round. Formally, such memory-one strategies correspond to a map that takes the outcome of the previous round a as input 116 55 and that returns the cooperation probability p,, for the next round, p= (ps)„,ic..Dr. For example, player i's strategy Repeat, which 117 56 simply reiterates the own move from the previous round, takes the form p P with 118 58 'Ns,. = so 0 if Si =D . gel, {1 if S, =C [SS] 119 120 57 59 121 60 Additionally a memory-one strategy also needs to specify a cooperation probability for the first round of the game. Because the outcome 122 61 of infinitely repeated games is often independent of the first round, the initial cooperation probability is typically neglected (I, 2, 8-13). 123 62 In the following, we will therefore only specify a player's initial cooperation probability when necessary. 124 Hite et el. www.pnes.orgAglkontentishort/1407811711t 1 of 19 EFTA01199755 125 If all members of the group use memory strategies, the calculation of payoffs according to Eq. S4 becomes particularly simple. In that 187 126 case, the repeated game can be described as a Markov chain, as the outcome in the next round only depends on the outcome of the 188 127 previous round (14-16). Although the assumption of memory-one strategies often simplifies calculations, we will show below that the 189 128 properties of ZD strategies hold irrespective of the strategies of the other group members (in particular, ZD strategists do not require 190 129 their coplayers to apply memory-one strategies). 191 130 For a game between n players with arbitrary but fixed strategies, let ve(t) be the probability that the outcome of the rth round is a and 192 131 let v(t) = [v"(t)]eelcol^ be the vector of these probabilities. For example, for pairwise games v(t) becomes kat). vcn(r), vrac(0.van(01. 193 132 A limit distribution visa limit point for t-. co of the sequence (v(I) + ...v(t)]/t. The entries of v. of such a limit distribution cor- 194 133 respond to the fraction of rounds in which the group members find themselves in state a e {CO}" over the course of the game. If one 195 134 of the players applies a memory-one strategy p, Akin's lemma again guarantees that there is a powerful relationship between p and v 196 135 (which can be shown with literally the same proof as in the main text). 197 136 198 137 Lonna (Akin's Lemma). Suppose the focal player applies an athitraty memory-one strategy p. Then, for any limiting distribution v (irrespective 199 138 of the outcome of the initial round), we have 200 139 201 140 (p — pRel • v = 0, IS6] 202 141 203 142 where the product refers to the usual scalar product, p•v= r__ ,._,...(arpos• 204 143 We note that Akin's lemma makes no assumptions on the payoff structure of the game (in particular it also applies to games 205 144 that do not have the form of a social dilemma). Moreover, there are no restrictions on the strategies applied by the remaining 206 145 n —1 group members. 207 146 Zero-detenninant strategies. To define zero-determinant strategies, let us first introduce some further notation. For a given round outcome 208 14Pt3 GE {C. D}". let lad denote the total number of cooperators (i.e., 'al equals to the number of Cs in a). This allows us to write player i's 209 148 payoff g, for that round as 210 149 211 150 IS7] 212 egl. .S.) = { aro; I if :St —=DC. 151 213 152 2 Let g' = &e,),„.(cmr be the corresponding payoff vector. Using this notation, we can write player i's payoff in round t as ne(t) = g • v(t), 153 21514 and i's expected payoff for the repeated social dilemma according to Eq. S4 becomes d= g' • v. Finally, let 1 = (1).,Icar denote the 154 216 vector with all entries being equal to one. By definition of v, it follows that 1. v= I. We can now introduce ZD strategies as follows. 155 217 156 Theorem (Press and Dyson). Let a, 4, and y be parameters such that E mpi0 O. If player i applies a memory one strategy of the form 218 157 219 158 II 220 159 P = PReP +cle + E fife + r I • PSI 221 160 Oi 222 162 161 223 then, itrespettive of the strategies of the remaining n —1 group members, payoffs obey the equation 224 163 225 164 a:if+ Eft"? + y =O. 091 226 165 or 227 166 228 We refer to strategies of the form S8 as zero-de:eminent strategies or ZD strategies. 229 167 Proof Follows immediately from Akin's lemma, because 168 230 169 231 170 232 171 0 = (p — pate ) • v = (agt + E 6e + rt •v = azi + E isys + T. pie] 233 172 n" or 234 173 235 174 By using ZD strategies, a player can thus enforce a linear payoff relation between his own payoff and the payoffs of the coplayers. 236 Moreover, by appropriately choosing the parameters a, fir and y, the player has direct control on the form of this payoff relation. 175 237 For our purpose, it will be convenient to use a slightly different representation of ZD strategies. For a player i who applies a ZD 176 238 strategy, let us consider the following parameter transformation: 177 239 178 240 179 -r - a 4 241 180 i_ s=,.-• Mt' = (SII] 242 a +Ek.efik. Lkonk rAiiil' a iii i =°. 46=Efik• L- k 181 k4i 243 182 Using these new parameters, ZD strategies take the form 244 183 245 184 246 185 P = PRI' +Ok i ''' Eine + ( 1 •t)111 , 15121 247 186 i* 248 Hite et el. www.pnes.orgicglkontentishort/140783711 I 2 of 19 EFTA01199756 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 3%4 309 310 { I +95[(1 0a it (blel akr1-01 —Ial subject to the constraints #*0, w, =0, and E74,w, =1, which directly arise from the definitions in Eq. S11. When player i applies such a ZD strategy, the enforced payoff relation according to Eq. S9 becomes rr'= sx' + (1 —s)/. IS13) where = iH7d is the weighted average payoff of the coplayers. We refer to! as the baseline payoff of the ZD strategy, to s as the slope, and tow = (wi ) as the strategy's weights. The parameter # does not have a direct effect on Eq. S13; however, the magnitude of # determines how fast payoffs converge to the enforced payoff relation as the game proceeds (7). Examples (the Impact of different weights): i) Equal weight on all coplayers. Suppose player i applies a ZD strategy with weights voi = I/(n — I) for all j 01. According to Eq. S12, the entries of such a ZD strategy have the form Po= 95[(1—s)(1-41) + if Si =D. if Si =C (S141 The cooperation probabilities of player i thus only depend on the player's own action Si in the previous round and on the number of cooperators H. That is, the ZD strategies that we discussed in the main text are exactly those ZD strategies that use the same weight for each of their coplayers. According to Eq. 513, such strategies enforce tr, ' =sr' +(I —s)/, with tr-' being the arithmetic mean of all coplayers payoffs tr-' =E 1(rt — I). In Fig. SI, we illustrate this relationship for two different social dilemmas (the public goods game and the volunteer's dilemma) and for two different ZD strategies (a generous and an extortionate ZD strategy). Full weight on one coplayer. Let us now suppose instead that player i chooses w such that all entries are zero except for w, =I for some j# i. It follows that 1 +0(1 —s)(1—akh,) 1 +01.saiel..i +(1 s)I] Pe= 4,[sblei — am_i+ (1 —s)/] 0(1 —0 —No if Si=Si=C ifSe=C,Si =D if Si =D.S)=C if Si =Si =D. ISIS] That is, player i's reaction only depends on the number of cooperators. the own move, and on player ts move. The enforced payoff relation 513 becomes Kr =sd + (I —s)1. A player cannot enforce arbitrary payoff relations 513 because the parameters I. s, w, and ¢ need to be set such that the resulting cooperation probabilities according to Eq. S12 are in the unit interval. We thus say that a payoff relation (Ls. w) is enforceable if there is a ¢#0 such that the resulting ZD strategy p satisfies p,, [0, 1] for all possible outcomes aE (C.Dr. The following gives some necessary conditions for enforceable payoff relations. Proposition 1 (Necessary Conditions for Enforceable Payoff Relations).Any enforceable payoff relation (1.s.w) satisfies —r t.'. <s < 1, and ifs 0. In addition to these necessary conditions, one can also give a characterization of all possible payoff relations. Proposition 2 (Enforceable Payoff Relations). Consider a payoff relation (1,s. w) for player i such that ovi> 0 ford/ j. Let fv) denote the sum of the j smallest entries of w (ercluding the entry W1), and let % =0. Then (1.4,w) is enforceable for a given social dilemma if and only if either s =1 or b. — max { riy(bi —ai-1)}</< min a —a,)} { 0≤iln -1 1—s hsism-1 1 I —s IS16I Remark (some observations on enforceable payoff relations): i) A direct consequence of Proposition 2 is that ZD strategies exist for all social dilemmas and for all weights w with In> 0 (one only needs to set s =1). Moreover, if all weights are positive, i.e., iv, > 0 for all jOi, it also follows that any baseline payoff between be s! <an_ i is enforceable for some s al _i for all j; one only needs to choose an s that is sufficiently close to one). ii) For a given slope and given weights, we also note that if the baseline payoffs It and /2 are enforceable, then so is any baseline payoff between Ii and h. iii) In the special case of equal weights on all coplayers, wi = 1/(n — I) for all j, the sum of the j smallest entries of w is simply given by =j/(n —1). In that case, a payoff relation is enforceable if and only if either s= 1 or max {b - j bi-aj-}<7 < +n -j -1 biti -a)}. (5171 — 01•93-1 ' n -1 1-s twsx.-I n-1 1-s Fig. S2 gives an illustration of the enforceable payoff relations, again for the two examples of a public goods game and a volunteer's dilemma. In particular, in both examples the space of enforceable payoff relations shrinks with group size. This holds more generally: if the payoff advantage of a defector br i —a1 does not increase with group size, then Proposition 2 implies that larger group sizes n make it more difficult to enforce specific payoff relations. 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 HlIbe et el. www.poes.org/cglkontentishort/140788711I 3 of 19 EFTA01199757 373 iv) In the special case of full weight on one coplayer, i.e., we = I for some k 0 i and all other entries of w are zero, the sum of the/ 435 374 smallest entries is either ii7= 0 (if/ <n — I) or go) =I (if i =n —1). Because bm ≥ hi and am ≥ at, a payoff relation is enforceable if 436 375 either s =1 or 437 376 438 377 (5181 439 max {Dn.-2. iril- fbft-1 <!< min{b a°, al }. I —s 1 —s 378 440 379 441 For games in which ba- 2 >al, condition S18 cannot be satisfied. In particular, the condition cannot be satisfied for social dilemmas 380 4t with group size n > 3 (because for such social dilemmas ba_2 >ai follows from S1 and S2). Thus, in large groups, the only feasible 381 ZD strategies are those with s = 1, i.e., the only payoff relationship that player i can enforce between his own payoff and player is 443 382 payoff is the fair payoff relationship, s =j/. 444 383 v) A comparison of Eqs. S17 and SIR thus shows that the set of enforceable payoff relations is larger if player i uses the same weight 445 384,ts for all coplayers. This holds more generally: if a payoff relation (ts.w) is enforceable for some weight vector w, then it is also 446 385 enforceable for the weight vector that puts equal weight on all coplayers (this follows from Proposition 2 because go) becomes 447 386 largest when all coplayers have the same weight). 44S 387 449 388 Nash Equilibria of Repeated Multiplayer Dilemmas. Various Folk theorems have shown how repetition can be used to sustain cooperation in 450 389 social dilemmas (17. IS). To prove these Folk theorems, one typically constructs specific strategies that give rise to given payoff 451 390 combinations (xi x") and shows then that these strategies are indeed an equilibrium of the repeated game. Herein, we ask a somewhat 452 391 different question: within a certain strategy class, what are all Nash equilibria? We respond to this question for two different strategy classes, 453 392 the class of ZD strategies, and the class of pure memory-one strategies. The equilibria among these two strategy classes will prove to be stable 454 393 against any mutant strategy (i.e., we do not need to assume that mutants are restricted to ZD strategies or memoryone strategies). 455 394 To simplify the analysis, we will focus on symmetric memory-one strategies for which the cooperation probability only depends on the 456 395 own move in the previous round and on the number of cooperators in the previous round (but not on who of the coplayers cooperated). 457 396 In that case, each possible outcome a can be identified with a pair (S, j), where S e (C,D} is the player's own move and/ is the number 458 39b. of cooperators among the other n —1 group members. This allows us to represent memory-one strategies as vectors of the form 459 398 p = (ps1). Using an analogous notation as in the previous section, v= (v51) corresponds to the frequency of observing each of the states 460 399 (S,j) over the course of the game, g' = (gi .) corresponds to player rs payoffs, and r = (r) corresponds to the average payoffs of i's .54 s) 461 400 coplayers (using the arithmetic mean r = n—Li_ E rne). Because symmetric memory-one strategies are a subset of all memory-one 462 401 strategies, the previous results (in particular Akin's lemma and the Press and Dyson theorem) naturally carry over. 463 402 Nash equilibria among ZO strategies. Consider a group in which all players apply the ZD strategy p with parameters 1, s, 0, and w) = 1/(n —1) 464 403 for all/ #i. and let us suppose the nth player considers deviating. We will refer to the first n — 1 players as the residents and to the nth 465 4D4 player as the mutant, and we denote a resident's payoff by x and the mutant's payoff by I. Because residents apply a ZD strategy, Eq. S13 466 405 implies that each of them enforces the relationship 467 406 468 n I 407 n —1 —2 n+n l ir=sn+(l—s)!, (519] 469 408 470 409 which can be rewritten as 471 410 472 411 it = Ar+ (I —siz)!, (S20] 473 412 474 413 with 475 414 476 415 ea = s(n — 1) — (n —2). (S21] 477 416 478 417 That is, then —1 residents collectively enforce a linear relationship between their own payoff Jr and the payoff of the mutant ir, with the 479 418 same baseline payoff I and with slope st. By using the same strategy as the residents, the mutant yields the payoff if = fr. Forst< I, Eq. S20 480 then implies that ir = A=l. Forst =s =1, the value of I is a free parameter that does not have an effect on the entries of the ZD strategy; to 419 481 be consistent with the cases < 1, we define! to be the payoff that the strategy yields if applied by all group members (this payoff depends on 420 the strategy's cooperation probability in the initial round). Thus, by using the residents' strategy the mutant yields the payoff !. 482 421 For p to be a Nash equilibrium, a minimum requirement is that the mutant must not have an incentive to switch to a different ZD 483 4// strategy 0, with parameters i and ,--th ci <1. Such a mutant would enforce the payoff relation 484 423 485 424 n = lir+ (1 —1)I. (S22] 486 425 487 426 Solving the two linear equations S20 and S22 for the mutant's payoff yields 488 427 489 428 it _ 1(1 —IR) 14.TR(1-1) (S23] 490 429 1 —isia 491 43 4310 The mutant has no incentive to deviate when 1 492 <1, that is when 493 432 494 433 0 —0:71(1 —1) 495 u I - ` SO. (524] 434 I —31/1 496 lite et el. www.petes.ceseglicOntenthhcet/140703711I 4 of 19 EFTA01199758 497 Because — d 3,—ri ≤s <1 (by Proposition I) and because --4 ct <1 (by assumption), the denominator of Eq. S24 is positive. We can 559 498 therefore iitinguish three cases. 560 499 561 i) sR =0. In that case 1-1=0, and player n cannot improve his payoff by deviating. 500 562 ii) sR > 0. In that case 1 —/ ≤0 if and only if I <1. To prevent the mutant from deviating, the residents thus need to apply a strategy Jul 563 with maximum possible baseline payoff, 1=a„_i. 502 iii) sR <O. Then 1-1 <0 if and only if ? — / > 0. To prevent the mutant from deviating, the residents' ZD strategy needs to set 1 to the 564 503 minimum value 1=bo. 565 504 566 505 This result also holds if mutants are not restricted to ZD strategies. 567 506 568 Proposition 3 (Nash Equilibria Among ZD Strategies). Considers social dilemma in which mutual cooperation is the best outcome for the group 507 569 whereas mutual defection is the burst possible outcome 508 be < onisit < . jak.. I + (n —Dbr 570 509 571 510 MU MA + l(11—Db) <6,„_1. IS25) 572 511 n osisn n 573 512 574 Let p be a ZD strategy with parameters 4 s, and # and let I R = (n - 1)5—(n —2). Then p is a Nash equilibrium if and only if one of the 513 575 following three cases holds: 514 576 515 577 516 5R =0 and bo 0 and 1=a„..t; and .sR <0 and / = bo. 578 517 579 518 Remark (some observations for stable z0 strategies): 580 519 i) The three conditions in Proposition 3 do not depend on 0. Whether a ZD strategy is stable only depends on the payoff relation that 581 520 it enforces, but not on the exact strategy that gives rise to this payoff relation. 582 521 ii) The three conditions do not directly depend on the sign of s, but on the sign of .5R = (n —1)5 — (st — 2). Whether a given ZD strategy 583 522 is stable thus depends on the group size. 584 523 iii) The second condition is of particular interest, because it states that stable mutual cooperation can be achieved by ZD strategies 585 524 with 1=an_ t and s> (n — 2/n —1). For pairwise games these are exactly the generous ZD strategies with I =4,,_i and s> 0 (2, 10). 586 519 7 As the group size increases, generous strategies need to approach the fair strategies (with s= I) to allow for stable cooperation. 587 526 588 Corollary 1 (Convexity of the Set of Nash Equilibria). Consider a social dilemma in which mutual cooperation is the best outcome for the group. 527 Suppose pi and p* are tno ZD smmgies that both give rise to stable cooperation (i.e., l' =I" = tr„_i and ?> (n —21n —1), se> (n —21n — 1)). 589 528 Then any linear combination p =.1p' + (I —A)p° with 0 < A <1 is afro a ZD strategy that gMtc rise to stable cooperation. 590 529 Proof A direct computation shows that p can be written as a ZD strategy with parameters / =an_i, 46 =20' + (1 —2)0' >0, and 591 530 s= (IA24' +(I— A)erfill[Ab' + (1 —A)01}≥ (ti — 2/n —1). 592 531 A similar result can also be shown for ZD strategies that lead to mutual defection. 593 532 Nash equilibria among pure memory-one strategies. As another application of Akin's lemma, we will show in the following which pure memory- 594 533 one strategies allow for stable cooperation in multiplayer social dilemmas. To this end, let us again consider a group in which the first n — 1 595 534 players apply some pure memory-one strategy p= (psi ) and in which the nth player considers deviating. Let v denote a limit distribution 596 535 from the perspective of the n —1 resident players and let t be the corresponding limit distribution from the perspective of the mutant 597 536 player. The following observation will be useful: 598 537 Lemma 1 (Relationship Between Limit Distributions). If the residents apply a pure memory-one strategy, psie (0, 1) for all (S,j), the entries of 599 538 v swish, 600 539 601 540 602 541 ye.; = 0 for all j<n —2 (S261 603 542 varpi= 0 for all j> 1. 604 543 60 Moreover, the limit distributions v and i are related by 544 6056 545 607 546 ka-i =vca-t, i)c.o =YD.& (S27] 608 547 lion-I=VCA-2, i)D.0 = VD.0, 609 548 610 549 and 1,si= 0 for all other states (54). 611 550 Proof As all residents use the same pure strategy, they play the same action in any given round. Thus, if one of the residents cooperates, 612 then there are at least n —2 other cooperators, and therefore the probability to end up in a state with less than n-2 other cooperators 551 613 is zero, vcd =0 for j <n —2. The same argument shows that vD0 =0 forj> 1. Finally, for Eq. S27, we note that the mutant is in the state 552 614 (Cis — 1) if and only if each of the residents is in the state (Can —1). Similarly, the mutant is in the states (C. 0), (An -1), and (D.0) if 553 615 and only if the residents are in the state (D, 1), (Cat — 2), and (D.0), respectively. 554 616 555 Proposition 4 (Pure Memory-One Strategies That Give Rise to Stable Cooperation). Consider a social dilemma in which there is an incentive to 617 556 deviate from mutual cooperation, 1),,,_1>a„_1. Let p be a pure memory-one strategy that cooperates in the first round and that sticks to 618 557 cooperation as long as all other playas do so (i.e., pc.„-I = 1). Then p is a Nash equilibrium if and only, if the entries of p satisfy the three 619 558 conditions 620 HIt . et el. www.pnes.orgicglkontentishort/140783711 I 5 of 19 EFTA01199759 621 683 622 N A-2 =0 IS28a) 684 623 685 624 [bbl 686 bn-i—a,,-t 687 626 PD.I S an-1 — a0 625 an —be 688 627 9 628 Poo S bff_i -1 —an_i' IS28O 68 690 629 691 630 Remark (some remarks on Proposition 4): 692 631 i) The full proof of Proposition 4 is given in the appendix; the step (*) follows from a straightforward computation of payoffs for two 693 632 possible mutant strategies. The step (a) requires a more sophisticated argument; it is exactly in this second step where Akin's 694 633 lemma comes into play. 695 634 ii) According to Proposition 4, the stability of a cooperative and pure memory-one strategy p is solely determined by the four entries 696 635 pc.„_] =1,pc,,_2= 0, pDA, and pp°. This observation is a consequence of Lemma 1, which allowed us to neglect all other entries of 697 636 p. For pairwise games, Lemma 1 is not required, and thus pairwise games allow more general versions of Proposition 4, which are 698 63f 4 then also valid for mixed memory-one strategies (2, 6). 699 638 Examples (for stable memory-one strategies): 700 639 701 i) The proofs of the various Folk theorems are often based on trigger strategies, which relentlessly play D after any deviation from 640 702 the equilibrium path (17). An example of such a strategy is Grim, for which pc,' =I andps,, =0 for all other states (S, j). Because 641 Grim satisfies the three conditions in Eq. S28, Grain is indeed a Nash equilibrium for all multiplayer dilemmas. 703 642 ii) In their analysis of multiplayer dilemmas, refs. 19 and 20 consider the performance of generalized versions of Tit-for-Tat. These 704 643 TPTk strategies cooperate if at least k other group members have cooperated in the previous round, i.e.,psi = 1 if and only if j≥ k. 705 644 As the conditions 528 are only satisfied for k =n —1, it follows that TPTa-i is the only Nash equilibrium among these generalized 706 645 versions of Tit-for-Tat. 707 646 iii) Unfortunately, neither Grim nor TF7i_i is robust under the realistic assumption that players sometimes commit errors (16). For 708 647 stochastic simulations of variants of the public goods game, ref. 15 found that evolution may promote strategies that only 7® 648 cooperate after mutual cooperation or after mutual defection, i.e., Pea-1 =Pao = I. and psi =0 for all other states (S.j). We refer 710 to such strategies asWSLI For the prisoner's dilemma, this strategy corresponds to the Win-Stay Lose-Shift behavior described by 649 711 ref. 14. According to Eq. S28, WSLS is a Nash equilibrium if and only if the social dilemma satisfies (b„_, +b0)12 ≤an_1. In Fig. S3, 650 we illustrate the stability of these strategies when the social dilemma is a public goods game. 712 651 713 652 With a similar approach, we can also characterize the pure memory-one strategies that result in defection. 714 653 715 654 Proposition S (Pure Memory-One Strategies That Give Rise to Stable Defection). Consider a social dilemma with bo> ao and bn-t ern...t. Let p 716 be a pure memory-one strany that defects in the first round and that sticks to defection as long as all other players do so (Le., PD.o =0). Then 655 717 p is a Nash equilibrium if and only if at least one of the following nvo conditions is satisfied 656 718 657 719 b„_1 +ao 658 pal =0; and n =pca-2=0 and <be. 2 659 ren-I (529] 720 721 660 722 Remark (some remarks on Proposition 5): 661 723 662 i) As a special case of the above proposition, we conclude that AllD is an equilibrium for any social dilemma with ba_i ≥an_1 and 724 663 Do a ao. 725 ii) Conversely, mutual defection is never stable in social dilemmas with b0 <ao (such as the volunteers dilemma). If bo <ao, it follows 664 726 from condition Si that be <ai for all/. As a consequence, an equilibrium in which everyone defects can always be invaded by 66 6 728 5 727 a player who switches to Al1C. 66 667 729 668 Coordinated Play and Zero-Determinant Alliances. In the previous sections, we analyzed the amount of control that single individuals can 730 669 exert over their coplayers in repeated multiplayer dilemmas. Now we are interested in the question whether individuals can gain a higher 731 amount of control by coordinating their play with other group members. To this end, let us consider a set of n-4 < n players, who agree on 670 732 671 a joint strategy. We will refer to these players as the allies and to the remaining group members as the outsiders (as depicted in Fig. 1). 733 672 Depending on the degree of coordination, one can think of various forms of coordinated play. In the following we are going to 734 673 explore two modes of coordination: 735 674 i) Strategy alliances. Here, players only agree on the set of alliance members and on a common memory-one strategy that each ally 736 675 applies. During the actual game, there is no further communication taking place. Strategy alliances thus require a minimum amount 737 676 of coordination. This alliance type seems particularly relevant when allies do not have the possibility to communicate during the 738 677 game or when it is too costly to coordinate the allies actions in each round. 739 678 ii) Synchronized alliances. Alternatively, players may synchronize their decisions in each round such that all allies play the same action. In 740 679 effect, a synchronized alliance thus behaves like a new entity that has a higher leverage than each single player. 741 680 We will use the symbols ri-4 to refer to the payoff of an ally and C A to refer to the average payoff of the outsiders. In the following, 742 681 we investigate which linear payoff relationships ZD alliances can enforce and how their strategic strength depends on the mode of 743 682 coordination. 744 Hite et el. www.poes.oroRglkontentishoit./1407837111 6 of 19 EFTA01199760 745 Strategy alliances. To model strategy alliances, let us consider a group of nA allies, with 1 ≤nit <n. We assume that the allies agree on 807 746 using the same ZD strategy p during the game. The parameters of this ZD strategy are given by /, s, and 0> O. To allow allies to 808 747 differentiate between the actions of other allies and outsiders, the weights w= (wi) of the ZD strategy may depend on a player's 809 748 membership in the alliance. That is, if player i is a member of the alliance, then 810 749 811 750 wA if j is a member of the alliance 812 751 wi'4 = { w"v1 if j is an outsider. (530] 813 752 814 753 with w4 ≥ 0 and nr4 ≥ 0 such that the weights sum up to one, (nA —1)w4 + (n —ItA)nrA = 1. Because all allies use the same strategy, it 815 754 follows that they all get the same payoff. Moreover, by Eq. 513, each of the allies enforces the payoff relationship 816 755 817 756 (x4 — 1)wAr? + (n —nA))v-All-A = srA + (I —sy. (S31] 818 757 819 758 This payoff relationship can be rewritten as 820 759 821 760 fir4 = sANA + (1 —s4)1. (S32] 822 761 823 762 such that 824 763 825 764 826 ? — s— (nA 1 — OA — 1) —1)"'4 765 (533] 827 wA' 766 828 767 829 is the effective slope of the strategy alliance. For HA = 1 /(n — 1), we recover the case of equal weights on all group members. For general 768 830 weights; A, we say that the payoff relationship S32 with parameters (/,s4) is enforceable if we can find 0> 0 and 0 < wA < 1/(nA —1) 769 such that all entries of the resulting ZD strategy according to Eq. S12 are in the unit interval. The following gives the corresponding 831 770 characterization. 832 771 833 772 Proposition 6 (Enforceable Payoff Relations for Strategy Alliances).A strategy alliance can enforce the payoff relation (1.?) if and only if either 834 773 ? =I or 835 774 836 j bi — ar , 1 n — f — I bio —a/ 775 sA <1 and sr:7,0 {b, </ < min •to +— ISM] 837 0 776 n—nA I —et nA-isisn-i 1 n — nA I —sA }. 838 777 839 Moreover, if nA <n12, then —1 <— nA fin —nA)<sA < I. 778 Remark (on strategy alliances): 840 779 841 780 i) Earlier we saw that individuals typically lose their strategic power when groups become large. The above proposition shows 842 781 that players can regain control by forming alliances. In particular, the space of enforceable payoff relations (i,sA) increases 843 with the size of the alliance n4. Larger alliances can therefore enforce more extreme payoff relationships, as illustrated in 782 844 Fig. S4. 783 845 ii) Somewhat surprisingly, it follows from the proof of Proposition 6 that the set of enforceable payoff relationships becomes maximal 784 when 1A approaches II (nA -1) (and therefore w-A -. 0). The most powerful alliances are those in which the outsiders' actions 846 785 only have an infinitesimal influence. 847 786 iii) In contrast to the case of ZD strategies for individual players, we note that the effective slope ? for alliances does not need to be 848 787 bounded from below. As an example, let us assume the alliance has reached a size n4 such that ba_„.. ≤a„A_I (which can only 849 788 happen when nA > n/2). Because b„,A is an upper bound for the left side of Eq. S34 and a...I_ 1 is a lower bound for the right side 850 789 of Eq. S34, it follows that any I with b„,,A 515a,,.4_1 can be enforced, irrespective of the value of sA <1. 851 790 iv) However, for alliances that have reached a size nA such that b„_„A <a„A _i, the theory of ZD strategies becomes somewhat less 852 791 relevant: such alliances are better off by cooperating in each round (if all allies cooperate, their payoff is at least anA _I, whereas if 853 they all defect, their payoff is at most b„_„.4). In other words, if n4 individuals are able to make a binding agreement that they will 792 854 all play the same strategy, in a social dilemma with b„_„A <a„A_I, then unconditional cooperation is a dominant strategy for the 793 855 alliance. 794 856 795 Synchronized afflattes. In the previous scenario, we assumed that each of the allies decides independently whether to cooperate in a given 857 796 round. Let us now turn to a scenario in which allies meet after each round to decide which action they collectively play in the next 858 797 round. As a result, the alliance members act as a single entity, in a game with n —nA coplayers. To investigate such a scenario, let us 859 798 first adapt our notation correspondingly. For a given set of allies, let (S,j) refer to the outcome in which all allies choose SE {C,D}, 860 799 and in which j of the outsiders cooperate. As in previous sections, the limit distribution v = (vs.') corresponds to the fraction of 861 800 rounds the alliance finds herself in state (S.,i) over the course of the game. A memory-one strategy for a synchronized alliance is 862 801 a vector p= (ps.1)—given the outcome (S,j) of the previous round, the cooperation probabilitypsi is used to determine whether all 863 802 allies cooperate or all allies defect in the next round. The synchronized alliance uses the strategy Repeat if cooperation probabilities 864 803 are given by 865 804 866 805 Rep... f 1 if S=C [535] 867 806 Ps.) — 1 0 if S=O. 868 lite et el. www.roes.orgleglkontenthhott/140788711I 7 of 19 EFTA01199761 869 870 871 872 873 874 875 With literally the same proof as in the main text, one can verify Akin's lemma for synchronized alliances: if the alliance applies a memory-one strategy p then any corresponding limit distribution v satisfies (p — pReP) • v = 0. Let us next write down the possible payoffs in a given round. The payoff vector gA for the synchronized alliance has the entries ...A 1 a„or i if S=C (S36 ] 55I = bi if S=D, 931 932 933 934 935 936 937 876 and the corresponding vector g-A that contains the average payoffs of the outsiders (using the arithmetic mean) takes the form 938 877 939 878 ja„A,,j_i +(n — nA —))b,,At, if S=C 940 n —nA IS37] fait+(n —nA —.Obi if S=D. 879 880 Ss/ — 941 942 881 943 n —nA 882 944 883 Using these payoff vectors, the payoff of each ally is given by x-4 = gA • v, and the mean payoff of the outsiders is sr-4 =r 4 • v. 945 884 Analogously to the case of individual players, we can define ZD strategies for synchronized alliances as strategies of the form 946 885 947 886 P=PR`P+cre+fig-A +71. IS38] 94 887 949 888 with I being the memory-one strategy with all entries being one and with a, II, and y being parameters of the ZD strategy (with fi 0 0). 950 889 Akin's lemma then implies that such alliances enforce the relationship 951 890 891 ourA + fir -A + y =O. 1S39] 952 953 892 Again, we stress the fact that this relationship holds irrespective of the strategies of the outsiders: even if outsiders notice that they are 954 893 facing a synchronized alliance, there is nothing they can do to the above relationship. As above, we use the 955 894 prevent payoff parameter transformation 1= —y/(a +11), sA = --alp, and # = —it to write ZD strategies as follows: 956 895 957 896 p =pRy+ ek[sAg4 —g- A + (1 —sA)11]. IS40] 958 897 959 898 With these new parameters, the enforced payoff relationship according to Eq. S39 takes the usual form 960 899 961 900 yr- A = sAr4 + (1 —sA)l. IS411 962 901 963 902 Proposition 7 (Enforceable Payoff Relations for Synchronized Alliances). A synchronized alliance can enforce the payoff relation (1,sA) if and 964 903 only if either sA = I or 965 904 966 905 sA #1 and j bi-aj-i} , n -j- I biti -ai < 967 si min {a + — -- IS42] n -$1-4 1 —sA — w4-isisx-i 1 n—nA 1 —s•A }" 906 osmisnA{b I 968 907 969 908 Moreover, if nA <n12, then —1 < —nA 1(n — nA) < et < I. 970 909 Remark (on synchronized alliances): 971 910 i) For not too large alliances with nA <n12, Propositions 6 and 7 imply exactly the same conditions on enforceable payoff relation- 972 911 ships. Thus, although strategy alliances require considerably less coordination between the allies, they have the same strategic 973 912 power as synchronized alliances. 974 913 ii) When nA > 42, synchronized alliances may be able to enforce a strictly larger set of payoff relationships, because they are not 975 914 restricted to payoff relationships with sA s 1. Whether relationships with s-4 > I are enforceable depends on the social dilemma. 976 915 When the social dilemma satisfies b„_,,A <a„A_I then condition S42 can be satisfied for any b„_„A 1 sufficiently large. Conversely, when b„_„4 >a„.4_1 then only slopes s-4 1 the left side of Eq. S42 is 977 916 917 strictly larger than h„-„A, whereas the right side is strictly lower than a„.4_1). iii) Overall, we conclude that synchronized alliances are more powerful than strategy alliances if and only if the alliance has reached 978 979 918 a size nA such that ba„A <a„A_I. However, as noted for strategy alliances, the condition b„,A <a,,A_, transforms the social 980 919 dilemma into a game in which mutual cooperation is the best strategy for the alliance, such that the notion of ZD alliances 981 920 becomes less important. 982 9 983 911 21 923 Here we explored the strategic power of alliances, assuming that the allies agree on a joint ZD strategy. Given these results, one may ask which ZD strategy the allies should agree on and which combinations of alliance strategies and outsider strategies form an 984 985 9;4 925 equilibrium of the game between allies and outsiders. This question is different from the questions explored above; there we considered a homogeneous group of players, and we explored which strategies are stable if applied by all group members. To explore equilibria for games with ZD alliances, one needs to distinguish between the strategies of the allies and the strategies of the outsiders. This inherent 986 987 917 asymmetry makes the equilibrium analysis more intricate, and we thus leave this question for future work 988 989 928 Applications. In this section, we apply our theory to two particular examples of multiplayer social dilemmas: the public goods game and the 990 929 volunteer's dilemma. For simplicity, we will focus here on symmetric strategies that only depend on the number of cooperators but not 991 930 on the cooperators' identities (for ZD strategies this implies that we consider the case of equal weights on all coplayers). 992 Hite et el. www.pnes.oraftglkontenthbort/140788711 I 8 of 19 EFTA01199762 993 Public goods games. In a public goods game, each player of a group can cooperate by contributing an amount c > 0 to a public pool. Total 1055 994 contributions are multiplied by a factor r with 1 <r <n and evenly shared among all group members. Thus, payoffs are given by 1056 995 1057 996 is) = (j+1)m ,. fir 1058 — c. and ui —. (8431 997 n = n 1059 998 1060 999 Some of the properties of ZD strategies for public goods games have been recently described by (13), using an independent approach. 1061 1000 Here we complement and extend these results. 1062 ID strategies for public goods games. Plugging the payoff values 843 into representation S14 shows that ZD strategies have the form 1001 1063 1002 1003 1004 1005 1006 1007 I +0[(1 —s)(/— g a r +c) ci n -nn, I if Si =C Psi = # [(I —s)(/-1— re) + ' ci n n 1 if S, =D. 1S44] 1064 1065 1066 1067 1068 1069 1008 To explore which payoff relationships (1,$) a single player can enforce, we use the characterization given in Eq. 817. Because the 1070 1009 payoffs of the public goods game are linear in the number of coplayersj, the corresponding conditions become particularly simple (as 1071 1010 only the boundary cases j= 0 and j =n — 1 need to be eobsinereodi:ire.:licoenclude+tIliatsa single player can enforce a linear payoff relation 1072 1011 with parameters O) if either s= 1 or 1073 1012 1074 1013 1075 1014 (n —1)tr c (r n)c c 18481 1076 1015 n 1-s- n 1077 1016 1078 1017 Fig. S2 shows the set of all pairs (1.$) that satisfy the above constraints for various group sizes n. We get the following conclusions for the 1079 1018 existence of extortionate strategies, generous strategies, and equalizers, depending on the size n of the group: 1080 1019 i) Extortionate strategies (7= bo = 0). Let us ask which slopes s an extortionate player can enforce. The inequalities 845 then imply that 1081 1020 slopes s> (r— 1)/r can always be enforced, irrespective of the group size /1. However, slopes s < (r —1)/r are only enforceable if the 1082 1021 group size is sufficiently small - 1083 or lee more 1022 1084 1023 r(I -s) 1085 1024 " S (S46] r(1 -s)- I oe 1086 1025 1087 1026 We conclude that in large groups, n -• co, only extortionate strategies with s < (r - 1)/r are feasible. 1088 1027 i) Generous strategies (/ = a„..i = is —c). Using the inequalities 545, one can show that generous players can enforce exactly the same 1089 1028 slopes as extortionate players. 1090 1029 ii) Equalizers (s = 0). For equalizers, the inequalities S4S imply there are three regimes: (a) if n Sri(' —1), all baseline payoffs 1091 1030 0 2r/(r — I), there are no equalizers. 1093 1032 1094 In particular, we conclude that for a given multiplication factor r > 1 the set of equalizer strategies disappears as groups become large. 1033 1095 Strategy alliances in the public goods game. By Proposition 6, strategy alliances with nA members can enforce a linear relation with 1034 parameters (l. sA) if and only if either sA =I or if the two following inequalities hold: 1096 1035 1097 1036 0<1<rc—c 1098 1037 (n—n 1038 )rc c (nsA — n)c c (S47] 1099 1100 1039 n 1 —sA — / — n + 1 —sx 1101 1040 1102 For the special cases of extortionate strategies, generous strategies, and equalizers, these inequalities allow us to derive the following 1041 1103 conclusions: 1042 1104 1043 i) Extortionate strategies (1 =b0 =0). We can rewrite the inequalities S47 to obtain a critical threshold on the fraction of alliance 1105 members that is needed to enforce a certain slope sA 1044 1106 1045 1107 1046 nA r(I—sA)— 1 1108 > 18481 1047 n — r(I-4) ' 1109 1048 1110 1050 becomes particular, if an alliance wants to enforce arbitrarily high extortion factors x—• co, then sA = 1/x —.0 and the critical threshold becomes nA in > (r— 1)/r. This condition is always satisfied if nA =n -1, implying that an alliance with n — I members can always Ill' be arbitrarily extortionate toward the remaining group member. 1051 1113 ii) Generous strategies (l=cen-i = rc —c). The inequalities S47 lead to the same threshold for nA pi as in the case of extortionate 1052 strategies, as given in Eq. S48. 1114 1053 iii) Equalizers (s = 0). For equalizers, the inequalities S47 lead to two critical thresholds; to be able to set the payoffs of the outsiders 1115 1054 to any value between 0 < / < or —c, the fraction of the allies needs to satisfy 1116 HIE. et al. wwwones.orolcolkontenthbort/1407837111 9 of 19 EFTA01199763 1117 1179 1118 nA r —1 > 1S49] 1180 1119 II r 1181 1120 182 However, to be able to set the payoffs of the outsiders to some value between 0 <I < rc —c, the number of allies only needs to exceed 183 1122 1183 1122 1184 1123 nA (n -2)(r - 1) 1185 1124 ≥ IS50] 1186 n n + (n -2)r ' 1125 1187 1126 1188 1127 1189 1128 Nash equilibria for the public goods game. In the following, let us describe a few strategies that allow for stable cooperation in the public 1190 1129 goods game. According to Proposition 3, this can be achieved by using a ZD strategy with parameters 1=4,,-i, 4>O, and 1191 (n -2/n -1) <s < 1. When we choose the boundary case s=1 and efr= 1/c (which is the maximum value of 0, given the constraint 1130 1192 0 <psi S 1), the resulting ZD strategy according to Eq. S44 is proportional Tit-for-Tat with entries 1131 1193 1132 1194 1133 pTFTsi = i l (S51] 1195 n 1134 1196 1135 1197 which is independent of the player's own move. This rule says that the player's cooperation probability is given by the fraction of 1136 cooperators among the coplayers in the previous round (additionally, we need to specify the cooperation probability for the first round, 1198 1137 which needs to be set to one). 1199 1138 Another boundary case is given by the choice s = (rt - 2/n — 1) and 0= [n(n -1)]/(c[n(n - 2) +r]) (which is again the maximum value 1200 1139 of 0). We refer to the resulting ZD strategy as generous Tit-for-Tat, which has entries 1201 1140 1202 1141 gTFTsi j n -j -1 n(r - 1) 1552] 1203 1142 n - 1+ n - I (n -2)n +I 1204 1143 1205 1144 Also gTFT is independent of the player's own move, and it is generally more cooperative than prit because gTFTsi > pTFTsi for all 1206 1145 j <n - 1. 1207 1146 A last boundary case is given by 0-.0, in which case the resulting ZD strategy approaches the strategy Repeat, independent of the 1208 1147 choice of (n — 2/n — 1) <s <I. Due to Corollary 1, we can conclude that any linear combination of these three strategies of the form 1209 1148 1210 1149 p = Ai • p7FT + ArgTFT +A3 . Repeat, (S53] 1211 P 150 / lift ' with 0 <Ak <1, A3 < 1, and Ai +.22 +A3 =I is also a stable ZD strategy. 1213 Among the pure memory-one strategies, Proposition 4 allows us to conclude that Grim and TFT„..i are always Nash equilibria. 1152 1214 Moreover. the strategy WSLS is a Nash equilibrium if r≥ 2n /(n + 1), as illustrated in Fig. S3. 1153 1215 Volunteer's dilemma. In the volunteer's dilemma, at least one of the players needs to cooperate and pay a cost c > 0 in order for all group 1154 members to derive a benefit b>c> O. Thus, the payoffs are given by 1216 1155 1217 1156 ai = b —c for all b and b, =b if j> 1 and bo = 0. (S54] 1218 1157 1219 1158 ID strategies for the volunteer's dilemma. According to Eq. S14, the ZD strategies with equal weight on all coplayers have the form 1220 1159 1221 1160 0/2 1 + 41( 1 —s)(1—b +c)—n; j : I c] if S, =C 1161 /3 1162 1163 Psi = 0[(1 — s ) (1 — b) + 1+2 ci if S, =Dfl 1S551 0/4 l ms 1164 1226 1165 96 (I —4 1 if S, =D.j =O. 1227 1166 1228 1167 By condition S17, exactly those parameters 1 ands can be enforced for which either s=1 or 1229 1168 1230 1169 1 c s max{ 0,6 -- —} /sb -c. 15561 1231 1170 n -1 1 —s 1232 1171 1233 1172 This set of enforceable payoff relations is illustrated in Fig, S28. In the special case of extortionate strategies (1=0), condition S56 1234 1173 implies that a given slopes can only be enforced for sufficiently small groups 1235 1174 n <I + c 1236 1175 b(1-s)' 15571 1237 1176 1238 1177 For equalizers (s = 0), the inequalities in Eq. S56 imply that a player can only determine the average payoff of the coplayers if n=2, 1239 1178 and then the only enforceable baseline-payoff is I= b -c. 1240 lite et el. www.imes.org/eglkontenthhort/140788711t 10 of 19 EFTA01199764 1241 Strategy alliances in the volunteer's dilemma. For strategy alliances, the enforceable payoff relationships according to Eq. S34 become 1303 1242 1304 1243 max I 0. b — 1 c —} <1Sb — c. IS58] 1305 1244 n — n4 I — sA — 1306 1245 1307 1246 In particular, alliances that aim to enforce an extortionate relationship (with 1=0 and some ? >0) need to have a critical size 1308 1247 1309 list c 1310 1248 > 1 (SS9] 1249 n — nb(I —sA). 1311 1250 1312 1251 It follows that alliances cannot be arbitrarily extortionate (setting ? =0 on the right side implies that such alliances would need to 1313 1151 satisfy n't > n - 1). Instead, even a large alliance of size n4 =n —1 can only enforce slopes with s4> (b —c)/ b. 1314 The performance of ZD alliances against adapting outsiders. In addition to the static properties of ZD strategies, Press and Dyson (1) also 1253 1315 highlighted a remarkable dynamic property of ZD strategies: when a player with a fixed extortionate ZD strategy is matched with an 1254 1316 adapting opponent who is free to change his strategy over time, then the adapting opponent will move toward more cooperation. In the 1255 1317 following we present simulations suggesting that ZD alliances can have a similar effect. 1-x' 1318 To explore the performance of ZD alliances against adapting outsiders, we consider a group of n subjects. Let us assume that the 1257 1319 players (1. n4} form a synchronized alliance with ti4 <n and that the allies commit themselves to act according to a ZD strategy 1258 1320 with parameters 1, r 1 , and # (similar results could also be obtained under the assumption that the allies form a strategy alliance instead 1259 1321 of a synchronized alliance). Moreover, let us assume that each of the outsiders applies some arbitrary memory-one strategy. To p- 1260 1322 rametrize memory-one strategies, we note that the possible outcomes of a single round of the game can be written as 1261 1323 0= (S-4.S„.40 S„), where SA E (CD) is the joint action of the allies and where Si e {C. D} is the action of each of the outsiders. As 1/6) 1324 a result, there are 211-11A+1 possible outcomes a. The memory-one strategies for the outsiders are thus modeled as vectors p = (p,) with 1263 1325 2"-" A+I entriespa G [0,1]. 12.64 1326 We assume that the ZD alliance and the outsiders interact in a series of repeated games. Although the strategy of the ZD alliance is 1265 assumed to be fixed, outsiders are allowed to adapt their strategy from one repeated game to the next. Specifically, we assume that in each 1327 1266 1328 time step, the group interacts in a repeated public goods game, resulting in the payoff el for each of the allies and the payoffs x) for each 1267 outsider/. Because all players use memory-one strategies, these payoffs can be calculated using a Markov-chain approach (15). In the 1329 1268 1330 next time step, one of the outsiders is randomly picked to change his strategy from p to p'. The entries of the new strategy pi are 1269 1331 independently drawn from a normal distribution around the old strategy (using an SD of 0.01). If the outsider's payoff using the new 1270 strategy is xi, then we assume that the outsider keeps the new strategy with probability 1332 1271 1333 1272 1 1334 1273 P — I 4. exp (—racy' — sin (S60] 1335 1274 1336 1275 Otherwise, the outsider rejects the new strategy and continues to use the old strategy. The parameter fa> 0 corresponds to the strength 1337 1276 of selection. In the limit of weak selection (a—. 0, this yields p= 1/2, such that the choice between the new and the old strategy is fully 1338 1277 random. For the simulations, we consider the case of strong selection (we used m=100); in that case, the new strategy is likely to be 1339 1278 adopted if x' > x, and it is likely to be rejected when le <x. This elementary process, in which outsiders are allowed to experiment with 1340 1279 new strategies, is repeated for r time steps. For the initial population of outsiders, we assume that all outsiders start with unconditional 1341 1280 defection. 1342 1281 Fig. S5 reports the outcome of this evolutionary scenario for three different ZD strategies: a fair strategy, an extortionate strategy, and 1343 1282 a generous strategy. In all three scenarios, the outsiders become more cooperative over time; as a consequence, also the allies cooperate 1344 1283 more often, because all three ZD strategies are conditionally cooperative (Fig. S5, Upper). However, there are clear differences in the 1345 1284 final cooperation rates between the three scenarios. Extortionate alliances seem to be least successful to incentivize cooperation, 1346 whereas fair alliances tend to achieve full cooperation in the long run. This success of fair strategies can be attributed to their higher 1 85 1347 1286 slope values: because fair strategies uses =1, they perfectly recoup outsiders for increasing their cooperation rates. Both other strategy 1348 1287 classes use slopes with s < 1, which makes it less attractive for the outsiders to become more cooperative. As depicted in the lower 1349 panels of Fig. 85, fair ZD alliances therefore also yield the highest payoffs by the end of the simulation. 1288 1350 1289 Of course, the numerical simulations presented here only provide a snapshot of the full dynamical properties of ZD strategies (which 1351 deserve a careful analysis on their own). The simulations serve as a proof of principle: as previously shown for the iterated prisoner's 190 1352 dilemma (1, 21), ZD strategies can be used to generate a positive group dynamics in multiplayer dilemmas. 1291 1353 1292 Appendix Proofs. 1354 1293 Proof of Proposition 1: By the definition of ZD strategies. the cooperation probabilities after mutual cooperation and mutual defection are 1355 1294 given by 1356 1295 1357 1296 row n= 1 +0(1 - s)(1 —a„_t) 1358 1297 P61] 1359 po D) = 0(1 —s)ff — bo). 1298 1360 1299 As these two entries need to be in the unit interval, it follows that 1361 1300 1362 1301 0( 1 —s)(1—a„_i ) 5 0 1363 (S621 1302 0 S 0(1 —*/ — AO- 1364 HIlbe et el. www.pnes.orsecglkontenthhoit./140783711I II et 19 EFTA01199765 1365 Adding up these two inequalities implies 0(1 —s)(bo —a„..1)≤0, and because of Eq. S3 1427 1366 1428 1367 0(1 -s)a. O. 1368 (8631 1429 1430 1369 Analogously. let us consider outcomes a in which all players but one cooperate (i.e., a is a permutation of (C,...,C, D), in which case 1431 1370 1432 1371 Pe . { 1+ O[sa„_2 — (I — wl )a„_2 — wibn-i + (I —s)/) if the defector is a coplayerf 0 i (364] 1433 1372 O6.-1 -an-2 4' (1 -s)/I if the defector is player i 1434 1373 1374 Because all these entries pe need to be in the unit interval 1435 1375 O[sa,,-2— (1— wl )an -2 — Wibe-i + (I —s)/}≤ 0 143743386 1376 1S65] 1 0 ≤ egsba-t — an-2 + (1 —s)1. 4 1377 1439 1378 Adding up these inequalities yields #(s+ wi)(b, _1 — a„_2)≥ 0 for all jOi, and because of Eq. S2 1440 1379 1441 1380 0(s + tv.,) a 0 for all j*i. 1381 1566] 1442 1443 1382 Combining the inequalities 563 and 566 then yields 1444 1383 1445 1384 0(1 +ii/j) ≥ 0 for all /0/. (567] 1446 1385 1447 1386 and because at least one of the iv' is larger than zero (because all wi sum up to one), it follows that #≥ 0. The restriction efr$ 0 then 1448 1387 implies 417 O. Due to the inequalities S63 and 866, we may also conclude that —minfowi ss ≤ I. Because minf,,,w, ≤ 1/(n — 1) (again 1449 1388 because the wi sum up to one), it follows that —1/(n — 1)≤s ≤ 1. 1450 For s# 1, the inequalities S62 and S63 imply bo ≤/≤a„_,. 1389 1451 Proof of Proposition 2: For a given a= (Si S',,), the entries p„ of a ZD strategy according to Eq. 512 can be written as 1390 1452 1391 1453 pn = lir + 46[(1 —s)(1-10,) + E wi(gf, —gni. 1568] 1392 1454 1393 114 1455 1394 with Al, given by Eq. SS and with t ie and g, given by Eq. S7. Let e denote the set of i's coplayers who cooperate in state a, nd let up 1395 145 14567 denote the corresponding set of coplayers who defect. Using this notation, the entry pe is given by 1396 1458 1397 1459 1398 I +# [(I -s)(/ -aki..i) - E w, 0,,,, -al,-0I if S, = C 1460 1399 AAP Pe = 1569] 1461 1400 1462 1401 #1 1 -s)(/-4,1)+ E w,(blei -ale !) if S, =D. 1463 1402 icor 1464 1403 1465 1404 Because et, > 0 can be chosen arbitrarily small, the conditionpeep, I] is thus satisfied for all a if and only if the following inequalities hold 1466 1405 1467 (1 —s)(/—akfri) — Eivi(bm — apt ') SO for all a with S, =C 1406 1468 Se 1407 (I / -s)( -4 4) + E iv,o,,,, -nicht) ≥ 0 for all a with SD , = . 18701 1469 1408 1470 1409 160r 1471 1410 1472 Ifs =1, these inequalities are independent of the parameter I. and they are satisfied for any social dilemma (because wi≥ 0 for all/ by 1411 assumption and because bk,i >tilt , by Eq. S2). Fors < 1, we may divide the above inequalities by (I —s)> 0, implying that Eq. S70 is 1473 1412 equivalent to 1474 1413 1475 jadow)(biel —°101-1) 1414 E 1476 atel-I + I —s ≥l for all a with Si =C 1415 (S71] 1477 1416 Enew)(biel - °101-1) 1478 1417 voi — ≤/ for all a with Si =D. 1479 1—s 1418 1480 1419 The inequalities S71 in turn are satisfied if and only if 1481 1420 1482 1421 if ,150161 —alel-I)) EGew)(biel—alel-0) 1483 max thcei ≤/≤ min atei.., + (S72] 1422 I —s elvic I —s . 1484 elY,•11 1423 1485 1424 Because the terms (blei- Ott! )/(1—s) are positive, the respective maxima and minima are attained by choosing the weights wi as small 1486 1425 as possible. That is, for a given number of total cooperators lal, the extrema are attained for those states a for which E irrews and 1487 1426 E ja,,,,w, are minimal. This observation implies that condition S72 is equivalent to 1488 Hite et el. www.pnes.orgfcglkontentMort/1407811711 I 12 of 19 EFTA01199766 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 O max b 1—s —a/A)) <1< min + asisn-I Ittn_j_i 1—s (bit , —ai)} with 45 being the sum of the j smallest entries in (vi) . Proof of Proposition 3: We already know that for a ZD strategy to be a Nash equilibrium, one of the three conditions need to be fulfilled (otherwise there would be a different ZD strategy that yields a higher payoff). Conversely, let us assume that one of the three conditions of the proposition is fulfilled. I) Ifsit =0, then by Eq. S20, the mutant's payoff is 1=1, irrespective of the mutant's strategy. In particular, there is no incentive to deviate. ii) Suppose s R > 0,1= an-I, and let us assume to the contrary that the zero-determinant strategy is not a Nash equilibrium. Then there is a mutant strategy such that 1> a,,A. Because the residents collectively enforce the relation if =st k+ (1 —sR)a,,_i and because s' >0, we can conclude k>a„_i. However, then the average payoff of all group members exceeds a„..i, contradicting the assumption that an_i is the maximum average payoff per round. iii) Under the assumption that be is the minimum average payoff per round, the case s R <0 and 1 = bo can be treated analogously to the previous case. Proof of Proposition 4: Because p is a Nash equilibrium, the payoff of any mutant strategy j, satisfies I <an-1. Let us first consider a mutant who applies the strategy i.e., h i =0 for all (S. j). Because the mutant never cooperates, =0 for all j, and by Lemma 1 also Poi =0 for allj except/ e {0.n — 1}. The values of fia„_i and ODA can be obtained by calculating the left eigenvectors of the transition matrix (An -1) (D.0) n — 1) (D,O) PC4-2 I - PC„-2 • PD.0 1 -pD.0 If we hadpt .-2 =1, then the assumption that p players start with cooperation would imply i Da_i =1, such that the payoff of AHD was ir=b„_, >a„..1. Because this contradicts the assumption tha p is a Nash equilibrium, we conclude that pc.„_2 =O. A calculation of the left eigenvector of Eq. S74 with respect to the eigenvalue 1 then yields As a result, the payoff of AID is ~tDn 11 .[P no +Pas) 1 / (I -Fpoo) + bo k = b„_APD„,,A + bean! 1 +pn.o oot The requirement ir <an-I then implies the condition S28c. As another special case of a possible mutant strategy, let us consider a mutant p such that pca_l =0, and pso =I for all other states (S,j). With a similar calculation as in the previous case, one can determine this mutant's payoff as (ao-i + bo-0 'Nu + 0 0 = °n-iPco-1+bo-ipao-t +koPco • 1 + 2pn.1 The requirement if <€2,,_1 implies condition S28b. Suppose the n —1 residents apply the pure memory-one strategy p. Due to Akin's lemma n-I n-I E (pci- Ovci + Expo; =0, fro which because of Lemma 1 and =1, N o-2=0 simplifies to VC„-z=paimi +pnevn.o. Then, irrespective of the mutant's strategy, the payoff k satisfies — a n_t = r .(1 (ai—aa_Octi + Qtt, Lemma = (ae — a o.4)ke + (bo_i — a o..4)1t.o.4+(bo — )ip.° ) Lemma I, = Igo +(bo-3 —°n-i)vca-2+ (bo —ao-i)voa Etl = l"0 — an-0(Prkiva3 +PD.ovass)+ (b0 — an-dvao (528bal = frbn-I (72e-t — clo)] • V Al + [(bn-I "an-I)PD.0 (ao-i bad • VD.0 S 0 , that is, p is a Nash equilibrium. 1551 073] 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 IS74] 1570 1571 1572 1573 1574 1575 1576 1577 (S751 1578 1579 1580 1581 1582 IS761 1583 1584 1585 1586 1587 1588 IS77] 1589 1590 1591 1592 1593 1594 (S781 1595 1596 1597 1598 1599 (S791 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 Hite et el. www.pnas.oegfcglkontent/thort/140788711 13 of 19 EFTA01199767 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 164gio 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 Proof of Proposition 5: The proof follows along the same lines as the proof of Proposition 4. Suppose pis a Nash equilibrium and assume thatpal 00 (because p is a pure strategy it follows that pal =1). We have to show that these assumptions imply pc,' =pca-2 =0 and (b,,-.1+a0)12<bo. To this end, let us first consider a mutant with strategy AMC, such that psi =I for all (S./). Because of Lemma 1, and because the mutant always cooperates, the only possible outcomes in a given round (from the perspective of the mutant) are (C. is — 1) and (C.0). The transition matrix is given by The limit distribution of this transition matrix is such that the payoff of ABC becomes (C,ri —1) (C.0) — 1) (c 0) PCA-1 I 1 0 ( 1 Pco 1—Pc„ -1 ) , (PcA-1) = 2 —Pca-1 2 PC„-1 + (1 — = an-a0c0-1+ agto 2 —Pc„-1 If we had pc.„-I =1, this payoff would equal to ir=a„_i >60=x, contradicting our assumption that p is a Nash equilibrium. Thus, pc„-1 = 0. Let us now consider another mutant strategy p with PD.o=1 and psi =0 for all other states. Again by constructing the transition matrix [with possible states (C. n —1). (C.0), (D.n —1), and (D,0)], one can compute the payoff of this mutant as N-1+ —Pc.-2)(bo+ao) = 3 2Pcn-2 If pc,,_2 =1, then 1=6„-i >bo= if, again contradicting the assumption that p is a Nash equilibrium. Therefore, isc.,,_2= 0, and the mutant's payoff becomes ir= (b„_i +b0 +a0)/3. For p to be an equilibrium, this payoff needs to satisfy it <b0, which yields (b„..) +ao)/2sbo. Let us first consider the case pDA =0. Because the memory-one strategy also prescribes to defect in the first round and because pao= 0, it follows that all residents play defect throughout the game, irrespective of the strategy of the mutant. Thus, any mutant's payoff can be written as k=i)a.obo +itoao <bo= A. This shows that p is a Nash equilibrium. Let us now consider the second case, pc,,_) = pc„..2 =0. Without loss of generality, we can also set pal = 1, and by assumption, pa0= 0. Under these conditions, Alcin's lemma becomes = ka-1 +VC a-2. Then, irrespective of the strategy of the mutant, the mutant's payoff satisfies rt-1 ir - b0= Da / - bo)fti +(bi -bo)Poi i-0 Ltm=ma +(ao -bo)Pc.o+ (bn-i Isma I, = -bo)vc,„_) +(ao -60)voi + (b,,_) - bo)vc,,,_2 Ell I CI(an-1 - bo)vcA-1 + (ao -bo)(Vca-1+VCA-2) bO)9C4 -2 = (an-1 no - 2b0)' VC„-1 + (bn-i +a0 - 2b0). van-2 5 0, with the last inequality being due to the fact that (b„_) + ao)/2 sb0 and a,,..) <b,,_). Again, we conclude that p is a Nash equilibrium. Proof of Proposition 6: Due to our construction, strategy alliances require each ally to apply a ZD strategy p with parameter I, s, and 0 and weights w. To enforce an effective slope r 1, Eq. S33 implies that the parameter s needs to be chosen such that s = sA + (11A DIVA (] —SA). For? =1, we get s=1, and Proposition 2 guarantees that the payoff relationship is enforceable (independent of I and the weights wA). Let us now assume that s <I (and therefore? < I). Because of Proposition 2, the alliance can enforce the payoff relationship (I.?) if and only if we can find an appropriate weight vector w= (w)) such that 1675 1676 1677 1678 1679 1680 1681 (MO) 1682 1683 1684 1685 1686 1687 1688 1689 lS811 1690 1691 1692 1693 1694 1695 1696 (582) 1697 1698 1699 1700 1701 1702 1703 l 5831 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1 ISM) 714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 (5851 1732 1733 1734 1735 1736 Me et el. www.pnes.orgicglkontentishoit/140788711I 14 et 19 EFTA01199768 1737 1799 max {Is' — 1 _ 08.4 _ oirt l'nfi-i l —sii 1 </ ≤ min asisn-I fa, + l _ friA _ owA 1 _ 5,4 }. (S861 1800 -i 1738 1739 asssn i) gin- I bit. —a) 1801 1740 1741 As in Proposition 2,W) refers to the sum of the smallest entries in w (excluding the entry corresponding to the focal player). Because w 1802 1803 only has the two possible entries w4 and tv-A, we can write 67 as 1742 1804 1743 1805 1744 fivA if wA <w- A. j <nA —1 1806 1745 1746 w., —_ {(n A — Owl + (1 — nA +1)W-A if wA <w-A. j> n A - I 1807 jw-A if wA >w-A, tot -n4 15871 1808 1747 (I +11-4 -n)tvA + (n -nlitr A if wA >WA, j > it -nA 1809 1748 181 1749 The constraint (nA — 1)IvA + (is — nA)w-A =I implies iv-A= [1— (nA —1)w/fin —nA). By plugging this into Eq. S87, we can calculate 1811 the expression 1750 1812 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 iwA 1 < if w.4 <_ -1' j nA -1 n-1, if iiP4 < , j> nA - 1 1— (nA —1)wA 1 is —j— 1 if 1 - (nA -1)wA n -flA MB] 1 - (nA -1)wA 1 1 if HA > _. j< It -nA n —1 n —nA 1 - (n - j -1)wA if wA 1 i.j> n —nA. 1— (nA —1)wA 1813 1814 1815 1816 1817 1818 1819 182109 1821 1822 1823 1762 Due to condition 586. the space of enforceab e payoff relations becomes maximal when we choose the weight wA such that 1824 1763 k/[1— (nA —1)wAl becomes maximal. Eq. S88 suggests that by/[I — (nA —1)wAi is monotonically increasing in wA. Thus, considering 1825 1764 the restriction 0 <wA <1/(nA — 1), the maximum is attained for HA-. 1/(nA -1), which also implies 1A> 1/(n- 1). From Eq. S88 1826 1765 we obtain 1827 1766 1828 1767 1829 1768 fi m ii _ { n n A co if jsn-nA 1S891 1830 1769 .a -Inn' -If 1 — (nA —1)wA 1831 if j >n — nA . 1770 1832 1771 Thus, for wA sufficiently close to 1/(nA-1), condition S86 is satisfied if and only if 1833 1772 1834 1773 1774 0sisn-n-4 ' n — it— 1 .5" nA -min -1 ' is .11'- 1 SA ) . max {bi— j A i r): 1 ) <I < min {a, +n —* , i 1 .n. bs —ai (S981 1835 1836 1775 1837 I 71011 This is condition S34. Moreover, if nA <n/2, setting j=n/2 in the left side of Eq. S90 suggests 1838 1777 1839 1778 n/2 bna — a/t12-1 bn/2 I. 1591] 1840 1779 n — nA 1 — sA 1841 1780 1842 1781 Similarly, setting j=n/2 — 1 in the right side of Eq. 890 leads to 1843 1782 n —n/2 k ip —anfz-a 1844 1783 1784 1≤64/2-1 + n _ nA 1 — sA ' 1S92] 1845 1846 1785 Summing up these two inequalities shows that sA > — [nA /(n —nA)[. 1847 1786 Proof of Proposition 7: The proof follows the lines of Propositions I and 2. By its definition (Eq. S40), a ZD strategy for a synchronized 1848 1787 alliance has the form 1849 1788 1790 1789 1791 1792 1793 1794 1795 1796 1797 1798 1+0[0 —sA)(1—a„,,,n_i )—II: nn7 i (b„Ay —an.,n_id 18509 1851 if S =C 1852 Psi = { P 931 1853 4,[(1-sA)(/-M+ 4 01-ai_1)] if S=O, 1854 1855 for 0 <j S n —nA. The conditions pci <1 and poi ≥ 0 imply the following sign constraints 1856 1857 1858 ., n -nA -j ,.. 0 (S9421 1859 a. 0[(1 -sio- a„An_ij n n.„1 (a04,5 - anAti-id 1860 Hltbe et al. www.pfsas.org/eglkontenthhortn40703711I 15 et 19 EFTA01199769 1861 1923 0 ≤ 0 [(I -sA)e -1)1) +n - * 1.4 0 1- afri)]. 1862 [S94b] 1924 1863 1925 1864 1926 for all 0 ≤j < n -n-4. Setting j=n—nA in Eq. S94a yields 1865 1927 1866 441 —sA)(1 — an.° 1. 0. (S951 1928 1867 1929 1868 and setting j=0 in Eq. S94b yields 1930 1869 1931 1870 0≤0(1—sA)(1—bo). (S961 1932 1871 1933 1872 Adding up Eqs. S95 and S96 shows that 1934 1873 1935 1874 #(1 -sA) ≥ O. 1S971 1936 1875 1937 1876 For s 4 =1, we can ensure that 0 ≤psi S I for all entries in Eq. S93 by choosing a 0> 0 that is sufficiently small. Let us therefore 1938 1877 assume ? # 1. Because we also have 0# 0 by definition, condition S97 becomes 0(1 — s-4) > 0. Dividing the sign constraints in Eq. S94 1939 1878 by 0(1 —sA) then implies 1940 1879 1941 1880 max {b.- J 2—} Vs min la„A4)_i +n -"A -j bnAli - anA+)-1). (8981 1942 i b. -a._i 1881 0s)sn-nA I n flA I SA ssisn-nA n —11A I —sA 1943 1 1 882 944 1883 This condition in turn is equivalent to condition S42. 1945 Conversely, suppose Eq. S42 is satisfied for some (/,?) with : A O 1. It follows that the sign constraints S94 are satisfied for every 1884 1946 choice of A. subject to the condition 0(1 —sA)> 0, which implies that the entriespsi in Eq. S93 satisfypci s I and ppi ≥ 0 for all/ By 18112 choosing # sufficiently close to zero, we can also ensure that pc) >0 and ppo ≤1. This shows that the payoff relationship (1,sA) is 1947 188 enforceable. 1948 1887 Finally, suppose nA ≤n/2. Setting/ =0 in Eq. S94a yields 1949 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1899 1898 1900 0[(1 -sA) (I - a„A _I) - (b„A - a„A_ I)) 5,. O. 1S991 and setting j = nA in Eq. S94b results in „A 0 ≤ 0 [(1 —?)(/ — b„.i) + n — n ..1 (bo — anA ..1 )1 . 1S1001 Adding up these two inequalities shows that /I .4 H 0 (s4 + ≥ O. n ItA 01011 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1901 Combining Eqs. S97 and SIO1 gives 0> 0, which in turn implies —[nA /(n —n-4)] ≤sA ≤ I. 1963 1902 1964 1903 1. Press WH, Dyson PI (2012) Iterated Prisoners Dilemma contains strategies that dominate any evolutionary opponent. Prot Nati Arad Sc) USA 109126):10409-10411 1965 wan 2. Akin E (2013) Stable cooperative solution for the iterated prisoner's dilemma. 1966 3. Kerr B, Godfrey-Smith P. Feldman MW (2004) What is altruism? Trends (col Evol 190i:13S-140. 1905 4. ledyard 10 (1995) Pubk goods: A survey of experimental research. The Handbook of Experiments Economic. eds Kagel lit Roth AE (Princeton Univ Press. Prin(eton). 1967 19061e S. Diekmann A (1985) Volunteer's dilemma. 1 Conflict Resoled 29:60S-610. 1968 i96I" 6. HAW C. Trauisen A. Sigmund K Partners co nvals7 Strategies to, the Iterated prisoners dilemma. Working Paper. 2013. 1969 7. HAW C Rohl T. Milinski IA (2014) Extortion subdues human players but Is finally punished in the prisoners dilemma. Nat Common 5:3976. 1908 8. Stewart Al, Plotkin M (2012) Extortion and cooperation in the prisoner's dilemma Poe Nail Aced Sci USA 109(26):10134-1013S. 1970 1909 9. Hilbe C. Nowak MA. Sigmund K (2013) Evolution of extortion In Iterated Prisoners Dilemma games. Prot Nate Atad Sc? USA 110071:6913-6918. 1971 10. Stewart Al, Plotkin 1B(2013) From extortion to generosity, evolution in the Iterated Prisoner's Dilemma. Proc Nat1 Aced So USA 110(30:15348-15353. 1910 1972 11. Hite C, Novak MA, Trauben A (2013) Adaptive dynamics of extortion and compliance. PtoS ONE 8111)x+77886. 191m n. szomoki A. Pert M (2014) Evolution of extortion In structured populations. Phys Rev E stet monansott matter PhyS 89:022804. 1973 iusin 13. Pan 1., Ito D, Rang 2, thou T (2014) Zero-determinant strategies in the iterated pubic goods game. 1974 14. NOwak 6A. Sigmund K (1993) A Strategy 01 win-Start lose-shill that outperforms tit-fa-tat in the Prisoners Dilemma game. Nature 344(6432):56-93. 19 Via IS. Haven C. Schuster HG (1997) Effects of increasing the number of players and memory size In the iterated prisoners dilemma: a numerical approach. I00( WO SO 26t513-519. 1975 1914 IL Sigmund K (2010) The Calculus of Se/Pshness (Princeton Mir Press, Princeton). 1976 19 811 A. Friedman 1O971) A noncooperative eguikbrium for suPergarnes. Rev Ron Stud 313i-12. 1977 1B. Fudenberg D. Mackin E (19.96) the folk theorem in repeated games with discounting or with incomplete information. Econometrica S4(3):533-554. 1916 19. Boyd R. Richerson Pi (1988) The evolution of reciprodty in sizable grown./ Meer anal 132(3):337-356. 1978 1917 20. Kurokawa 5, tiara V (2009) emergence of cooperation In pude( goods games. Prix Old SCi 2760660):1379-13M. 1979 21. Chen 1, Zinger A (2014) The robustness of zero-determinant strategies in iterated Prisoner's Dilemma games. 1 Theor Bid 387%6-54. 1918 1980 1919 1981 1920 1982 1921 1983 1922 1984 Hilbe et al. www.pnas.OrgAgikontentishort/1407887111 16 of 19 EFTA01199770 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 A towage peon or co-players R' Public Goods Game •c -c ltd 10.03 1. Payoff of bcal player B Volunteers Dilemma Average00AB OICO-PlOYenl *o av ha Payoff of fccatplayer Fig. 51. Illustration of 20 strategies in the case of equal weights, iv) =1/(n —1), for all j ii, and for (A) the linear public goods game and (8) the volunteers dilemma. The blue-shaded area represent all feasible payoffs, with the x axis representing the payoff of player i and they axis representing the mean payoff of is coplayers. The dashed diagonal gives the payoff combinations for which rri =re'. In both graphs, the strategy of player i is fixed to some ZD strategy, wfiereas for the coplayers we sampled 104 random memory-one strategies. Red dots represent the resulting payoff combinations, and the gray line gives the 2002 prediction according to Eq. 513. For both graphs, we considered an infinitely repeated game in a group of size n=0. Parameters: (A) public goods game 2003 1a1= Q+ 1)rcin — c and bi =jrcinj with r = 2.4 and c = 1. For the strategy of player i, we used a generous ZD strategy with parameters / =rc—c, s = 2/3, 4=1/2. 2004 (8) Volunteers dilemma (ai = b - blip =b, and by =0) with b=1.S, c=1; player i applies an extortionate strategy with parameters 1=0, s =9/10, 0=1/2. 2005 2006 2007 A Public Goods Game 2008 snot 2009 2010 1 2011 2012 *It 2014 *AS- n n 2016 b et 2015 2017 2018 o Bawer* of l Baseline 2019 payoff -e Parr( 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2093 2032 2094 2033 2095 2034 2096 2035 2097 2036 2098 2037 2099 2038 2100 2039 2101 2040 2102 2041 2103 2042 2104 2043 2105 2044 2106 2045 2107 2046 2108 2013 B Volunteers Dilemma Slope. Fig. 52. Enforceable payoff relations in the case of equal weight on all coplayers for (A) the linear public goods game and (8) the volunteers dilemma. A pair (1,$) is enforceable for a given group size n if the point is within the respectively shaded area. The set of enforceable pairs for large n is a subset of the re- spective set for smaller n, i.e., the set of enforceable pairs shrinks with increasing group size. Parameters: (A) linear public goods game with r = 2.4, c=1; (B) volunteer's dilemma with b=1.5, c=1. 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 Hite et al. www.poes.org/cgikontentishortiMON14711I 17 of 19 EFTA01199771 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 Slopes 2135 2136 1 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 -1 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 5 a a to 04 g 2 2 4 5 6 7 Group size n Grim & TFTn-1 WSLS Grim & TFT Ng. 53. Stable memory-one strategies for the linear public goods game. The figure illustrates for which parameter regions the strategies Grim, TETA, and WSLS are Nash equilibria, provided that the public goods game constitutes a social dilemma 1 <r <n). Grim and TfT,i are always Nash equilibria. 7P7 for k <n— I is never a Nash equilibrium. WSLS is a Nash equilibrium when (N.' +b0)/2 san.i, which yields r z2n,t(n +1). In particular, WSLS is always a Nash equilibrium when r22, irrespective of group size. =1 =2 3 Baseline payoff Ng. 54. Strategic power of strategy alliances in the public goods game. Each colored area illustrates the set of enforceable payoff relations according to Proposition 6 for different alliance sizes M. T e set of enforceable payoff relations for small nA is a subset of the respective set with larger nA. Consequently, the larger the alliance, the more extreme payoff relations it can enforce. Parameters: linear public goods game with r= 2 4. c =1, and group size n=10. 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 1177 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 Hite et el. www.pnes.OrgeglkOntenthhcet/140788711I 18 of 19 EFTA01199772 2233 2234 2235 2236 2237 2238 2239 2240 1141 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 1161 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 A i.o a 0.B 5 0.6 10.4 02 C.2 0.0 a 2.0 1g 1.6 ros 0.5 Os Extortionate alliance 100.000 200.000 AIlance OWSklerS 0 100.000 lime 200.000 B l.0 OA OA 0.4 02 OA Fair alliance PAisnce ( Outsiders 2.0 Is 1.0 os OA 100.000 200.000 Miaow 100.400 0 Time 200.000 C Generous alliance 1.0 OA OA 0.4 02 00 0 2.0 I.5 IA OA AO 100.000 200.000 row= Outsiders ance 0 100.000 lime 200400 Fog. SS. Performance of three different 2O alliances against adaptive outsiders. For each simulation, the strategy of the 2O alliance was fixed, whereas outsiders were allowed to adapt their strategy as described in the text. (Upper) Average cooperation rate during each repeated game. (Lower) Resulting payoffs for allies and outsiders. fad) panel depicts the average of 20 simulations. All simulations were run for a public goods game with r =3 and c= 1, in a group of size n = S with n" =2 allies. For the strategies of the ZO alliances we used (A) an extortionate strategy with =0, s = 0.8, and #=1/2; (8) proportional Tit-for-Tat with s = 1 and 0=1; and ft') a generous strategy with i= rc — c= Z 5=0.8, and #=1/2. PNAS prooi Embargoed 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 14110e et el. mvw.pelaS.049/CgikOntent/160a/1407687111 19 of 19 EFTA01199773 AUTHOR QUERIES AUTHOR PLEASE ANSWER ALL QUERIES 1 Q: 1_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after "This": "This is merely done...." Q: 2_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after "This": "This always holds ...." Q: 3_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after "This": "This allows us to...." Q: 4_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after "This": "This holds more generally...." Q: 5_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after "This": "This holds more generally ...." Q: 6_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after "This": "This allows us to...." Q: 7_Please clarify if (2, 10) refers to references or equation numbers. Q: 8_Please clarify if (2, 6) is equations or references. Q: 9_Please rewrite the following for clarity: "...is also a stable...." Q: 10_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after `This": "This shows that...." Q: 11_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after `This": "This is condition ...." Q: 12_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after "This": `This shows that the...." Q: 13_For SI ref. 2, please provide journal name, volume, issue, and page range. Q: 14_For SI ref. 5, please provide issue number. Q: 15_Is SI ref. 6 published. References must be published to be listed in the reference list. Please provide type of reference and all necessary publishing information or delete if unpublished and renumber reference section accordingly. Q: 16_For SI ref. 12, please provide issue number. Q: 17_For SI ref. 13, please provide journal name, volume, issue, and page range. Q: 18_For SI ref. 15, please provide issue number. EFTA01199774 AUTHOR QUERIES AUTHOR PLEASE ANSWER ALL QUERIES 2 Q: 19_For SI ref. 17, please provide issue number. EFTA01199775

Technical Artifacts (39)

View in Artifacts Browser

Email addresses, URLs, phone numbers, and other technical indicators extracted from this document.

Domainwww.imes.org

Domainwww.pfsas.org

Domainwww.poes.org

Email[email protected]

Phone(3315101243

Phone

1231
1170

Phone

1237
1176

Phone

1258
1320

Phone

1291
1353

Phone

1389
1451

Phone

1435
1375

Phone1560-1563

Phone

1589
1590

Phone1660)1379

Phone

1760
1761

Phone

1845
1846

Phone

1849
1788

Phone

1894
1895

Phone

1899
1898

Phone2760660

Phone291-2294

Phone3003614363

Phone3743386

Phone6404321

Phone641291-1298

Phone

714
1715

Phone7687111

Phone7811711

Phone7837111

Phone7887111

Phone913-6918

Phone9134918

Phone929-9934

Phone

931
1853

Wire RefReferences

Wire Refreference

Wire Refreferences

Wire Refreferring

Forum Discussions

This document was digitized, indexed, and cross-referenced with 1,400+ persons in the Epstein files. 100% free, ad-free, and independent.

Donate More ways to support →

Annotations powered by Hypothesis. Select any text on this page to annotate or highlight it.