key: cord-0322294-cb2upg6s authors: Liang, Rizhou; Wang, Qinqin; Zhang, Jiqiang; Zheng, Guozhong; Ma, Lin; Chen, Li title: Dynamical reciprocity in interacting games: numerical results and mechanism analysis date: 2021-01-31 journal: nan DOI: 10.1103/physreve.105.054302 sha: 5ed1adb08e7c499bc5ec038aca83becf14f0c910 doc_id: 322294 cord_uid: cb2upg6s We study the evolution of two mutually interacting games with both pairwise games as well as the public goods game on different topologies. On 2d square lattices, we reveal that the game-game interaction can promote the cooperation prevalence in all cases, and the cooperation-defection phase transitions even become absent and fairly high cooperation is expected when the interaction goes to be very strong. A mean-field theory is developed that points out new dynamical routes arising therein. Detailed analysis shows indeed that there are rich categories of interactions in either individual or bulk scenario: invasion, neutral, and catalyzed types; their combination puts cooperators at a persistent advantage position, which boosts the cooperation. The robustness of the revealed reciprocity is strengthened by the studies of model variants, including asymmetrical or time-varying interactions, games of different types, games with time-scale separation, different updating rules etc. The structural complexities of the underlying population, such as Newman--Watts small world networks, ErdH{o}s--R'enyi random networks, and Barab'asi--Albert networks, also do not alter the working of the dynamical reciprocity. In particular, as the number of games engaged increases, the cooperation level continuously improves in general. Our work thus uncovers a new class of cooperation mechanism and indicates the great potential for human cooperation where concurrent issues are so often seen in the real world. The rise of human civilization is built upon the widespread cooperation at almost every corner of socioeconomical and other activities [1] . Its decay, by contrast, generally leads to regress of human welfare or even wars. The recent decades have witnessed some imminent crises such as global warming, trade wars, and more recently COVID- 19 [2] etc. Solutions to any of them require multilateral cooperation, a throughout understanding of what motivates cooperation and how it evolves, and why it fails, is then needed. According to the Darwinism, however, natural selection favors the fittest, those who are altruistic incur a cost to themselves, leading to a less chance to survive. Cooperation is not a reasonable option by logic. Revealing the hidden mechanisms behind is therefore of fundamental importance and has been listed as one of the grand scientific challenges within this century [3] . Within the framework of evolutionary game theory, many important progresses have been made [4, 5] . By analyzing those canonical models such as prisoner's dilemma (PD), the snowdrift game (SG), the public good game (PGG), and the collective-risk dilemma etc, valuable insights are obtained and several mechanisms are revealed. These include direct [6] and indirect reciprocity [7] , kin [8] and group selection [9, 10] , spatial or network reciprocity [11] , reward and punishment [12] , social diversity and hierarchy [13] [14] [15] , and so on. In particular, it is found that the presence of population structures, either static or dynamic, is able to promote cooperation com- * Email address: chenl@snnu.edu.cn pared to the well-mixed scenario, because cooperators form clusters that can prevail against the invasion of defectors. This so-called network reciprocity has been extensively studied in the past few decades [16, 17] . Alongside these theoretic insights, recent behavioral experiments also expand our understanding of cooperation [18] . As a new paradigm, the recruiting volunteers are configured with some given topologies and rules, they are well-motivated to play the games. But due to the complexities like human psychology, cultures or personalities etc, inconsistencies with theoretic predictions are often unveiled [19] . For example, experiments with static structured population do not found their advantage in promoting cooperation in general as the network reciprocity predicts [20] [21] [22] . Some extra conditions regarding the game experiment settings have to be satisfied for cooperation to thrive [23] . These facts imply that some essential factors could be missing in the most of current game-theoretic models, and the experiment-driven model improvement effort is required in the future. As a common practice, most of the existing works mainly focus on a single game scenario, with a belief in the mind that when the dynamics of a single game is well-understood, the wisdom obtained is supposed to be applicable to many other situations, even with more games being evolved. The philosophy behind is the reductionism essentially. In the real world, however, entities are generally evolved in multiple games simultaneously, where they could influence each other's evolution. For example, colleagues potentially work in a couple of concurrent projects, nations are involved in several issues such as trade, security, culture etc. In these contexts, the progress or conflicts in one issue is likely to affect the evolution of others, they have a stake in each other explicitly or implicitly. In non-human species, field research have made similar observations. For example, in chimpanzee societies, their activities such as grooming, hunting, sharing meat, supporting one another in conflicts, border patrols etc are found to be closely correlated with each other, e.g. a male chimpanzee with good hunting skill is more likely to be groomed by others and vice versa [24] . Till now, only a few models have considered the multiple games, but with different emphases. One line is along the multi-game dynamics. An early conceptually related work is the study by Cressman et al. [25] where two 2-strategy games are played and the eventual states can be described by the dynamics of the separate game. Later research show that the fate of a single game generally cannot be determined without incorporating the messages of other games [26] [27] [28] , e.g. persistent cycles could arise within coupled one-equilibrium games. These works are done in a mean-field sense, and only consider one-shot game scenario where the potential reciprocity in evolution is beyond their scope. Another line is within the framework of interdependent networks, where different games are played on different layers of networks and they are coupled by means of the payoff/utility function [29] [30] [31] [32] [33] [34] [35] [36] [37] . Zhen et al. [29] consider two PGGs being played on two symmetrically connected lattices, and the utility function includes not only the contribution of the payoff of the focal site plus its neighborhood's payoffs, but also the contributions of the neighborhood in the other lattice. They found that as the neighborhoods' contribution in both lattices increases, the cooperation level is promoted. Gómez-Gardenes et al. [30] extend the study to arbitrary number of layers where they adopt Erdős-Rény random graphs and PD for each layer, and the net payoff is through the equal contribution among all layers used for the strategy updating. Their work shows a resilience of cooperation for extremely large values of temptation to defect and this resilience is intrinsically related to a non-trivial organization of cooperation across different layers. Asymmetrical game settings are also studied [33, 34] , where different games are unfolded on different layers. For example, Santos et al. [33] considers PD and SG being posed on two layers of regular random networks respectively, individuals imitate neighbors from the same layer with a probability, and neighbors from the second layer with a complementary probability. Therefore the strategy transfer is allowed between layers and they find that while such coupling is able to promote the cooperation in the PD layer, but it is detrimental for the cooperation in the SG layer. Within the framework of interdependent networks, its specific construction matters [31, 32, 35, 36] . In particular, the link overlap across different layers is shown to be crucial [35] , there is no benefit for cooperation if without any structural correlation. The impact of other topological effects such as degree mixing is also studied in [36] , and some other dynamical processes like spontaneous symmetry breaking between different layers are uncovered in [37] . All these promotions are attribute to the interdependent network reciprocity [29] , a subcategory of network reciprocity. Intuitively, one can view it as an efficient construction of population relationship that tends to maximize the previously uncovered network reciprocity. The most related work is by Donahue et. al [38] , where they propose a framework termed multichannel games. Each channel represents a repeated game and players interact over multiple channels and these channels are dependent with each other. Based on two donation games, they uncovered the evolutionary advantage of cooperation due to the game linkage. This finding acts as a good starting point towards a new fundamental category of reciprocity -dynamical reciprocity, as a counterpart of the well-studied network reciprocity. It emphasizes that the reciprocity stems from the game-game dynamics rather than the underlying topology of population. A bunch of fundamental questions, however, remain unanswered: what's the typical evolutionary dynamics when more games are engaged? how robust is the dynamical reciprocity? what's mechanism behind this new type of reciprocity? These are what we are trying to answer in this work. The aim of the present paper is to introduce and formalize the interacting games and systematically investigate the impact of game interplay on the cooperation prevalence. Different games interact through a perceived payoff, a function of payoffs in all games. To our surprise, we are not only able to see the cooperation promotion, but also the cooperationdefection phase transitions could disappear, where an absorbing state of nearly full cooperation is approached. More numerical studies show this sort of promotion is quite robust, confirmed by variants with both different game dynamics (the types of game, the updating rule and synchronicity, the game coupling etc.) and different underlying topologies. A meanfield treatment indicates that new dynamical routes towards cooperation come into play, which is confirmed by detailed mechanism analysis. There, apart from the invasion category of interaction that is also present in the single game case, two other new categories of interactions -neutral and catalyzed -are identified. Working together, these three categories of interactions lead to a persistent advantage of cooperators over the defectors. The promotion is further enhanced when more games are engaged, where the revealed mechanism still holds. The paper is organized as follows: In Sec. II, we formulate the interacting games of arbitrary number. In Sec. III, the preliminary results of two symmetrically interacting games on 2d regular lattice are shown. In Sec. IV, we present a meanfield treatment within the framework of replicator equations to see what dynamics would arise in the presence of gamegame interactions. In Sec. V, the dynamical mechanism of the revealed reciprocity is discussed in details by classifying all interacting pairs. More numerical results regarding the robustness of the revealed reciprocity are provided regarding dynamical variations in Sec. VI, and structural variations in Sec. VII. More games are considered in Sec. VIII to extrapolate what would be expected when the number of games increases. Finally, some concluding remarks are given in Sec. IX, together with the implications and some open questions being listed. A short version of the work is presented in a short letter [39] . (a) Consider a group of networked players, they play two games G1,2 simultaneously, two payoffs are obtained accordingly, which can be interpreted as the fitness in their evolution. (b) When two neighboring individuals, say player x and y, are to update their strategies with respect to a given game (e.g. game G1 here), the update not only depends on the payoffs obtained in G1 (with weight 1 − θ), but also the one in the other game G2 (with θ), this combination is termed as the effective payoffs, see Eq.(3). We formulate the interacting games within the evolutionary framework for where m games are denoted as G = {G 1 , G 2 , ..., G m }, they are played simultaneously in the population of size N sharing the same underlying topology. The corresponding strategy set is denoted as S m . Assume that each player could be in one of two states in each game: cooperation (C) or defection (D), i.e. S 1 = {C, D}. By combination there are 2 m elements in S m , e.g. the strategy set for m = 2 is S 2 = {XY |CC, CD, DC, DD}, where X, Y correspond to the state in game G 1 , G 2 , respectively. In our study, we adopt the general pairwise games (GPG) and public goods game (PGG). The GPG is defined as follows: when both players cooperate, each gets a reward R; when both defect, then each gets a punishment P ; and the mixed encounter yields a temptation T for the defector while the cooperator becomes a sucker with a payoff S. Different rankings of the four payoffs lead to different game types. Specifically, four types of games are defined in S − T parameter space by fixing R = 1 and P = 0: i) 1 < T < 2 and 0 < S < 1 for the snowdrift game (SG); ii) 0 < T < 1 and 0 < S < 1 for the harmony game (HG); iii) 0 < T < 1 and −1 < S < 0 for stag hunt game (SH); iv) 1 < T < 2 and −1 < S < 0 for prisoner's dilemma (PD). The PGG can be considered as an extension of PD where an arbitrary number of players can play together in a group. It is defined as follows: in each round, every player in the group chooses either to contribute 1 to the common pool as a cooperator, or nothing as a defector; the sum of the contribution is then multiplied by a gain factor r > 1, reflecting the synergetic effect; finally, the resulting amount of benefit is equally shared among all members in the group, including those defectors. Therefore the net payoff of cooperators have to subtract 1 from the shared amount, whereas the defectors don't need to do so. If without any mechanism, defection is preferred. The system is initialized with random conditions if not stated otherwise where each player randomly chooses to cooperate or defect in each game. The evolution follows the stan-dard Monte Carlo (MC) procedure. At an elementary step, a random player x is chosen and a random game g ∈ G is played, then we compute its payoff Π x according to the game setting. Next, one of its neighbors y ∈ Ω x is picked randomly, also its payoff Π y is computed. Lastly, player x adopts y's strategy according to some function W (s g y → s g x ) = W (Π x , Π y ) that translates their payoff difference into the learning propensity. In our study, the Fermi rule [40] is adopted as where K is a temperature-like parameter, measuring the uncertainties in the strategy adoption, its inverse can be interpreted as the selection pressure. K is fixed at 0.1 throughout the work if not stated otherwise. Π g x is the effective payoff perceived by player x in game g defined below that is used to update its strategy. A full MC step is comprised of m × N such elementary steps, meaning that every player is going to update its strategy once for each game on average. The effective payoff Π g x that is formally defined as by which the game-game interactions come into play. It captures the facts that the decision-making of a given game g would base upon a perceived payoff through integrating payoffs in all games instead of simply the one under play. Here θ i ∈ [0, 1] is the contribution weight of game G i . The distribution P g (θ 1 , θ 2 , ..., θ m ) then determines to how m games influence the perceived payoffs in game g. Note that, the implemented MC simulation procedure is to mimic the continuous-time evolution as in the real world, where the strategy updating is asynchronous. To our interest, we also consider synchronous updating (SU) scheme as follows. All m games are repeatedly played in circular order G 1 , G 2 , ..., G m , G 1 , ...; for a given game, all players simultaneously compute and compare their payoffs to one randomly chosen neighbor, and make strategy update according to Eq.(1). m such discrete steps in the SU scheme guarantee that every game is precisely played once for each player. Four more different updating rules [41] : Moran-like rule, replicator rule, multiple replicator rule, and the unconditional rule are also investigated in Sec. VI for robustness studies. We adopt 2d square lattices with the size of N = L × L as the underlying structure for most studies, where each individual plays the games with its four nearest neighbors, and the periodic boundary condition is assumed. We will also study Newman-Watts networks, Erdős-Rényi networks, and Barabási-Albert networks in Sec. VII. As noted above that all games are assumed to share the same set of links, this is not very realistic of course, since many games are unfolded on their own sets of connections, as in the multiplex networks. However, these structural intricacies in multiplex networks would bring additional complexities that will confound our understanding of the reciprocity purely from the dynamical part, therefore we would like to avoid them. In this section, we shall only discuss the case of two interacting games G = {G 1 , G 2 } with perfect symmetry, they are unfolded on a 2d square lattice (see Fig. 1 ). A simple linear combination is used for the effective payoff as where E(...) is a linear function, P G1 (θ 1 , θ 2 ) = {1 − θ, θ} and P G2 (θ 1 , θ 2 ) = {θ, 1 − θ} are also symmetrical in Eq. (2). Here we interpret the contribution weight θ as the game-game interaction strength. The larger the value of θ, the stronger impact of the other game is posed. The case of θ = 0 reduces the model into two independent games, while the other extreme θ = 1 corresponds to the cross-playing scenario where the decision-making of a given game is entirely determined by the payoffs in the other game. More often cases in reality are supposed to occur in between. These constitute a parsimonious model of two interacting games. A typical example of GPG is two symmetrically interacting PD games reported in [39] , see Fig. 2 . A monotonic promotion of cooperation is observed as the interaction strength θ increases. As the case of cross-playing is adopted, the cooperation prevalence becomes independent of the game parameter b any more. The time series support these observations, and in particularly show that the decay in the initial stage is also become lesser when θ become larger. This means that, even before clusters are formed, the game interaction provide some protections for cooperators from being exploited by defectors compared to the case of independent game case, where the cooperators are invaded and go to be extinct for the given parameter in Fig. 2(b) . We further show that this promotion is universal for all game types within the GPG formulation, see Fig. 3 . We see that as θ increases, the defection region shrinks. In particular, when θ → 1, the cooperation maintains at a fairly high level (> 0.8) for the whole parameter S − T space, the difference among the four games nearly disappears. In fact, a similar observation is also made in two symmetrically interacting PGGs. Fig. 4 shows the prevalence of cooperation within a two-parameter space. While a larger value of normalized gain factorr tends to raise the cooperation propensity, a stronger interaction strength θ generally facilitates cooperation as well. Together with Fig. 3 , we conclude that as the game-game interaction becomes stronger, cooperation continues to improve, and fairly high cooperation is seen as θ → 1, irrespective of the game type. In theory, the evolutionary games can be described by a mean-field treatment based on the replicator equation (RE) [41] , which was introduced in 1978 by Taylor and Jonker [42] . RE characterizes the evolution of frequencies or fractions of different species in the population by taking into account their mutual influence on each other's fitness. Mathematically, it successfully captures the selection process and provides a bridge between the Nash equilibrium in static payoff matrix and the evolutionary stable strategies in evolution. However, there are some assumptions required explicitly or implicitly in the derivation of RE as follows. (1) The population is infinitely large; (2) The population is well-mixed so that each individual interacts with an equal probability with everyone else; (3) No mutation is allowed for the strategies, their frequency changes are only due to the reproduction; (4) The evolution of frequencies is linearly proportional to their fitness difference. Derivations from the finite-size effect or from the structured property in the real population are expected according to assumptions (1) and (2). With these assumptions, let's consider a population with m games, their evolution can formally be described aṡ where f s is the frequency or fraction of population within state s ∈ S m , Π s is its fitness Π s = g∈G Π g s , andΠ = s f s Π s is the average fitness of the whole population. The species with a high fitness tends to increase its fraction, the one with a low value tends to reduce instead. C in θ −r parameter space for two symmetrically interacting public goods games.r = r/(k + 1) is the normalized gain factor, and k + 1 is the number of games the individual is evolved. Also f G 2 C ≈ f G 1 C due to the symmetry. Here L = 128 and K = 0.5. Let's first recall the well-known single pairwise game case, where m = 1 and S 1 = {C, D}. f C,D are the fraction of cooperators and defectors respectively, with f C + f D = 1. By applying the payoff matrix given in Sec. II, the RE is theṅ where There are three fixed points: which correspond to full defection, full cooperation, and a mixed strategy, respectively. Their stabilities are determined by doing linearization of Eq. (5) and computing the corresponding eigenvalues around these fixed points. Only negative eigenvalues guarantee a stable solution. Well-known facts of the stable solution are as follows [43] : full defection for PD, mixed strategy for SD, coexistence of full cooperation and full defection for SH depending on the initial condition, and full cooperation for HG. Now let's extend the RE treatment to the case of two symmetrically interacting pairwise games, where S 2 = {CC, CD, DC, DD}. The four fractions satisfy s f s = 1, s ∈ S 2 . The overall fitness of a given state s is Π s = g Π g s = Π G1 s + Π G2 s . Along the setup in numerical simulations, the two games are symmetrical both in the game parameterization and interactions. Without game interactions, the fitness of the four species in the two games are and With these, we define the effective fitness analogously as corresponding to the effective payoffs as in Eq. (3) in the numerical simulations. The RE Eq. (4) is then rewritten aṡ where Π s = g Π g s = Π G1 s + Π G2 s . And the mean fitness is With some algebra (see Appendix A for details), we can derive the evolution of cooperator fraction with regard to either game, say game where The first term of the rhs. is exactly the same as in the single game dynamics shown in Eq. (5) . It means that the cooperator fraction in game G 1 tends to increase when Π G1 C > Π G1 D . The new dynamics lies in the second term, which captures the gamegame interaction via the interaction pairs of CC -DD and CD -DC. Specifically, when CC individuals meet up with DD ones, if their fitness satisfies Π G2 C > Π G2 D the advantage of cooperators in game G 2 transfer DD to be CD due to the game correlation. By contrast, when CD individuals meet up with DC ones, this advantage is instead to reduce the cooperation prevalence, transferring CD to DD. But for the way around (i.e. Π G2 D > Π G2 C ), however, the advantage of defectors in game G 2 leads to the growth of cooperators in G 1 , transferring DC to CC, which is unexpected when games evolve independently. Therefore, the above equation explicitly captures the cooperation dynamics from both intra-and inter-game interactions. Note that, by the exchange of game label 1 and 2 in Eq. (11) , the equation describes the cooperator fraction evolution of game G 2 , i.e. f G2 C = f CC + f DC . By analytically solving the mean field Eqs. (10) (see Appendix B), we found that the stable solutions are exactly the same as the single game case, meaning the game-game interaction brings no impact on the cooperation evolution at the mean-field level. The two games are decoupled essentially in the well-mixed scenario. This means that the dynamical reciprocity from the game-game interaction should go hand in The cooperator fraction f G 1 C for the two interacting pairwise games in the well-mixed population within S − T parameter space. The resulting cooperation prevalence exactly corresponds to the solution of single game; the less significant bistability in SH game is simply due to the random initial conditions, where fC ≈ fD ≈ 1/2. Due to the symmetry, C in all cases, except very few case along the line S = T − 1 in SH, depending on their initial conditions. Other parameters: R = 1, P = 0, N = 2 14 . hand with network reciprocity -it only works in the structured population. Numerical simulations of two interacting PD games in a fully connected population confirm this conclusion as shown in Fig. 5 , where by starting from random initial conditions, mixed strategies, full cooperation, coexistence of full cooperation and full defection, full defection are respectively seen in the quadrant I to IV, in accordance with the analytical results. In the following section, however, we show that the second term in Eq. (11) does play a role in the structured population, which brings fundamentally different dynamical routes towards cooperation. To understand the physics behind Eq. (11) , and the dynamical reciprocity better, let's list all possible interactions in the two interacting games on 2d square lattices and classify them into different categories according to the net effect in their offspring reproduction (see Table I ). For simplicity, we focus on the cross-playing case (θ = 1) in pairwise games, where the classification is most clear, but qualitatively the following analysis can also be applied to the cases with θ < 1. Before proceed, we need to distinguish two interacting scenarios -individual and bulk interactions. Consider two neighboring players x and y, and assume s x = s y . In the individual scenario, we only compute the payoffs of the two players through the interacting pair x − y. This scenario applies to the context when their surroundings are unknown, like the random initial condition case, where the information of their neighbors is stochastic. Instead, when players of the same state are well-bulked, their payoffs can be explicitly estimated by incorporating both the intra-and inter-bulk gaming. Specifically for player x, its payoff is assumed to include the payoffs from its three neighbors with s = s x (intra-bulk) and the one from x − y gaming (inter-bulk). These two scenarios are two extreme cases occurring in structured population, the actual evolution should occur somewhere in between. Since there are four different types of individuals, there are six interacting pairs by combination, which can be classified into the following three categories for either individual or the bulk interactions, see Table I . (i) Invasion type -for the individual interaction, DD dominates over CC for either game thus the CC individual tends to become CD or DC; while when bulk CCs meet up bulk DDs, CCs at the interface have higher payoff than DDs (typically 3R > T for the lined up interface), CC bulks are supposed to invade into DD regions. In either case, the two cannot coexist and invasion happens, where the one at the disadvantageous position will be invaded and become CD or DC. (ii) Neutral type -for both scenarios, there is a neutral outcome, where statistically no net change in cooperation is expected. Consider CC-CD interface as an example, the two individuals are of identical state regarding game G 1 , Π G1 CC = Π G1 CD = R and 4R for individual and bulk interactions, respectively. And since the evolution in game Individual interaction Bulk interaction Invasion type CC + DD Catalyzed type CD + DC (1), a random-walk-like movement of their boundaries is expected. This argument is applicable to the other three interfaces alike. (iii) Catalyzed type -when a CD meets up a DC and they play game G 1 ; accordingly Π G1 CD = Π G2 CD = T and Π G1 DC = Π G2 DC = S with T > S, the DC individual is then likely to become CC by learning (i.e. CD + DC G1 −→ CD + CC). Analogously, when they play game G 2 , the CD individual tends to become CC (i.e. CD + DC G2 −→ CC + DC). As a consequence, CC individuals are continuously produced. In the bulk scenario, however, the situation is reversed. When bulk CDs and bulk DCs meet up, at the lined-up interface Π G1 CD = Π G2 CD = T and Π G1 DC = Π G2 DC = 3R with T < 3R, CD individuals then tend to become DD while DC remains unchanged when playing game G 1 (i.e. CD + DC G1 −→ DD + DC). And DC individuals likewise are likely to become DD when playing game G 2 (i.e. CD + DC G2 −→ CD + DD). Therefore, DDs are continuously produced instead for bulk interactions. In all these cases, the advantage or disadvantage regarding a given game, is not directly translated to any strategy update in the game per se, but the state changes in the other game. This is reminiscent of catalyzed reactions in Chemistry. The presence of one game is to "catalyze" the evolution of the other game. They mutually catalyze each other' evolution. With this classification, it's straightforward to understand the impact of game-game interactions, or more specifically the second term in mean-field equation Eq. (11) . First, since the four neutral interactions give rise to null net effect, no related term is included. Second, the CC-DD and CD-DC interacting pairs do have net effect in the cooperation evolution therefore they appear in the equation, but because they always have the opposite impact on cooperation in either individual or bulk scenario, they have opposite signs. Third, the mean-field treatment assumes a well-mixed setup, which actually corresponds to the individual scenario, where the CD-DC pairs contribute to the increase of cooperation while the CC-DD pairs do the opposite because the cooperators are at a disadvantage position, i.e. Π To validate the arguments of the three categories, some numerical experiments are conducted. To prepare the two scenarios, we respectively start with random and half-half patched initial conditions to mimic the individual and bulk circumstances respectively, and focus on the early stage of evolution, since longer evolution is going to ruin the randomness or compactness. Each time we only place two species on the 2d square lattice for clarity. Figure 6 shows the time series for all six binary combinations in the individual scenario. Category (i) corresponds to Fig. 6(a) , a mixture of CC and DD individuals. As can be seen, the fraction of CC decreases rapidly at the early few steps, DDs' density doesn't increase either, instead the partial cooperators CD and DC show considerable increases. This is line with our argument that CC has no advantage against DD individually, and CD and DC are produced. The evolution at the later stage, the well-mixture is ruined and the CC players gain their strength to increase. Figure 6 (b)-6(e) reports the evolution of Category (ii). We observe that only the two prepared species are present, the others are of zero density. This is because a partial absorbing state is reached, e.g. CC-CD in Fig. 6(b) , game G 1 is in the absorbing state with full cooperation, the fraction of DC or DD continues to be zero. The remaining two fractions fluctuate around 1/2, the neutrality of their evolution is then confirmed. Category (iii) is illustrated in Fig. 6(f) , where CD and DC players are randomly blended. As expected, in the first few steps, the fraction of CC rapidly increases together with some DD individuals, and the fractions of CD and DC themselves decease. Later on, the fraction of CC individuals continue to increase as all other three fractions decrease, but the population by then is not well-mixed anymore and some clusters are formed. Figure 7 shows the evolution in the bulk circumstance. Category (i) is shown in Fig. 7(a) , where the evolution of the two fractions is now qualitatively different from Fig. 6 (a) -bulk CC individuals are now in an apparent advantage position over DDs, its fraction continues to increase and the faction of DD decreases until distinction. The fractions of CD and DC keep at relatively low densities. Typical spatiotemporal snapshots for this case are shown in Fig. 3 in CSL [39] . Category (ii) is illustrated in Fig. 7(b-e) . As can be seen, the dynamical evolution is qualitatively the same as shown in Fig. 6(b-e) , all fluctuating around 1/2. Category (iii) is shown in Fig. 7(f) , which is fundamentally different from the process in Fig. 6(f) , but in line with our arguments above. It shows that once CD and DC meet up in bulks, the payoffs from the intra-cluster gaming reverse the advantages, favoring the DD individuals. As a result, the fractions of CD and DC continue to decrease and the density of DD monotonically increases without any CC individuals being seen. Put together, the numerical experiments in Fig. 6 and Fig. 7 perfectly justify the classification of the three categories. C. Dynamical realities on 2d square lattices However, these above two initial conditions are peculiar that correspond to two extreme circumstances and only last for a short-term time window. What really happened to these diverse interactions along the whole time evolution? Figuring out these facts is the key to understanding the working of dynamical reciprocity, since the invasion and catalyzed types of interaction always produce the opposite results, the classification itself cannot explain why cooperation is preferred. Here, by adopting random initial conditions, we monitor the long-term evolution for both θ = 0.5 and 1, the dynamical processes typically experience two stages: i) At early stage t < t c (t c ∼ 10 MC steps for the size of 1024 × 1024), no sizeable clusters are supposed to be present in the system. Hence the individual interaction scenario applies. Statistically, as Fig. 8(a,c) show that, there is a detectable quantity difference in CC-DD and CD-DC pairs that the proportion P CD−DC > P CC−DD , meaning the catalyzed interactions occur more frequently. Net production of cooperators is thus expected. Though this stage is relatively short. ii) When t > t c , clusters are gradually formed. Once the cluster property is strengthened, the bulk interaction scenario comes into play. Interestingly, a proportion crossover is found now that P CD−DC < P CC−DD , whereby cooperation is again enhanced since net cooperators are also favored in this scenario according to Table I . Figure 8 (a,c) further show that the four neutral interacting pairs have the major proportions in most of the time for both cases, though they bring no net production of cooperators or defectors. The difference between Fig. 8(a) and 8(c) is that an equilibrium state of coexistence is reached for the case of θ = 0.5, while an absorbing state regarding G 1 is approached Put together, for the whole processes, the system selforganizes into states with different relative proportions of invasion and catalyzed interaction type that make cooperators continuously be produced, no matter clusters are formed or not. Look back to the Eq. (11), we are now sure that it also captures the cooperation mechanism within the structured population. When the strategies are not clustered, defectors dominate i.e. Π G1,2 C − Π G1,2 D < 0, but due to the number difference of the interacting pairs, f CC f DD − f CD f DC < 0, the second term in Eq. (11) is thus positive. When the strategies are clustered, the opposite is true Π again the second term is positive and cooperation is enhanced. This means that the game-game interactions prefer cooperation in the whole evolutionary process and thus explains the dynamical reciprocity. A simpler case is starting from half-half patched initial conditions, where bulk scenario sets in from the very beginning, see Fig. 8 . We can see that the interface proportion of CC-DD decreases and all others increase, but the inequality P CD−DC < P CC−DD holds along the whole process, no crossover as in Fig. 8(a,c) is seen. The ensuing dynamics exhibits insensitivity to the initial conditions, the evolution is qualitatively the same as in Fig. 8(a,c) in the long term. This means that in the absence of stage i), the dynamics in bulk scenario is still sufficient to yield high level of cooperation. To validate the appropriateness of the two-scenario division in the random initial condition case, we further conduct cluster size and compactness analysis. While the former is easy to understand and often adopted, the latter is to measure how compact of clusters, which is defined as the fraction of neigh- bors with identical state regarding the central player (here we adopt Moore neighborhood with eight neighbors), its average characterizes the overall compactness of clusters. The situation with both large average cluster size and compactness provides an ideal circumstance for bulk interaction scenario to play, the individual interaction scenario works for just the opposite case. There may also be cases that the average size is large but with small compactness or the way around, which are in between the two interaction scenarios discussed above. Figure 9 provides the statistical properties of the cluster size for both θ = 0.5 and 1. As can be seen, the initial size of clusters is pretty small, but they become larger as time evolves, and get saturated in the case of θ = 0.5 at around t t c [ Fig. 9(a) ]. For the cross-play case (θ = 1), however, the cluster sizes continue to increase except the DD players [ Fig. 9(b) ]. The PDF shown in Fig. 9(c,d) shows that the cluster size could typically reach an order of 10 2 at the late stage but not for CC or DD clusters in the case of θ = 1, where they are either too huge or too small. Figure 10 shows the corresponding compactness analysis. Fig. 10(a,b) show similar profiles of time series when compare with Fig. 9(a,b) . Note that the peaks of DD species in both Fig. 9(a,b) and Fig. 10(a,b) are simply due to the invasion of defectors at the initial stage for t < t c . The PDF shows quite a few individuals are of high compactness, especially for the case of θ = 1, where there are some considerably large clusters (expect DD clusters) within the population. These observations justified our the two-scenario division, whereby the above mechanism analysis of the dynamical reciprocity seems reasonable. In this section, we turn to provide more evidences to examine the robustness of the dynamical reciprocity by studying different variants of the above model, such as asymmetrically interacting games, games of the time-scale separation, and various updating rules etc. The first relaxation of the above model is to remove the symmetry assumption by using asymmetric settings, which could be more realistic. One variant could be that the two games are identical, while their impact on each other is assumed to be asymmetric. The effective payoffs defined in Eq. (3) are then naturally written as where the interaction strengths θ 1,2 ∈ [0, 1] represent the contribution fraction of game G 1,2 in the other game's effective payoff. If a game is more important for the other game than the way around, then generally θ 1 = θ 2 . Note that θ 1,2 are not necessarily negatively or positively correlated, both could view the other game important or unimportant in their decision-makings, up to the specific context. Figure 11(a,b) show respectively the cooperation levels for the two games within the θ 1 − θ 2 parameter space, where the symmetrical case studied above is along θ 1 = θ 2 . The overall trend is qualitatively the same as the symmetrical case that a high level of cooperation is expected when θ 1,2 become large. In particular, a high cooperation level of a given game, say game G 1 , is more likely to happen when the other game's contribution θ 2 is large given its own's contribution is not too small (θ 1 0.4); and nearly full cooperation is reached when θ 2 approaches one. But due to the asymmetry, however, the two cooperation prevalences could be very different. Another variant is to remove the symmetry in the two games per se. Interacting games in the real world more often involve different issues, which should be modeled by different types of game. Thus, here we study the case of interacting games composed of two different pairwise games, i.e. one is PD game, the other is SD, but with the same interaction strength θ. To reduce the parameters, a convenient way to do is as follows [33, 34] : the two games share three identical parameters R, P , T , but with opposite signs in S, since 0 < S < 1 for SD and −1 < S < 0 for PD are required by definition. Figure 12 shows three typical interaction strengths in S − T parameter space. Without game interaction, they are independent, almost no cooperation is expected for PD. As the interaction becomes stronger, both cooperation levels are promoted. Significant promotion is possible when the scenario becomes cross-playing, again high cooperation prevalences arise for the whole parameter space. Due to the difference of the two games, now the cooperation levels are not homogeneous, but with certain nontrivial distribution in the parameter space, remaining for further investigation. Note that here the game-game interaction promotes cooperation in both games, which is superior to the previous results within the framework of interdependent network. In [33, 34] , the PD and SD are placed on two interdependent networks, and they are coupled through network interdependency. They show that with the increment of coupling, the cooperation promotion in PD is at the expense of cooperation decline in SD. An underlying assumption in the above studies is that the time-scales of involved games are comparable, their updating rates are assumed to be identical. A further relaxed condition is to allow for different time scales, or even time-scale separation -some of them are fast games while others are slow ones. This could be the case in reality since different issues by nature has its own paces and hence is reasonably of different time scales. To see the impact of the time-scale separation, we study two interacting PD games, one is fast, the other is slow. The time-scale separation is characterized by the time scale ratio T r ≥ 1: when the slow game is updated by one generation for each player, the fast game is updated T r generations on average. Apparently, the case of T r = 1 is reduced to identical time scale scenario as studied above; and as T r increases, the time-scale separation becomes stronger. Figure 13 shows the cooperation prevalence for a wide range of time-scale separation for θ = 0.5 and 1. In both cases, the cooperation prevalence in the slow game monotonically declines, while the cooperation level for fast game keeps largely unchanged or even increases within T r < 10 in the case of θ = 0.5. As T r further increases, the declines of both games are observed in Fig. 13(a) , while the cooperation level for slow game keeps round 0.5 and full cooperation for the fast game in Fig. 13(b) . These results mean that a fairly high level of cooperation is still able to be maintained when the time scales are not very much separated. A systematic account of time-scale separation's impact will be presented elsewhere. The update rule determines how exactly the strategy of individuals evolves in time. There is a variety of update rules that have been adopted in the literature, each being conceived in different backgrounds, and sometimes they yield fairly different cooperation outcomes. For completeness, here we also examine four other often used updating rules [41] : (i) Replicator rule, also known as the proportional imitation rule, is inspired by the replicator dynamics. The procedure is similar as the Fermi rule case that we randomly pick one node x and one of its neighbor y and the imitation probability linearly depends on the payoff difference as: where Π g 0 = k(max(1, T ) − min(0, S)) to ensure the probability W g xy ∈ [0, 1]. (ii) Multiple replicator rule is a variation of the replicator rule, where we now check simultaneously the whole neighborhood of x, and therefore it's more probable to change the strategy of x. With this rule, the probability of player x maintaining its strategy is where W g ij is given by (13) . The more neighbors an individual has, the less likely to maintain its strategy. (iii) Moran-like rule, also know as death-birth rule, is inspired by the Moran process in biology. With this rule, an individual randomly picks one site in its neighborhood including itself, with the imitation probability defined as where Ω * x = Ω x ∪ {x}, the constant Π g 0 is to guarantee the numerator positive, Π g 0 = max j∈Ω * x (k j ) min(0, S) for pairwise games. (iv) Unconditional imitation rule, also known as follow-thebest rule, is a deterministic rule. At each time step, every player adopts the strategy of the individual who has the highest payoff in its neighborhood, given this payoff is greater than its own. Another complication of the model study is the synchronicity of the strategy update. As described in Sec. II, the one used above is the random sequential updating, or simply termed as asynchronous updating (AU). We also examine synchronous updating (SU), where each individual is updated simultaneously, thus each player is updated exactly once per generation. Previous studies show that the synchronicity issue is often relevant for the cooperation outcome [41] . In what follows, we investigate the cooperation dynamics of two interacting PD games using the above four updating rules, with both AU and SU. Figure 14 presents the cooperation prevalence of two interacting PD games for the above four updating rules, for AU. Without game interaction, the phase diagram are very similar, except the unconditional imitation rule case [ Fig. 14(j) ], where its phase diagram differs significantly from the others. As game-game interaction is engaged and becomes stronger, the overall cooperation is promoted in general, and this promotion reaches maximal as θ → 1, the cross-playing scenario, in line with the Fermi rule studied above. In this scenario, however, there are some new dynamical features emerging. The most significant distinction is that there is no monotonic dependence of cooperation on the temptation T , an intermediate value of T yields a cooperation valley for replicator rule, multiple replicator rule and unconditional imitation [ Fig. 14(c,f,l) ]. Besides, many "defection islands" arise within the parameter space for the the Moran rule [ Fig. 14(i) ]. Furthermore, the cooperation prevalence in some quadrants instead decreases, like HG and SH. SU doesn't alter the overall phase diagram of cooperation, as shown in Fig. 15 . The cooperative behaviors are qualitatively the same as the cases with AU, and values are even slightly larger compared to Fig. 14 , where the monotonic dependence of T and "defection islands" in the cross-playing scenario remain. Detailed analysis requires inspecting the evolutionary game dynamics, which will be presented elsewhere. Taken together, the observations in Fig. 14 and Fig. 15 mean that the dynamical reciprocity is robust against different rules and the synchronicity of the strategy updating. While the above robustness studies focus on the variations in dynamical aspects, we now turn to the impact of structural complexities, since the underlying connectivities of real populations are far more complex than regular lattices we studied. The stylized models for complex networks include Erdős-Rényi (ER) random networks, small-world (SW) networks, scale-free (SF) networks, the impact of which on cooperation in the single game case has been studied extensively [16] , both theoretically and experimentally. Here we are not aiming to examine exhaustedly different topologies, rather we only study three of them and focus on the question: whether the structural complexities alter qualitatively the working of dynamical reciprocity? C with other four update rules with AU, for θ = 0, 0.5, 1. The update rules are: replicator rule (first row, a-c), multiple replicator rule (second row, d-f), Moran rule (third row, g-i), and unconditional imitation (bottom row, j-l). Random initial conditions are adopted. The four numbers are the average cooperation prevalence for the corresponding quadrants. Also f G 2 C ≈ f G 1 C due to the symmetrical settings. Parameters: R = 1, P = 0, and L = 128 for the 2d square lattice. We adopt Newman-Watts network for the SW networks [44] , which are derived from d-dimensional square lattice (here d = 2) by adding some additional random links. This SW model is thought to be better behaved than the original network model [45] , such as the exclusion of detached possibility. Instead of rewiring, shortcuts are added with a probability φ corresponding to each bond of original lattice, so that there are dN φ shortcuts on average. The average degree is then k = 2d(1+φ). By tuning the parameter φ, the topology can vary continuously from the regular lattice to small-world networks, and to random networks in principle. Fig. 16 (a) provides phase transitions for a couple of interaction strength θ for φ = 0.01, showing that as θ increases, the cooperation prevalence is continuously promoted; when θ → 1, the phase transition disappears, and a fairly high level of cooperation is also observed, irrespective of the control parameter b. Fig. 16(b) shows the cooperation dependence on the network parameter φ for three b. While a higher temptation b reduces the cooperation prevalence as expected, a larger φ generally enhances the cooperation for large T . However, when the temptation is small (e.g. b = 1.1), there is an optimal φ that promotes the cooperation to the largest degree. This observation is in line with previous findings in the single game case that a moderate amount of randomness in smallworld networks is found to best enhance cooperation [46] . Erdős-Rényi random networks [47] represent a class of topologies found in nature. Its construction starts with an ensemble of N isolated individuals, any two nodes are then connected with a given probability. In such a way, the degrees of the resulting networks follow a Poisson distribution around the mean value k . The evolutionary outcome of two interacting PD games on ER networks is shown in Fig. 17(a) . Due to the structural disorder, the prevalence of cooperation is higher than the case of 2d square lattice in the absence of game interaction. By increasing θ, a continuing promotion of cooperation is observed as well, with the cross-playing scenario being the optimal case likewise. Fig. 17(b) shows the dependence of cooperation prevalence on the average degree k , where there is an optimal k for each case, further increasing the value of degree results in a cooperation decline. This is because the case of k → N − 1 is equivalent to the well-mixed populations, where no cooperation is expected according to the above mean-field analysis. We adopt Barabási-Albert (BA) networks [48] for the simulations on the scale-free network. The construction is via the growth and preferential attachment. The network is started with a small fully connected graph as the initial core, and every newly added node is going to connect k /2 existing nodes, with a probability proportional to their degrees. The generated networks follow a power-law degree distribution with an exponent −3. Different from the SW or ER random networks, which are taken as homogeneous networks, scale-free networks are typical heterogeneous networks. This heterogeneity is able to boost cooperation considerably, yielding fairly high level of cooperation even for a single game. The reason lies in the fact that the hubs are easily to form cooperator backbones that drives the whole network to be cooperative [13] . This boost is however based upon the accumulated total payoffs, whereby the hubs are very likely to have higher payoffs and thus be the model players. Once the accumulated payoff is replaced by the average payoff (i.e.Π x = Π x /k i ), which is argued more reasonable in reality in some previous studies [15, 49, 50] , the heterogeneity-induced-enhancement is much less significant. Here, we adopt the average payoff scheme in our simulations of two interacting games. As can be seen in Fig. 18 , without game interaction (θ = 0), the enhancement effect of heterogeneity is very much inhibited -a low level of cooperation is seen. As the game interaction θ increases, the cooperation curve is monotonically lifted, also the cross-playing scenario work best. Note that, in scalefree network case, a flipping noise in posed to inhibited the strong fluctuations caused by the strong degree heterogeneity in the following way: in each elementary step, with a small probability p f to flip the state of the focal player, and with 1 − p f to conduct the standard MC procedure as in Sec. II. With flipping noise, the absorbing state is thus avoided, and it can be interpreted as the deviation from the logic of imitation due to the irrationality. In brief, additional complexities in the underlying networks of population, as shown in this section, only bring some quantitative difference compare to the lattice case, the dynamical reciprocity still works in the complex networked populations. Given the results of two interacting games, one naturally wants to see what's the trend if more games are engaged, since in the real world there are many more issues unfolded simul-taneously. The question of interest is: what could be expected if the number of games involved increases, whether the above mechanism still holds? In [39] , we have shown that by assuming equal contribution for each game, the phase transitions and typical time series show clearly that a higher level of cooperation is expected when more games are involved (m = 1, 2, 3). Based upon these observations, one can reasonably extrapolate that a continuing promotion in cooperation is expected when more and more games are engaged. Here, we plan to study the three game case in a bit more depth. The effective payoffs following the linear combination for the three interacting games are now written as where the interaction strength θ 1,2,3 are respectively the contribution weights of G 1,2,3 in other games' effective payoffs. To reduce the number of parameters, here we adopt a circular parameterization as follows, where θ 1,2 ∈ [0, 1] and θ 1 + θ 2 ≤ 1. By varying these two interaction strengths, we can systematically study the case of three interacting games. The simplest case where θ 1 = θ 2 = θ is shown in Fig. 19(a) . When θ 0.25, cooperators cannot survive within the population for the given parameters, further increasing the interaction strength, a continuous phase transition is seen and the cooperation prevalence goes to be fairly high when θ → 1/2, which corresponds to a cross-playing scenario in the three interacting game case. Fig. 19 (b) provides the more general case where θ 1,2 can be different. We see that a stronger game interaction leads to better cooperation holds in general. In particular, high cooperation does not require the symmetry between θ 1,2 ; in fact, as long as θ 1 + θ 2 → 1, the general cross-playing scenario, fairly high cooperation is guaranteed. For three games case, the mechanism behind the promotion is the same as the case of the two interacting games and the above classification still holds. Specifically, there are now 8 different states and 28 pairwise interactions by combination. Even though, these interactions can still be classified into the above three categories also within a cross-playing scenario θ 1 = θ 2 = 1/2 for simplicity, as listed in Table II . Overall, also two scenarios are considered -individual and bulk interactions. And all interacting pairs can similarly be classified into invasion, neutral, and catalyzed categories. While no net production in cooperation is expected in the neutral category, cooperators are either yielded or ruined in the other two categories, and the net effect is opposite in two scenarios. Additional complexities here, however, are that for a given pairwise interaction, it could be classified into differ- ent categories for different games; e.g. DCC-CDD belongs to catalyzed type when playing game G 1 , but is of neutral type when playing game G 2 or G 3 . All classifications are also confirmed by numerical experiments (not shown). In summary, motivated by the facts that different games are often coupled in the real world, we formulate their evolution in the framework of interacting games. We show that this game-game interaction generally boosts the propensities of cooperation in all evolved games. To our surprise, the optimal promotion occurs when the decision-making of the games is completely blind to their own's payoffs. A mean-field treatment reveals that two new routes to cooperation arise, which are confirmed by further analysis. All these findings suggest a new mechanism for cooperation -dynamical reciprocity. Exhausted numerical evidences for variants both in dynamical and structural aspects have verified the robustness of the dynamical reciprocity. While the network reciprocity is to facilitate the growth of cooperator clusters via the structured connectivities [16, 51] , the mechanism to lift cooperation in the dynamical reciprocity is through the game-game correlation instead, more specifically through the invasion and catalyzed types of interactions. In this sense, the dynamical reciprocity manifests itself as an entirely new different mechanism. Nonetheless, as shown in the well-mixed case, the interacting games doesn't show any improvement compare to the single game case. This means that the dynamical reciprocity only works in structured population. Since most populations in the real world are structured, this precondition is easy to meet. Therefore, the two reciprocities are expected to go hand in hand to maintain high levels of cooperation in reality. In the present work, we have only treated the linear game interaction case as shown Eq.(3). There, the effective pay- TABLE II . Categories of interactions in 3 interacting PD games G1,2,3 within a cross-playing scheme (θ 1 = θ 2 = 1/2 in Eq. (17)). C 2 8 = 28 pairwise interactions by combination can still be classified into three categories in either individual or bulk scenarios. offs perceived is a linear combination of payoffs for different games, and the interaction strength is encoded within the combination weights. This is of course a simplified formulation for the real cases, more realistic modeling could be more complex functions of the payoffs and the weights. In addition, the effective payoff should also probably incorporate a memory of past history for all evolved games, which implies that a non-Markovian model [52, 53] seems more proper. The five strategy updating rules used in this work essen-tially belong to the outward learning, where they imitate those who are better off. This is the mainstream approach of modeling updating. However, a new paradigm proposed recently [54, 55] is through the inward learning, where the decision-making of individuals is through introspective actions based on their histories. With the help of machine learning [56] , they also model the evolution of cooperation, but is limited to the single game case. It would be interesting to see what if more games are played in this shifted paradigm, does the dynamical reciprocity still work? As the next step, maybe the most important open question, is to verify the dynamical reciprocity in behavioral experiments. But due to the great complexities in human beings, the experiment needs to be carefully designed to be convincing. Ideally, the game-game correlation is tunable; also the comparison of interacting games within structured and unstructured populations is necessary according to our results. Back to the real world, the implications of our works is at least two facets. On one hand, since different issues are often interweaved in the real world, and highly cooperative behaviors are abundant, the dynamical reciprocity seemingly provides a natural causality explanation for these two observations. On the other hand, to handle those cooperation failures in some vital issues, such as global warming and trade war, our work suggests that the players should get involved in as many games as possible, whereby a decent cooperation should be expected as a result. This advice could be applicable to different contexts from international affairs to interpersonal relationships. Finally, our work may also provide an inspiration for the complexity science. In complexity science, many systems are studied within the framework of structure plus dynamics (i.e. the function), where the structure and its impact on dynamics have been extensively studied with the rise of network science. Our findings suggest that the dynamics-dynamics interactions may also harbor a great amount of complexities, which have largely been underestimated before. Another good example of "more is different" [57] in dynamics is modeling the spread of multiple infectious diseases [58] [59] [60] [61] , where the physics revealed is fundamentally different from the single one case. We hope the related communities could put more attention to the potential complexities arising from the dynamics-dynamics interactions in the future. , s ∈ S 2 that players perceived and used in their strategy updating, as follows The overall effective payoffs are then The key observation here is that the interaction strength θ is cancelled out due to the linear combination, which is also reasonable since θ is the contribution weight that only adjusts the payoff values perceived in a specific game but not the overall payoffs. Specifically, we have Inset these terms into Eq. (10), the replicator equations are then where we used the normalization condition f CC +f CD +f DC +f DD = 1. These equations can actually be summarized aṡ The structure of Eq. (A6) is similar to Eq. (5) and its meaning is straightforward that the change in f s comes from the interaction of individuals within state s and s and their effective payoff difference. The overall effective fitness differences are as follows Now let us consider the evolution of overall cooperator fraction with regard to G 1 by adding the first two equations in Eqs. (A5) and insert the above related termṡ Similarly, one can also obtain the equation for G 2 by adding the first and the third equations in Eqs. (A5). Equation (11) is then obtained. To solve the mean-field equation Eq. (A5), we make some rearrangements that lead to where f G1 C = f CD + f CC and f G1 D = f DC + f DD by definition. Based on the observations in the numerical simulation as well as the symmetrical game setting, it's reasonable to assume f g CD = f g DC , g ∈ G. Together with the normalization condition, only two variables (let's select f CC and f DD ) are independent, the others can be expressed by Their equations are By settingḟ CC =ḟ DD = 0, we obtain the four solutions: (1) f CC = f DD = 0; (B6) (2) f CC = 1, f DD = 0; (B7) (3) f CC = 0, f DD = 1; (B8) The stability of these solutions is determined by the eigenvalues of the corresponding Jacobian matrix as (1) For f CC = f DD = 0, We have λ = ± (R + S − T − P ) 2 , the eigenvalues couldn't be both negative, thus this solution is unstable in any case. (2) For f CC = 1, f DD = 0, We have λ 1 = −(R − T ), λ 2 = −2(R − T ). Therefore, this solution is stable only when T < R. We have λ 1 = (S − P ), λ 2 = 2(S − P ). This solution is stable only when S < P . (B14) together, the stable solutions of the mean-field equations are exactly recovered to the single pairwise game: the mixed state (solution (B9)) is stable for SD game region, full cooperation (solution (B7)) is stable for HG game region, full defection (solution (B8)) is stable for PD game region, and SH region is bistable (full cooperation or full defection) because it is the overlapped region for solution The evolution of cooperation Levels of selection in evolution Genetic and cultural evolution of cooperation Games and Economic Behavior Emergent route towards cooperation in interacting games: the dynamical reciprocity Evolution and the Theory of games Random Graphs Stochastic Processes in Physics and Chemistry