Mutual helping for direct benefits can be explained by various game theoretical models, which differ mainly in terms of the underlying conflict of interest between two partners. Conflict is minimal if helping is self-serving and the partner benefits as a by-product. In contrast, conflict is maximal if partners are in a prisoner's dilemma with both having the pay-off-dominant option of not returning the other's investment. Here, we provide evolutionary and ecological arguments for why these two extremes are often unstable under natural conditions and propose that interactions with intermediate levels of conflict are frequent evolutionary endpoints. We argue that by-product helping is prone to becoming an asymmetric investment game since even small variation in by-product benefits will lead to the evolution of partner choice, leading to investments by the chosen class. Second, iterated prisoner's dilemmas tend to take place in stable social groups where the fitness of partners is interdependent, with the effect that a certain level of helping is self-serving. In sum, intermediate levels of mutual helping are expected in nature, while efficient partner monitoring may allow reaching higher levels.
Helping, defined as an act that increases the direct fitness of a recipient, has attracted great interest as it is at odds with general Darwinian notions of competition and self-interest. One solution has been Hamilton's [1,2] kin selection theory of altruism where helping is directed at genetically related individuals. However, nature is full of examples where helpers and recipients are unrelated, most obviously in interspecific interactions, that ‘have played a central role in both the ecology and evolution of life on Earth’ [3, p. 3]. Within the same species, there are also countless examples of individuals helping unrelated conspecifics, provided it yields overall direct fitness benefits. Given the great confusion regarding terminology, the best we can do is to define each term when using it for the first time. We follow Bshary & Bergmüller , who put together existing definitions in a coherent way. Based on Lehmann & Keller  and Bronstein [6,7], mutual helping for direct fitness has been termed ‘cooperation’ if it occurs between members of the same species and ‘mutualism’ if it occurs between members of different species (see also ).
Key topics in research on helping are to determine how population structure and life history lead to unconditional helping and to identify the decision rules and partner control mechanisms of conditional helping . Partner control occurs if a cooperator takes an action that lowers the pay-off of a defector, for instance by defecting, punishing or sanctioning the defector with premature termination and refusal to interact again, or by switching to a new partner .
Although stable mutual helping can be explained by numerous models, the literature is dominated by two scenarios. Either helping is inherently self-serving, with no danger of defection, or it is an ‘investment’ (a pay-off reduction irrespective of a partner's action) that is compensated by future benefits, which creates a temptation to defect. We will first present both scenarios in detail before arguing that, in most biological systems, the assumptions upon which they are based are ecologically implausible. We will further argue that, in the real world, most cases of dyadic mutual helping are accompanied by some level of conflict, which in turn has selected for partner monitoring and control, both between related and unrelated individuals. We will present stylized games that capture what may well be the most common stable endpoint—intermediate levels of conflict—and propose ways in which these games could be explored in the future.
2. Helping with minimal levels of conflict
Arguably, the most straightforward condition for stable mutual helping is a situation in which each individual performs a self-serving act that benefits a partner as a by-product. We call this ‘by-product helping’ as it may occur within and between species. Brown  has referred to the same condition as ‘by-product mutualism’, but this term clashes with the original use of the term ‘mutualism’, i.e. cooperation between species . A good example for ‘by-product helping’ is the ‘selfish herd’ effect . For example, seals reduce inter-individual distances when swimming through zones with great white sharks, which is both self-serving for the actor and beneficial for the partner, as it reduces the predation risk for both . Another example is cooperative hunting, provided the hunting success increases with group size [11,12]. Cooperative hunting can also occur between species, as recently described in interspecies interactions between groupers (a predatory fish) and other predatory species [13,14]. In these cases, the hunters position themselves to maximize their own hunting success and immediately swallow any captured prey, with no sharing . Thus, the benefits accrue owing to self-serving coordination in time and space rather than through mutual investments.
A second form of stable helping without defection is positive pseudo-reciprocity . Positive pseudo-reciprocity involves an initial investment that enables the recipient to perform a self-serving act that in turn benefits the investor as a by-product. A prime example is the tri-trophic interaction between plants, insect herbivores and parasitoid wasps . When attacked by a herbivore, the plant produces volatiles, which enable the wasp to detect the herbivore. The wasp will then self-servingly lay her eggs into the herbivore, which will kill it and benefit the plant as a by-product. Positive pseudo-reciprocity can also be mutual. For example, in lichens, a composite organism of algae living in the filaments of a fungus, the fungus invests by producing a shelter that enhances the efficiency of the algae's photosynthesis and nutrient production, which enables the fungus to grow more rapidly and produce more shelter. As the fungus transfers the algae vertically to the next generation of fungi, the fitness of both partners is interdependent, which prevents defection from either side.
Owing to the inherent stability of by-product helping and positive pseudo-reciprocity, various authors have hypothesized that they should be abundant in nature [17–19]. While many well-documented examples are mutualisms, i.e. interactions between species, there are also examples of cooperation, i.e. intraspecific interactions, such as the selfish herd effect .
3. Helping with maximal levels of conflict
We define helping with maximal levels of conflict of interest as cases in which all individuals would do best by fully defecting from each other (i.e. in the absence of a partner control) as the evolutionarily stable strategy. The exploitation aspect distinguishes our conflict of interest from potentially stable lack of cooperation in coordination games, in which non-coordinators do not exploit coordinators. In interactions with maximal conflict, helping behaviour can only emerge if partners are able to provide mutually conditional investments. We use the term ‘mutually conditional investment’ as equivalent to Trivers’  ‘reciprocal altruism’, a term we avoid as it clashes with Hamilton's [1,2] definition of ‘altruism’, i.e. helping relatives owing to kin selection. We avoid the term ‘reciprocity’ as a shortcut for reciprocal altruism as the term is currently used in many different ways. The standard model for direct mutually conditional investments is the iterated prisoner's dilemma, a game in which two players repeatedly choose between cooperating and defecting. The pay-offs are such that mutual cooperating yields higher pay-offs than mutual defection, but in each single interaction defection maximizes immediate pay-offs independently of the partner's action (figure 1). Thus, to cooperate is an investment, where future return benefits can only be owing to the partner providing return investments. Owing to the pay-off structure, however, the temptation to defect is continuously present.
Despite a perpetual temptation to defect, a variety of partner control mechanisms can nevertheless lead to stable mutual investments. In fact, economists have shown that solutions with almost any frequency of investments are possible, provided mutual helping creates a surplus in pay-offs and a sufficient number of rounds are played without a fixed known final round (‘folk theorem’: ). Largely unaware of the economic literature, evolutionary minded scientists have spent considerable effort to identifying successful strategies that start cooperatively and continue to cooperate as long as the partner also cooperates (reviewed by Dugatkin ). In the most famous strategy in an iterated prisoner's dilemma game, tit-for-tat, one individual cooperates on the first iteration of the game and then does exactly what the partner has done on the previous move, i.e. defect on a defector or cooperate with a cooperator . An alternative solution to achieve mutually conditional investments in an iterated prisoner's dilemma-type game is ‘negative reciprocal investment’, where a cooperative individual pays a cost to reduce the pay-off of a defecting partner (‘punishment’; ). Another particularly powerful partner control mechanism is to threaten with partner switching, which promotes mutual investments, at least in well-mixed populations .
Various biologists have argued that, in contrast to by-product benefits and positive pseudo-reciprocity, direct mutually reciprocal investment is rarely found in non-human species, in both between- and within-species interactions [26–30], which is at odds with the theoretical literature. There are a few convincing examples for mutually conditional investments, i.e. experimental studies that demonstrate contingent helping: rats in a laboratory food pulling task , flycatcher mobbing behaviour , food provisioning in vampire bats ( in combination with ), apparent support for grooming in baboons  or tolerance and support for grooming in vervet monkeys . Despite these examples, there is still a discrepancy between the modelling literature on mutually reciprocal investments and the empirical results.
The discrepancy between theoretical efforts and empirical evidence is understandable but also surprising. It is understandable because theoreticians are not interested in modelling conflict-free helping as the stability of helping in such cases is self-evident and theoretically uninteresting, whereas the opposite is the case for helping to resolve maximal conflicts. What is currently debated, however, is why there are few convincing examples of helping that involve mutually conditional investments and what ‘rare evidence’ might actually mean. We therefore quickly summarize and comment on some proposed explanations or opinions.
One argument for why evidence for mutually conditional investments is rare is that such helping can only evolve if two mutations arise simultaneously in at least two individuals that are potential partners of each other: to invest and the ability to invest conditionally on the partner's return investment . Such an admittedly evolutionarily unlikely scenario has become known as the bootstrapping problem (; see also  for some potential solutions). Alternatively, several authors have pointed out that mutually conditional investments might be rare because non-human species lack the necessary cognitive abilities to keep track of the outcome of past social interactions, especially if this involves interactions with several partners [27,28].
While we acknowledge that evolvability and cognitive constraints may prevent the evolution of mutually conditional investments, in many cases, we do not think that these explanations suffice to explain the apparent rarity of mutually conditional investments in nature. The evolvability argument seems to be based on assumptions concerning gene–behaviour relations that do not apply to species with brains. Vertebrates and invertebrates have been shown to learn appropriate behaviour via classic and operant conditioning . In the latter, animals condition their own behaviour as a function of the changes in the environment. It does not really matter whether the environment is abiotic or an interaction partner, and whether in the latter case the situation is potentially cooperative or competitive. There are countless examples in the optimal foraging literature showing that many species are capable of efficiently fine-tuning their responses to pay-off differences when moving between food patches , and Kacelnik  makes a strong case that such foraging decisions can be most parsimoniously explained with the all-purpose tool ‘operant conditioning’ rather than situation-specific evolved rules-of-thumbs/heuristics. A similar point has been made by Bshary & Oliveira , who argued that selection on brain functioning works mainly on higher functional circuits involved in decision-making in a variety of different social situations. To our mind, the same line of argument can be applied to explain situation-specific conditional helping. All we need to explain is the evolution of learning rules (like strength of reinforcement, weighing of past interactions, exploration of behavioural repertoire) that allow individuals to adjust their behaviour during their lifetime. Dridi & Lehmann [43,44] give wonderful examples of how exploratory trial-and-error reinforcement learning rules evolve that allow individuals to solve an iterated prisoner's dilemma.
The argument made above puts the emphasis on the cognitive constraint hypothesis. As we see it, this hypothesis is difficult to reconcile with the fact that individual recognition is widespread at least in various vertebrate clades like mammals, birds and fishes [39,45], and has even been shown in social insects . Lack of mobility allows ‘individual recognition’ based on location, as is the case for ants interacting with their many partner species or for pollinators interacting with flowering plants. Moreover, memory capacities are apparently sufficient for learning through operant conditioning, with the evolution of crucial learning rule parameters subject to natural selection [43,44]. Indeed, it has repeatedly been argued that solving an iterated prisoner's dilemma game might not be as rare as often assumed because scientists were looking for strategies proposed by theoreticians, like tit-for-tat, while animals make decisions differently. For example, it has been argued that decisions about mutually conditional investments may be based on a more general assessment of recent social interactions (‘attitudinal reciprocity’, ), on the general quality of a relationship (‘emotional reciprocity’, ), or on general past experience rather than precise counting with each potential partner (‘generalized reciprocity’, ) (see also ). All of these proposed decision mechanisms could be based on the dynamics of learning through operant conditioning (or on higher cognitive processes if available). Two field experiments on baboons and on vervet monkeys fit the idea of attitudinal or emotional mutually conditional investments [35,36]. Crucially, having recently groomed another individual increases the probability of receiving his or her tolerance or coalitionary support. This implies that the return investments are not ‘all-or-nothing’; in fact, although effects that are conditional on recent grooming are present both in related/bonded pairs and in unrelated/non-bonded pairs, they come on top of different baseline levels for interactions without prior grooming. Various other experiments (with both positive and negative results; [47,51–66]) and a large number of correlational studies  provide additional evidence for such graded mutually conditional investments in primates. Based on this extensive experimental and correlational evidence, we predict that graded mutually conditional investments are indeed common in primates and will be found also in other taxa.
This statement applies not only to positive contingencies but also to negative ones, i.e. punishment, where current correlational and experimental evidence is even rarer in non-human animals . While cognitive constraints may limit the usefulness of punishment in various situations [30,68], the fact that aggressive responses to cheating can cause more cooperative behaviour in fish in both inter- and intraspecific interactions [69–71] suggests that at least many vertebrate species should have the cognitive requirements to use punishment.
This leads us to the issue of whether the phenomenon of strictly mutually conditional investments solving an iterated prisoner's dilemma is truly rare, as we argued above. To us it is a matter of perspective: if the game structure of every single case of helping for direct benefits were known and total sums were made, we would expect numerous cases of mutually conditional investments solving an iterated prisoner's dilemma pay-off matrix (i.e. in the 1000s, to put a rather preliminary estimate). To place this in perspective, helping for direct benefits is ubiquitous, as shown by the myriad of cases of pollination mutualism, plant–microbe interactions in the soil or mutualisms involving ants protecting partner species. A group of mutualism specialists proposed in 2003 that none of these cases is suspected to solve an iterated prisoner's dilemma , a view that has hardly changed since . We hope that these illustrations put our use of the term ‘rare’ into perspective. Mutually conditional investments are so rare that we think that additional arguments have to be explored in addition to (not instead of) arguments about genetic/cognitive constraints.
4. The ecology of helping for direct benefits
Here, we follow up on the possibility that prisoner's dilemma-type pay-off structures are rarely found in nature owing to ecological reasons . We do so not in mathematical terms but by developing socio-ecologically relevant scenarios. As all three of us are empiricists, we apologize for the realistic possibility that we might miss out relevant models that would have made our points already in a more elegant (i.e. mathematical) way. Also, while we think that models allowing for continuous investments are biologically more realistic, we illustrate our points with stylized games with discrete behavioural options. We do so because we think that the logic is easier to grasp and because we are interested in different classes of models, i.e. models that predict no conflict, intermediate conflict and maximal conflict (figure 1). This classification holds for both continuous and discrete behavioural options (see  or  for continuous options in snowdrift and prisoner's dilemma games). Before we challenge the ecological validity of prisoner's dilemma-type pay-off matrices, we will challenge the notion that conflict-free cooperation/mutualism is a stable endpoint of mutual helping in nature. Both scenarios (starting out with by-product benefits or with prisoner's dilemma pay-off structure) have in common that the resulting levels of conflict are intermediate, i.e. cooperative behaviour would persist to some extent even in the absence of partner control mechanisms. Furthermore, both scenarios have in common that selection leads to changes in individual strategy spaces, which in turn leads to changes in game structure and the corresponding pay-off matrix.
A specific example is provided by Friedman & Hammerstein  in their analysis of egg trading in the simultaneously hermaphroditic hamlet fish (see [75,76] for other models). Hamlets form pairs in the late afternoon. Partners alternate several times between releasing eggs and fertilizing eggs . This seems to be rather inefficient compared to each partner releasing all eggs in one bout. However, such a release pattern would be vulnerable to cheaters: as eggs are more costly to produce, the individual that releases its eggs first would face the risk that the partner fertilizes the eggs and then leaves to find a new partner that still has eggs. Releasing all eggs in one bout would correspond to a sequential one-shot prisoner's dilemma game, where defection is the only stable outcome. The evolution of a parcelling strategy, combined with waiting till late afternoon, overcomes the problem of defection. Parcels are so small that the best response to receiving a parcel for fertilization is to give a parcel, to which the best response is to provide the next parcel, until all eggs are fertilized. This is because the benefits of staying are larger than the benefits of leaving: leaving involves search costs and the risk of not finding another partner in the little time before sunset . Thus, the evolution of a parcelling strategy has transformed the pay-off matrix for each decision from a prisoner's dilemma game to a mutually positive pseudo-reciprocity game : for both partners it is self-serving to stay and invest at each moment of decision.
5. Why most cases of helping are likely to involve intermediate levels of conflict
(a) Shifts from chance by-product benefits to coordination to conditional helping in an asymmetric game
As discussed earlier, by-product mutualism and positive pseudo-reciprocity are based on the notion that helping is free of conflict, which provides ideal starting points for the evolution of helping, as often argued for mutualisms [26,78]. However, we argue that this game structure is often not stable. Where by-product benefits occur, there would inevitably be selection on increased association rates that cause coordination costs as well as selection on partner choice (cf. [79,80]). We illustrate these points with an example of collaborative hunting between groupers and partner species, such as moray eels. While the benefits are entirely owing to by-products of self-serving behaviour, the magnitude of these benefits is likely to vary between individuals. Empirically, it has been described that the willingness to participate in cooperative hunting is variable in both partners . Furthermore, there seems to be individual variation in the ability to coordinate movements, perseverance and the frequency with which prey is flushed towards the partner. Such variation can be owing to ontogenetic effects, with evidence for individuals changing their behaviour drastically between subsequent years (R.B. 2002 and 2003, unpublished data). In addition, it seems likely that partners of different sizes have different prey preferences and differential effects on prey escape behaviours. As a result, individuals have the choice between more or less suitable (profitable) partners, and laboratory experiments have demonstrated that groupers readily do choose better collaborators , something demonstrated first in chimpanzees .
The key point here is that these interspecific hunting associations form a biological market with individuals having the choice between different partners, sometimes belonging to different species, e.g. groupers may choose from moray eels, Napoleon wrasses and octopuses . As soon as there is exchange of goods or services, the market forces of supply and demand are expected to start operating [79,80]. In particular, groupers should preferentially associate with partner species or individuals that provide the best by-product benefits and partners should prefer groupers that provide the best by-product benefits for them in turn. As the hunting associations are mutually beneficial, being involved in more of them means increased foraging success, which should translate into an increased fitness. Therefore, individuals are under selection to choose good partners and to be chosen frequently. Competition within a class of traders over access to partners is predicted to lead to outbidding . In the cooperative hunting example, there is some observational evidence for outbidding at the partner recruitment stage. First, it seems clear that groupers preferentially seek moray eels rather than Napoleon wrasses to initiate a joint hunt in order to search for suitable prey : groupers associate above chance levels with moray eels but not with Napoleon wrasse. The preference breaks down in areas where there are few partners available (AL Vail, A Manica, R Bshary 2010, unpublished data). In contrast, in areas where partners are more abundant than at the initial study site , groupers alter their behaviour in an important aspect: they rarely initiate joint hunting but instead join moray eels that had already started to move through the reef (AL Vail, A Manica, R Bshary 2010, unpublished data). Thus, the coordination costs to start a hunting association are paid flexibly by different partner species depending on partner availability. Arguably, most by-product mutualisms are likely to involve such coordination costs. For example, mixed species associations in primates yield by-product benefits owing to the reduction in predation risk but the coordination requires deviation from optimal foraging routes. The presence of this trade-off explains why associations do not occur 100% of the time .
The conflict is well illustrated with the battle-of-the-sexes game in which two players want to be together but differ in their spatial preferences, which creates a conflict about who is paying the cost for being together. Biological market theory can make predictions about which class of traders is winning the battle and which one is losing it . In the absence of markets, differences in needs may result in ‘leaders’ and ‘followers’ ([83,84]; see  for an evolutionary scenario for mutualisms) and it is also possible that individuals alternate in paying the coordination cost. Note that this would lead to an alternating helping pattern without involving an iterated prisoner's dilemma pay-off matrix.
In the grouper example, there is currently only evidence for shifts in which a partner pays the coordination costs. However, it is easy to imagine that the biological market would select newly arising strategies that go beyond the provisioning of by-product benefits. For example, while successful individuals immediately stopped the collaboration in all observed cases of successful hunts, an individual that would continue the collaboration under such conditions might well benefit from such investments if this increases the chance that partners more readily accept invitations by this individual and/or choose this individual with increased probability.
In sum, we propose that any interaction that starts out as by-product benefits or as positive pseudo-reciprocity has the potential to evolve into a system that involves specific investments with the sole purpose of being chosen as a partner, as soon as the system involves a biological market. Under such circumstances, stable investments can be achieved through the threat of partner switching, a form of negative pseudo-reciprocity : individuals invest because it would otherwise be in the self-interest of the partner to stop the interaction and switch to a different individual. In conclusion, any form of by-product benefit may lead to partner choice whenever partners differ in the magnitude of by-product benefits they provide. This in turn leads to competition through outbidding and the evolution of investments, which are monitored and insured through the threat of partner switching.
Figure 2 summarizes the important steps from chance meetings to an asymmetric game in a biological market. The amount of extra investment will be a function of the relative abundance of the two classes of partners . If conditional choices by the choosing class have a minor cost (denoted ε in figure 2), negative frequency-dependent selection leads to the coexistence of bidding individuals (‘cooperators’) and non-bidding individuals (‘defectors’) and partner switching as the partner control mechanism employed by members of the choosing class. This scenario corresponds in various important features to arguments put forward recently by André , who argued that the initial presence of helping may facilitate the evolution of conditional strategies, as such a scenario resolves the bootstrapping problem.
(b) Shifts from maximal mutual levels of conflict to intermediate mutual levels of conflict
An assumption of the standard iterated prisoner's dilemma game is that current interaction partners have an independent past and an independent future once the game is over. As a consequence, an individual's fitness is independent of its partner's fitness, apart from the link that is created through the pay-off consequences of their decisions during interactions. However, these assumptions are frequently not met in nature, especially in animals that are most likely to have iterated interactions: animals that live in stable social groups characterized by kinship, long-term relationships and social bonds. Under such conditions, the fitness of the social animals can be strongly determined by interdependencies. The best-known and most-studied interdependency is based on genetic relatedness and the resulting biological altruism (kin selection, [1,2]). Interestingly, however, long-term social bonds between genetically unrelated individuals or even just being a member of the same group are likely to have similar effects [87–89]. As Roberts  proposed, the logic of ‘r’ in Hamilton's rule can in principle be applied to any form of interdependence and denoted the coefficient ‘s’ for stake. Note though that the functioning is quite different: the value of r is fixed for related individuals, while a coefficient of interdependence between genetically unrelated individuals can change with time and directly affects the values of b and c. As an example of interdependence between unrelated individuals, in a slave-making ant species, several unrelated queens team up and rapidly produce a sufficient number of workers that can defend the common nest. For individual queens, there is no temptation to cheat because any failure to contribute will automatically lead to the failure of all queens . Once enough workers exist, the interdependence between queens drops below a critical threshold and the queens fight each other to the death until a single victor remains . The principle of interdependence applies more generally to social animals as any helping that generates benefits from repeated interactions over a long time-period is likely to cause interdependence between partners. The higher the interdependence, the more frequently social situations arise in which it is a self-serving strategy to support partners when they need it. Long-term partners become social assets that need to be cherished and are costly to lose: helping is under positive selection as long as the benefits for the recipient multiplied by the degree of interdependency outweighs the cost of helping . Remember that this rule does not predict that individuals always help, but that helping is contingent on the act yielding net benefits independently of any reciprocal investment. Hence, we do not necessarily expect the strict contingency postulated by the model of mutually conditional investments.
So what cases of helping between unrelated individuals may involve important levels of interdependency? We propose that, as a general rule, interdependency is correlated with the stability of partner availability. Stable relationships are most likely to occur in stable groups. For example, in bi-parental bird species with lifelong monogamy, the death of the partner causes a decrease in the fitness of the surviving individual [91,92]. Similar effects seem also to be present in primates . In many primates, individuals of one sex (typically females) remain within their natal group all their life . Evidence is accumulating that under such circumstances, a stable core social network has a positive effect on individual fitness [95,96].
Interdependence is by no means restricted to group-living species. Mutualisms in the form of symbioses provide many good additional examples. Here, genetic interdependence is absent but interdependence might still be strong if partners live intimately together over extended time-periods; it is maximal in case of joint vertical transmission to the next generation. Well-known cases of such symbioses include gut bacteria, lichens, corals and some ant mutualisms [3,6]. For such cases, it has been argued that interests are rather aligned and conflicts small [3,6,18].
(c) Interdependency versus mutually conditional investments based on an iterated prisoner's dilemma
Despite the various critical reviews and the well-established role of mutual dependency on helping, there is a considerable literature that seeks to explain helping behaviour as forms of mutually conditional investments within an iterated prisoner's dilemma framework, often based on observational data. Various recent highly interesting studies on alternated helping with respect to vigilance in rabbitfish pairs , coordinated hunting in lionfish  and leading during migration flights in geese  yield great examples for coordination but not for mutually conditional investments solving an iterated prisoner's dilemma, at least not until it is demonstrated that investments are contingent and a prisoner's dilemma pay-off matrix is the most parsimonious assumption. We argue that the latter is the more challenging part because of the frequent occurrence of genetic and social interdependencies. We illustrate this concern in figure 3, in which we show how interdependencies can transform a pay-off matrix that looks like a prisoner's dilemma into other games once the effects of interdependency are included (see [22,100] for theoretical papers).
The best-known among the games emerging from interdependencies is the snowdrift game, which is also called the hawk–dove game (when emphasizing the competitive rather than cooperative nature of an interaction; ). In this game, an individual's best option depends on what the partner is doing. If the partner cooperates the best option is to defect; if the partner defects the best option is to cooperate. This is because mutual defection yields the lowest pay-off for both players. As a consequence, the success of cooperating and defecting displays negative frequency dependence . One possible solution is to cooperate and to defect with probabilities that generate stable frequency dependence . However, this will lead to cases in which both partners defect and hence both lose out, and hence various mechanisms may lead to the emergence of cooperators and defectors [43,72]. An even better solution would be that the two partners cooperate. How partners would achieve this solution is not obvious, however, as an individual that knows that the partner always defects should always cooperate to maximize its own pay-off. Mutually conditional helping that is contingent on the partner's behaviour would provide a solution. The option to switch partners in a biological market might offer an alternative control mechanism to achieve high levels of mutual help as cooperators could leave defectors. As the baseline level of helping is not zero like in a prisoner's dilemma pay-off matrix but b/c, the resulting high levels of helping would be partially self-serving and partially mutually conditional investments (figure 4).
6. General discussion
Our general goal has been to understand the evolution of dyadic cooperation and mutualism. The theoretical literature is heavily biased towards the iterated prisoner's dilemma game, but there is little evidence for this game in natural systems. A currently popular alternative explanation is positive pseudo-reciprocity [17,18], a situation in which partners have come to rely on each other with a stake in each other's success, and so both benefit from helping. While positive pseudo-reciprocity is thus rather conflict-free, we have made verbal arguments why ecology and evolution may often alter individual strategy space such that asymmetric games emerge in which conflicts of interest between partners are of intermediate level rather than minimal or maximal. Conversely, we argued that for games of maximal conflict, i.e. in the form of iterated prisoner's dilemmas, ecology and evolution may often drive individual strategy space such that interdependency leads to games with intermediate conflict levels, as exemplified by the snowdrift game.
The take-home message is that some level of conflict is bound to be widespread in both cooperation and mutualism. On the other hand, maximal conflict based on a prisoner's dilemma pay-off matrix is most likely rare. Even the primate examples presented earlier that provided convincing evidence for mutually conditional investments between non-bonded group members [35,36] might still be built on top of a low level of unconditional helping owing to interdependence (group augmentation, ). The temptation to fully defect is most likely to be found in some mutualisms characterized by short interactions where the fitness of a behavioural phenotype of an individual is hardly affected by the partner's survival. However, we do not know of cases based on mutual temptation to defect as assumed by the iterated prisoner's dilemma game. Instead, mutualisms often represent games with asymmetric strategic options. Marine cleaning mutualisms involving cleaner wrasses of the genus Labroides provide a case in point. These cleaners have inherent preference for the clients' mucus over their ectoparasites, so that feeding on the latter can be interpreted as an investment . More importantly, cleaners that follow their inherent preferences (i.e. cheat by feeding on mucus) may slightly reduce client survival but will otherwise not suffer any further repercussions, mainly because they interact with many different clients  and typically live shorter lives than they do (data available on fishbase.org). In contrast, the vast majority of client species have no means to defect on a cleaner, i.e. to perform a behaviour that increases their own pay-off at the expense of a cooperating cleaner. Possibly because the conflict does not involve interdependencies, the system has yielded experimental evidence for a variety of partner control mechanisms, including punishment, partner switching and social prestige [69,105]. Asymmetric cheating options are highly abundant in mutualisms, with experimental evidence that contingent helping may occur even in plants and insects [106–108]. For within-species cooperation, evidence for pay-to-stay in cooperatively breeding systems  provides a class of examples for helping acts like brood care or territorial defense, though Hamilton & Taborsky  found that if the threat of eviction alone enforces helping, subordinates will not overcompensate for the costs they impose on dominants.
In conclusion, we see great scope for both theoreticians and empiricists to consider games with intermediate levels of conflict, and to explore how selection on changes in individual strategy spaces lead to changes in game structure and pay-off matrices. Mutual or alternating, conditional helping does not in itself provide evidence for an iterated prisoner's dilemma, as the same patterns may emerge in an iterated snowdrift game or an iterated battle-of-the-sexes game. Furthermore, as we have suggested repeatedly in this paper, the market conditions in each study system have to be understood and incorporated. While free markets favour both conflict and the easy solution of partner switching, more restricted markets reduce conflict but potentially offer no solution to the remaining conflict level.
Finally, it will be interesting to investigate in how far the pay-off structure is linked to various features of social interactions in stable groups; i.e. features that have been traditionally introduced by primatologists to the study of animal behaviour: gradual build-up of relationships (bonds; ), reconciliation and other repair or servicing mechanisms , including negotiation  or inequity aversion . Linked to this issue, one can ask questions about the cognition underlying these features. Stake-based cooperation is predominantly described in long-lived animals with long-term social bonds, individual recognition and good memory capacities  but stakes exist also in other systems. The monitoring of partner behaviour within a broader social context and corresponding high degrees of freedom with respect to own behaviour may well make social relationships cognitively demanding and hence a key aspect of the social brain hypothesis . Of overarching importance is that we understand the ecology of our study species. Differences in life history, social systems, genetic relatedness and cognitive abilities are likely to have a major influence on levels of helping and underlying decision rules. Only if we know the diversity of social interactions and helping patterns observable in our study species in nature can we conduct meaningful field or laboratory experiments.
All three authors were involved in the initial discussion of the topic. R.B. drafted a first version that was then edited by K.Z. mainly for clarity and by C.v.S. for conceptual issues.
The authors declare no competing interest
All authors are funded by individual grants from the Swiss Science Foundation.
We thank the editors for the invitation to write this paper. Furthermore, we are grateful to Michael Taborsky, Jean-Baptiste André and an anonymous referee for constructive comments.
One contribution of 18 to a theme issue ‘The evolution of cooperation based on direct fitness benefits’.
- Accepted November 20, 2015.
- © 2016 The Author(s)