Cheaters—genotypes that gain a selective advantage by taking the benefits of the social contributions of others while avoiding the costs of cooperating—are thought to pose a major threat to the evolutionary stability of cooperative societies. In order for cheaters to undermine cooperation, cheating must be an adaptive strategy: cheaters must have higher fitness than cooperators, and their behaviour must reduce the fitness of their cooperative partners. It is frequently suggested that cheating is not adaptive because cooperators have evolved mechanisms to punish these behaviours, thereby reducing the fitness of selfish individuals. However, a simpler hypothesis is that such societies arise precisely because cooperative strategies have been favoured over selfish ones—hence, behaviours that have been interpreted as ‘cheating’ may not actually result in increased fitness, even when they go unpunished. Here, we review the empirical evidence for cheating behaviours in animal societies, including cooperatively breeding vertebrates and social insects, and we ask whether such behaviours are primarily limited by punishment. Our review suggests that both cheating and punishment are probably rarer than often supposed. Uncooperative individuals typically have lower, not higher, fitness than cooperators; and when evidence suggests that cheating may be adaptive, it is often limited by frequency-dependent selection rather than by punishment. When apparently punitive behaviours do occur, it remains an open question whether they evolved in order to limit cheating, or whether they arose before the evolution of cooperation.
Why should an animal cooperate when it could cheat? By definition, cooperation benefits others, often at some cost to the cooperator. Thus, selection should favour ‘cheaters’ that benefit from the social contributions of others while offering little or nothing in return. But nature tells a different story: cooperative behaviour is taxonomically widespread, and it has persisted and even flourished over long periods of evolutionary time [1–4]. This apparent contradiction has led the evolution of cooperation to be regarded as one of evolutionary biology's greatest puzzles [5–8].
The evolution of cooperation among non-relatives is especially perplexing because it cannot be explained by kin selection. Hamilton  famously showed that when cooperation is directed at relatives, it can increase the cooperator's inclusive fitness, even if it decreases the cooperator's direct fitness. However, although cooperation among relatives is common, not all cooperation is directed at kin [4,8,10]. Thus, partners need not be related for cooperation to evolve.
Game theory has advanced our understanding of the evolution of cooperation among non-relatives. The ground-breaking works of Trivers  and Axelrod & Hamilton  discussed the parallels between the Prisoner's Dilemma game and cooperative animal behaviours ranging from alloparental care in birds to fig pollination by wasps. Axelrod & Hamilton  introduced their paper by noting that, prior to the 1960s, ‘cooperation was always considered adaptive’ because of misguided group-selectionist thinking. They then went on to show that individuals playing a one-shot Prisoner's Dilemma game always get a higher pay-off from ‘defecting’ than from cooperating, even though mutual cooperation generates a higher pay-off than mutual defection. Probably as a result of the lasting influence of the Prisoner's Dilemma on thinking about the evolution of cooperation, the antithetical view now dominates: cheating is almost always considered adaptive, begging the question of how cooperation between non-relatives can be evolutionarily stable (e.g. ).
Here, we review the empirical literature on cooperation in animal societies and ask whether animals that cooperate less than other members of their social group are really ‘cheating’. We define cheating as an ‘adaptive uncooperative strategy’  that increases the cheater's fitness at the expense of its partner or social group . This is very similar to the recent definition of Ghoul et al. , who said that cheating is ‘a trait that is beneficial to a cheat and costly to a cooperator in terms of inclusive fitness’.
This definition has two key components: first, cheaters must prosper from cheating, and second, they must reduce the fitness of the individual being cheated. A failure to cooperate, therefore, does not always represent cheating; for example, individuals with few resources may invest little in cooperation but also generally have low fitness. The empirical literature is rife with examples of individuals that cooperate little or not at all in social interactions; here, we critically examine whether these behaviours satisfy the evolutionary definition of cheating given above. We organize these examples into four categories, not because these represent the only scenarios in which cheating might be favoured, but because these are the behaviours most commonly cited as evidence that cheating and punishment have co-evolved in cooperative animal societies. These categories are (table 1): (i) ‘lazy’ group members that fail to provide alloparental care to a shared or communal brood; (ii) subordinates that reproduce themselves instead of caring for relatives; (iii) defectors on reciprocal partnerships; and (iv) free-riders on collective action. Although there is abundant evidence of animals that fall into each of these categories, our review reveals that the fitness consequences of not cooperating are often assumed, but rarely measured. Furthermore, uncooperative individuals almost never prosper. More frequently, uncooperative individuals appear to have low fitness, an observation with important implications for the evolutionary dynamics of cooperation.
There are several reasons why uncooperative individuals might have low fitness: (i) poor overall condition may result in lower levels of cooperation, (ii) cheating may be subject to negative frequency-dependent selection, (iii) cheaters may be punished by other members of their social group, or (iv) cooperation may confer direct or indirect fitness benefits that accrue either immediately or incrementally over the lifetime of the cooperator, making failing to cooperate maladaptive even in the absence of punishment.
In the Prisoner's Dilemma game, cooperation and defection are discrete strategies, but in nature, cooperation is usually a continuous trait. Furthermore, the expression of cooperation is likely to be condition-dependent, with higher condition individuals often being better able to make larger investments into cooperation than lower condition individuals. For example, in some cooperatively breeding birds and mammals, a helper's condition affects how much alloparental care it provides, and experimentally improving a helper's condition increases cooperation [16,17]. (Alternatively, the converse relationship may also be possible—i.e. helpers in good condition might provide less alloparental care because they are less reliant on the benefits that helping generates—but we are not aware of empirical examples from animal societies.) Recently, Friesen  cleverly called low-fitness partners that cooperate little in interspecific mutualisms ‘defective, not defectors’, and this label probably also applies to many animals in poor condition that fail to cooperate in their social groups. Importantly, unlike defectors, ‘defective’ genotypes do not threaten the evolutionary stability of cooperation because they are not favoured by selection and thus do not spread within populations of cooperators.
A second reason that uncooperative individuals can have low fitness is negative frequency-dependent selection. The idea that frequency-dependent selection might limit cheating has received comparatively little attention from behavioural ecologists, partly because Hamilton's  original formulation of inclusive fitness showed that the fitness benefits of cooperating with kin remain constant regardless of the proportion of cooperators in the population. However, frequency-dependent game theoretical models have explored the evolutionary stability of mixes of cheating and cooperative strategies in groups of non-relatives (e.g. [19,20]). These models assume that free-riders do well when they are rare because they exploit the efforts of a large number of cooperators. But as cheating becomes more common, the benefits of cooperating (and hence, the benefits of associating with the group) diminish, and more cooperative groups outcompete groups with more free-riders (e.g. ). Thus, the solution to the game is a stable equilibrium frequency of each tactic, potentially explaining how variation in cooperative behaviour is maintained in natural populations. Note, however, that at equilibrium, all strategies have equal average fitness in this scenario.
A final possibility is that cheating has led to the evolution of punishment in animal societies, and thus that uncooperative individuals have low fitness because other members of their social group punish them. The term ‘punishment’ is usually reserved for behaviours that go beyond simply withholding the rewards of cooperation [22,23]. Preferentially rewarding more cooperative partners is termed ‘positive reciprocity’ or ‘partner choice’ instead [6,15,22], whereas punishment is generally defined as paying a cost to harm a cheating partner, for example by harassing or attacking uncooperative individuals in a social group [22–24]. Punishment will be selectively favoured only if it generates an inclusive fitness benefit, so the short-term cost of punishment must be recouped through either a long-term direct fitness benefit to the punisher or an indirect fitness benefit to related members of the social group (termed ‘selfish’ and ‘altruistic’ punishment, respectively, by West et al. ). We also restrict our definition of punishment to those behaviours that have evolved in order to reduce the fitness of uncooperative individuals and enforce cooperation, excluding many acts of aggression among individuals in social groups. As Clutton-Brock & Parker  pointed out, aggressive behaviours occur in a variety of social contexts, including in dominance and competitive interactions, but these behaviours will often be favoured even in the absence of cheating and thus need not have evolved in response to cheating.
Here, we ask whether apparently punitive animal behaviours likely evolved in response to cheating. To select for punishment, cheating must reduce the fitness of the individuals being cheated. However, our review of the literature reveals surprisingly scarce or weak evidence that behaviours often called ‘cheating’ actually decrease the fitness of other social group members, suggesting that cheating does not drive the evolution of punishment. Furthermore, the evolution of punishment presents a number of conceptual difficulties (box 1).
Conceptual difficulties with the evolution of punishment.
Evolutionary origins. Any scenario in which cooperation is maintained by punishment is subject to a ‘chicken-and-egg’ problem : did punishment give rise to cooperation, or did cooperation give rise to punishment? If punishment is necessary for cooperation to be favoured over cheating, then it had to be present before cooperation first evolved. But if punishment is an adaptation to cheating within social groups, then it is evolutionarily derived [7,26]. Yet some mechanism that conditions an individual's fitness on its level of cooperation had to exist for cooperation to evolve, in which case cooperation was adaptive in the absence of punishment. And if cooperation is adaptive without punishment, then there should be little selection for cheaters and subsequently for punishment. So, which came first?
Co-evolutionary dynamics. The coevolution of cheating and either punishment or partner choice is paradoxical [27–29]: if cheating selects for punishment, and punishment selects for cooperation, then punishment removes the selective incentive for its own maintenance. Whether punishment selects for more cooperation or more cheating is open to debate, as it is often suggested that animals evolve ways to cheat that go unpunished (e.g. [15,17,30,31]). Nonetheless, if punishment is ineffective against these new cheaters, they do not help to maintain punishment in the population.
Timescale of enforcement. The timescale over which punishment potentially ‘enforces cooperation’ differs between animals capable of learning or otherwise modifying their behaviour in response to harassment or aggression, and animals in which selfishness may be a fixed property of a genotype . If animals can learn, punishment may cause individuals to cooperate as a learned behaviour and thus may generate immediate benefits to the punisher if they repeatedly interact . But if punishment selects for cooperation only in future generations of animals, then costly punishment may fail to produce immediate benefits within the lifetime of the punisher and should be selected against (but see ).
Second-order cheaters. If punishment involves individual costs but group benefits, selection may favour ‘lazy’ punishers that are in effect a kind of ‘second-order’ cheater [22,23,33]. This simply punts the question from ‘what prevents individuals from cheating instead of cooperating?’ to ‘what prevents individuals from cheating instead of punishing’?
Because our review reveals few systems in which punishment is parsimoniously interpreted as an adaptation to cheating, we suggest several alternative explanations for the evolution of the apparently punitive behaviours many animals exhibit in their social groups. In many cases, these behaviours have probably evolved in the absence of cheaters and are selectively favoured in contexts unrelated to cheating. Furthermore, in many animals, harassment or aggression towards uncooperative individuals likely predates the evolution of cooperation and represents the ancestral state. In other words, in many systems, ‘punishment’ is probably not derived from cheating, but is the background against which cooperation first evolved [27,34].
2. Empirical evidence for cheating and punishment in cooperative animal societies
(a) Withholding alloparental care
Complex cooperative societies often include ‘helpers’, non-breeding subordinates that help to raise the offspring of dominant breeders. Helpers are often closely related to the brood that they help to raise, so indirect fitness benefits may prevent selection for cheating. In some societies, however, genetic relatedness is low, and cheating behaviours could be adaptive if group members are able to gain direct fitness benefits from group membership without paying the costs of helping [4,35,36]. The direct fitness benefits for an unrelated helper that joins a social group are reasonably well documented, including future mating opportunities, territory inheritance or increased survival [37,38]. But if unrelated individuals can access the benefits of group membership without helping to rear the group's young, what prevents them from withholding alloparental care? Do helpers ever fail to help, and if so, can dominant individuals punish or coerce them into cooperating?
Many studies have found that helpers vary in the amount of help they provide, often substantially [39,40]. However, ‘lazy’ helpers are not necessarily increasing their fitness by avoiding the costs of alloparental care. Instead, individuals that do little alloparental care may be in low condition and unable to invest much in helping, often resulting in low fitness. Most available evidence suggests that lazy helpers do not enjoy enhanced fitness compared with cooperative members of a group. In fact, one reason that helping is evolutionarily stable may be that the levels of care that helpers provide are typically dependent on body condition, lowering the long-term costs of helping . In meerkats, for example, levels of alloparental care by subordinate females are positively correlated with body mass and rate of weight gain, so lazy helpers are apparently in poorer condition than hard-working ones . Food supplementation experiments suggest that the same is true in cooperatively breeding moorhens (Gallinula chloropus)  and carrion crows (Corvus corone) .
Even when withholding alloparental care might be adaptive for the helper, it may not reduce the brood's fitness: the benefits of helping are subject to diminishing returns as the level of care (or the number of helpers) increases, so once a sufficiently high level of provisioning is reached, additional helping no longer augments offspring fitness (reviewed in ). Therefore, helpers may adaptively lower their levels of effort without selecting for retaliatory behaviours by dominant breeders.
Another complication is that helpers that appear to be lazy in one context may participate in other activities or at other times . Context-dependent division of labour should not be confounded with ‘laziness’, since division of labour can increase efficiency and benefit all parties . Baglione et al.  found that 27% of subordinate carrion crows were lazy helpers, failing entirely to provision nestlings. When dominant individuals were experimentally removed, however, lazy helpers voluntarily began visiting the nest and provided enough food to fully compensate for the loss of a breeder, indicating that they may in fact represent a sort of ‘insurance’ workforce. Removal experiments suggest that the same may be true for lazy workers in eusocial insect colonies, who—unlike vertebrates—generally cannot increase their own future reproductive fitness by withholding care since many are functionally or completely sterile [48–50]. ‘Lazy’ workers in eusocial insect colonies therefore pose an interesting comparison to cooperatively breeding vertebrates, since inactive individuals appear to be widespread, generally cannot increase their own fitness by withholding help, and are well tolerated by dominants .
In vertebrates, a different interpretation for this ‘tolerance of laziness’ is that helpers deceive dominant group members into believing that they are providing higher levels of care than is actually the case. ‘False-feeding’, in which helpers bring food to the brood but then consume the food themselves rather than delivering it, has been reported in several species and is often interpreted as a deceptive strategy. In the most frequently cited example, Boland et al.  found that young white-winged chough helpers (Corcorax melanorhamphos) frequently brought food to the nest and even placed it into the bill of a nestling, but then removed the item and consumed the food themselves. Helpers were most likely to false-feed when they were alone at the nest (i.e. when other group members could not witness the deception), and dominant breeders sometimes aggressively chased helpers that arrived without food. In every other study of this behaviour, however, results have indicated that false-feeding is unlikely to have evolved to deceive dominants. Instead, it appears to reflect a simple trade-off between the hunger of the helper and the needs of the nestlings: helpers that perform false-feeding are more likely to be young, inexperienced or in poor condition; and they are generally insensitive to the presence or absence of other group members [17,52–54]. In most of these societies, though not in choughs, helpers are related to the brood that they feed, so it is unlikely that false-feeding behaviours would be selectively favoured if they actually reduced nestling growth or survival.
But why is cheating so rare in cooperative societies where helpers are unrelated to the group's offspring [55–57]? One hypothesis proposes that cheaters are rare because the threat of punishment is ubiquitous—essentially, that cheating and punishment have co-evolved to the point that neither is observed in natural populations, though both lurk under the surface [30,58]. Empirical support for the ubiquity of punishment comes from manipulative studies in which subordinates are experimentally prevented from helping. In cichlids and fairy-wrens, dominant individuals were observed to harass apparent defectors when they were experimentally prevented from helping [59,60]; and in naked mole-rats and paper wasps, subordinates became ‘lazier’ when dominant females were experimentally prevented from punishing them, increasing their levels of effort only when the dominants were restored [58,61]. Although these studies do not demonstrate that cheating itself should be selectively favoured (or, indeed, that cheating occurs) under natural conditions, they provide strong behavioural evidence that dominants can, and do, monitor the actions of subordinate group members; and that aggression by dominants can induce cooperation by subordinates [30,62–64].
It remains an open question, though, whether punishment by dominants evolved as a response to cheating by subordinates, or whether such responses existed before cheaters did [26,27]. Particularly in cooperative breeding systems, where helping is typically facultative, it is possible that dominant breeders tolerated extra-group individuals that showed submissive behaviours or provided help, and attacked or evicted those that did not (e.g. ). Helping behaviour might therefore function as a type of ‘pre-emptive appeasement’  that propitiates aggressive dominants. In either case, credible threats and punishment may help to maintain cooperation by subordinate helpers, particularly unrelated ones; the key question is whether punishment may have played a primary role in the origins of helping behaviour, rather than having subsequently evolved to lower the benefits of cheating [26,66].
(b) Reproduction by subordinates
Another behaviour that may or may not represent cheating is to reproduce instead of helping to rear the offspring of the dominant breeder(s). A well-studied potential example is worker reproduction in social insect colonies, although subordinate reproduction is also known from cooperatively breeding vertebrates [30,67]. In eusocial Hymenoptera, workers usually rear the offspring of their mother, the queen, instead of their own young, presumably because doing so increases their inclusive fitness . It is nonetheless relatively common for workers to lay eggs that may develop into males (worker-produced males or WPMs) or occasionally into females . Are reproductive workers ‘cheating’ and how has ‘punishment’ evolved in social insects (box 2)?
Policing in theory.
Theory has focused on two, not necessarily mutually exclusive hypotheses  explaining the evolution of worker ‘policing’ in social Hymenoptera.
Relatedness. In a seminal article, Ratnieks  showed that a ‘police allele’ will spread when workers are more closely related to queen-laid than worker-laid male eggs. Workers are always related to their own sons by a coefficient of relatedness (r) of 0.5, but their relatedness to the sons of queens and the sons of other workers depends on the colony kin structure. Because Hymenoptera are haplodiploid, when a colony has one queen that has mated once, workers are more related to the sons of other workers (r = 0.375) than to the sons of the queen (r = 0.25), precluding the evolution of worker policing through relatedness considerations alone. When queens mate with enough males, however, workers are more related to the queen's sons (still r = 0.25) than to the sons of other workers (r decreases to 0.125 as the number of males that mate with a queen increases), meaning that workers can increase their inclusive fitness by policing egg-laying by other workers.
Colony productivity. Ratnieks  also showed that a police allele can be selectively favoured if ‘colony-level efficiency increases as a result of worker policing’. This hypothesis hinges on worker reproduction reducing colony productivity, for example because reproductive workers invest time or resources in their own brood at the expense of the queen's brood. If policing reduces worker investment in reproduction, in principle, it can increase colony productivity.
Frank [70,71] proposed a more general model for the evolution of any trait that represses competition within groups and thus increases group productivity, and El-Mouden et al.  extended Frank's model to incorporate more realistic costs and benefits of policing. Although Frank [70,71] originally found that policing evolves relatively easily (specifically, whenever r < 1 − c, in which r is relatedness and c is the cost of policing; thus, policing in this model evolves even at very low relatedness), El-Mouden et al.  showed that the conditions favouring the evolution of policing are likely to be much more restrictive. For example, in the El-Mouden et al.  model, policing cannot evolve at low relatedness and cannot fully suppress within-group conflicts, leading the authors to conclude that, ‘policing may be harder to evolve than originally thought’.
Although we usually expect cheaters to invade populations of cooperators, it is possible that worker reproduction has simply been retained since before the evolution of eusociality (to quote Bourke [68, p. 304], ‘workers may have been selected to produce sons and rear sisters’). In fact, current evidence suggests that worker sterility is highly derived within eusocial Hymenoptera and that workers continued to reproduce in many lineages long after the evolution of eusociality . Recent phylogenies could be further leveraged to explore how often hymenopteran lineages with reproductive workers are nested in clades in which complete worker sterility is the ancestral condition, and thus whether cheating invades cooperation at macroevolutionary timescales.
Much as helpers vary substantially in the amount of help they provide, workers vary substantially in how much they reproduce [68,73–77]. A leading explanation to account for some of this variation is worker ‘policing’, which is possibly the most widely cited example of punishment in any animal society. In some social Hymenoptera, workers eat eggs laid by other workers or act aggressively towards workers with activated ovaries, thereby reducing the number of adult males that are worker-produced. Worker policing was first studied empirically in the honeybee, Apis mellifera , and has subsequently been described in some ants, wasps and other bees .
There are two main hypotheses about the evolution of worker policing in social insects: the relatedness and colony productivity hypotheses (box 2). The former predicts that policing is selectively favoured because workers increase their relatedness to the colony's males by policing (box 2 and ). However, under this hypothesis, worker policing does not really ‘enforce cooperation’ so much as bias relatedness in workers' favour. In other words, the relatedness hypothesis predicts that not all worker reproduction is policed (i.e. it is not cheating per se that is punished), just the production of male eggs related to workers at r < 0.25. A significant phylogenetic correlation between worker reproduction and colony kin structure provides some support for this hypothesis, although there is substantial unexplained variation; colonies produce few WPMs when workers are more related to the queen's sons than other workers' sons and many WPMs when this relatedness difference is reversed . Models have also explored how worker policing and worker adjustment of colony sex ratio may evolve in tandem [73,77].
There is less evidence for the colony productivity hypothesis, in which worker policing recoups reductions in colony productivity caused by selfish worker reproduction (i.e. cheating; box 2). Although several studies have attempted to measure the costs of worker reproduction to the colony as a whole, the resulting evidence of costs is very mixed [21,80–86]. To our knowledge, no studies have documented a negative correlation between the number of WPMs and the number of queen-produced reproductive offspring made by a colony. The time or energy that workers invest in rearing their own offspring may trade off with investments in other tasks, but how this affects colony productivity is largely unknown ([31,80,84,87–89]; but see ). Most experiments comparing colonies that vary in their number of reproductive workers have found few, if any, differences in colony productivity [82,83,85,86], although Dobata & Tsuji  recently showed that brood production decreased when there were more ‘cheaters’ (i.e. reproductive workers) in laboratory colonies of the parthenogenetic ant Pristomyrmex punctatus. However, while unrestrained worker reproduction appears to be costly to P. punctatus colonies, there is no evidence of policing in this ant species ; instead, the authors suggest that cheating and cooperative lineages of P. punctatus could be maintained by frequency-dependent selection. Worker policing occurs in other clonal ants (e.g. ), but there are no cheaters in these taxa because nest-mates are genetically identical (in contrast, the ‘cheaters’ in P. punctatus colonies are a distinct genetic lineage ). Thus, collectively, the results of these studies provide little evidence that cheating workers select for policing to increase colony productivity.
‘Policing’ could also be a manifestation of reproductive competition among subordinates in both social insects [68,91,92] and vertebrates . In the bumblebee Bombus terrestris, both queens and workers eat large numbers of worker-laid eggs, and when workers consume eggs they often have activated ovaries, though not always . In both cooperative and non-cooperative breeders, selection should generally favour females that gain access to a greater share of local resources for their own offspring by acting aggressively towards other reproductive females or the other females' young. Furthermore, brood cannibalism and aggression among reproductive females is likely ancestral to eusociality in Hymenoptera; both solitary and primitively eusocial taxa often engage in these behaviours (e.g. [68,92–97]). This suggests that worker policing is not a de novo adaptation in highly social lineages like Apis, but instead evolved in the context of a pre-existing behavioural repertoire that included egg eating and aggression among females. In addition, the expression of a ‘police allele’  in workers could result from it being selectively favoured in queens.
There are several other potential explanations for the evolution of worker policing behaviours that also have little to do with punishment of cheating. For example, workers may kill all eggs not laid by the queen as a defence against brood parasitism [69,98]. Or colony members may eat worker-laid eggs more often than queen-laid eggs because worker-laid eggs are less viable (, but see [69,100]).
Much as punishment of lazy helpers is often documented by experimentally preventing subordinates from helping (see previous section), worker policing is often documented by experimentally manipulating workers to lay eggs. Experiments demonstrating worker policing usually generate large numbers of worker-laid eggs by ‘orphaning’ groups of workers (i.e. separating workers from the queen; e.g. [78,86,87]) because workers rarely lay eggs when the queen is present . The fact that workers lay few eggs in the presence of the queen suggests that it is often the queen that represses worker reproduction, not other workers. Recently, Van Oystaeyen et al.  identified highly conserved queen pheromones that prevent worker reproduction in ants, bees and wasps, perhaps giving new life to the old hypothesis (e.g. ) that most worker reproduction is prevented by queen traits (e.g. pheromones, aggressive behaviour, etc.). The relative contribution of worker policing to ‘reproductive harmony’ in social insect colonies may be fairly small; one estimate is that worker policing reduces the proportion of WPMs produced by A. mellifera from approximately 7–0.12% , although Kärcher & Ratnieks  recently found only 14 worker-laid eggs in over 3000 empty drone cells in six A. mellifera colonies. In general, more studies are needed in which researchers measure how many eggs are actually laid by workers in un-manipulated queenright colonies, and how many are subsequently killed by workers before reaching adulthood (especially under natural conditions). One possibility is that worker reproduction is so rare that it imposes little selection for worker policing, and thus that the worker behaviours that have been called ‘policing’ have evolved for reasons other than punishment of cheating.
(c) Failure to reciprocate, or ‘defecting’ in a reciprocal interaction
Perhaps the most frequently proposed explanation of cooperation between unrelated individuals is that it is maintained by reciprocal exchanges of goods or services, in which one individual suffers a temporary fitness cost in order to help a partner who later reciprocates the cooperative act . Axelrod & Hamilton  showed that the Prisoner's Dilemma can lead to cooperation only if the game is repeated many times with the same participants: in a one-shot game, it always pays to defect. Models of the repeated Prisoner's Dilemma have shown that the strategies most likely to lead to stable cooperation are those in which individuals copy the previous behaviour of their partners, cooperating when they do and ceasing to cooperate if their partners defect (‘tit-for-tat’), especially if cooperative players occasionally forgive defectors (‘generous’ tit-for-tat ).
Recent models have extended the original two-partner dyad to encompass larger cooperative networks, considering situations in which many individuals can repeatedly interact to trade commodities under conditions that are less restrictive and more socially complex than those envisioned by original solutions of the Prisoner's Dilemma game . Several models support the hypothesis that generalized reciprocity can create evolutionarily stable levels of cooperation in a Prisoner's Dilemma situation by the simple decision rule of ‘help anyone if helped by someone’, and that following these rules need not be cognitively demanding [105,106]. Although empirical evidence for reciprocal cooperation in natural, non-captive settings still lags behind theory, recent studies on alloparental care have convincingly interpreted helping behaviour by non-relatives as a type of commodity trading—essentially, that helpers ‘pay to stay’ in the social group by providing cooperative services [57,59].
One of the best examples of direct reciprocity in a natural setting is food sharing by common vampire bats (Desmodus rotundus), in which reciprocal exchanges of blood meals closely resemble the repeated dyadic encounters originally envisioned by Trivers . Vampire bats feed exclusively on blood, and bats that return to the communal roost without having fed are in danger of starvation unless another bat regurgitates blood to them. Although the interpretation of this behaviour as reciprocal altruism  has been criticized on the grounds that bats also feed kin , a recent study found that reciprocity was a stronger predictor of blood donations than genetic relatedness: individuals preferentially donated blood to those that had previously donated to them, and the proportion of partnerships that formed between unrelated individuals did not differ significantly from that predicted by random assortment . In order for vampire bats to preferentially repay those that have helped them, they must be able to recognize individuals as well as to remember who has given what, when—perhaps a valid assumption in vampire bats, since individuals sniff one another before donating blood and can recognize relatives, suggesting that odour plays a role in individual recognition. Moreover, interaction networks are relatively small, with each bat sharing blood with an average of fewer than four partners. Reciprocal interactions may be further stabilized by the asymmetry between the costs and benefits of blood donations: a well-fed individual pays a relatively low fitness cost when it donates blood, whereas the relative benefit to the recipient is much larger . The prediction, therefore, is that cheating is not favoured because defectors should pay a large direct fitness cost in future losses, although this has not been tested directly.
Vampire bats appear to meet the crucial requirement of theoretical solutions of the Prisoner's Dilemma: cooperators help those that have previously helped them; therefore, they presumably fail to help defectors. Recent theory, too, shows that reciprocity remains stable only when partners are able to end the relationship when faced with a defector (for example, by leaving or threatening to leave the social group [108,109]). But despite a substantial theoretical literature on costly punishment (that largely fails to support its evolutionary stability [110–112]), there is almost no empirical evidence for punishment that goes beyond withdrawing from a partnership in interactions involving sequential reciprocal exchanges. In the most robust examples of reciprocity, such as food exchange in vampire bats and allogrooming in ungulates, aggression and physical harassment have not been observed even though individuals sometimes fail to reciprocate [10,113]. Punishment might evolve more readily in primates, in which social bonds and individual recognition are well established and inter-individual aggression is frequent; but here too evidence is lacking (thoroughly reviewed by Silk ).
The sole report of systematic punishment of non-cooperators in a system characterized by reciprocal exchange is from Hauser & Marler , who studied food-associated calls in rhesus macaques. Rhesus macaques often give characteristic calls upon finding food, a response that may have evolved in order to alert group members to share food. Hauser & Marler  experimentally provided free-ranging macaques with food and observed the responses of discoverers and fellow group members. They found that macaques that failed to call when they found food were more likely to be attacked than were those that did call, a result widely interpreted as punishment for defecting . However, these results are also consistent with manipulation or coercion: attacks were usually directed at lower ranking females, and males that discovered food were rarely attacked even though they typically did not call . Nor is it clear that aggressive punishment was costly to enforce—in fact, ‘punishers’ often chased the discoverers away and consumed the food themselves, suggesting that the primary benefit to punishing was immediate access to food rather than ensuring that the discoverer would cooperate in the future. (It is not even obvious that ‘defectors’ stayed silent in order to conceal the food from other group members and increase their own consumption, since individuals were more likely to call when they were hungry than when they were satiated.) Given the lack of empirical evidence for costly punishment, it seems unlikely to play an important role in enforcing reciprocity in non-human societies.
(d) Free-riding on collective action
Collective actions occur when many individuals simultaneously contribute to a common good, such as cooperative hunting, predator mobbing or defence of a shared territory [8,117]. Although collective actions do occur in the context of cooperative breeding societies, we draw a distinction here between the alloparental care behaviours discussed earlier and examples of collective action, largely because collective actions often involve unrelated individuals and in most cases the immediate fitness pay-offs of cooperating are more evenly shared among participants. Consider, for example, a group of white-faced monkeys (Cebus capuchinus) that collectively defends the boundaries of its foraging territory against neighbouring groups. When two groups are equally motivated to fight, the size of the group is the most important determinant of success. Each additional group member increases the odds of winning the interaction by 10% . Individuals that hold back—‘laggards’ that do not participate in the fight—contribute nothing to their group's strength, thereby increasing the probability that their group will suffer a loss .
In this case, cooperation generates immediate, synergistic rewards that are shared by all group members. Laggards decrease their own fitness as well as that of their group-mates, so ‘free-riding’ is in fact quite costly. Cheating is limited because the relative pay-offs of cooperating and defecting create a stable equilibrium: if other group members do their part, it is best for the laggard to do his as well . Maynard Smith & Szàthmary  likened this pay-off structure to that of two people in a rowboat, each with one oar on opposite sides of the boat. The boat can only move forward if both row (both players benefit). If only one rows, the boat moves in circles (neither benefits, but one pays a cost); and if neither rows, the boat does not move at all (neither benefits and neither pays a cost). N-person rowing games (also known as stag hunts in game theory) may explain the evolutionary stability of many instances of collective action [120,122].
As relative group size increases, the magnitude of these shared benefits declines since a smaller proportion of group members is needed to provide the same collective good [123,124]. When non-cooperating group members are able to profit from the actions of cooperative group-mates, cheating is no longer costly and cheaters should increase their direct fitness relative to that of cooperators. Empirical studies on cheating in inter-group encounters support the prediction that cheating may become profitable in larger groups, finding that some individuals consistently fail to participate in aggressive inter-group conflicts (e.g. black howler monkeys, Alouatta pigra ; free-ranging dogs, Canis lupus familiaris ; ring-tailed lemurs, Lemur catta ; and wolves, Canis lupus ).
As with lazy helpers in alloparental care systems, though, it is not always clear whether individual variation in the level of cooperation represents free-riding. There is little evidence that laggards profit from their actions, although the fitness consequences of such behaviours are notoriously hard to measure. More importantly, lagging is generally correlated with asymmetries in the benefits accrued from an inter-group conflict and the costs incurred (reviewed in ). In other words, some individuals may invest less in collective actions simply because they have less to gain or more to lose, so their failure to participate may reflect individual differences in age or condition among group members rather than an attempt to cheat. A comprehensive review of inter-group interactions in non-human primates found that most occurrences of free-riding are correlated with variation in dominance rank, kinship ties and mating success rather than with immediate gains in direct fitness .
In a few cases, however, certain individuals fail to participate in inter-group conflicts regardless of any of these variables. An influential study by Heinsohn & Packer  found that group-territorial female lions (Panthera leo) could consistently be assigned to four categories: unconditional cooperators, who always led the way in territory defence; conditional cooperators, who participated only when most needed; conditional laggards, who participated less when most needed; and unconditional laggards, who always stayed behind regardless of the group's need. Along with the finding that the occurrence of laggards tends to increase with group size [119,126,132], these results suggest that individuals that hold back from collective actions do sometimes increase their own fitness at the expense of cooperative group-mates, though not as frequently as often supposed.
What role, if any, does punishment play in these situations? Despite a large theoretical literature, the evidence that punishment can suppress free-riders in public goods games is equivocal at best. Depending on the assumptions of the model, theory predicts that punishment can promote cooperation [133,134], have no effect on cooperation , or even undermine cooperation by favouring retaliation [136,137]. Empirical studies of humans playing N-player public goods games under laboratory settings suggest that punishment can be evolutionarily stable, but only under a very restricted set of conditions (reviewed in [23,138]).
In non-human animal societies, by contrast, there is virtually no empirical evidence that punishment enforces collective action. Heinsohn & Packer  noted that although leading female lions appeared to recognize laggards—and remembered their past failures to cooperate—they failed to punish them directly (through aggression) or indirectly (by withdrawing further cooperation). Although this finding was initially considered to be surprising, it now appears to be the rule rather than the exception. A growing number of studies have explicitly sought behavioural evidence for punishment and have failed to find it, even when laggards are obvious (for example, in lemurs ; chimpanzees ; dogs ; and sociable weavers ).
A promising alternative hypothesis is that free-riding on collective action is adaptive only in certain contexts—as above, when group size is sufficiently large that cheating is cost-free—and at low frequencies. Although the role of frequency-dependent selection in stabilizing cooperation has recently received increased attention, most empirical support comes from social microbes rather than cooperative vertebrates. For example, several studies have found that cooperative microbial strains—those that produce a public good, such as antibiotic compounds, biofilm polymers or iron-scavenging molecules—can coexist in stable equilibria with cheating strains, which avail themselves of these goods without producing any in return. Cheaters have higher fitness than cooperators when they are rare, since producing these goods is costly; but their fitness diminishes rapidly as their prevalence in the population increases [141,142]. It remains to be seen whether similar dynamics may help explain the coexistence of leaders and laggards in cooperative vertebrate groups, since selection for cooperative traits may be weaker than in social microbes, and animal groups are less highly structured . Nevertheless, recent intriguing evidence that negative frequency dependence limits cheating in socially polymorphic spiders (Anelosimus studiosus)  and bark beetles (Dendroctonus frontalis)  suggests that this hypothesis warrants further research in cooperative animal groups.
3. Conclusion and future directions
We find little evidence to suggest that cheating and punishment behaviours have co-evolved in cooperative animal societies. When cheating does occur—for example, when free-riding monkeys fail to participate in inter-group fights—punishment behaviours are rare or non-existent, and the effect of cheating seems to be to limit optimal group size rather than to select for retaliatory aggression [118,119]. By contrast, the best-documented examples of punishment come from manipulative experiments in which cheaters are artificially created, as when dominant cichlids or fairy-wrens attack subordinates that they perceive to be unhelpful [59,60] or when worker honeybees police eggs transferred from queenless colonies . Therefore, there is evidence that cheating occurs at low frequencies in some societies and that policing or physical aggression may help to maintain cooperation in others, but not that the latter has evolved in response to the former.
In many cases, cooperation appears to be evolutionarily stable not because cooperators were able to evolve responses to punish uncooperative cheats, but because such responses probably existed from the outset, selecting for cooperative partners. This conclusion has important implications for the evolutionary dynamics of cooperation, including its origins. As Cockburn  pointed out, empiricists have generally succeeded in coming up with adaptive explanations for cooperative breeding when it has occurred, but have ‘failed miserably’ at predicting in which lineages it should evolve—or why it is absent in other lineages. Thinking about punishment as a precondition for stable cooperation rather than as an evolved response to cheating may help us to predict when cooperation can evolve. In the same way that understanding kin selection has generated testable predictions about which ancestral mating systems should be most likely to give rise to cooperation (e.g. [1,3,147]), understanding the life-history or behavioural traits that likely antedated the evolution of cooperation may help generate predictions about the ancestral conditions under which it arose (e.g. ).
Our review also suggests that, contrary to what is typically assumed, not cooperating is rarely an adaptive strategy for social animals; when cooperation generates direct or inclusive fitness benefits, a failure to cooperate lowers an animal's lifetime fitness. In these societies, cheating is not selectively favoured in the first place and non-cooperative phenotypes may be maintained only in mutation-selection balance . If cheaters are therefore rare, they are unlikely to impose much selection for punishment. These results are consistent with recent theory, which has increasingly shown that punishment—even in humans—can be evolutionarily stable only under limited circumstances [72,135,138,150], and that cooperation is unlikely to evolve when cheating is truly advantageous.
C.R. conceived the review. Both authors contributed to conceptual development, wrote the review and approved the final manuscript.
We have no competing interests.
C.R. was supported by the Harvard Society of Fellows and by Princeton University. M.E.F. was supported by an NSERC Discovery grant and the University of Toronto.
We thank Deborah M. Gordon, Egbert G. Leigh Jr., Dustin R. Rubenstein, Michael Taborsky and two anonymous reviewers for their thoughtful and constructive criticisms on earlier versions of this manuscript; and we thank Meghan J. Strong for editorial assistance in manuscript preparation.
One contribution of 18 to a theme issue ‘The evolution of cooperation based on direct fitness benefits’.
- Accepted November 19, 2015.
- © 2016 The Author(s)