The interplay of cognition and cooperation

Sarah F. Brosnan, Lucie Salwiczek, Redouan Bshary

Abstract

Cooperation often involves behaviours that reduce immediate payoffs for actors. Delayed benefits have often been argued to pose problems for the evolution of cooperation because learning such contingencies may be difficult as partners may cheat in return. Therefore, the ability to achieve stable cooperation has often been linked to a species' cognitive abilities, which is in turn linked to the evolution of increasingly complex central nervous systems. However, in their famous 1981 paper, Axelrod and Hamilton stated that in principle even bacteria could play a tit-for-tat strategy in an iterated Prisoner's Dilemma. While to our knowledge this has not been documented, interspecific mutualisms are present in bacteria, plants and fungi. Moreover, many species which have evolved large brains in complex social environments lack convincing evidence in favour of reciprocity. What conditions must be fulfilled so that organisms with little to no brainpower, including plants and single-celled organisms, can, on average, gain benefits from interactions with partner species? On the other hand, what conditions favour the evolution of large brains and flexible behaviour, which includes the use of misinformation and so on? These questions are critical, as they begin to address why cognitive complexity would emerge when ‘simple’ cooperation is clearly sufficient in some cases. This paper spans the literature from bacteria to humans in our search for the key variables that link cooperation and deception to cognition.

1. Introduction

Cooperation between unrelated individuals is of great interest for evolutionary biologists for several reasons. First, as cooperation involves investments (behaviour that reduces the immediate payoff of the actor) in the provision of benefits to another individual, one has to reconcile its existence with a theory of evolution that emphasizes the advantages of self-interest. How does a behaviour which benefits another individual evolve and how do actors exchanging these benefits deal with the potential of cheating? Many analytical models and computer simulations describe the conditions under which cooperation may promote individual fitness. Often, however, proximate issues such as the tendency to choose immediate benefits rather than delayed rewards has to be overcome to achieve stable cooperation. Second, cooperation may be at least partly responsible for the cognition with which it is associated. The ability to cooperate but also to manipulate and deceive partners is assumed to play an important role for an individual's fitness in social species. Therefore, cognitive abilities that may enhance an individual's competence may have been under strong positive selection and may have contributed to the evolution of (relatively) enlarged neocortices in birds, primates, cetaceans and other highly cooperative vertebrates (Machiavellian Intelligence hypothesis: Byrne & Whiten 1988; social brain hypothesis: Barton & Dunbar 1997; Emery Clayton & Frith 2007). Thus, intelligence may not have evolved as a ‘universal capacity’, but instead as a ‘social competence’.

Nonetheless, it is important to note that social complexity involves a great variety of phenomena. For example, the Machiavellian Intelligence hypothesis in its original form (Byrne & Whiten 1988; Whiten & Byrne 1997) took a wide, permissive perspective on the variety of socio-cognitive adaptations through which an individual may exploit the potential benefits of its social world, as well as dealing with its hostile aspects; social knowledge, discovery techniques, social curiosity, social problem solving, innovation, flexibility, social expertise, social play, mind-reading, self-awareness, imitation and culture were all explicitly included. The importance of cooperation and deception relative to other factors remains an open question. This becomes more obvious as one realizes that cooperation per se does not require advanced cognition. Intraspecific cooperation and interspecific mutualisms are ubiquitous in nature, existing from single-cell organisms to plants, invertebrates to humans. This ubiquity implies that cooperation and cheating can be achieved by very simple means. Thus, while scenarios can be raised in which cooperation is cognitively demanding, it is also reasonable to assume that the advanced cognition which has been proposed is not required for all cooperation. As a consequence, cognitive complexity cannot be inferred from observing cooperation, but must be demonstrated experimentally.

In fact, pinnacles of social complexity appear to follow a bimodal distribution, where prominent species are either not very cognitive (single cellular organisms and social insects) or are among the most intelligent species on Earth (humans; Clutton-Brock et al. 2009; see also Connor 2010). The selective forces differ, however. It is clear that helping behaviour in bacteria and eusocial insects is (almost) entirely driven by kin selection (Clutton-Brock et al. 2009), leading to behavioural strategies that are largely unconditional on the behaviour of partners (although they may be conditional on recipient identity and one's body condition). While kin selection is important in large-brained social species as well, additional complexities arise in these species from cooperation with unrelated individuals. Here, we are particularly interested in the possibility that specific cognitive abilities are linked to large-scale cooperation.

Cognitive abilities may be important for two different aspects of cooperation. First, cognition may help make coordination between partners more efficient. Second, cognition may be important to make strategic decisions concerning the best behavioural option in a given situation. Following some important definitions (§2), we will only briefly discuss the former aspect in §3 and then focus on strategic decision making in §4, which comprises the main body of this paper. In order to identify conditions that warrant the existence or evolution of (more advanced) cognitive abilities for social success, we will first analyse some examples of cooperative and deceptive behaviour in which decision making requires no brain at all. This will spotlight key conditions where simple decision rules are insufficient to prevent exploitation by cheaters, and thus will help to identify conditions under which cooperation is mediated by specific cognitive skills. In this context, we also discuss whether increasing cognitive abilities invariably help the evolution of cooperative behaviour or whether they may in some circumstances hinder it.

2. Some operational definitions

(a) Complex

Here, we mean ‘complex’ to be a situation in which many factors go into the decision-making process at two levels: (i) the number of different factors taken into account and (ii) the extent of interactions of these factors. The relationships can be disproportional, e.g. at critical levels, a small change can make a big difference. In general, it is possible to predict the outcome if factors going into the situation are known, so recursive causality exists.

(b) Cognition, emotion and impulsivity

We use ‘cognition’ as an umbrella term that starts with the acquisition of information from the environment and encompasses information processing, holding beliefs, desires and knowledge and some form of internal representation of this information. Cognitive mechanisms include elementary processes comprising perception, attention, action, memory, problem solving, concept formation, categorization and generalization (Shettleworth 2009, 2010). Here, we focus on cognitive processes that specifically aim at dealing with the social environment.

Emotion is an umbrella term for any internal state that makes certain behaviours more likely, including things such as anger, fear, frustration, pleasure, joy or euphoria. Emotions are valenced responses to internal and/or external stimuli mediated by different, though not necessarily exclusive, brain regions (e.g. Bechara et al. 2000; Damasio et al. 2000; LeDoux 2000; Bechara 2004; Berlin et al. 2004). Previous psychological and neuroscientific research reflected the long tradition of Western philosophy in viewing emotion and cognition as if they were separate processes. Today, this view has been transformed into one which emphasizes the bidirectional emotion–cognition interaction pathway (Maga & Cummings 1994) that may be necessary for adaptive functioning (e.g. Dolan 2002; Ochsner & Gross 2005; Ochsner & Phelps 2007). For example, emotional arousal has been associated with improved long-term memory causing an unusually high degree of detailed recall (Cahill et al. 1996; Roozendaal et al. 1996). Also, emotions exert a strong influence on reasoning and guide processes of decision making (Bechara 2004) in ways that are neither well understood nor systematically researched (Dolan 2002).

(c) Information and memory

Information and memory are terms with many meanings depending on context and discipline. In this paper, we use information loosely as environmental input to an organism and consider memory to be the storage of such information. This storage can have various substrates, including B cells in the immune system or synapses and neural circuits. The latter comprises three storage systems, sensory memory, working memory and reference memory (or long-term memory). Experimental studies into memory indicate that these systems interact over the course of learning (e.g. Baddeley & Hitch 1974; Thompson & Kim 1996; Baddeley 2000; McGaugh 2000; LeDoux 2000; Kim & Baxter 2001). For our purposes, it may be less useful to talk about which memory system may support cognitive approaches to cooperation and more informative to discuss how memory in general may do so.

Memory may help in two ways. First, individuals may remember information about specific events and partners, such as what goods or services they gave to a partner and which ones they are owed in return. Such ‘book-keeping’ closely resembles what has been called calculated reciprocity, in which individuals must remember which goods or services they have received and return an appropriately equivalent good or service at a later time (Brosnan & de Waal 2002; de Waal & Brosnan 2006). Memory may also encode less specific information. Considering reciprocity again, it may be sufficient for animals (including people) to remember whether the partner cooperated or defected on a previous move, or even to simply encode a positive or negative ‘tag’ (e.g. due to respective emotions) towards the partner. This sort of reciprocity, which has been called attitudinal reciprocity (de Waal 2000; see also de Waal & Suchak 2010), may be easier to encode due to a smaller memory load, yet sufficient to yield outcomes which are beneficial to the actor.

3. Coordination and cognition

Some scientists define cooperation not by its fitness consequences but more operationally as ‘acting together’ (Noë 2006; Taborsky 2007). This definition emphasizes a general perception that cooperation should include an aspect of coordination between partners. The notion of coordinated action is particularly well supported if active recruitment can be documented. Recruitment is widespread in animals, ranging from insects to various vertebrates (insects: Hölldobler & Wilson 2009; fish: Bshary et al. 2006; mammals: de Waal & van Hooff 1981; Gouzoules et al. 1984; birds: Bugnyar & Kotrschal 2001). Much research on cooperation in biology focuses on task sharing and division of labour in cooperatively breeding species, another form of coordination. Eusocial species represent the pinnacle of this organizational complexity, yet the processes that achieve this complexity are not cognitive: in the hymenoptera, food quality received at the larval stage determines whether a female will become a worker or a queen. Furthermore, different casts with different functions are either based on anatomical specialization (in ants and termites) or on age-related task specializations (bees). Nevertheless, the efficiency of colonies is greatly enhanced by sophisticated communication that allows efficient exploitation of food sources, location of suitable new sites for nesting and communal defence against predators and competitors.

It is important to note that in the most extreme case, coordination of individuals ‘working together’ might be achieved without these individuals paying any attention at all to the others and the state of the other's work. For example, social Stegodyphus spiders catch and handle prey too large for one single individual (Ward & Enders 1985; Wickler & Seibt 1993). A family of individuals approaches the prey independently in reaction to the net vibrations caused by the prey. Then together, they pull the prey victim to their communal nest for consumption. This concerted cooperative effort results from sharing the same nest and consequently pulling in the same home direction; it does not require any communication or monitoring of partners.

In contrast to the spider example, cooperative hunting in vertebrates is the prime example where coordination has been linked to cognition. Boesch & Boesch (1989) proposed that collaborative hunting reflects the cognitive abilities of the species or population in question. They defined four levels of complexity of coordination during hunts: (i) similarity, in which all hunters concentrate similar actions on the same prey, but without any spatial or time relation between them; (ii) synchrony, in which each hunter concentrates similar actions on the same prey and tries to relate in time to each other's action; (iii) coordination, in which each hunter concentrates on the same prey and tries to relate in time and space to the others' actions; and (iv) collaboration, in which hunters perform different complementary actions directed to the same prey (e.g. encirclement). Collaboration has been observed in only a handful of species: chimpanzees, dolphins, orcas, lions and harris hawks (Bednarz 1988; Boesch & Boesch 1989; Stander 1992; Baird 2000; Connor 2000; Gazda et al. 2005). In contrast to task sharing in eusocial species, the coordination in intraspecific collaborative hunting is rather complex: individuals must learn to perform variable behaviours and to keep track of others' actions and outcomes for their efforts to be successful. An individual's best behavioural option depends on what other group members are doing, which will vary from one hunt to the next and even within the same hunt from moment to moment.

Currently, evidence for intraspecific collaboration is restricted to mammals and birds. In fishes, the most complex form of intraspecific cooperative hunting described so far occurs in mormyrid fishes, in which individuals swim in formation while searching for prey (pack hunting; Arnegard & Carlson 2005). Full collaboration in fishes has been observed only in the interspecific context, where predator species with complementary hunting strategies team up and gain from the effect of their joint actions on the prey (Bshary et al. 2006; case observations in Lukoschek & McCormick 2002). With respect to cognition, however, such interspecific collaborative hunting seems to be more similar to collaboration in eusocial species, as each partner does what it has been selected to do.

Finally, we note that coordination between group members concerning activity or movement patterns could be a very interesting topic to link cooperation and conflict with cognition, since coordination could become more difficult as group size increases. While much research focuses on functional aspects of decision-making processes (Conradt & List 2009), groups also function as information centres about ephemeral food patches. In this case, the question arises of what social skills are necessary to efficiently exploit the knowledge of others (Emery et al. 2004; Bugnyar & Heinrich 2005, 2006; Dally et al. 2006; Clayton et al. 2007; see also Earley 2010).

4. Strategic social behaviour and cognition

We use the term ‘strategic’ if an individual may choose between different options from its behavioural repertoire where (i) the actual choice has consequences for the payoffs of both the actor and the partner(s) with whom it is currently interacting and (ii) the optimal choice depends on the partner(s)' strategy and corresponding behaviour. In other words, the individual can choose between different levels of cooperative behaviour or between cooperative behaviour and cheating. Thus, the appropriate choice of behaviour must be based on some sort of information, be it information about an individual's internal state, specificities of the current situation or about its own or its partner's past behaviour.

Not all cases of cooperation or mutualism are strategic in this sense. Instead, plenty of examples exist where each individual performs self-serving behaviour that benefits its partner(s) as a by-product (termed ‘by-product mutualism’; Brown 1983). Cooperative hunting is a prime example because the benefits of cooperation can only be achieved if individuals act together. Hence the best response of an individual to its partner hunting is to join the hunt. In golden jackals, for example, Lamprecht (1978) observed a sixfold increase in hunting success in pairs compared with singleton hunts, making cheating an unprofitable action. As a consequence, by-product mutualisms do not require strategic behaviour. We do not consider these cases further.

(a) Strategic behaviour without a brain

Cooperation and deception based on strategic behaviour can sometimes occur even though partners lack a brain. A well-studied system is the interspecific mutualism between leguminose plants and Rhizobia (Kiers et al. 2003). In fact, this system does not require learning, memory or individual recognition. The plant makes the initial investment, providing shelter (nodules) and carbohydrates to the bacteria, which then fix nitrogen for the plant (the second investment). However, the bacteria differ in their ability to fix nitrogen and, since fixing nitrogen requires energy, lines that fix less nitrogen save energy due to their (genetically determined) cheating. Thus, the plant invests in a structure where benefits are delayed or even not reciprocated if Rhizobia enter the nodule and then fix little or no nitrogen. However, a plant interacts simultaneously with many bacteria of different lines spread over the nodules in its entire root system, which gives it some recourse.

Experiments demonstrate that plants can assess the quantity of nitrogen fixation in different parts of the root system and respond appropriately: in areas with a lot of nitrogen fixation, the plant grows new roots, whereas plants reduce root growth in areas with low nitrogen fixation (Kiers et al. 2003). The results demonstrate a plant's ability to detect cheaters and to sanction them. Plants are able to solve the various problems because (i) the initial investment in the construction of nodules is based on a genetic programme, so the initial investment in the interaction is guaranteed, (ii) the assessment of partner quality is based on the evaluation of current physiological activity (nitrogen fixation) in each nodule, (iii) partner ‘recognition’ is possible based on location, and (iv) the response to both cooperators and cheaters is immediately self-serving: the plant grows roots where its gains are high, which benefits cooperators and sanctions cheaters.

It is interesting to turn the point around and to ask what the plant does not need to do to gain benefits. First, the plant is not hindered by the initial cost of growing a nodule, nor does it need to assess initially whether the current investment of growing nodules will be fully compensated, since growing a nodule is genetically determined and, hence, an unconditional action. However, if the decision to grow nodules was based on learning and memory, it would be cognitively demanding to associate current costs with delayed benefits. Second, the plant does not need any long-term memory for its decision regarding where to grow new nodules, but can respond to the current situation. A change in the local composition of Rhizobia strains automatically leads to a change in local nitrogen fixation, which automatically leads to a change in local nodule growth rate by the plant. Third, the plant does not have to recognize bacteria partners as individuals because the bacteria's movements are restrained. The system seems to work, even though it is not perfectly discriminative: the plant's decision based on location means that cooperative strains will also be sanctioned if they share a nodule with cheating strains. Finally, the controlling action of the plant that reduces the fitness of cheating strains—growing nodules as function of local nitrogen fixation—does not decrease immediate payoffs. Thus, the success of the controlling action does not depend on cheaters behaving more cooperatively in the future and so does not require any ability to plan for the future or mechanisms to get around the problem of temporal discounting.

The lack of these features is interesting because learning and memory, individual recognition and planning for the future are assumed to require increasingly complex nervous systems. As we illustrated with the plant–Rhizobia example above, plants and bacteria do not need these abilities for their mutualistic interactions. Nevertheless, simple forms of memory exist in bacteria (e.g. Casadesus & D'Ari 2002) and plants (e.g. Thellier et al. 2000; Volkov et al. 2008). Although these kinds of non-neuronal memories seem to be rather constrained in extent and variety, it may still turn out that they are used for strategic decision making in the context of cooperation.

(b) Strategic behaviour demanding higher cognitive abilities

In this part, we will explore factors which are likely to make strategic behaviour more complex, and hence demand at least some cognition.

(i) Partner mobility

The majority of animals are mobile. Strategic behaviour becomes more complex when partners are mobile for two reasons. First, mobility means that there may be both spatial and temporal separation between interactions. Therefore, any appropriate decision making that includes information about past interactions has to be based on memory. Second, mobility often causes encounters with several potential partners, requiring individual recognition and memory to choose appropriate partners and determine the appropriate behaviour (e.g. a biological market; Noë et al. 1991). Note that individual recognition is not automatically linked to mobility. In many cases of mutualism, only one partner is mobile and may remember the location of sessile partners rather than recognize partners as individuals. For instance, insect pollinators (who are mobile) could avoid deceptive plants that do not produce nectar based on spatial memory (avoidance of location where the non-mobile plant is), and further generalization of negative experiences may allow experienced pollinators to avoid cheater species entirely (Gigord et al. 2002).

One can imagine scenarios where even if both partners are mobile they could potentially use location rather than individual recognition for decision making. However, most situations would favour individual recognition of mobile partners. Indeed, this ability is widespread among vertebrates, and there is increasing evidence that some invertebrate species have this ability as well. Individual recognition has been demonstrated in paper wasps (Tibbetts 2002), the burying beetle Nicrophorus vespilloides (Steiger et al. 2008) and the lobster Homarus americanus (Karavanich & Atema 1998). Thus, while individual recognition of mobile partners seems to be a prerequisite for cognitively complex cooperation, it is not the case that only highly encephalized organisms are capable of partner recognition.

(ii) Delays and the problems of cooperation

While some decisions can be made on information which is currently available, in other cases individuals must make decisions based on information from the past or expectations of the future, in particular if individuals are mobile (see §4b(i)). Current information is easier to deal with not only because it does not require memory, but also because a lack of cooperation from the partner will immediately affect the internal state of an individual, affecting behaviour (e.g. a lack of fixed nitrogen could cause a lack of root growth in the vicinity). In contrast, if there are discrete interactions with a period of time between intervals (starting from a fraction of a second; see Frey & Morris 1997; Dudai 2009), it becomes increasingly unlikely that a partner's cheating during the last interaction will affect an actor's current state. Therefore, the individual has to base its behavioural decision on some sort of memory, either an explicit calculated memory of the interaction or an emotional reaction that is generated by the interaction (Brosnan & de Waal 2002).

Moreover, individuals must overcome the issue of delay of gratification and be capable of ‘paying’ now for a future benefit. This is a problem plants do not have when growing nodules or producing nectar because genetic programmes cause such investments. However, in organisms with cognition, if there is a time delay between investment and compensation, there is the risk that they will discount goods and services, potentially at different rates depending on an individual's role (e.g. the one owed versus the one owing). In fact, we know that animals (and humans) strongly discount the future, often preferring smaller immediate rewards to larger, temporally distant rewards (e.g. Stevens & Hauser 2004). Humans can plan for the future, but there is more debate about other species (see §4c(iii) for more discussion; see also Melis & Semmann (2010) for a discussion of how human cooperation differs from that of other species).

(iii) Conditioning and temporal delays

Bacteria and plants may use innate decision rules for their behaviour/physiology, but the evolution of information storage and calculation in the brain allows individuals to change behaviour based on learning. The basic forms of associative learning, Pavlovian and operant conditioning, have been demonstrated across a wide range of animal taxa (Wynne 2001). In Pavlovian (classical) conditioning, an animal learns to associate stimuli with each other, while in operant (instrumental) conditioning, an animal learns to associate its own behaviour with outcomes in the environment. If the outcomes are favourable, this positive reinforcement will increase the probability that the animal will perform the behaviour again in the future. In contrast, if the changes are negative, the probability of showing the behaviour again will decrease. Associative learning mechanisms appear to be the most widely used learning mechanisms in animals (Mackintosh 1974, 1983; Wynne 2001), and researchers in animal cognition have found it challenging to properly demonstrate more complex cognitive abilities in animals. While there is evidence for other mechanisms, including insight learning, planning, perspective taking, experience projection and mental time travelling, the evidence is restricted to a few species and does not preclude associative learning in addition to these more complex mechanisms. Therefore, it is important to evaluate how associative learning may affect an animal's ability to cooperate or to deceive.

Empirical studies on associative learning have revealed the overwhelming importance of temporal contingency for efficient associative learning. With the known exception of food poisoning (e.g. conditioned taste aversion), stimulus and response must be closely linked in time, between less than a second in some species up to minutes in others. Owing to this intimate link in time, any behaviour has to provide benefits quickly; without a positive or negative reinforcer, learning will not occur or extinction will (finally) eliminate the appearance of the behaviour. The inherent constraints of associative learning mechanisms may explain why animals often do not achieve stable cooperation even though the conditions/payoffs favour cooperative solutions in the long term.

Deception, too, may be affected by associative learning. Deceptive alarm calls by several species of birds and primates (e.g. Møller 1988; Wheeler 2009) may be due to a conditioned association by the ‘deceptive’ caller. At some point, that individual may have given a spontaneous alarm call in a non-predator context, perhaps due to the stress inherent in the situation (e.g. an attack by a dominant). If this call resulted in the cessation of the stressful situation (e.g. the dominant left), it would have created a strong association in the mind of the caller, and led to future false alarm calls. This is functional deception, but does not require any explicit understanding of how the call affected others' behaviour on the part of the caller. Moreover, conditioning may lead the other individuals in the group to learn whose alarm calls are legitimate and whose are not. If one individual's alarm calls routinely occur outside of the context of actual danger (i.e. lack of reinforcer in form of a predator), others may cease attending to their calls. Thus, conditioning may paradoxically lead both to deception and be a mechanism for avoiding being deceived. However, neither will occur if the time period between the stimuli and responses are too long.

Delays between actions and consequences may represent situations in which the evolution of a large brain is important; species with larger brains may be better equipped to deal with longer delays in the conditioning process and/or the ability to refrain from impulsivity. Although some species can postpone behaviours that yield small immediate benefits for only a very short time (in the order of seconds) in favour of delayed larger benefits, some apes can postpone for much longer (Beran & Evans 2006; Dufour et al. 2007). In cooperative situations, delays can be even 30–60 min, as in cleaning interactions (Bshary & Grutter 2002; Bshary & Schäffer 2002), to a day or more, as in blood provisioning in vampire bats (Wilkinson 1984). However, even this may not be long enough for all situations, essentially putting a hard cap on the ability of learning to influence cooperative behaviour.

(c) Strategic behaviours and temporal delays

(i) Cooperation and the problem of investments

Of the many concepts that can explain stable cooperative behaviour, only two assume that cooperative behaviour is better than cheating by default (Bshary & Bronstein 2004; see table 1). In by-product mutualism (Brown 1983), benefits to partner(s) are the result of immediately self-serving decisions. For example, individuals may self-servingly decide to remain in the vicinity of others in order to reduce predation risk, which as a by-product benefits all group members in addition to the actor (selfish herd; Hamilton 1971). Delayed benefits may still be predictable, as in positive pseudoreciprocity (Connor 1986). According to this concept, stable cooperation may be achieved if an investment by one partner enables the recipient to perform a self-serving behaviour that nonetheless benefits the investor as a by-product. For example, some fungus-harvesting ants provide services to the fungi, which allow the fungi to (self-servingly) grow and reproduce, which in turn benefits the ants because they harvest fungi for food (Mueller et al. 2005).

View this table:
Table 1.

Concepts which can explain cooperative behaviour.

In all other concepts of cooperation, there is a temptation to avoid investment in the cooperative behaviour, which constitutes cheating. For example, flowering plants would do best if pollinators provide their service without being rewarded (Brandenburg et al. 2009). Therefore, individuals must be able to detect any cheating from the partner and to respond in a way that increases the cost of cheating to the actor so that it does not yield net benefits. We refer to these responses as partner control mechanisms. Effective partner control mechanisms are responses to cheating that reduce the cheater's payoff to a level that puts cheating under negative selection, and so lead to stable cooperation either over an individual's lifetime or over evolutionary time.

There are three basic situations in which control mechanisms encourage investments (Bshary & Bergmüller 2008). First, investment pays if by doing so the investor avoids a self-serving response that would reduce the payoffs of a cheater as a by-product (‘negative pseudoreciprocity’). For example, if a pollinator lays too many larvae in a yucca fruit, the plants abort these fruits. Although this is because these larvae would eat all the seeds, it serves to encourage cooperation by the pollinator (Pellmyr & Huth 1994). Similarly, reef fish will visit another cleaner for their next inspection if their current cleaner wrasse cheats (Bshary & Schäffer 2002). Second, investment pays if it leads to return investments (‘positive reciprocity’). Tit-for-tat-like reciprocity is based on such mutual rewards (Axelrod & Hamilton 1981). Third, investment pays if by doing so the investor avoids a costly response aimed at reducing a cheater's payoff (‘negative reciprocity’). Punishment (Clutton-Brock & Parker 1995) and policing (Ratnieks 1988) are control mechanisms based on negative reciprocity (see Gächter et al. (2010) and Jensen (2010) for further discussions of punishment and spite).

Appropriate strategies can in principle be encoded genetically and performed in response to key stimuli, as argued by Axelrod & Hamilton (1981). However, any species with a nervous system may learn about the behaviour of others, as well as about the consequences of their own behaviour, and adapt accordingly (Wynne 2001). Cooperative solutions may be made more likely by mechanisms for overcoming immediate costs, such as empathy or an innate tendency towards helping (see de Waal & Suchak 2010; Jaeggi et al. 2010). Alternatively, individuals may explore a variety of behaviours, some of which will be cooperative (McNamara & Leimar 2010). With time, their behavioural decisions will be based on learning what rules yield high payoffs. In this scenario, cooperative behaviour is likely to emerge if the partner uses negative pseudoreciprocity, or positive or negative reciprocity as control mechanisms. However, the corresponding controlling behaviours differ with respect to the ease with which they are learned. In negative pseudoreciprocity, the best option for the controlling individual is to cooperate as long as the partner cooperates (plants directing resources to the fruit so that it develops, clients returning to a cooperative cleaner wrasse), while stopping the interaction with a cheater is immediately self-serving. Therefore, individuals that explore a variety of behaviours could easily learn with associative learning to both cooperate and control the partner's behaviour. Under these conditions, stable cooperation appears to be achieved relatively easily.

In positive reciprocity, the best behaviour for the controlling individual is to stop investing if the partner cheats, which is immediately self-serving (defecting in an iterated Prisoner's Dilemma). Thus, the control mechanism can easily be acquired with associative learning. However, because cheating yields larger short-term payoffs than cooperating, the system is prone to end up in mutual defection. Finally, in negative reciprocity, the controlling behaviour is to punish a cheater. By definition (Clutton-Brock & Parker 1995), punishment reduces immediate payoffs of both punisher and victim. Thus, while punishment may be useful in promoting future cooperative behaviour in cheaters, it suffers from similar problems as investments: if the behaviour is not part of a genetic strategy, operant conditioning will disfavour punishment because of its immediate costs to the actor. An additional problem for negative reciprocity is that the incentives are negative reinforcers, so cooperation can only be learned after punishment for failure to cooperate. Thus, negative reciprocity seems to be particularly cognitively demanding compared with other concepts of cooperation. Nevertheless, it has been demonstrated in marine cleaning mutualism (Bshary & Grutter 2005; Raihani et al. 2010), suggesting that at least vertebrates show this control mechanism.

(ii) Subjective rewards and emotions

Temporal discounting seems to work strongly against long-term investments; thus, a more convenient solution could be the evolution of mechanisms that make an animal subjectively perceive an investment as a benefit, hereafter called subjective reward. Subjective rewards should be linked to investments where the likelihood of future benefits is high. The ‘objective’ cost of the investment would be perceived as ‘subjective’ benefit. An excellent example is humans who punish transgressors not to change their behaviour (they cannot in these experiments), but because it makes them feel good (de Quervain et al. 2004; Singer et al. 2006).

If the point of subjective payoffs is to achieve prosocial outcomes, as has been argued recently, then this may be present in other species as well (de Waal et al. 2008, de Waal & Suchak 2010; Jaeggi et al. 2010). Such subjective payoffs may also help individuals avoid situations in which cooperation leads to outcomes which, while positive, are insufficient relative to a partner (Brosnan & de Waal 2003; Brosnan Freeman & de Waal 2006; Brosnan 2008). Many subjective rewards are based on friendship, a voluntary, long-term affiliative social relationship between two or more individuals (Wasilewski 2003). Such long-term relationships may provide a solution to the problems deriving from temporal discounting by providing a ‘safe’ environment (Wickler 1976) in which to cooperate, based not on the memory of each past interaction but on the memory of the relationship quality. An open question is whether one needs a large brain to build subjective payoffs or whether these may also occur in less encephalized species. For example, hormones may be directly responsible for or affect cognitive processes that lead to investments (Soares et al. 2010).

(iii) Planning for the future

Mental time travelling, the ability to mentally re-live personal past experiences and to pre-live future events (Suddendorf & Corballis 1997; Boyer 2008), may provide another means to overcome temporal discounting. This is because individuals can voluntarily construct possible future events in their mind, and even incorporate emotions associated with the outcome of the imagined future scenarios (Ainslie 2007). In this way, individuals may ‘experience’ (‘pre-live’) future outcomes of current options in the present and compare them with short-term rewards side by side. Once a species evolved the cognitive components necessary to have voluntary access to remembered facts (semantic memory) and then even to re-experience episodes from one's own past (episodic memory; Tulving 1972, 1983, 2005), it might have the prerequisite for investments based on combining recent events with information stored in long-term memory to predict future consequences (Suddendorf & Corballis 1997). Some argue that animals are caught in the present, unable to consider activities beyond those for which cues (internal or external) are immediately present (Roberts 2002; Suddendorf & Busby 2003; Suddendorf & Corballis 2007). If this was the case in the strict sense, then animals could not mentally value future benefits since future would not exist for them. However, comparative psychologists make a good case for future planning in species like food-caching Western scrub jays (e.g. Raby et al. 2007; Raby & Clayton 2009; see also Clayton et al. 2008) and apes (Call 2007; Osvath & Osvath 2008).

Nevertheless, mental time travelling does not always resolve the problem of discounting the future. The time window to look into the future, also called the ‘future time perspective’ (Fellows & Farah 2005), contributes to determining what priorities will be set and what anticipated outcomes and reinforcers (both negative/positive) will be considered. This temporal framework might vary between species (Fellows & Farah 2005) and hence is likely to affect their ability to cooperate. One way to increase acceptable time delays is to manipulate emotions/physiological states. Various forms of stress, such as anxiety and boredom, have been related to time perception (Hancock & Weaver 2005). For instance, primates can counteract impulsivity through self-distraction, which may function both to occupy the individual and to lengthen acceptable delays in gratification (Evans & Beran 2007; Heilbronner & Platt 2007).

5. Cooperation in networks/groups

(a) The use of public information

One additional factor, mentioned previously, which may make cooperation more complex is the use of ‘public’ information from observation rather than ‘private’ information from personal experience (for a detailed discussion surrounding this topic, see Earley 2010). Public information from eavesdropping has been found in many different species, mostly vertebrates (McGregor 2005). The concepts of indirect reciprocity based on image scoring (Alexander 1987; Nowak & Sigmund 1998) and indirect pseudoreciprocity based on image scoring (‘social prestige’, Zahavi 1995; Roberts 1998) deal with public information in the context of cooperation. The former concept has been demonstrated only in humans so far (Wedekind & Milinski 2000), while the latter can be found in cleaning mutualism (Bshary & Grutter 2006).

Using public information for behavioural decisions requires certain cognitive abilities. First, senses have to be developed in order to gain the information. Bystanders have to recognize individuals and acquire information about what they are doing. Furthermore, bystanders have to evaluate the behaviour of others in an indirect way: the observed behaviour does not influence the payoffs of bystanders. Thus, bystanders have to deduce from the effects of someone's behaviour on third parties how this someone would affect their own payoffs, which may be either difficult or misleading (Brosnan et al. 2003). Based on this evaluation, bystanders then have to decide whether they should seek or avoid interactions with this individual and, if an interaction takes place, whether or not to cooperate. This has the potential to make decision rules for cooperation considerably more complex, in part because the use of public information selects for behavioural changes among the observed individuals (e.g. audience effects). For example, cleaner wrasses behave more cooperatively if they are observed by non-resident clients (Bshary & Grutter 2006), which devalues the quality of public information. Models show that the possibility of collecting public information may cause not only an increase in cooperation, but potentially also in aggression (Johnstone & Bshary 2004) and even allow the evolution of tactical deception (Johnstone & Bshary 2007).

There are advantages if individuals are able to understand and use public information. In the case of conditional cooperation, using public information may allow better predictions because the information may be more recent than personal experience. Public information may be particularly relevant for an individual's ability to choose ideal cooperation partners from among potential candidates. Observation of others' interactions will yield information about newly forming alliances or newly arising conflicts that can be counteracted or reinforced with strategic behaviour. This may lead to the possibility of changing partners. At the same time, observed individuals should hide their intentions if detrimental.

(b) N-player games

In our evaluation on the factors that cause an increase in cognitive requirements for appropriate strategic decision making, we have until now focused on concepts for two-player interactions. It is clear that coordination increases in complexity as the number of simultaneous partners increases. With respect to strategic decision making, it is difficult to predict how the addition of partners relates to complexity. At the most basic level, n-player cooperation either constitutes a by-product mutualism (West et al. 2007) in which every individual should contribute or an n-player Prisoner's Dilemma (‘the tragedy of the commons’, Hardin 1968) where nobody should contribute. However, Milinski and colleagues have pointed out that group-living species will face both group and pair situations, and that behaviour in one condition may have implications on behaviour in the other. In humans, individuals that contribute to a public good raise their image score and therefore receive more help in pairwise interactions (Milinski et al. 2002). The main cognitive challenge in such cases will be that as the number of interactors increases, so does the burden on memory. Individuals have to simultaneously monitor the behaviour of all partners in group situations to respond appropriately in pairwise situations. One way around the problem of monitoring directly may be the use of gossip (Sommerfeld et al. 2008), but this requires language. We do not discuss the issue of n-player interactions in more detail, but refer the reader to Connor (2010).

6. Conclusions

Cooperation is widespread in nature, which precludes the possibility that it always requires advanced cognitive abilities. Stable investments may be achieved with minimal cognition in bacteria and plants. A lack of mobility combined with simple evaluation of current levels of cooperation by the partner and controlling behavioural responses to cheating that are immediately self-serving allow stable investments with minimal information processing. Moreover, these investments may be based on genetic programmes, further reducing complications.

Mobility and discrete interactions make cooperation more complex, and learning and memory become prerequisites for decision rules that allow individuals to invest without being exploited by cheaters. Simple learning mechanisms such as associative learning most likely hinder the establishment of stable cooperation with delayed outcomes since animals would learn to avoid investments because they fail to associate the investment with delayed benefits. Additional cognitive abilities are required to allow individuals to develop investment behaviours. Subjective rewards, empathy, friendship, future planning or other mechanisms may all allow cooperation where associative learning would not. More generally, the idea that associative learning leads to the maximization of immediate payoffs may explain the perception that complex societies can best be achieved either without a brain due to kin selection, or with a very large brain due to mutual investments.

Acknowledgements

We thank Frans de Waal and Peter di Scioli for useful comments on an earlier draft of the manuscript. Funding to S.F.B. was provided by a National Science Foundation Human and Social Dynamics Grant (SES 0729244) and an NSF CAREER Award (SES 0847351) and by the Swiss Science Foundation to R.B. and NIH grant NIMH-061944 to L.H.S.

Footnotes

    References

    View Abstract