Royal Society Publishing

Cognitive ornithology: the evolution of avian intelligence

Nathan J Emery


Comparative psychologists interested in the evolution of intelligence have focused their attention on social primates, whereas birds tend to be used as models of associative learning. However, corvids and parrots, which have forebrains relatively the same size as apes, live in complex social groups and have a long developmental period before becoming independent, have demonstrated ape-like intelligence. Although, ornithologists have documented thousands of hours observing birds in their natural habitat, they have focused their attention on avian behaviour and ecology, rather than intelligence. This review discusses recent studies of avian cognition contrasting two different approaches; the anthropocentric approach and the adaptive specialization approach. It is argued that the most productive method is to combine the two approaches. This is discussed with respects to recent investigations of two supposedly unique aspects of human cognition; episodic memory and theory of mind. In reviewing the evidence for avian intelligence, corvids and parrots appear to be cognitively superior to other birds and in many cases even apes. This suggests that complex cognition has evolved in species with very different brains through a process of convergent evolution rather than shared ancestry, although the notion that birds and mammals may share common neural connectivity patterns is discussed.

1. Introduction

It is everywhere recognized that birds possess highly complex instinctive endowments and that their intelligence is very limited (Herrick 1924)

The quest for intelligence in non-human animals has focused traditionally on our closest living relatives, the monkeys and apes and other large brained social mammals, such as cetaceans (whales and dolphins), pack hunting carnivores and elephants. Birds have been relegated to use as models of simple associative learning. Although these latter studies have tended to focus on pigeons (Columba livia), chickens (Gallus gallus) and quail (Coturnix coturnix), there are over 9000 species of birds (Perrins 2003); some such as the Corvidae (crows) and Psittacinae (parrots), which have forebrains that are relatively the same size as the great apes, live in complex social groups and have a long developmental period before becoming independent from their parents. These have been suggested as hallmarks of intelligence in primates (Humphrey 1976; Byrne & Whiten 1988). Although, field ornithologists have documented thousands of hours observing birds in their natural habitat, they have restricted their interest to examples of avian behavioural ecology and ethology, rather than studies of avian intelligence (Marler 1996). This is in stark contrast to field primatologists who have conducted experiments into the cognitive aspects of tool use, social behaviour and memory of wild primates (de Waal & Tyack 2003).

Although this paper will attempt to provide an overview of what is known about the intelligence of birds as a taxa, it will become rapidly apparent that there is something special about the cognitive abilities of crows and parrots (Emery & Clayton 2004a). These two groups have consistently demonstrated intellectual skills that are qualitatively and quantitatively more sophisticated than have been demonstrated by other birds, and in many domains comparable to monkeys and apes (Emery 2004; Emery & Clayton 2004b). This superiority may be due to the limited number of species tested in comparative psychology that have neither large relative brain size nor are renowned for their intelligence, such as pigeons, chickens or quail, or the processes examined in these representative species are relatively simple (e.g. classical and instrumental conditioning). However, it will be argued that certain aspects of corvid and parrot socioecology, neurobiology and life history, such as sociality, large relative forebrain size, and long developmental period, are pre-requisites for intelligence in birds, as they appear to be in primates.

Corvids and parrots encounter many of the same ecological problems as primates. First, corvids and parrots live in constantly variable environments. For example, many parrots live in the same neotropical regions as the arboreal primates (Forshaw 1989), whereas corvids are found throughout the world (Madge & Burn 1994). The Corvus species, in particular, survive in some of the harshest environments on earth, from the extreme cold of Alaska and Siberia, to the extreme heat of the Sahara and Mojave deserts. Second, many corvids and parrots are omnivorous, generalist foragers. Keas (Nestor nobalis), for example, are the only species of alpine parrot and eat a varied diet, which include fat-, protein- and carbohydrate-rich foods found discarded in human settlements. Keas are also the only carnivorous parrot, and their voracious attacks on sheep and their curiosity almost led to them being hunted to extinction (Diamond & Bond 1999). Third, many species of corvids and parrots are highly social and demonstrate similar levels of social complexity to many monkeys and apes, particularly species with fission–fusion societies (Emery 2004). Fourth, corvid and parrot forebrains are very large relative to their body size, especially when the weight constraints of flying are high. Finally, corvids and parrots often have an extended developmental period before they become nutritionally independent from their parents, and have an extended life expectancy, compared to other birds.

2. No longer bird-brains…

The poor development in birds of any brain structures clearly corresponding to the cerebral cortex of mammals led to the assumption among neurologists not only that birds are primarily creatures of instinct, but also that they are very little endowed with the ability to learn. There is no doubt that this preconceived notion, based on a misconceived view of brain mechanisms, hindered the development of experimental studies on bird learning (Thorpe 1964, p. 336)

The idea that the six-layered neocortex of most mammals is the prerequisite for complex cognition still pervades popular culture. Indeed, intellectually less endowed individuals in Western society are often called ‘bird-brains’. Perhaps more surprisingly, this view is still held by many comparative psychologists and neuroscientists. One reason for this long-held, but ultimately incorrect view is the confusing terminology used to name the different regions of the avian telencephalon (forebrain). Traditionally, regions in the avian cerebrum ended with the suffix—striatum, meaning derived from the basal ganglia (figure 1a). As the vertebrate basal ganglia is involved in species-specific behaviours, such as maternal care, sexual behaviour and feeding (Reiner et al. 1998), bird-brains were deemed incapable of producing flexible or intelligent behaviour. It is now known that this nomenclature is based on a fallacy; large parts of the avian forebrain are derived, not from the striatum, but from the pallium (figure 1b). Interestingly, the mammalian neocortex is also derived from the pallium (Jarvis & Consortium 2005). This places the avian forebrain into a new light, where bird behaviour may now be explained as an adaptation to solving socio-ecological problems similar to mammals, possessing hardware that is different to mammals, albeit evolved from the same structure. Pepperberg (1999) provides a useful computer analogy when comparing mammalian and avian brains; mammalian brains are like IBM-PCs, whereas avian brains are like Apple Macintoshes; the wiring and processing are different, but the resulting output (i.e. behaviour) is similar.

Figure 1

(a) Classic view of the avian telencephalon, in which the greatest proportion of the cerebrum is classified as striatal in origin (dark grey shading), compared to the smaller extent of the pallium (light grey shading). (b) Recent view of the avian telencephalon, in which the majority of the cerebrum has been reclassified as pallial in origin (light grey shading) compared to the smaller striatum (dark grey shading). Adapted from Jarvis & Consortium (2005). Abbreviations: CDL, area corticoidea dorsolateralis; E, ectostriatum (classic) or entopallium (revised); HA, hyperstriatum accessorium (classic) or hyperpallium apicale (revised); HP, hippocampal complex; IHA, interstitial nucleus of the hyperpallium intercalatum; L2, field L2; LPO, parolfactory lobe; OB, olfactory bulb.

(a) Brain size

Aside from differences in the structure of avian and mammalian forebrains, there are other gross neuroanatomical similarities, which may be important for discussing the relationship between bird-brains and intelligence. Jerison (1973) has suggested that absolute brain size may not be a useful indication of intelligence because the brain not only solves cognitive problems, but also perceives objects in the environment and subserves more basic regulatory and vegetative functions, such as controlling heart rate, breathing, etc. which may in turn be related to body size. Brain size also correlates with body size. He, therefore, produced an index of relative brain size in which the effects of body weight were controlled (something termed the encephalization quotient, EQ). EQ appears to separate animals based on their apparent cognitive skills. What is perhaps most interesting with respect to this review, the relative size of the crow brain is the same as the chimpanzee brain, and in both cases much larger than predicted for their body size. The other birds represented in Jerison's figure (hummingbird; Trochilinae and ostrich, Struthio camelus), were located on or much lower than the regression line, suggesting that their brains are either the same size or much smaller than predicted.

There are many problems with this sort of comparison. First, comparing species across taxa, which live in very different environments, may not be useful, as these environments will have produced variable constraints or no constraints on the evolution of brain size. For example, aquatic animals do not have the same restrictions on body or brain weight as terrestrial animals, particularly those that fly. Second, body size may not be the most appropriate control variable (Deacon 1990). Third, total brain size may not be the most appropriate index of cognition, as this measure includes brain regions not important for cognition, such as the brainstem. Therefore, more appropriate brain regions should be used as variables, such as the isocortex in mammals or the nidopallium and mesopallium in birds. For example, when different regions of the forebrain of various bird species (nidopallium, mesopallium, etc.) are compared, using the brainstem as a control variable, there are clear differences between species, with relatively larger nidopallium and mesopallium in songbirds, and especially crows, when compared to quail and pheasant (Rehkamper et al. 1991).

There are also differences between species found within the same family. For example, a comparison of the telencephalon of Old World and New World corvids found that Old World corvids (with the exception of Clark's nutcrackers; Nucifraga columbiana) were located on or above the regression line, whereas the majority of the New world corvids were located below the regression line (figure 2). As western scrub-jays (a New world corvid; Aphelocoma californica) have a relatively small telencephalon, but have demonstrated some of the most sophisticated aspects of cognition yet described for birds (see later sections), this example represents another case where gross brain size may be a limited guide to intelligence.

Figure 2

Scatter plot comparing body size (g) with volume of telencephalon (mm3) in Old and New World corvids. The regression line is the line of best fit passing through the origin. Old World corvids (crow, rook, jackdaw, magpie, chough, red-billed blue magpie, European jay, Clark's nutcracker) are represented with black symbols. New World corvids (western scrub-jay, pinyon jay, Mexican jay) are represented with white symbols. Data taken from Basil et al. (1996) and Healy & Krebs (1992).

(b) Evolutionary ecology of the avian brain

Perhaps the best evidence that bird-brains have been adapted for cognitive processing is from comparative analyses of relative brain (or brain component) size and measures of behavioural complexity. For example, many birds develop new methods for extracting or processing foods or feed on novel foods. Anecdotal evidence of these innovations has been documented by professional and amateur ornithologists and published in a variety of specialist bird journals, such as Brit. Birds, Wilson Bulletin, Auk and Bird Study. Lefebvre and colleagues (Lefebvre et al. 1997, 2004; Timmermans et al. 2000; Lefebvre & Bolhuis 2003) have collected around 2000 of these anecdotes, grouped them by family, and correlated the frequency of anecdotes across families with either relative forebrain size or various brain components, such as nidopallium or mesopallium. Consistently, there was a significant relationship between high innovation rate and large relative brain size for corvids, parrots, and to a lesser extent, non-corvid songbirds, woodpeckers (Piculinae), hornbills (Bucerotidae), owls (Strigidae) and falcons (Falconidae). These patterns were similar when frequency of tool use (using the same method of collecting anecdotes) was also correlated with relative brain size (Lefebvre et al. 2002).

Bird-brains have also adapted to solving other types of socio-ecological problems. For example, many birds, like anthropoid primates, live in large individualized societies (Emery 2004). In such societies, group members recognize one another, have long-term relationships and track others' social relationships, and therefore require specialized neural systems to process such types of information (Humphrey 1976). In primates, there is a strong correlation between relative neocortex size and mean group size (Dunbar 1992). However, the same analysis cannot be performed on birds as there is no comparable quantitative measure for group size as this changes constantly throughout the year for many species. This issue has been circumvented by grouping species based on their social structure. Burish et al. (2004) found that relative telencephalon size was larger in ‘transactional’ species (i.e. species with individual recognition, transfer between groups and social memory) compared to solitary, covey and colonial species. An analysis performed at different levels of social structure (solitary, pair, family, small, medium and large groups), representing different avian families, found that there was no overall effect of social structure on forebrain size, but there was an effect for corvids and parrots in medium social groups (Emery 2004). Other socio-ecological variables do not appear to correlate with brain size: cooperative breeding (Iwaniuk & Arnold 2004) and play (Diamond & Bond 2003). However, the latter analysis only compared large-brained corvids and parrots, as there are few examples of play behaviour in other birds.

Perhaps the most controversial relationship between brain and behaviour in birds is that of the hippocampus and food-storing. There is now very good evidence from anatomy, lesions and electrophysiology, in all the vertebrate groups, that the hippocampus is important for spatial memory (Colombo & Broadbent 2000). This relationship is most pronounced in those species that hide and recover large amounts of food items over long-time periods (weeks to months), such as the Paridae (Krebs et al. 1989; Sherry et al. 1989; Hampton et al. 1995; Healy & Krebs 1996) and the Corvidae (Healy & Krebs 1992; Basil et al. 1996). However, a recent analysis using all published data from both passerine families found that such strong correlations disappeared (Brodin & Lundborg 2003). Interestingly, the addition of new data on parids and a reanalysis revealed a continental effect; European species have a significantly larger hippocampus than North American species. After controlling for this effect, the strong relationship between food-storing and the hippocampus reappears (Lucas et al. 2004).

3. Anthropocentric approach to avian cognition

One reason why the intellectual capabilities of birds have been neglected by ornithologists until the last 30 years has been a focus on the anthropocentric approach to avian cognition. This approach has centred on a limited number of species, such as pigeons, chickens and quail, which do not have the well-developed forebrains of corvids and parrots. The aim of this approach has been to examine fundamental processes of learning and cognition that are either the same in all human and non-human animals (i.e. associative learning) or abstract/relational concepts, such as number and language, which may be unique to humans. This section will review examples of the anthropocentric approach; categorization and concepts, learning sets, transitive inference, object permanence, and numerical concept. The literature on avian decision-making (e.g. optimal foraging) is extensive and complex, and it would be difficult to do it justice within the space constraints of this review. Therefore, it will not be discussed further here. Bateson & Kacelnik (1998) is an excellent introduction to the behavioural ecology and psychology of decision-making for those readers interested in this important aspect of avian cognition.

Section 4 will review the alternative adaptive specialization approach, which suggests that species differences in cognitive processes are related to the ecological problems faced by that species, such as using tools, remembering the location of ephemeral food resources in space and time and interacting with conspecifics. Section 5 will report recent studies in corvids which have utilized information from a species' ecology and natural behaviour to answer questions about the uniqueness of certain facets of human cognition, such as theory of mind and episodic memory. It is suggested that all three approaches are essential for a comprehensive understanding of avian intelligence.

(a) Concepts and categorization

Birds are exceptionally skilled at discriminating between visual images. Such images can be categorized based on their perceptual similarities or may even be grouped together based on a human-like, abstract concept, such as same–different. Pigeons, for example, can discriminate images of aerial photographs (Lubow 1973), people (Herrnstein & Loveland 1964), pigeons (Poole & Lander 1971), trees and water (Herrnstein & Cable 1976), chairs, cars, humans and flowers (Bhatt et al. 1988) and even arbitrary stimuli, such as letters of the alphabet (Morgan et al. 1976) and the paintings of Picasso, Monet, Chagall and Van Gogh (Watanabe et al. 1995; Watanabe 2001). The classic example of this ability was discovered by Herrnstein & Loveland (1964), who found that pigeons could learn to discriminate between pictures which contained human beings in them and those that did not. All the stimuli were novel to the pigeons, the backgrounds were different, the people were either clothed or naked and the number of people in the photographs was not consistent across trials.

Pigeons can also solve simple same–different discriminations where three stimuli are presented simultaneously, two that are the same and one that is different. The subject has to respond to one of the two ‘same’ stimuli (matching task) or to the ‘different’ stimulus (oddity task). This sort of task may be solved either by responding to configural relationships between the stimuli or as a conditional discrimination (Macphail 1982); however, there is little supporting evidence that pigeons perform using this method (Cumming & Berryman 1961; Berryman et al. 1965; Carter & Eckerman 1975). The best evidence that birds can form abstract concepts such as same–different has been provided by Alex, the African grey parrot. Alex has been trained to vocally label more than 100 objects with different colours, shapes and which are made from different materials. Alex can also request or refuse these objects (‘I want X’) and quantify numbers of them (2–6, Pepperberg 1999). Using this ability to investigate Alex's cognitive abilities, rather than his linguistic talents, Pepperberg has found that Alex can categorize objects based on their colour, shape and material and determine whether multiple exemplars of these properties are the same or different (Pepperberg 1987a).

(b) Rule versus rote learning

When pigeons, chickens or quail are presented with complex problems that require the application of a rule, they begin to fail. For example, White Leghorn chickens, Bob White quail (Colinus virginianus), Yellow head parrots (Amazona ochrocephala) and Red-billed blue magpies (Urocissa occipitalis) were compared on a successive discrimination reversal task (i.e. two stimuli are discriminated based on reinforcement contingencies such that S+ leads to reward and S− leads to non-reward, then the contingencies are reversed), there were clear differences between species. The corvid outperforms the parrot, which outperforms the quail, which outperforms the chicken (Gossette et al. 1966). Another classic test which has been used to examine differences in intelligence between species (closely or distantly related) is the learning set. Briefly, an animal is presented with a pair of stimuli to discriminate, say a blue square and a red square. Choosing the red square always leads to a reward, choosing the blue square leads to non-reward, and then the next trial is initiated. The best method for solving this problem is to adopt a ‘win–stay’ strategy; if the animal's choice leads to reward, keep choosing the same stimulus, if not respond to any other stimulus. A more complex solution is to adopt a win–stay, ‘lose–shift’ strategy, which is to keep responding to the previously rewarded stimulus, but if not rewarded, shift to the alternative stimulus. In the win–stay strategy, it is not automatically apparent that the correct course of action if the chosen stimulus was not rewarded is to shift to the opposite stimulus, as the correct response could be to withhold responding at all. After being presented with six trials of the same problem, the stimuli are then changed to a novel set of stimuli, such as a green square and an orange square, with the green square now rewarded. When presented with these new stimuli, the animal does not possess knowledge of the appropriate reward contingencies and therefore can only respond at random on Trial 1. However, once the animal gains this information, it can adopt a win–stay, lose–shift strategy to choose the rewarded stimulus on Trial 2. Pigeons learn these problems very slowly, and cannot transfer to new sets of stimuli (Wilson et al. 1985), whereas corvids improve performance across trials and appear to have adopted a win–stay, lose–shift strategy (Kamil & Hunter 1970; Hunter & Kamil 1971, 1975; Kamil et al. 1973, 1977; Kamil & Mauldin 1975; Wilson et al. 1985). Another passerine species, the greater hill myna (Gracula religiosa), performed at comparable levels to blue jays (Kamil & Hunter 1970).

(c) Transitive inference

In social animals, the ability to make inferences about other individuals' relative place in a dominance hierarchy and, therefore, predict the outcome of competition should be a useful skill. This ability is called transitive inference and has been examined in squirrel monkeys, chimpanzees, rats and pigeons in the laboratory. Briefly, an animal is presented with a pair of stimuli (A and B), where response to A is rewarded, and response to B is non-rewarded. The animal is then presented with a different set of stimuli (B and C); however, for this pair, B is now rewarded, and C is non-rewarded. The animal is then presented with further pairs of stimuli, each time the previously non-rewarded stimulus is now rewarded, and a novel stimulus is non-rewarded. When the animal is presented with novel combinations of stimuli, such as B and D, which have both been rewarded during previous training trials, but have never been presented together, the animal should infer that B is more valuable than D, and therefore choose B over D. Pigeons can solve this problem (von Fersen et al. 1991), however it has been suggested that pigeons do not solve the problem through reasoning, only by associative learning. Fersen and colleagues suggest that within a pair of stimuli, the rewarded stimulus transfers some of its associative strength to the non-rewarded stimulus, and that this level of associative strength is reduced the further along the stimulus chain (A–E). Therefore, A has the highest associative strength, and E has the lowest associative strength. This is known as value transfer theory. One argument against this suggestion is the finding that if stimulus A is only rewarded on half the A/B trials, and stimulus E is only rewarded on half the D/E trials, then although the value of B has been reduced and the value of D has been increased, pigeons still tend to choose B over D (Weaver et al. 1996).

Transitive inference has also been examined in New world jays, with the prediction that the social pinyon jays (Gymnorhinus cyanocephalus) should be more successful on transitive inference tasks than the relatively less social western scrub-jays (Bond et al. 2003). Indeed, pinyon jays demonstrated more rapid and accurate learning when presented with stimulus colour pairs (i.e. A/B, B/C, C/D, D/E, E/F and F/G); however, there was no difference in the number of errors made by the two species in four out of six pairs. Although less accurate than the pinyon jays, the scrub-jay learning curve was almost identical to the pinyon jay learning curve. When tested for transitive inference (i.e. B/D, B/E, B/F, C/E, C/F and D/F), there was no difference between the species in their level of accuracy. Scrub-jays were more accurate when the ‘symbolic distance’ between the two stimuli was the greatest (i.e. B/F pair was more accurate than B/D pair). Pinyon jays and scrub-jays were equally accurate on B/D, C/E and D/F pairs. Finally, pinyon jays displayed longer response times to those pairs lower in the sequence (i.e. D/E and E/F) than scrub-jays, which Bond et al. suggest represents a more cognitive strategy. Although there is interesting difference between the two species, there are also many interesting similarities; therefore it may be premature to suggest that this study conforms to the idea that the highly social pinyon jays are more successful at transitive inference than scrub-jays because they are social. Recently, Pas-y-Mino et al. (2004) have tested this hypothesis directly using social dominance tests in pinyon jays. Pinyon jays observed contests for food between pairs of jays of known dominance from either the same or a different group (e.g. bird B with bird A and bird B with bird 2). The observing jay (e.g. bird 3) was then placed into competition with bird B. If bird 3 had formed representations of the relative dominance of birds observed in the previous encounters, then bird 3 should have displayed a greater number of submissive displays to B and a reduction of dominance behaviours. This is what occurred. Unfortunately, scrub-jays have not been tested using the same paradigm.

(d) Insight learning and problem-solving

Thorpe defined insight as ‘the sudden production of a new adaptive response not arrived at by trial behaviour or as the solution of a problem by the sudden adaptive reorganization of experience’ (Thorpe 1964, p. 110). Although, the classic studies of insight were performed on chimpanzees by Kohler in the early part of the twentieth century (Kohler 1927), birds have also been tested for insight or solving novel problems without recourse to trial-and-error learning. One classic test for insight in birds is the string-pulling or patterned strings task. Although, many birds can learn to pull up string which is attached to food (Thorpe 1943; Vince 1956, 1958, 1961) and choose the correct strings when more than one string is presented (Ducker & Rensch 1977), only hand-raised ravens (Corvus corax) and keas have been tested as to whether they can immediately solve novel problems related to string-pulling, such as crossing the strings, changing the string's colour, or attaching one string to food and the other to a stone. Both the ravens (Heinrich 1995) and the keas (Werdenich & Huber, in press) pulled up the string on the first attempt, and continued a high performance across all the tasks. Whether this is really an example of insight is debatable; however, it is indeed suggestive of some form of rapid problem-solving (Heinrich 1995, 2000). Interestingly, performance on string-pulling tests appear to be compromised in language-trained African grey parrots (Psittacus erithacus), who can make requests (I want X). By contrast, parrots with little language training pull up strings with ease (Pepperberg 2004).

(e) Numerical concept

One of the earliest studies of numerical competence in birds were performed by Koehler in the 1950s on a collection of different birds (Thorpe 1964 for review of earlier studies). Koehler utilized two methods to test birds: simultaneous and successive presentation. In simultaneous presentation tests, the bird was presented with a card with a number of dots drawn onto it, and two smaller boxes; one with the same number of items (food), the second with a number of items deviating from the number on the card by one. A raven and grey parrot learned to open the box with the same number of items as the card (from 2 to 6 items). Finally, Koehler examined whether the birds were successful because they were discriminating quantity, rather than number. On each trial, a number of items were made from a standard ball of modelling clay. Therefore, the quantity was the same across trials, but the number was different. The birds rapidly learned this problem, suggesting that they were responding to the number of items on the card and the corresponding number on the correct box. In successive presentation tests, the birds were rewarded for eating a certain number of food items, when smaller numbers of items were located in a series of boxes. For example, a bird was rewarded for eating five items: Box 1 contained one item, Box 2 contained two items, Box 3 contained one item, Box 4 was empty and Box 5 contained one item. Although, other boxes were available, the birds had to stop opening boxes once they had eaten the correct number of items. The raven rapidly learned to choose the correct box even if all five boxes were presented simultaneously (Koehler 1950). Perhaps most impressively, some birds could master multiple forms of this task simultaneously; a jackdaw (Corvus monedula) could open black lids until it had eaten two pieces of food, green lids until three food pieces, red lids until four pieces and white lids until five pieces. How could the birds have done this without ‘counting’? One striking observation by Koehler of a jackdaw has been suggested to be a behavioural indicator of counting. This bird took items from the first three boxes, totalling four pieces, and then moved away from the boxes. This was about to be reported as failure, when the bird returned and passed in front of each box in turn, making a number of bowing gestures which appeared to correspond to the number of items which it had previously retrieved from the box. It then opened the empty box, moved onto the fifth box, removed the food item and then moved away, not attempting to open any of the remaining boxes!

Unfortunately, these experiments have never been replicated to the same degree or with the level of control (however, see Swenson (1970) for a simpler, but equally convincing demonstration of number discrimination in white-necked ravens; Corvus cryptoleucus). However, similar studies have been performed with chimpanzees, finding comparable results (Boysen & Berntson 1989). An equally interesting demonstration of numerical competence in birds has been reported for Alex, the African grey parrot discussed earlier (Pepperberg 1999). Alex was presented with trays containing different numbers of different objects (e.g. three keys and two corks) and asked to distinguish the cardinal set (i.e. the total number of objects, in this case five). He was proficient in reporting the total number of both familiar and novel objects (Pepperberg 1987b). Alex was also successful in reporting the number of object X (e.g. keys) from a set of objects X and Y (i.e. pieces of wood) and focused on the number of objects independent of irrelevant information, such as their shape and colour (Pepperberg 1987b). Most impressively, Alex could report the number of items of a particular kind (e.g. keys) and their colour. For example, if presented with one orange chalk, two orange wood, four purple wood and five purple chalk, and asked ‘How many purple wood?’, Alex was highly accurate in reporting the correct number, in this case ‘4’ (Pepperberg 1994).

(f) Object permanence

During early development, human children pass through a number of cognitive stages which relate to their understanding of their physical and social environment (Piaget 1952). Object permanence is the ability to keep track of objects and individuals that are not currently available to perception (out of sight). In children, object permanence develops in distinct developmental stages from tracking the movement of visible objects (Stage 2) and tracking partially hidden objects (Stage 3) to forming representations of fully hidden objects (Stage 4) and representing the visible (Stage 5) and invisible (Stage 6) displacement of hidden objects (Uzgiris & Hunt 1975). Object permanence would appear to be an important ability for many animals, particularly food-caching species and predators. Indeed, food-caching magpies (Pica pica) display Piagetian Stage 4 object permanence around the age at which they begin recovering cached food (44 days), and also achieved Stages 5 (65–107 days) and 6 (Pollok et al. 2000). In another developmental study, an African grey parrot was successful on Stage 4 tasks at 9–16 weeks, Stage 5 tasks at 17–20 weeks and Stage 6 at 21–33 weeks (Pepperberg et al. 1997). Object permanence (all stages) has been demonstrated in a number of adult psittacine birds; African grey (Pepperberg & Kozak 1986; Pepperberg & Funk 1990), Illiger mini macaw (Ara maracana), parakeet (Melopsittacus undulates) and cockatiel (Pepperberg & Funk 1990). The only other avian species to be tested, ring doves (Streptopelia risoria), demonstrated successful performance on Stage 4 tasks (Dumas & Wilkie 1995).

4. Adaptive specialization approach to avian cognition

For cognitive abilities to have evolved, there must have been socio-ecological problems facing animals which could not be solved using trial-and-error learning or innate responses (Kamil 1988). There have been various suggestions as to what these social and ecological variables might have been, including life in a complex, individualized social group (Humphrey 1976; Dunbar 1998); finding food located in time and space (Milton 1988); tool use (van Lawick-Goodall 1970); and the extraction of shelled or cased food (Gibson 1986). Although, these have been suggested to be important for the evolution of primate intelligence, it is now clear that there is equally impressive evidence that the same variables also influenced the evolution of avian intelligence. These variables will be discussed further in §7. However, this section will discuss the adaptive specialization approach to avian cognition, focusing on spatial memory, food-caching, social learning and tool use.

(a) Spatial memory and food-caching

Many birds cache food for future consumption, either a large amount of seeds cached over a wide area which are stored seasonally or a smaller amount of higher-quality, perishable animal material or fruit which are recovered hours or days later (Sherry 1985; Vander Wall 1990). To efficiently recover these caches, storers need to process various types of information (often simultaneously) about the location of the cache site, the type and perishability of the cached item, and the social context during caching (de Kort et al., in press). Caching different foods in different contexts may require different cognitive abilities for successful retrieval. For example, Clark's nutcrackers living at high elevations cache up to 30 000 pine seeds over a wide area that are recovered up to six months later, which should require highly proficient long-term spatial memories (Balda & Kamil 1992). By contrast, western scrub-jays living in California cache fewer of a wider variety of food items which differ in their level of perishability and which are recovered after much shorter periods from caching (Clayton et al. 2001a).

How do species such as Clark's nutcrackers remember the location of thousands of individual caches made over a wide geographical area many months in the past? One suggestion is that each storer forms a ‘snapshot’ of every cache location. If so, this would be a highly inefficient system for a bird that caches thousands of food items in thousands of different places. This system would also be completely inflexible in responding to changes in the environment, such as increased snowfall over the cache site. However, there is some evidence that this is the process by which Clark's nutcrackers may retrieve some of their caches, as they were found at the time of recovery to orient themselves in exactly the same direction as when the snapshot was formed during caching, even if the cache site was approached from a different direction (Kamil et al. 1999).

A second suggestion is that the birds use visual cues, such as landmarks, or arrays of multiple landmarks to orient themselves with respect to the cache site. One area of controversy is the relative importance of local cues (objects located close to the goal object) versus distal cues (objects further away from the goal, such as on the periphery). Many corvids will use a combination of both distal and local cues to aid in finding their caches (Clayton & Krebs 1994; Gould-Beierle & Kamil 1999). European jays (Garrulus glandarius) use tall landmarks that were close to hidden food sites, rather than small, distant landmarks (Bennett 1993). Large landmarks provide more information to a caching animal than just the general location of a cache site. For example, Vander Wall (1982) allowed Clark's nutcrackers to cache in an arena containing multiple objects; however, between caching and recovery, the arena was extended by 20 cm to the right and all small objects in the right half were also moved by 20 cm. A large landmark in the left of the arena (a rock) remained in place. The birds displayed errors in recovery accuracy of approximately 20 cm for the caches they had made in the right-hand side of the arena, whereas the caches made in the left-hand side of the arena were recovered accurately, suggesting that the birds calculated the distance between the cache site and a landmark. However, in the wild, storers can use multiple landmarks to calculate the relative distance between cache sites and two or more landmarks. Clark's nutcrackers, for example, learn to find the half-way point between two landmarks, and transfer this ‘rule’ to changes in the distance between the landmarks (Kamil & Jones 1997). Clark's nutcrackers also respond to changes in directional information (Kamil & Jones 2000). Studies comparing Clark's nutcrackers with pigeons and jackdaws found that all three species could learn the distance rule, but only jackdaws failed to transfer the rule to novel distances (Jones et al. 2002).

(b) Social learning

There are many types of social learning from stimulus or local enhancement, observational conditioning, goal emulation and imitation (Whiten & Ham 1992). Social learning has been investigated in the field and the laboratory. All birds that have been studied or tested for social learning have been successful, which is not the case for many mammals (Lefebvre & Bouchard 2003). In the wild, social learning has been studied mainly with respect to feeding behaviour: where to eat, what to eat and how to eat. The classic example of social learning in birds was described by Fisher and Hinde (Fisher & Hinde 1949; Hinde & Fisher 1952). In late 1940s Britain, blue tits were observed to open the silver tops of milk bottles to access the cream found on the top of the milk. This behaviour appeared to quickly spread throughout Britain, much quicker than would be predicted by trial-and-error learning. However, subsequent laboratory studies found that this could be explained as stimulus or local enhancement rather than imitation (Sherry & Galef 1984, 1990).

Two forms of imitative learning have been investigated in birds; vocal mimicry and motor imitation. Male songbirds not only copy the song of their fathers (Catchpole & Slater 1995), but some species such as mynas, lyrebirds and parrots can imitate the vocalizations of other birds, human speech and general noises (Baylis 1982). As there is little evidence for vocal imitation in non-human primates, the case of motor imitation may be more appropriate for comparison between birds and primates. To date, there have been many successful studies of motor imitation in birds (Zentall 2004); however, only one study on an African grey parrot has fulfilled the same criteria as used in non-human primates, i.e. the ability to imitate novel motor patterns (Moore 1992).

The two-action method of motor imitation has been proposed as the most appropriate method for examining imitative behaviour in animals (Heyes 1996). The technique was initially used to test whether budgerigar observers learned to remove a red cardboard square from a white pot demonstrated by a conspecific, either using the beak or the foot (Dawson & Foss 1965). Similar experiments where two demonstrated actions can result in the successful acquisition of a goal have been performed in starlings (Sturnus vulgaris, Campbell et al. 1999), Japanese quail (Coturnix japonica, Akins & Zentall 1996) and ravens (Fritz & Kotrschal 1999).

Like apes, many birds, particularly parrots, have been demonstrated to process complex covered foods, such as shells, and the tough skins of some fruits (Gibson 1986). Huber et al. (2001) examined the ability of keas to imitate the actions of a demonstrator to gain food located inside an artificial structure (clear Perspex box) that functionally resembles a hard cased fruit. The box could only be opened by performing three successive manipulations of three locking devices (a screw, split pin and a bolt). Conspecific demonstrators were trained to perform the appropriate actions used to gain entry to the box. One group of observers saw the demonstrators opening the box, whereas an additional control group was presented with the box without any experience of observing a demonstrator. The observers spent a longer-time exploring the box, and the latency to first contact the box was shorter in the observers. The observers also appeared to understand the goal of the task as they displayed greater perseverance in manipulating the locking devices on the box. Unfortunately, no bird succeeded in opening the box, but there were differences in the levels of success in opening the individual locking devices.

Sociality has been suggested to be one of the prerequisites for the evolution of complex cognition (Humphrey 1976; Emery & Clayton 2004b). Indeed, solving social problems may have been the reason for developing such intellectual skills in the first place (Humphrey 1976; Byrne & Whiten 1988). In either case, social species may be predicted to demonstrate enhanced social learning compared to non-social species because of the increased opportunities available for learning socially in larger groups (Lefebvre & Giraldeau 1996). However, it is unclear whether being social only provides a selective advantage on learning tasks socially, or will also transfer to learning tasks individually. When social pigeons were compared with territorial, and therefore less social, zenaida doves (Zenaida aurita) on a task which could be performed individually or after witnessing a demonstration by a trained conspecific, both species were poor individual learners, but the pigeons were successful after observing a demonstrator (Lefebvre et al. 1996).

Social pinyon jays and relatively non-social Clark's nutcrackers were compared on two tasks: a motor task, which involved removing a lid covering a food well, or a discrimination task, which involved discriminating between two differently coloured wells, one of which contained food. Half the birds of each species performed the motor task after observing a demonstration, and half the birds were tested individually. The same occurred for the discrimination task. Pinyon jays were more accurate on both tasks after social learning, compared to individual learning, but there was no difference between social and individual learning for Clark's nutcrackers (Templeton et al. 1999). This result suggests that social pinyon jays benefit from observing conspecifics, whereas non-social Clark's nutcrackers do not. However, the results did not substantiate the adaptive specialization hypothesis as there was no difference in performance between the social and non-social species. Interestingly, Clark's nutcrackers are likely to have evolved from a social-living common ancestor with other corvids.

(c) Tool use and understanding the physical properties of tools

The first description of tool-use outside of humans was reported in wild chimpanzees (van Lawick-Goodall 1968); however, some birds have also been described as creating and using tools (for review, see Emery & Clayton (2004a). Although many birds, primates and other animals use tools, there is some controversy about the extent to which these species understand how tools work, the consequences of using a tool and the unobservable forces underlying their function (so-called folk physics).

Many birds appear to use and/or manufacture tools. Some examples of animal tool-use, however, do not fulfil the strict criteria of tool-use demonstrated by non-human primates. Tool-use has been described as ‘the use of physical objects other than the animal's own body or appendages as a means to extend the physical influence realized by the animal’ (Jones & Kamil 1973, p. 1076). Vultures (Neophron percnopterus), for example, crack open eggs by dropping them onto rocks (van Lawick-Goodall & van Lawick 1966). This is not a demonstration of tool-use per se, as the rock is not an extension of the vulture's body. However, vultures which throw stones at ostrich eggs are demonstrating tool-use (Thouless et al. 1989). Similarly, thrushes which open snail shells by smashing them onto stones (Gibson 1986), or crows in Japan and California that open hard shelled walnuts by dropping them from great heights onto hard-surfaced roads (Nikei 1995; Cristol & Switzer 1999) are not demonstrating tool-use. These may be innate responses, and they may not require the mental manipulations required to transform an object with one distinct function into a tool with a different function.

Tool use and manufacture has been demonstrated in wild birds. A species of Galapagos finch, the woodpecker finch (Camarhynchus pallidus), was reported to use a stick to probe for insects in the holes of trees (Millikan & Bowman 1967). The finches would break off a twig, leaf stem or cactus spine and then use it to dig into an inaccessible hole. The birds also transport the best tools with them when foraging and change the length of the tools when they are an inappropriate length for the next hole. Tebbich et al. (2001), using aviary-housed finches, examined whether this tool using was learned socially or through individual trial-and-error learning. Some captive adult finches learned to gain access to a beetle larva hidden in a hole in an artificial tree trunk using a twig. However, the non-tool users (when exposed to many weeks in the presence of tool-users) did not learn to use this technique. Hand-raised finches exposed to tool-users, however, did not learn to use tools any better than young exposed to non-tool-users, thereby suggesting that tool use in finches is independent of social learning, and may represent an example of learning during a critical period. Tebbich et al. (2002) also found that the best finch tool users were found in dry habitats, where prey is located under dry bark that is difficult to access, and virtually none in humid habitats, where prey is located under wet moss.

The clearest example of tool-use and manufacture in corvids is by the New Caledonian crows (Corvus moneduloides) in the South Pacific. Hunt described how crows manufacture two types of tools (stepped-cut Pandanus leaves and hooked twigs) for use in retrieving insects (Hunt 1996). The crows often carried useful tools around with them on foraging expeditions. Each type of tool was used for a specific function, which required performance of a particular action. For example, Pandanus leaves were used to probe for prey under leaf detritus, utilizing a series of rapid back and forth movements, whereas hooked twigs were used to poke out insect larvae from within holes in trees using slow deliberate movements.

Although field studies are extremely important, they cannot help answer questions about an animal's understanding of the unobservable forces acting on tools, so-called folk physics. A number of birds certainly manufacture and use tools in the laboratory. Northern blue jays (Cyanocitta cristata), for example, were found manipulating the shape of newspaper strips provisioned at the bottom of their cage, and using them to pull in inaccessible food pellets (Jones & Kamil 1973). The jays did not use the paper tool when pellets were not present, and tended to use the tools more when the length of their food deprivation was greatest. The jays were also able to use a feather, thistle, straw grass, paper clip and plastic bag tie in similar ways when presented with these objects. Finally, the jays also wet the strips of paper, placed the strips in their empty food bowl and used them to collect food dust. Similar behaviour was observed by Clayton & Joliffe (1996) in another food-storing species, the marsh tit (Parus palustris).

Experiments on woodpecker finches have attempted to test whether they understand how tools work using three paradigms based on accessing food located in the middle of a transparent tube (Tebbich & Bshary 2004). These tasks have been used to great effect in tool using non-human primates, such as capuchin monkeys and chimpanzees, with variable results (reviewed in Tomasello & Call (1997)). In the first task, the finches were provided either with toothpicks with two smaller sticks attached at either end producing an H-shape or twigs with thorns at both ends pointing in opposite directions. As the toothpick and the twig with the thorns were wider than the opening to the tube, the bird had to remove the smaller sticks or thorns before inserting the twig into the tube. Three finches modified the artificial tool and four finches modified the natural tool. In the second task, the finches were provided with a series of sticks of differing lengths, some that were shorter than the length required to reach the food in the tube, some the exact length and some longer. There was no evidence that subjects chose the correct length of tool, but they were all successful in reaching the food on subsequent attempts. The final study is called the ‘trap tube’ task, which uses the same clear tube, but with a modification (trap) in the centre. In this task, pulling the food away from the trap resulted in success, whereas pulling towards the trap resulted in the food falling into the trap. One finch was successful in retrieving food from the tube. To determine whether the bird had used a rule to solve the task (i.e. pull away from the trap), the tube was inverted, so that pulling the food from either side would result in food. Surprisingly, the successful finch pulled equally from both sides, possibly suggesting that it had understood that the trap no longer was functional. Of course, the bird was rewarded for pulling from either side (i.e. it always received food); therefore, there was no punishment for pulling from either side. New Caledonian crows in the laboratory were provided with similar tasks, and chose the correct tool, a twig of certain length or diameter, from a ‘tool box’ (a collection of twigs of different lengths and widths) that was appropriate for reaching food placed in the middle of a transparent tube (Chappell & Kacelnik 2002), or passing through a small hole and so being able to push a food container into their reach (Chappell & Kacelnik 2004).

The most impressive example of ‘folk physics’ in any animal has been demonstrated by a female New Caledonian crow called Betty, who modified a non-functional, novel material (metal wire) into a new, functional shape (a hook). Two crows were initially provided with pieces of straight or bent wire, which were provided to enable them to pull-up a bucket containing food located in a well (Weir et al. 2002). The male crow, Abel, however, stole the bent wire, leaving only the straight wire. The female crow initially attempted to lift up the bucket using the straight wire, however, when unsuccessful, bent the wire into a hook. To perform this action, the crow would have required an understanding of the initial problem (access to food contained in the cup with a handle can only be achieved by pulling the container upwards and removing it from the well), the inadequacies of the available material (straight wire instead of a hook) and the properties of the wire (can be manipulated into a useful hook). However, examination of the hooks created by Betty and of the methods used to lift-up the bucket from the tube reveal that only three out of the 10 wires were fashioned into ‘proper’ hooks (i.e. with a final angle less than or equal to 90°, where the wire would be bent greater than 90°), and that successful retrieval of the food bucket could be achieved without creating such a proper hook. Positioning the bent end of the wire under the handle of the bucket and lifting while pushing the edge of the wire against the side of the tube and pulling upwards was successful because the tube itself could be used as a tool to aid in bucket retrieval. A proper hook should have been able to lift-up the bucket without aid from other structures.

When presented with straight pieces of wire, Betty often attempts to retrieve the food bucket without first creating a hook, and is often successful in using the straight wire to lever the bucket upwards or pierce the meat (N. J. Emery 2004, personal observation). More thorough tests of understanding in this task are required, such as increasing the diameter of the tube, while keeping the size of the bucket the same, so that only a functional hook could be used to pull-up the bucket or decrease the opening at the top of the tube, so that only a pull-up technique would be successful.

5. Anthropocentric questions, ecological answers

Although utilizing information about an animal's natural history can be a powerful tool in thinking about the evolutionary constraints leading to the development of a cognitive ability, we have to be careful in what we infer from this information. For example, small group size does not necessarily mean lack of social skills; reduced reliance on cached seeds does not mean poor spatial memory. For example, western scrub-jays are semi-territorial, and so relatively non-social. They also do not rely heavily on stored food (although more so than most Old World crows). However, scrub-jays have demonstrated episodic-like memory (Clayton & Dickinson 1998) and experience projection (Emery & Clayton 2001), two capacities not yet demonstrated in other animals. These studies will be described in detail below.

(a) Episodic-like memory

For food-storing animals, cache recovery may require more than a proficient understanding of ‘where’ the caches are hidden. Caching animals may also need to have an understanding of ‘what’ information if the cached items differ in type, and an understanding of ‘when’ information if the items differ in time taken to perish. Laboratory studies have found that western scrub-jays form integrated memories of what item was cached where and when (Clayton & Dickinson 1998, 1999). When caching perishable food, it is prudent to learn something about the decay properties of the food (e.g. how long till they perish?), and if two or more perishable foods are cached, to learn their relative decay rates, so that food can be recovered when it is still fresh and edible. Clayton & Dickinson (1999) trained one group of scrub-jays (Degrade Group) that wax worms were still fresh 4 h after caching, but had degraded after 124 h. A second Replenish Group always received fresh wax worms at recovery. Less preferred peanuts were always available for caching. The Degrade Group birds rapidly learned to avoid searching for wax worms after 124 h when they had perished. When tested in probe trials (in which the food and any odour cues had been removed) after caching both worms and peanuts in different parts of a unique caching tray, the birds in the Degrade Group searched in wax worm sites after 4 h, but switched to searching for peanuts after 124 h (Clayton & Dickinson 1998, 1999). The Degrade Group birds were then taught that two different foods (mealworms and crickets) degraded after different times (28 and 100 h, respectively). They rapidly learned the different relative decay times of the two foods, compared to non-perishable peanuts. During probe trials, the scrub-jays in the Degrade Group switched their preference for mealworms to peanuts at the 28 h interval and from crickets to peanuts at the 128 h interval.

Scrub-jays may also cache two different perishable items at the same time, and therefore must learn the relative time to perish rates between the two foods. Clayton et al. (2001b), therefore, compared how the jays responded to mealworms cached in one side of the tray, and crickets cached in the other side. As mealworms were found to be preferred to crickets, the jays should have searched specifically for mealworms when they were still fresh (4 h later), but switched to searching for crickets after 28 h, when the mealworms had degraded. This is what the birds did during un-rewarded test trials. These studies provide convincing evidence that during cache recovery, western scrub-jays remember not only the location of their caches, but also the different food types located within individual cache sites, and the relative time since they were cached. This representation of the time since caching is essential for the efficient recovery of perishable food items. It remains to be tested whether other corvid species, which are more or less dependent on caching perishable food, will also demonstrate such sophisticated understanding of the state of their caches at the time of recovery.

(b) Theory of mind

Five years ago, the idea that a bird could think about another's mental states (theory of mind) was preposterous. Research into primate social cognition had revealed many interesting insights into what chimpanzees may know about other minds (Emery 2005), but a scathing paper reviewing the evidence for mental attribution in primates suggested that all previous experiments had demonstrated associative learning, rather than theory of mind (Heyes 1998). It took a series of experiments on chimpanzee food competition, with a high ecological validity, to revitalize the field (Hare et al. 2000, 2001). A similar approach has been adopted for food-caching corvids with equal amounts of success.

As well as integrating information about the location and time of caching, food-storers also need to be aware of the social context, especially those species living in large social groups, because caches are susceptible to pilfering (Vander Wall 1990). For pilferers, the ability to locate caches made by others, quickly and efficiently, may be an important difference between successful pilfering and aggression from the storer. A number of corvids observe conspecifics caching and demonstrate excellent observational spatial memory for the location of another bird's caches (Bednekoff & Balda 1996b; Heinrich & Pepper 1998; Clayton et al. 2001a), whereas there is little evidence for similar social learning in other caching species, such as parids (Baker et al. 1988).

Use of observational spatial memory as a pilfering strategy may differ between species depending on level of sociality, and as such may be an adaptive specialization (Balda et al. 1996). Bednekoff & Balda (1996a,b) tested whether social Pinyon and Mexican jays and asocial Clark's nutcrackers remembered where another bird had cached, recording their cache retrieval accuracy after 1, 2 or 7 days. Pinyon jays remembered the specific location of caches after 1–2 days, and in general locations after 7 days, whereas Clark's nutcrackers and Mexican jays were more accurate than chance after 1 day. After 2 days, the Clark's nutcracker storers accurately recovered their own caches, but not those they had observed, whereas there was no difference between recovering their own caches and another's caches after 2 days in Mexican jays (Bednekoff & Balda 1996a). This finding supports the adaptive specialization of social learning hypothesis in corvids, however the study comparing pinyon jays and Clark's nutcrackers described earlier is more ambiguous. Further evidence against the adaptive specialization hypothesis was provided by a study of observational spatial memory in western scrub-jays. Three groups, Storer Group, Observer Group (which saw another jay caching) and Control Group (which could hear another jay caching, but could not see it), were compared for accuracy in retrieving caches made 3 h previously by the storer. The observers were more accurate than controls in searching for the cached food, but less accurate than the storers (Clayton et al. 2001a). The social context of caching behaviour may be viewed as an arms race between storers and pilferers, in which storers use counter strategies to minimize the risk of having their caches pilfered (Bugnyar & Kotrschal 2002; Emery et al. 2004). In this arms race, however, an individual bird can play both roles. Field observations suggest that storers engage in a number of cache protection strategies which may or not be dependent on cognitive processes. Example of such strategies include waiting until pilferers are distracted or cannot see them before they resume caching, or by making ‘false’ caches that either contain an inedible item such as a stone or nothing at all (e.g. rooks, Kallander 1978 and ravens, Heinrich & Pepper 1998; Heinrich 1999; Bugnyar & Kotrschal 2002). Some corvids return alone to caches they had hidden in the presence of conspecifics, and readily recache them in new places unbeknown to the potential thief (e.g. jays, Emery & Clayton 2001; ravens, Heinrich 1999).

While field observations are important for documenting natural behaviour, an experimental approach is crucial for understanding the mechanisms underlying these behaviours and determining the effects of experience, particularly in relation to simulation ‘theory of mind’ (Emery 2005). Consider the observation of birds moving food they had hidden in the presence of other individuals, and recaching the items in new places when those observers were no longer present. In the wild, one might explain the presence or absence of another bird as purely coincidental to the caching and recaching events. To test whether it was the presence of an observer at caching, and absence of one at recovery which elicted the storer's recaching behaviour, Emery & Clayton (2001) allowed hand-raised western scrub-jays to cache either in private or while a conspecific was watching and then recover their caches in private. Individuals that had prior experience of pilfering another bird's caches subsequently recached food in new sites, but only when they had been observed during caching. Because the two conditions were identical at the time of recovery, the birds had to remember whether or not they had been watched during the caching condition in order to know whether to recache during recovery, and if so, whether in new sites. Note that jays without experience of pilfering did not move their caches to new sites. The inference is that these birds engage in experience projection (Emery & Clayton 2004a), i.e. the jays relate information about their previous experience as a pilferer to the possibility of future stealing by another individual, and modify their recovery strategy appropriately. By focusing on the counter strategies of the storer when previously observed by a potential thief, this experiment raises the possibility that recaching behaviour is based on mental attribution (simulation account). However, we do not suggest that scrub-jays possess a belief–desire psychology (modular account of social cognition), which has been the focus of all previous research on comparative social cognition (Emery 2005). This will form the basis for §6.

Scrub-jays may appreciate how other jays see the world through visual perspective-taking in the context of cache protection strategies. Jays were provided with the opportunity to cache in two trays that differed depending on their distance from an observer, either ‘near’ or ‘far’. When observed during caching, jays cached predominantly in the tray located furthest from the observer (far), whereas during private caching trials, the storer cached equally in near and far sites. During recovery in private, the storer selectively recached food items from the tray closest to the observer, i.e. those food items, which were most at risk of theft. These results suggest that the storers may evaluate the quality of visual information available to the observers for subsequent pilfering and so enable appropriate strategies which function to reduced this information (Dally et al. 2005). In a related experiment, storers were provided with a choice of two caching trays, one in full view of an observer, a second hidden behind an opaque barrier. The storers cached predominantly in the hidden tray during caching, however during recovery the jays recovered their caches equally from both trays, but selectively recached food items from the tray not occluded by the barrier (Dally et al. 2005). Again, this suggests that the storers may have understood the visual perspective of the observer; however, an alternative explanation is that the jays cached in places where they could not see the potential thief.

One possible way to discount this explanation is to investigate cases where the observer's view is not completely occluded, but where there are cache locations in which the visual information available to accurately use observational spatial memory for pilfering is reduced. Dally et al. (2004) provided jays with the opportunity to cache in two trays, one that was located in an illuminated areas of the cage, the other in a darkened part of the cage. When observed, more caches were made in the darkened tray, however there was no preference for either the bright or dark tray when caching in private. The jays selectively recached items from the bright tray during recovery. In this case, the storer's view of the observer was not occluded; however, visual information available to the observer was impoverished.

Other corvids also implement a variety of cache protection strategies, which may be based on complex cognitive processes. In an aviary, Bugnyar & Kotrschal (2002) reported that individual ravens cached behind large natural barriers, such as trees and rocks, and repositioned themselves depending on the position of an observer. This may be an example of tactical deception (Whiten & Byrne 1988); however, a simpler explanation may be that ravens prefer to cache out of sight of conspecifics because this reduces the frequency of cache pilferage. This is substantiated by further data that ravens also cached at an increased distance from conspecifics, and they moved caches when others approached them (Bugnyar & Kotrschal 2002). Only laboratory experiments with appropriate controls, such as those described above for scrub-jays, can attempt to rule out such simpler explanations. Indeed, by focussing on recaching behaviour when other individuals are no longer present can eliminate explanations based on ‘behaviour-reading’, and the fact that scrub-jays hide some food in high risk cache sites, such as close to an observer, in view or in sunlight and then selectively recache these items when alone at recovery are highly suggestive of social reasoning (Dally et al. 2004, 2005).

A second possible example of tactical deception in ravens was described in a study in which a subordinate that had visual access to the location of hidden food led a dominant away from the food, before attempting to access the food themselves (Bugnyar & Kotrschal 2004). Again, similar behaviour has been reported in chimpanzees (Menzel 1974). Ravens may also appear to understand the difference between an individual who possesses knowledge about a caching event versus another individual who was ignorant of the event (Bugnyar & Heinrich 2005), although in this case discrimination learning (presence or absence of an individual at the time of caching) or behaviour-reading cannot be ruled out. Similar data and similar associative explanations have been reported for chimpanzees (Povinelli et al. 1990; however, see Hare et al. 2001).

6. Bird-brains revisited

In this brief overview of avian intelligence, we have seen that some birds possess many of the intellectual capacities of non-human primates. Indeed, corvids (and possibly parrots) appear to rival the great apes in many psychological domains (Emery & Clayton 2004b). Although corvids and parrots have brains that are the same relative size as chimpanzees, gorillas and orangutans, bird and mammal brains are very different structures. Indeed, Emery & Clayton (2004a) have suggested that corvids and apes may represent a case for convergent mental evolution (i.e. same cognitive processes, with the same outcome), but with divergent brains (i.e. very different brain structures). Although recent changes in the nomenclature of the avian brain go some way towards explaining how bird-brains can perform similar mental operations to mammalian brains, the brains themselves have not changed, only the way we view them. It may, therefore, be prudent to revise the earlier claim that bird and mammalian brains have diverged with respect to their anatomy. Although the gross structure of avian and mammalian brains is radically different, there is evidence that there are connectional similarities in the brains of these two taxa which may explain their similar behaviour and cognition. Three examples of these similarities are discussed.

(a) Visual processing

The mammalian neocortex is highly laminated, consisting of a series of six layers from the superficial layers on the surface to the deeper layers (figure 3a). Each layer has its own cell types, connectional patterns and neurochemical composition. By contrast, the avian telencephalon tends to be nucleated, with little or no laminar organization (figure 3b). There is one striking exception. The Wulst (dorsal pallial region) or hyperpallium is located on the dorsal surface of the telencephalon and consists of 3–4 layers depending on the size of the Wulst (small in pigeons and large in owls, Pettigrew 1979; Medina & Reiner 2000). As with the neocortex, each layer has its own connectional (Karten et al. 1973) and neurochemical (Shimizu & Karten 1990) patterns. Indeed, visual information appears to be processed by similar pathways in the avian and mammalian brains. The tectofugal pathway is important for orienting towards objects. Information is processed in the following manner in birds: retina—optic tectum (superior colliculus)—nucleus rotundus (pulvinar)—entopallium. The thalamofugal pathway is important for identifying objects. Information is processed in the following manner in birds: retina—lateral geniculate nucleus of the thalamus—Wulst or hyperpallium (striate cortex). Similar connectional architecture has also been suggested for the somatosensory and motor systems in birds and mammals (Medina & Reiner 2000). It is not yet certain what aspects of these anatomical traits have evolved from a common stem amniote ancestor and which have evolved independently.

Figure 3

Drawings of frontal sections through the telencephalon of (a) a rat and (b) a pigeon, with details of the similar connections patterns within (a) the laminated cortex of rats and (b) the laminated Wulst of pigeons. Adapted from Medina & Reiner (2000). Abbreviations: ac, anterior commissure; ACC, nucleus accumbens; cc, corpus callosum; Cl, claustrum; DB, diagonal band of Broca; EN, endopiriform region; HA, hyperpallium apicale; HD, hyperpallium densocellulare; HP, hippocampal complex; IHA, interstitial nucleus of the hyperpallium intercalatum; LC, lateral cortex; lv, lateral ventricle; M, mesopallium; N, 66 nidopallium; NC, neocortex; OB, olfactory bulb; S, septum; STR, striatum; TU, olfactory tubercle.

(b) Vocal learning

The song control system of passerines, such as zebra finches and canaries, has been the focus of neurobiological study for almost 30 years (Farries 2004). Recent studies on parrots (Jarvis & Mello 2000) and hummingbirds (Jarvis et al. 2000) have revealed similar, but different connectivity patterns in their vocal control pathways. This has led to the suggestion that vocal learning in birds (songbirds, parrots, hummingbirds) and mammals (humans, cetaceans and pinnipeds) has evolved via an analogous neuroarchitecture (Jarvis 2004).

(c) Avian ‘prefrontal cortex’

The final example of potential convergence in neural systems in bird and mammal brains is the existence of an avian prefrontal cortex. In mammals, the prefrontal cortex ‘contributes to the organization, planning and flexibility of behaviour based on previously acquired information’ (Dalley et al. 2004, p. 774). This definition encompasses many of the types of complex behaviour described throughout this review. Therefore, we would predict that those species which display many of these complex cognitive traits controlled by the prefrontal cortex in mammals should have functionally equivalent areas in the telencephalon. The strongest candidate is the caudolateral nidopallium (CDLN, Reiner 1986). Neurobiological studies in the CDLN of pigeons have revealed similarities in connectivity, neurochemistry, neurophysiology and function with the mammalian dorsolateral prefrontal cortex. For example, lesions of the CDLN effect delayed alternation tasks (Mogensen & Divac 1993; Gagliardo et al. 1996), reversal learning (Hartmann & Gunturkun 1998), other working memory tasks (Diekamp et al. 2002a), including the Go/No Go task (Aldavert-Vera et al. 1999) and impairments on some visual discrimination tasks (Aldavert-Vera et al. 1999), but not others (Mogensen & Divac 1993; Gagliardo et al. 1996; Hartmann & Gunturkun 1998). Neurons within the CDLN respond during the delay period of Go/No Go tasks similarly to neurons in the primate dorsolateral prefrontal cortex (Kalt et al. 1999; Diekamp et al. 2002b). With respect to neurochemistry, the distribution of DA fibres, D1 receptors, but not D2 receptors is highly concentrated in the CDLN, again similar to primate prefrontal cortex (Durstewitz et al. 1999) and blockade of D1 receptors in CDLN also disrupts similar tasks to permanent lesions (Diekamp et al. 2000). Finally, the CDLN is connected reciprocally with secondary sensory areas of all modalities (Leutgeb et al. 1996; Metzger et al. 1998; Kroener & Gunturkun 1999), and projects to somatomotor and limbic areas of the basal ganglia (Leutgeb et al. 1996; Metzger et al. 1998; Kroener & Gunturkun 1999), which allows it to influence behavioural and affective responses similar to primate prefrontal cortex.

Although this is striking evidence for functional and probably anatomical convergence between the avian CDLN and primate prefrontal cortex, a number of important questions remain to be answered. There is some controversy over the existence of dorsolateral prefrontal cortex in rodents (Preuss 1995; Dalley et al. 2004). At present, the only tasks, which have been affected by both prefrontal cortex and CDLN lesions are working memory tasks. Other more complex tasks, such as attentional set-shifting, have not yet been tested in birds. Second, the only avian species to be examined is the pigeon, which although talented at visual discrimination problems does not demonstrate the same forms of complex cognition displayed by birds with larger forebrains, and importantly, a larger nidopallium. It remains to be seen what effect CDLN lesions will have on these species. Finally, many aspects of complex cognition in corvids, such as episodic-like memory and theory of mind, activate other regions of the human prefrontal cortex aside from the dorsolateral region, such as the ventromedial region (Maguire 2001; Saxe et al. 2004). If these abilities in corvids are functionally equivalent to humans, then we might expect to find areas within the corvid cerebrum similar to the ventromedial prefrontal cortex of humans. This remains to be tested.

(d) How can small brains achieve complex cognition?

Although there are significant differences in absolute brain size across bird species, there is an ultimate constraint on brain size due to flight. Indeed, those species which spend a large percentage of their life in flight (migratory species) tend to have a significantly smaller telencephalon than sedentary or nomadic species (Burish et al. 2004). Within a finite brain space, one solution to this size constraint is to increase the number of neurons. This increase in neuron number will lead to a decrease in (proportional) connectional density, as the absolute number of connections per neuron has to remain constant (Striedter 2005). To minimize wiring, shorter connections develop between brain regions. Therefore, only those areas which are ‘nearest neighbours’ become connected, and more distantly related areas become functionally independent. Increasing the number of processing steps between brain regions—thereby increasing the ‘degrees of separation’—causes those regions to become modular. If the number of processing steps between regions is large, this is more inefficient than if the number of processing steps is few. One suggestion as to how brains may overcome this inefficiency is to adopt a ‘small-world’ architecture (Watts & Strogatz 1998), where the majority of connections are between near areas, but with some connections developing between far areas which are functionally integrated. This form of modularity has been described in large mammalian brains, particularly the visual system of cats and monkeys (Young 1992; Scannell & Young 1993)

Therefore, it is suggested that small brains which are constrained by their absolute size, such as corvids and parrots, may increase their number of neurons to achieve a highly efficient neuroarchitecture which is functionally analogous to the primate neocortex. Indeed, there is some evidence that crows, rooks, magpies and jackdaws have larger numbers of neurons and greater neuronal density in the forebrain compared to pigeons (Voronov et al. 1994). If this neural scenario is an explanation for the enhanced intelligence of corvids and parrots, then there should be a higher number of neurons in the forebrain of corvids and parrots, compared to other bird species from the same family and more distantly related species, and this high neuron number should be specific to those areas of the forebrain which are functionally equivalent to the primate neocortex, namely the nidopallium and mesopallium. This data is not yet available.

7. Evolving avian intelligence

Birds may be the most successful of the terrestrial vertebrates. They are found on every continent, in almost every ecological niche. There are 9000 species of birds, compared to amphibians, 6000 reptiles and 4100 mammals. This paper has presented a selective overview of the cognitive abilities of birds. The evidence suggests that not all birds were created equal. Some families, such as the corvids and parrots, appear to have evolved superior cognitive abilities compared to other birds, and which in many cases can be compared favourably to the great apes. Although cognitive ornithology is still in its infancy, there are good reasons to propose a special status for corvids and parrots (Emery & Clayton 2004a). There are 120 species of corvid that are distributed over every non-polar continent, from Greenland and Northern Canada and Alaska, throughout Europe, North and Central America and Asia, to South America, Africa, New Guinea and Australia (Madge & Burn 1994). Parrots, by contrast, have a more conservative geographical distribution, located primarily in temperate jungle and forested areas, such as Central and South America, Central Africa, Southern Asia, New Guinea, Australia and New Zealand (Forshaw 1989). This may be due to human influence (trapping the colourful birds for export and cutting down forests) and the rather specialized diet of parrots (fruit). A similar picture to parrots emerges for primates, which have a similar diet and are located in overlapping areas (Central and South America, Africa, Southern Asia and Japan).

Birds and mammals share a relatively recent evolution, with modern birds and mammals both appearing around 65 Myr. It has been suggested that a single ancient avian species (Archeoptyrx) survived a mass extinction event (that destroyed the dinosaurs) and that all modern bird species evolved from this one survivor (Wyles et al. 1983). Within the birds, the passeriforms (perching songbirds) demonstrate the most recent evolution, first appearing in the fossil record around 37.5 Myr. This is compared to the anthropoid primates which first appeared around 40 Myr, with the common ancestor to the modern great apes appearing in the fossil record 14 Myr (with the chimpanzees diverging from humans at 6 Myr). Mammals and birds also demonstrate comparable rates of anatomical evolution: ‘the anatomical differences among birds are no smaller than those among other vertebrates (frogs, lizards and mammals) of comparable taxonomic rank’ (Wyles et al. 1983, p. 4395).

The oldest corvid fossils in Europe date 20–25 Myr (Goodwin 1986), however the origin of the corvids has been traced to Central Asia in the Western Malaysian region (Hope 1989). The Eurasian and North American jays appear to have become specialized in the eating of nuts and acorns, living in forested environments similar to the primitive corvids, and therefore are probably more closely related to these early species. By comparison, the magpies and crow-like birds moved away from the forests into more open environments, becoming less dependent on seeds for food (crows and magpies tend to be omnivorous) and less constrained by their habitat, becoming more mobile (spreading across most regions of the world). This suggests that the magpies and crows are the most recently evolved of the corvids, and this in turn may account for their remarkable cognitive abilities compared to other birds. Similarly, the great apes evolved 5–10 Myr, and display similar advancements in their cognition when compared to other mammals.

Similarly, the oldest known Psittaciform fossil was discovered in France and dated to around 30 Myr (Miyaki et al. 1998), and the earliest modern genus was found in the USA and dated to 20 Myr (Forshaw 1989). The most recent diversification within the parrots is said to have been the separation of New World from the African parrots around 2 Myr (Smith 1975). Therefore, as with apes and corvids, parrots also appear to have had a very recent evolutionary history, which may go some way towards explaining their enhanced cognitive abilities.

As the last common ancestor to corvids, parrots and apes lived approximately 300 Myr, and not all birds and mammals share their cognitive abilities, it has been suggested that intelligence in these taxa can only have arisen by convergent evolution, driven by the need to solve comparable social and ecological problems (Emery & Clayton 2004a,b; Clayton & Emery 2005). Furthermore, the most recently evolved genera of corvids (Corvus, Pyrrhocorax) and apes (Pan) appeared at approximately the same point in evolutionary time (5–10 Myr). The Late Miocene to Pliocene was a period of great environmental and climatic variability and instability. This variability would have had a significant effect on food availability. As such, extant corvids and apes may have had to adapt strategies to locate food dispersed in time and space, extract food hidden within cased substrates, exploit meat as a high source of energy, and thus become innovative, omnivorous, generalist foragers. Indeed, there is good evidence that corvids, parrots and apes are highly innovative in their feeding strategies, and that this correlates significantly with relative brain size (Lefebvre et al. 1997; Reader & Laland 2002). Such ecological conditions will also have had an effect on the organization of social groups in apes, corvids and parrots. These ecological variables have already been suggested to have played an important role in the evolution of great ape cognition (Potts 2004), and it is easy to propose a similar scenario for the evolution of corvid (and probably parrot) cognition. Indeed, a simple examination of six socio-ecological variables (diet, social structure, relative brain size, innovation, life history and habitat) across corvids, parrots, other birds, monkeys, apes, elephants and cetaceans, reveals that certain preconditions correlate with the development of complex cognition: omnivorous generalist diet, highly social, large relative brain size, innovative, long developmental period, extended longevity and variable habitat (Table 1). Although not exclusive, and vastly simplified, this exercise suggests that the evolution of intelligence was highly correlated with the ability to think and act flexibly within an ever changing environment. In such environments, climate was incredibly variable; food was located in patches, often only ripe during brief time periods, or had to be pursued; and increases in the size and complexity of social groups containing many long-lived individuals required the ability to track social relationships. What is not yet clear is how these seemingly disparate variables came together to influence intelligent behaviour in such distantly related groups.

View this table:
Table 1

Socio-ecological conditions for the evolution of intelligence in birds and mammals. (Abbreviations: diet (O, Omnivorous; S, Seed; F, Fruit; Nut, Nuts; N, Nectar; M, Meat; V, Vegetation; I, Insects; C, Crustaceans); habitat (F, forest; W, wood; O, open; G, grass; H, exploits human environments; T, tundra; M, mountain). Text in italics represents those species which are omnivorous; highly social; have large relative brains; have a delayed maturation and an extended longevity (compared to other species within the same taxonomic group). Avian data from Forshaw (1989), Madge & Burn (1994), Lefebvre et al. (1997), Iwaniuk & Nelson (2003) and Perrins (2003). Mammalian data from Smuts et al. (1987), Macdonald (1999), Reader & Laland (2001), Whitehead (2003) and de Waal & Tyack (2003).)

8. The feathered ape in your garden

As I hope this review has demonstrated, birds are an important taxa in which to examine the anthropocentric and adaptive specialization approaches to cognition. Although there is abundant information on the behaviour, mating systems, ecology and life histories of many species of birds, carefully recorded in thousands of hours of field observations, there is a paucity of information on the cognitive abilities of birds. Although it is often impractical, and in many cases unethical to remove birds from their natural environment for cognitive testing, a new wave of studies in cognitive ecology, testing birds in the field, has arisen through careful collaborations between behavioural ecologists (studying banded populations of known individuals) and comparative psychologists (Healy & Hurly 1995, 1998; Hurly & Healy 1996; Henderson et al. 2001). It is hoped that this review will have demonstrated that comparative psychologists do not need to travel to exotic locations or choose exotic species to study complex cognition in birds as there are ‘feathered apes in your garden’.


This paper is based on a talk given at the Ecological Intelligence symposium held in Tutzing, Germany in October 2002 in honour of Wolfgang Wickler's retirement. I would like to thank Lucie Salwizcek, Redouan Bshary and Hans Fricke for inviting me to the meeting and giving a primatologist masquerading in crow's feathers the opportunity to talk about avian cognition. The writing of this paper was supported by a Royal Society University Research fellowship and by grants from the BBSRC, The Royal Society and the University of Cambridge. I would like to thank Nicky Clayton for her insights and discussion of these issues over the years, and for comments on the manuscript.

  • Received March 9, 2005.
  • Accepted August 18, 2005.


View Abstract