Evolutionary models of cultural change have acquired an important role in attempts to explain the course of human evolution, especially our specialization in knowledge-gathering and intelligent control of environments. In both biological and cultural change, different patterns of explanation become relevant at different ‘grains’ of analysis and in contexts associated with different explanatory targets. Existing treatments of the evolutionary approach to culture, both positive and negative, underestimate the importance of these distinctions. Close attention to grain of analysis motivates distinctions between three possible modes of cultural evolution, each associated with different empirical assumptions and explanatory roles.
Biological evolution and cultural change are both domains in which entities recur and are remade. They are also domains where some change is adaptive, leading to an improved handling of environmental pressures and opportunities. On the biological side, enormous progress was made though Darwin's realization that variation, heredity and differential reproduction can produce not only adaptive change, but also entirely new forms of life. Although analogies between biological and cultural change have long been noted, recent work has given them a new role. Especially since the arguments of Tomasello , the trans-generational accumulation of knowledge and skills by some form of cultural transmission has become an important element in explanations of the unusual course taken by human evolution over the past 2 Myr.
The evolutionary approach to culture is often treated as a ‘package deal’, either in general or in its application to pre-modern societies. In biological evolution, different theoretical concepts apply at different ‘grains’ of analysis. This phenomenon is, I suggest, both especially important and underappreciated in the case of cultural evolution. I begin with the case of biological evolution, and look at how different theoretical ideas get purchase at three different grains of description. The application of different Darwinian ideas also requires attention to exactly what is being explained—the origination of variants, or changes to distributions in a population. The analysis developed for the biological case is then applied to cultural change, where three grains can again be distinguished, with different evolutionary models relevant at each. As a result, three kinds of cultural evolution can be identified: Darwinian imitation, cumulative cultural adaptation and cultural phylogenetic change. Each model requires different empirical assumptions, and these requirements are not ordered from stronger to weaker but display a variety of logical relationships.
Cultural evolution is a domain in which individual cognition meets population-level dynamics. An additional theme of the paper is the relation between different kinds of success-tracking in human behavioural change, especially the relation between individual reinforcement learning, imitation of the successful as well as more integrated and rational forms of cognitive change. The operation of integrated practical intelligence reduces the applicability of some Darwinian models of cultural change but not others.
2. Grain of analysis in the biological case
Different theoretical concepts become important in biology at different grains of analysis. This point is often made with a distinction between micro-evolution and macro-evolution: micro-evolution is evolution within a population, and macro-evolution is change in a collection of populations, including the formation of new species and clades, extinction and other large-scale trends. Although this is often described as a divide between ‘levels’, it is possible to zoom in and out of an evolving system in a continuous way, attending to more or less of the causal detail. Some of these relationships in the biological case are represented in an elegant picture due to Willi Hennig (figure 1; ).
At the most coarse-grained level, a species splits into two, producing a small branching of the kind represented in the ‘tree of life’. Zooming in, we see that this splitting event is composed of many events involving relations between individual organisms, which Hennig calls tokogenetic relations. Zooming in further, the diagram represents ontogenetic relations between stages of a single organism, or what Hennig called semaphoronts.
Evolution by natural selection is change in a population owing to variation, heredity and differential reproductive success . This is usually seen as a micro-evolutionary process acting on organisms, but the criteria required are abstract; genes, cells, social groups and species can all, in principle, enter into change of this kind. For any objects to be units of selection in this sense, however, they must be connected by parent–offspring relations; they must have the capacity to reproduce. Units of selection in this sense can be called Darwinian individuals . An evolutionarily relevant case of reproduction can take many forms. There need not be replication, the faithful production of copies. Replicators are Darwinian individuals with high-fidelity, asexual reproduction and it is possible to have evolution by natural selection on units where reproduction is sexual and heredity is weak.
Evolution is one of a larger set of processes in which adaptive change occurs by variation and selection. An analogy has long been noted between biological evolution and various forms of learning by ‘trial and error’ [5–8]. In both evolution and reinforcement learning, variation is produced, and successful variants are retained, providing the basis for further rounds of variation and selection. In the evolutionary case, the successful become more common because they reproduce; they make more things like themselves. In the learning case, a successful behaviour does not make more behaviours. Rather, the nervous system is able to track the results of behaviours, and success induces the persistence and entrenchment of some internal structure capable of generating more behaviours of the same kind. This provides a model of a variation-and-selection process that does not work through reproduction by successful variants.
As noted earlier, ‘grain’ distinctions are continuous, and I now introduce a grain distinction within what would usually be called micro-evolution. This will be carried out with another graphical tool, the ‘adaptive landscape’ [9,10]. The landscape metaphor for evolution is not essential to the argument and is used here without endorsement of the ‘shifting balance’ theory that Wright sought to defend. The metaphor is helpful, however, in conveying some ideas. (This part of my analysis draws on ideas due to Jon Wilkins and is discussed in more detail in Godfrey-Smith & Wilkins .)
Imagine representing organisms of many species on a single abstract landscape. The horizontal axes represent phenotypic features, and the vertical axis represents fitness (figure 2). A certain kind of ‘adaptationist’ about evolution thinks that populations tend to climb hills as evolution proceeds. This is an attractive conception, but is it accurate? It depends, in part, on how close the imagined observer is to the landscape, and—as a consequence—on the scale in time.
Let us start by looking at the landscape from far away. At this distance, a species appears as a point. To most biologists working at this level of grain, what is striking is the emptiness of the landscape; organisms explore only a small fraction of the possible ways of making a living. We see a huge number of adaptive peaks, only a few of which are occupied, and there is no reason to believe that the occupied peaks are the highest ones. Populations have been restricted to small parts of the space by historical contingency and constraints that arise as a result. From this vantage point, the power of natural selection to produce adaptation appears limited.
This may change if we zoom in on a particular region of the landscape, containing one or a few peaks. At this intermediate level of grain, whole species or populations still occupy single points, but the points are vague or smudged. The change in zoom implies a change in the contrasts and the time-scale that are relevant. When a region of a certain size is chosen, this implicitly narrows the analysis to time-scales long enough for populations to be able to visibly move, but short enough for no population to move in or out of the region. Someone might then ask: ‘given that there is a population somewhere in this region, why is it in the particular location it is?’ Many biologists would answer by saying that we expect to find the population at or near one of the local peaks.
But next imagine we zoom in further, and focus on a very small region. We are now looking so closely that individual organisms can be recognized, and we are at a generational time-scale. Now evolutionary description must include drift, recombination and other factors studied in population-genetic models. Movement on the landscape is not continuous; new points appear some distance from their parents. The ‘adaptationist’ features that were prominent at the intermediate level of analysis recede.
Zooming out to the middle level of grain, and increasing the time-scale, the fate of any particular mutant is no longer important, as there will always be more. The constraints imposed by the genetic system can be broken by modifier alleles. There is more of a ‘search’ at this level of grain, and evolution shows a kind of ‘smartness’ that is not present at the other scales. Adaptive explanation resides primarily at the intermediate level of grain. Within a standard distinction between macro- and micro-evolution, this is micro-evolution, but in a sense it is meso-evolution. It is coarse-grained micro-evolution.
I next introduce a distinction between two kinds of explanations that can be given in an evolutionary context: distribution explanations and origin explanations. When giving a distribution explanation, the researcher assumes the existence of a set of variants in a population, in the present or at some earlier time, and explains why they have the distribution they do. Why are some common, and others rare? Why has such-and-such a trait been lost? An origin explanation, in contrast, is directed on the fact that a population has come to contain individuals of a particular kind at all. Then, the target of the explanation is the original appearance of the variants that are taken for granted when giving a distribution explanation .
Almost everyone will agree that natural selection can figure in distribution explanations. A more controversial matter is how selection can figure in origin explanations. It might initially seem that selection itself has no role there: selection has to do with sorting things that already exist, and mutation and recombination are the processes that figure in origin explanations. However, natural selection can reshape a population in a way that makes a given variant more likely to be produced via these immediate sources of variation than it otherwise would be. As selection changes the background in which mutation and recombination operate, it changes what those factors can produce.
Suppose we are explaining the evolution of the human eye. Building the eye involved bringing together many genes. Consider a collection of genetic material, X, that has everything needed, as far as genes go, to make an eye, except for one final mutation. So this background X is such that if a particular new mutation arises against X, it will complete the evolution of the eye. Initially, X was rare in the population. Selection can make the appearance of the eye more likely by making X more common. This increases the number of ‘independent experiments’ where a single mutation can give rise to the eye.
There I looked at one step in the process and told the story running backwards. The process itself runs forwards and involves many of these steps. The central principle also applies whether or not a series of design improvements is produced: whatever is favoured at one time becomes the ‘platform’ on which further mutational tinkering occurs.
This distinction can be seen in the history of biology. In Darwin's work , the emphasis is on origin explanations. The distribution explanations he gives are very simple: a new variant appears, and either spreads or is lost. The accumulation of many of these events explains how new kinds of organisms come to exist. In the ‘synthetic’ theory of the 1930s and later, more sophisticated distribution explanations appear, made possible by Mendelian genetics. In Fisher, Haldane and Wright [9,13,14], we see the idea of a discrete particle, a gene, inherited intact over many generations, and becoming more or less frequent in the gene pool. These explanations work because of the existence of simple and uniform processes of inheritance. Putting the two achievements together, the result is an evolutionary theory that can give both origin and distribution explanations.
Some empirical requirements of the two projects are different. It was not essential to Darwin's project that rules of inheritance as simple as Mendel's exist, even as approximations. The absence of complete ‘blending’ was a hidden assumption for Darwin's project, as pointed out by Jenkin , but on many matters described by population genetics, Darwin's Darwinism could be neutral. Putting the point differently, the distinctive role of selection in origin explanations can be understood in a fairly coarse-grained way. On the other side, the existence of selective gradients, making possible the accumulation of small improvements, is not something the project of population genetics strictly needs (though some writers, such as Fisher, have been committed to it).
Other requirements of the two projects are similar. One can be described as a kind of ‘localism’ or ‘looseness’ in biological populations. The operation of Mendelian principles in one breeding pair is not affected by the reproductive activities of other pairs in the population. (Whether a given pair get to breed at all may depend on what other individuals are doing, but that is not true of the function mapping parental properties to offspring properties, given that a pair does get to breed.) This assumption is needed for the aggregation of local events that drives population genetics models. The role of selection in Darwinian origin explanations requires something similar: when the proliferation of background X makes the novel phenotype Y accessible to mutation, this is because the instances of X are independent experiments, independent sites at which mutations may arise. An evolving population is a collection featuring some freedom of movement of the parts; a population is not like a car's engine, where the parts are very different and can do little without the other parts being in exactly the right place.
3. Cultural evolution
The rest of the paper applies this framework to cultural evolution. This is now a diverse field. (Landmarks and reviews include [16–19].) My starting point is a simple view: imitation is a bridge between human behaviour and an evolutionary dynamics. When one idea, behaviour or artefact is copied more than rival forms, it increases in frequency within the culture. Copying and imitation are processes in which one instance of a cultural variant becomes a parent to another, in the sense relevant to an evolutionary dynamics. Using the terminology mentioned earlier, copying and related activities make instances of cultural variants into Darwinian individuals in their own right.
A good way to understand a view is to see what it opposes or rules out. I will organize part of the discussion to follow with two challenges to the evolutionary approach to culture. The first is due to Sperber [20,21]. Sperber argues that many cases where an idea or behaviour recurs can be mistaken for copying in the relevant sense, when in fact they are not. Copying is a relation between tokens or instances, and involves a definite kind of causal process in which a copier attempts to match a copied instance or exemplar. One instance of a cultural variant becomes a parent of another. Sperber argues that this does happen, but not commonly, and as a general picture of the acquisition of culture it sits poorly with human psychology. For Sperber, people usually construct their own version of a cultural variant, often under the stimulus of many tokens. For example, some cultural variants become ‘attractors’ which people will converge on under a wide range of stimuli. ‘[M]ost cultural items are “re-produced” in the sense that they are produced again and again—with, of course, a causal link between all these productions—but are not reproduced in the sense of being copied from one another’ [21, p. 164].
Several issues can be distinguished here. First, Sperber thinks that even the superficial pattern needed for a Darwinian process will not often be found, as people transform the cultural variants to which they are exposed. Second, even when a pattern of recurrence and spread is present, the causal underpinnings will usually be wrong for a Darwinian explanation. Sperber is not opposed to a generally ‘populational’ approach to culture, but Darwinian populations do not provide the right model. Instead, Sperber favours an epidemiological model. In this paper, I will not make use of Sperber's specific psychological hypotheses, but the argument that recurrence can be due to factors other than token-to-token copying.
A second criticism of evolutionary views of culture goes beyond Sperber's. Fracchia & Lewontin  argue that cultural variants are not generally properties of individuals at all, and when they are, they are consequences of diverse interactions that an individual has with many others, and also with an environment built and maintained over many generations.
Acculturation occurs through a process of constant immersion of each person in a sea of cultural phenomena, smells, tastes, postures, the appearance of buildings, the rise and fall of spoken utterances. But if the passage of culture cannot be contained in a simple model of transmission, but requires a complex mode of acquisition from family, social class, institutions, communications media, the work place, the streets, then all hope of a coherent theory of cultural evolution seems to disappear [22, p. 73].
Fracchia and Lewontin reject a broadly ‘populational’ view of culture, as well as a specifically Darwinian one. They hold that no theory based on individual-to-individual interaction in populations will be adequate.
4. Darwinian imitation
Sections 5–7 look at recent work on cultural evolution, both empirical and theoretical. I start by looking more closely at imitation.
One body of work has explored mappings between copying rules of different kinds and evolutionary dynamics familiar from biological models. This work is often non-committal on what sorts of cultural variants are typically copied; they may be ideas, behaviours or artefacts. Some authors note that while a variety of things might be copied, ideas are fundamental . Sterelny , in contrast, argues that artefacts are more plausible ‘replicators’ than other candidates. Here I consider various cases without treating any as fundamental. I will use the term Darwinian imitation for any imitation-like process in which each instance, or token, of a cultural variant has one or more parent instances. The formation of a new instance need not be by faithful copying, and there may be more than one parent. But there must be some parent–offspring similarity, and the clarity of a ‘parent–offspring’ relation of the relevant kind is inversely related to the number of parents—if there are too many parents there are no parents at all.
A well-studied evolutionary dynamics is the replicator dynamics, which is approximately applicable to evolution in bacteria and other asexual organisms [25,26]. Suppose we are looking at change in the frequency in the population of some type of entity, the A type. In a continuous replicator dynamics, change in p, the frequency of that type, follows the rule , where t is time, WA is the fitness of the A type, and is mean fitness in the population. Björnerstedt  and Weibull  discuss the following process of behavioural imitation. Suppose each person in a population uses an initially assigned behavioural strategy until they decide to switch it for another. An individual chooses a new strategy by imitating a randomly chosen member of the population; payoff has no role at that stage. Payoff does have a role, however, in determining how often a person decides to switch. Suppose the probability of an agent revising their behaviour is inversely proportional to the payoff resulting from their present behaviour. More successful strategies are revised less often. If the rate of revision is a linear function of dissatisfaction, then the frequencies of behaviours in the population change according to a replicator dynamics.
This model can also apply to artefacts. Introducing a discussion of Polynesian canoes, Rogers & Ehrlich  translate a 1908 passage from the French philosopher ‘Alain’ (Émile-Auguste Chartier), who claimed that as ‘every boat is copied from another boat’, a Darwinian approach to boats can be taken. Badly designed boats will sink and not be copied, so ‘it is the sea herself who fashions the boats’ (p. 3417). Suppose that was true of boats; whenever a boat sinks, a new one is made by copying a single surviving boat chosen at random. Then, every boat has a unique parent boat. If boats vary, and the variation can be described in terms of different types whose distinctive features are passed on in copying, the situation matches the assumptions of Björnerstedt and Weibull and the evolution of boats will follow a replicator dynamics. The sinking of a boat maps onto the abandoning of a behaviour owing to dissatisfaction in the Björnerstedt–Weibull model.
The Björnerstedt and Weibull mechanism uses a kind of ‘differential mortality’ in culture. Putting it metaphorically, every instance of a cultural variant ‘alive’ at any time has the same chance of reproducing, but some stay alive longer than others. It is also possible to have a model with ‘differential fertility’: assume synchronized generations so every idea or behaviour ‘dies’ at the same time, but before dying there is reproduction, and some cultural variants reproduce more than others. For example, suppose everyone updates their behaviour at each time-step, and does so by copying the behaviours of individuals who were successful at the previous time-step. Rules of this kind are used extensively in models by Skyrms . There are different forms of the rule depending on who is copied. One of the more psychologically plausible is imitate your best neighbour: at each time-step, copy the behaviour of whichever of your immediate neighbours was most successful on the previous step. This rule does not yield a replicator dynamics, however, because change depends both on fitnesses and on who is in whose neighbourhood and hence who is available to be copied. A broader class of dynamics is sometimes recognized: a dynamics is payoff monotone if for any two types in a population, the one with the higher average payoff at a time has a higher growth rate at that time . The replicator dynamics is a special case, and some other imitation rules are seen as relatives of the replicator dynamics because they are payoff monotone. For example, Bendor & Swistak  note that in cultural contexts, rare successful types may grow quickly, in part because of their very rarity. This process of ‘elite imitation’ differs causally from biological reproduction, but it is still payoff monotone. Imitate your best neighbour is not payoff monotone, however, as it is possible for a type to increase in frequency despite having low average payoff, if some successful individuals of that type are appropriately positioned in the network.
It is also possible for a copying rule to give rise to a replicator dynamics without being a case of Darwinian imitation in my sense. Suppose individuals copy a behaviour on the next step with a probability proportional to the behaviour's summed success across the whole population on the previous step. This rule, which is perhaps psychologically implausible, gives rise to a replicator dynamics even though each behaviour instance does not have a small number of ‘parent’ instances. So replicators can exist without a replicator dynamics applying, and vice versa. (See Weibull  and Alexander  for further discussion of imitation and the replicator dynamics.)
Local imitation has been shown to produce pro-social outcomes in game-theoretic scenarios such as the Stag Hunt and Divide-the-Dollar [26,30]. These models assume a population of individuals located in space, interacting with neighbours and following a rule of behavioural updating as a result of the individual's experience. Skyrms argues that local imitation rules can lead to more prosocial outcomes than some ‘smarter’ rules used by other game theorists, and suggests that prosociality easily more arises in an evolutionary context than one governed by classic rational choice.
Earlier I distinguished several kinds of explanatory project in biology, according to the grain of description and whether origins are being explained or not. This work on imitation is aimed at distribution explanations. A fixed range of options is assumed, and they change in frequency. This is work in the style of the synthetic theory in evolutionary biology. Like that work, it depends for its power on a reliable low-level rule for the local transmission of cultural variants; the aim is something like a population genetics of culture. As far as theory goes, recent work does establish substantial links between cultural and biological mechanisms. It gives us ‘how-possibly’ explanations for the maintenance of various behaviours. Its empirical application—its ability to give us ‘how-actually’ explanations—is another matter.
To be empirically applicable a model of this kind requires strong psychological assumptions. A defender of this approach must engage with the kind of challenge I associated with Sperber, for example, in §3. Do cultural variants spread by the copying of tokens, or not? Here I will discuss two simple empirical phenomena that degrade the relevant parent–offspring relations. These are conformism and certain kinds of practical intelligence.
Conformism has been shown to be empirically important in human societies . A conformist rule, of the kind relevant here, is one in which updating is sensitive to the overall frequency of behaviours observed by the updating individual. Then, there are no ‘parent’ instances to a new behaviour produced. Conformism may be more discriminate than this; it might involve copying of a smaller number of individuals according to their status, such as senior members of a clan. Then, there may be parent–offspring lineages between behaviours again, but copying is not success-driven.
The second factor is ordinary individual intelligence. An observer may take on board the success of local individuals, but may combine many such observations into their decisions, attending to both positive and negative exemplars (avoiding failure as well as imitating success), and may engage in individualistic tinkering as well. Returning to the example from Rogers and Ehrlich, if each boat is faithfully copied off a single boat, then boats are replicators. If each boat is influenced by more than one ‘parent’ boat but the number is fairly small and variations are inherited to some extent, then boats may be Darwinian individuals without being replicators. But the multiplication of influences degrades parent–offspring relations. Once many boats, and other objects, have diverse influences, positive and negative, on each new boat, parent–offspring lineages fade and are lost.
The question of whether the requirements of this kind of evolutionary model is met connect to several debates in contemporary psychology. I have emphasized that phenomena such as conformism and flexible use of intelligence degrade reproductive lineages between cultural variants. It is an empirical question how widespread these tendencies are. Some results in developmental psychology suggest that human children show a distinctive tendency to ‘overimitate’ action sequences when copying a new skill . Other work has emphasized not a tendency for rigid imitation, but the sensitivity of a copying child to subtle indicators of likely success, in a way that fits a Bayesian model of rationality . One question here concerns how much practical intelligence intrudes on copying, and another question concerns how it does. On the latter issue, it would be possible for imitation to be guided in a fine-grained way by rational processing, but still yield parent–offspring lineages between cultural variants. You might spend a year deciding exactly which boat to copy, but then copy it in every detail, including features that have no apparent adaptive rationale. In contrast, you might attend in diverse ways to several models, tracking failure as well as success, and using other boats as prompts to intelligent tinkering. A person might switch between one mode and the other on different occasions. Reviewing the psychological literature, Shea  suggests that humans may have the capacities for two different kinds of broadly imitative behaviour, one involving a kind of ‘automatic mirroring’, and the other being more guided by deliberate choice (p. 2439). There may be modes of copying in human societies that do bring about parent–offspring lineages in cultural variants, and modes of copying that are guided too much by general intelligence to do this.
The status of the assumptions made by recent models of imitation also connects to foundational issues about explanation. A defender of these models in the face of psychological criticism might point out that many biological and physical models that have yielded genuine understanding are quite idealized, introducing many deliberate simplifications. Similarly, social models that assume a simple and uniform copying rule may capture causal tendencies present in more complex empirical cases. A model of this kind may be compared with physical models of interactions between molecules in a gas . Many unrealistic assumptions are made about gases in physical models, but this does not stop them from being predictively useful and furnishing good explanations.
In sum, recent theoretical work has established clear bridges between some classic biological models and cultural change based on imitation. This work is at a micro-evolutionary level and aims primarily at giving distribution explanations. When applied to empirical systems, these models make strong psychological assumptions, and it is easy for familiar factors such as conformism and the exercise of practical intelligence to produce a violation of those assumptions. Where low-level choices are intelligent and combine various sources of information, there is a particular disanalogy between the cultural and biological cases. In the biological case, as discussed in §2, there is a kind of ‘smartness’ at the meso-level, whereas at the micro-level genetic details intrude. In the cultural case, the individual choices that make up the micro-level tend to literally be smart, and in some though not all cases, this degrades parent–offspring relations between cultural variants, reducing the role of Darwinian imitation.
5. Cumulative cultural adaptation
In this section, I discuss another project applying evolutionary theory to culture, one sometimes combined with the first and sometimes defended independently [1,23,24,38,39]. The aim of this work is the explanation of cultural innovation.
In a memorable discussion, Henrich & McElreath  tell the tale of the 1860 Burke and Wills expedition in Australia. The Europeans set out with a well-equipped party, but were unable to survive while being surrounded by Aboriginals living in the same country quite successfully. Burke and Wills were smart individuals but with none of the accumulated local knowledge of the locals. The Aborigines were able to detoxify seeds from an aquatic fern, for example, whereas the Europeans poisoned themselves by eating them unprocessed. As Henrich and McElreath put it, human cultural abilities ‘generate adaptive strategies and bodies of knowledge that accumulate over generations’. Foraging in most environments requires ‘skills and know-how that no single individual could figure out in his lifetime’. The aim of this second research project is to explain this accumulation using evolutionary concepts. In both biology and culture, successive rounds of undirected variation can yield significant design improvements, provided that the successful variants in one generation proliferate and provide many independent platforms at which further innovation can occur.
Unlike the projects discussed in §4, the aim of this work is to give origin explanations, explanations of how a population could come to contain anyone who can build a canoe out of seal skin or catch fish without fishing line. The aim is not to track the dynamics of existing behaviours, except insofar as they contribute to the discovery of new ones.
This project differs from the first in what it requires and what it can be neutral about. Continuing with an example from §4, what must be present in a boat-building culture for origin explanations of this kind to be available? There is no need for a uniform rule of copying, applied by everyone in each case. What is important is that changes have a particular shape when viewed from a slightly more zoomed-out perspective. This shape is like a ‘chain of fountains’ (figure 3). When a useful new variant appears, it is able to proliferate through the population, and its proliferation creates independent platforms for further tinkering. That shape is compatible with a number of different patterns that might be seen when we zoom in. Looking closely, we might see a simple and uniform copying rule—such as the Björnerstedt–Weibull or Skyrms rules from §4, with some ‘noise’ creating new variation. But we might not see this; instead we might see individuals combining imitation, conformism and non-social tinkering. Different individuals might approach things differently from others, and one individual might approach a problem differently on different occasions. Individual variation in learning styles, and the haphazard intervention of ordinary intelligence, which make a ‘population genetics of culture’ less feasible, are no problem.
The required pattern at the middle level can exist without parent–offspring relations being present between all instances of cultural variants. If we look at one of the points in figure 3 where a new innovation first appears, the instances of that variant appearing immediately after the first one will probably be ‘offspring’ of the first. These ‘second-generation’ tokens cannot have been influenced by much other than the success of the first, unless all of them were products of some common local environmental cause. But once we move past these early-adopters, individuals acquiring the new cultural variant may be influenced by many others, with diversity in the kinds of influence exerted. The adoption of an idea or behaviour may be affected by the intelligent filtering of all the surrounding instances. Cultural variants certainly need not be replicators, and they need not always be Darwinian individuals either.
In thinking about these phenomena of recurrence without parent–offspring lineages, it is worth noting again known ways that variation and selection can operate without reproduction by the successful. This, as discussed in §2, is part of the message of the analogy between evolution by natural selection and trial-and-error learning. In trial-and-error learning, the entrenching of an internal structure within an individual produces further instances of a behaviour found to be successful. A collection of intelligent agents of the kind sketched above is not like a single trial-and-error learner; the process envisaged there is more distributed than that. When an innovation appears it diffuses locally, rather than by changing the state of the entire society. Diffusion occurs because in each location a collection of factors—an intelligent agent and a multi-faceted social setting—together give rise to a new instance of a successful variant.
As a result, some general arguments against evolutionary views of culture, the family of arguments I associated with Sperber, can be accepted by an advocate of this second kind of evolutionary approach. But views of the kind exemplified by the Fracchia and Lewontin critique must be resisted. For the proliferation of a successful variant to provide many independent sites at which further improvement can occur, the population must retain some of the ‘local’ or ‘loose’ character discussed earlier. If the members of the population are too tightly bound together in how they behave, then the proliferation of a behaviour may have any effect on what happens next—the consequences are a function of the state of the entire society. The requirements for this second evolutionary project are not simply weaker than those of the project discussed in §4, however, because the requirement of cumulative change, of small adaptive steps, was not needed for that first project.
This second project also has to deal with a question that can been raised generally for selection-based theories of innovation . The aim of the work is to show how a culture can be ‘smarter than its members’, just as biological evolution can lead to a population-level ‘search’ that is beyond the capacities of anyone in the population. In the case of biological Darwinism, mutations are produced wholly unintelligently. In the cultural case, new moves are at least generally produced with some intelligence. The argument must be that the collective's results are not just smarter than what a single individual could come up with, but also smarter than a simple aggregation of smart individual choices. Cases where improvements are made by one smart person ‘standing on the shoulders of others’ (in Isaac Newton's words) do not support a Darwinian approach to innovation. Advocates of the evolutionary approach to innovation point to skills such as the building of canoes—skills that no one individual could have worked out. But if such capital is built by a few individuals in each generation engaging in intelligent goal-directed improvement based on the output of their precursors, then the population-level process is not smarter than its members in the required sense.
Summarizing this section, a second kind of work on cultural evolution operates at a meso-level and aims at giving origin explanations. Some of the psychological phenomena that cause problems for a ‘population genetics of culture’ do not cause problems for this project. Explanations of the spread of an innovation that do not involve reproductive lineages between all tokens of the variant can be accepted. On the other hand, it is no surprise that some change is adaptive in a culture made up of smart and adaptive individuals. This is unlike the biological case, where adaptation in a population is possible even when the constituent individuals and mutational process are entirely unintelligent. For evolutionary ideas to get ‘traction’ on culture here, innovation must occur through the accumulation and diffusion of small improvements in a system where the multiplication of tokens provides many independent platforms at which the next improvement can occur.
6. Cultural phylogenetic change
Sections 4 and 5 have considered cultural change from very fine-grained and somewhat coarser-grained perspectives. In biology, there is also a still coarser-grained perspective on evolution, the level at which phylogenetic relations appear. A third body of work on cultural evolution attempts to take a phylogenetic approach in the same sense [40–43].
Gray & Jordan  looked at Polynesian cultures, aiming to resolve a debate about the origin and timing of human settlement of the Pacific islands. They approached this by taking common words from many different Pacific languages and working out the likely pattern of ancestry. They concluded that the languages did contain a signal of the pattern of splitting, enabling a reconstruction of the history of settlement, with a likely Taiwanese origin for the settlers.
What are the requirements for this third kind of project to work? It is not necessary to have parent–offspring relations between instances of cultural variants at the micro-level, as in the first project. It is also not necessary to have a pattern of cumulative adaptive evolution at the meso-level. Returning to the boat-building example, it might be that each boat built in a culture is the product of nothing less than the coordinated efforts of the entire society. Then, the proliferation of a new kind of boat in the culture does not produce more ‘independent experiments’ at which further innovation can occur, because no-one's boat-building activities are independent of anyone else's. In fact, it is possible to drop the requirement that we are dealing with a population, in my sense, at all. The human and artefactual elements of the culture might be so tightly knitted together that change in the society is more akin to change within a single organism than to change in an evolving population. I am not saying that this holistic view of culture is superior, either in general or in specific cases, to a more ‘localist’ view. The point concerns what this third research project requires. Within a culture, any degree of causal holism is possible.
A holistic character for change within each culture does not prevent the possibility of a ‘phylogenetic signal’, and a tree-like pattern of change, being present when we compare one culture with another (figure 4). The requirements for the applicability of phylogenetic methods in a case like this are strong, but they are different from those associated with the other two projects. There are two main requirements. First, there must be a reasonable degree of separateness of the branches. There cannot be too much reticulation and blending. The macro-structure present need not be a ‘tree’, with forks pointing only in one direction and no re-joining . But the edges making up the network must remain reasonably distinct from each other. Second, in order to reconstruct the history of the structure from its present state, change on each segment must be path-dependent. It must be reasonably gradual, or constrained in some other way by its history. As in the other cases, these requirements might be approximated rather than strictly met, but these are the requirements that are relevant, and they are different from the requirements of the other two projects. Work on cultural change in this third style has included the investigation of cultural traits that might also be addressed using evolutionary ideas at the micro-level, such as change in vocabulary, but has also included work on the political structure of societies. Applying a phylogenetic approach, Currie et al.  found support for a gradualist model of change between four political forms in the Pacific: ‘acephalous’ societies, simple and complex chiefdoms, and states. Being a complex chiefdom is not a trait that can increase or decrease in frequency within a society, as it is an organizational feature of the whole.
Returning to the challenges to evolutionary theories of culture mentioned earlier, the advocate of evolutionary methods operating on a phylogenetic scale need not argue with either of them. Imitation might be rare, as Sperber says. Culture might be holistic in its influence on each person, as Fracchia and Lewontin say. Those situations are consistent with there being reasonably discrete segments in the larger network of cultures. A more cohesive, less population-like form of culture may even have a better fit to the requirements of phylogenetic methods than looser forms of culture, because a more cohesive culture may be less likely to draw on outside influences, and maintain better boundaries. If so, a culture's good fit to the requirements of one evolutionary model may be associated with a poorer fit to the requirements of another evolutionary model working at a different grain.
The argument of §6 has been that evolutionary ideas can be applied to cultural change at several different levels of grain and to different explanatory targets. The resulting models make different empirical commitments, both social and psychological. Some of these relationships are summarized in table 1. I will close with some comments about the relation between these ideas and questions about human cognitive evolution.
A general feature of humankind is our ability to improve what we do by tracking the consequences of behavioural options. This can occur in various ways: it can work at an individual-level, or in a way distributed across a population; it may happen through a simple variation-and-selection mechanism, or through a more complicated kind of information processing. Some combination of these success-tracking capacities figured in the evolutionary process that made humans into specialists in information-gathering and intelligent control. As Tomasello has emphasized, part of what must be explained here is the way knowledge is retained and accumulates in human societies over many generations [1,44]. This is due to some human-specific combination of individual cognitive capacities and social organization. As discussed in works such as Whiten & Erdal's  contribution to this issue, the likely setting during important periods in the evolution of both our genus and species featured small groups with an egalitarian structure. Groups of this kind are conducive to cultural phylogenetic change. Whether Darwinian imitation and cumulative cultural adaptation were important in this context depends on further psychological facts. One possibility is that the breakthrough in knowledge acquisition seen in our lineage was due to the appearance of an unusual kind of imitation—high-fidelity token-to-token copying—a trait manifested experimentally in ‘over-imitation’ and related behaviours discussed at the end of §4. If this form of imitation is psychologically distinct, then the question arises of its relations to more flexible and rational ways of tracking success, in which humans can clearly engage, and also to the special role of teaching and ‘scaffolded’ learning in human societies . The very high social cohesion that Whiten and Erdal hypothesize, amounting to a ‘band-level, central information processing system’ in hunter–gatherer groups, might in some cases lead away from the distributed ‘localist’ model of adaptation discussed earlier.
A number of psychologists have recently explored ‘dual process’ or ‘two system’ hypotheses about the human cognitive architecture. These views posit a separation between rational, deliberate processes showing flexibility and top-down control, and underlying parallel processes that are involuntary and unconscious [47,48]. I noted earlier the possibility that humans have the capacities for two different kinds of broadly imitative behaviour, one involving a kind of ‘automatic mirroring’, and the other guided by deliberate choice . A dual-system view of this kind will probably also recognize an alternative success-tracking device in the unconscious parallel system, the ancient and phylogenetically widespread capacity for individual reinforcement learning. A view such as this sees humans as containing two specialized mechanisms that facilitate variation-and-selection processes, one operating within the individual and one distributed socially. This approach can be contrasted with one that resists ‘dual process’ separations, and emphasizes the way that humans have integrated several kinds of success-tracking—attending to reinforcement, attending to the success of others and more sophisticated mental modelling—into a unified way of dealing with their social and non-social environment.
I am grateful to Celia Heyes, Nicholas Shea, Jon Wikins, and three anonymous referees for comments on earlier drafts. Rory Smead provided valuable assistance with the relations between evolutionary dynamics and imitation. I also thank Jane Sheldon and Eliza Jewett-Hall for assistance with the figures and all the participants at the ‘New Thinking’ conference, Oxford 2011.
One contribution of 15 to a Theme Issue ‘New thinking: the evolution of human cognition’.
- This journal is © 2012 The Royal Society