Royal Society Publishing

On the shape and fabric of human history

Russell D. Gray, David Bryant, Simon J. Greenhill


In this paper we outline two debates about the nature of human cultural history. The first focuses on the extent to which human history is tree-like (its shape), and the second on the unity of that history (its fabric). Proponents of cultural phylogenetics are often accused of assuming that human history has been both highly tree-like and consisting of tightly linked lineages. Critics have pointed out obvious exceptions to these assumptions. Instead of a priori dichotomous disputes about the validity of cultural phylogenetics, we suggest that the debate is better conceptualized as involving positions along continuous dimensions. The challenge for empirical research is, therefore, to determine where particular aspects of culture lie on these dimensions. We discuss the ability of current computational methods derived from evolutionary biology to address these questions. These methods are then used to compare the extent to which lexical evolution is tree-like in different parts of the world and to evaluate the coherence of cultural and linguistic lineages.

1. Introduction

The only figure in Darwin's (1859) On the Origin of Species is an evolutionary tree. This tree reflects Darwin's vision of descent with modification from a common ancestor. Today phylogenetic methods or ‘tree-thinking’ (O'Hara 1997) form the foundation of inferences in evolutionary biology (Harvey & Pagel 1991; Huelsenbeck & Rannala 1997; Felsenstein 2004). However, biologists are not alone, nor even first, in their use of trees to represent histories of descent with modification. There is a long parallel tradition of using trees to study linguistic and cultural genealogies (Spielman et al. 1974; Cavalli-Sforza et al. 1988; Atkinson & Gray 2005; Hunley et al. 2007, 2008). There is also a lengthy history of scepticism about the applicability of evolutionary analogies to culture. The influential American anthropologist Kroeber (1948) explicitly contrasted Darwin's idea of a ‘tree of life’ with that of a ‘tree of cultures’. Kroeber argued that the tree of cultures entwines around itself, with frequent borrowing and diffusion of traits between cultures. In this scenario, information not only flows vertically from parent to daughter cultures but—just as importantly—horizontally between them too. There is a constant branching-out but the branches also grow together again, wholly or partially, all the time. Culture diverges, but it syncretizes and anastomoses too. … The tree of culture … is a ramification of such coalescences, assimilations, or acculturations. (Kroeber 1948, pp. 260–261)The late palaeontologist Stephen Jay Gould was also a vocal critic of phylogenetic approaches to culture. In his 1987 book, An Urchin in the Storm, he proclaimed that: Human cultural evolution proceeds along paths outstandingly different from the ways of genetic change… Biological evolution is constantly diverging; once lineages become separate, they cannot amalgamate (except in producing new species by hybridization—a process that occurs very rarely in animals). Trees are correct topologies of biological evolution… In human cultural evolution, on the other hand, transmission and anastomosis are rampant. Five minutes with a wheel, a snowshoe, a bobbin, or a bow and arrow may allow an artisan of one culture to capture a major achievement of another. (Stephen Jay Gould 1987, p. 70).Put bluntly, the obvious inference is that while phylogenetic methods are great in the biological realm, in studies of cultural evolution they are doomed to failure because cultural change is governed by completely different principles. Gould was not alone in holding this view (see Terrell 1988; Moore 1994 for total rejections of a phylogenetic approach to cultural evolution). Borgerhoff Mulder et al. (2006, p. 55) espouse the more moderate view that ‘… tree building is a powerful method and provides considerable insight, particularly when based on maximum likelihood and Bayesian inference procedures. However, without principled methods designed to uncover horizontal transmission, there is a danger of biasing findings towards vertical transmission if we only use tree-building methods’. They conclude their review with a cautionary statement that our ‘Current understanding of the relative importance of horizontal and vertical transmission is shaky, to say the least’ (Borgerhoff Mulder et al. 2006, p. 62).

A similar, if rather less polemical, debate exists about the coherence or fabric of cultural evolution. In an insightful article, Boyd et al. (1997) lay out a range of possibilities for the fabric of cultural evolution. First, culture could evolve as (vertebrate) species do. Factors such as shared worldview, cultural group selection and demographic events might act to ensure that cultures are coherent and tightly integrated systems with little horizontal transmission between cultures. Mace & Holden (2005, p. 117) argue that ‘population dynamics can lead to group-level selection occurring in human cultural evolution … Such processes could maintain the identity of discrete cultural groups even when genetic distinctions are more blurred or even absent’. The main pathway of information flow in such cases would be vertically between generations and hence phylogenetic methods should work well. Pagel & Mace (2004) and Mace & Holden (2005) defend something close to this viewpoint. Second, cultures could be hierarchically integrated systems. Here, cultures are comprised of ‘core traditions’ that are inherited vertically. Horizontal transmission occurs, but only affects peripheral traits and not the core of the system. In this scenario, phylogenetic methods will work well for the core traditions, but not for the peripheral traits. In the case of linguistic evolution, basic vocabulary trees might be highly congruent with trees based on innovations in morphology and phonology (e.g. Gray et al. 2009), but much less congruent with trees based on a sampling of the entire lexicon or typological features (Greenhill et al. 2010). A third possibility is that cultures are assemblages of coherent clusters. These clusters are tightly integrated and vertical change occurs inside each cluster, but each cluster can be transmitted horizontally and may thus have a quite distinct evolutionary trajectory. In this case, phylogenetic methods will only work on a cluster-by-cluster basis, and only if the boundaries of each cluster can be identified. Finally, if horizontal transmission is the predominant mode of cultural change, then cultures could just be collections of ephemeral entities. In this situation, there is no coherent cultural system beyond a non-structured set of highly diffusible traits. This could be the outcome when cultural evolution is either too rapid, or cultural selection is too strict (such that alternate variants die out almost immediately), or the constraints on culture are severe (i.e. there is only one way to build a mousetrap).

We believe that the current polarized debates about the shape and fabric of human history are not particularly productive. The way forward is not to be found by charging onward building trees in a blinkered and unreflective fashion. Reticulate cultural evolution and multiple cultural histories are real, if sometimes overemphasized. However, simply giving up at the first sign of horizontal transmission or an incongruent tree is no solution either. Despite the concerns about the tree-likeness and coherence of cultural evolution, computational phylogenetic methods have considerable success recently in answering questions about cultural history ranging from the origin of Indo-European languages (Gray & Atkinson 2003) to the social impact of adopting pastoralism in Africa (Holden & Mace 2003). In this paper, we suggest that further progress can be achieved through a combination of conceptual reframing, new methods for quantifying the tree-likeness and coherence of cultural evolution, and most crucially, empirical research.

2. A reframing

Rather than dichotomous disputes about the validity of cultural phylogenetics, we suggest that the debates are better conceptualized as involving positions along three continuous dimensions (figure 1). The first dimension we propose is Rv, the rate of change in characters transmitted vertically between generations. If this rate is very slow relative to the time period being studied, then there will be too little character change to allow the construction of cultural phylogenies. If Rv is too fast then the trace left by ‘descent with modification’ will be erased. The second dimension is Rh, the rate of horizontal transmission. At low rates of Rh, the estimated phylogenies will be good estimates of the cultural history. A recent simulation study by Greenhill et al. (2009) showed that phylogenetic tree estimates can be quite robust under realistic borrowing scenarios and moderate levels of undetected borrowing (e.g. less than 20% per 1000 years). At high rates of Rh, the estimated phylogenies will become increasingly inaccurate and poor summaries of the overall history. The third dimension is C, a measure of the extent to which different aspects of culture are coupled together. The challenge for empirical research is therefore to determine where particular aspects of culture lie on these dimensions. Methods exist to quantify the relative and absolute rates of change in cultural traits (Pagel et al. 2007; Greenhill et al. 2010). What we need is methods that enable us to quantify the shape and fabric of cultural evolution.

Figure 1.

This figure positions linguistic traits on three dimensions. Rv is the rate of change of vertically inherited cultural traits, Rh is the rate of horizontal transmission and C is the degree of cultural cohesion (adapted from Gray et al. (2007)). In this hypothetical example, morpho-syntactical traits evolved slowly, are relatively rarely borrowed and are tightly bound together. In contrast, a random sampling of the total lexicon evolves rapidly, has lots of borrowing and reflects many different cultural histories.

3. The shape of cultural evolution

Imagine a dataset (either biological or cultural) that contains comparative information on a range of taxa. For the sake of simplicity, let us assume that each taxon has been assigned a discrete character state for a number of characters (e.g. the nucleotide present at a specific point on a DNA sequence or the presence or absence of a cognate word). For each character, the taxa can be partitioned into a group that shares a specific character state and those that do not. In phylogenetic jargon this is termed a ‘split’. The more characters that group the taxa in the same way, the stronger the support for that split. When the splits are compatible (none of the splits group the taxa in contradictory ways), we can represent a set of splits derived from the whole dataset in a tree. The branches of the tree represent the splits and the branch lengths indicate the split weights. When the splits are incompatible, we can use a split graph. A split graph is a graphical representation of a collection of weighted splits (Bandelt & Dress 1992). In a tree, each split corresponds to a single branch. Removing that edge partitions the taxa set into two parts making up the split. In a split graph, each split corresponds to a collection of parallel edges, all with length equal to the weight of the split. Removing those edges partitions the graph, and therefore taxa set, into the two parts making up the split.

There are a number of methods for obtaining the set of splits to represent in a split graph (reviewed in Huson & Bryant 2006). One method that has proved useful in analysing conflicting signal in biological datasets is the NeighborNet algorithm (Bryant & Moulton 2002, 2004; Bryant et al. 2005; Kennedy et al. 2005). NeighborNet closely resembles agglomerative clustering algorithms like the single and average linkage methods. It constructs splits by progressively combining clusters in a way that allows overlap. The resulting graph provides a useful visualization of the extent to which the data is tree-like. A program that calculates NeighborNets and displays split graphs, SplitsTree4, can be downloaded from

Phylogenetic networks, such as the split graphs produced by the NeighborNet algorithm, give a broad brushstroke picture of conflicting signal within a dataset. The next step is to explore and measure aspects of the data that do not fit well into a tree, determine where the conflicting signal arises and find which taxa are involved. For this, we have found the delta score (Holland et al. 2002) to be useful. The method scores individual taxa from 0 to 1 according to how much each taxon is involved in conflicting signals. The scores returned are defined in terms of quartets, or subsets of four taxa selected from the complete set of taxa. Each quartet is given a score, and the score for a taxon is the average overall quartets that contain it. To determine the score for a quartet, e.g. the quartet containing i, j, k and l, we compute the three sums of the path lengths in the quartet dij + dkl, dik + djl and dil + djk, where d denotes the distance between taxa in the quartet. For example, in figure 2, dij equals the sum lengths of the branches a, b and c. Let m1 be the maximum of these three values, let m2 be the second largest value, and let m3 be the smallest. The score assigned to that quartet is then (m1m2)/(m1m3), or zero if the denominator is zero. The rationale behind this score is that it equals zero if the distances between the four taxa exactly fit a tree; otherwise, the score ranges between 0 and 1. In practice, we find that dividing by the normalization constant (m1m3) obscures some of the signal. Instead, we find that the simpler score (m1m2)2 for the quartet (called a Q-residual score in SplitsTree4) is a more accurate measure of departures from a strict tree and provides a value much closer to the residual in standard statistics. Note that scaling distances by some constant has no effect on the delta-score, but it does affect the Q-residual scores. For this reason, we rescale all of the distances before computing Q-residual scores so that the average of the distances between the taxa is 1.

Figure 2.

A quartet containing the taxa i, j, k and l. The path-length from taxon i to taxon j is the sum of branches a, b and c.

Once the scores are computed for each quartet, an overall estimate of the tree-likeness of the dataset can be obtained by summing the scores for all the quartets and dividing that sum by the total number of quartets (for n taxa there are n(n − 1)(n − 2)(n − 3)/12 quartets). The score for a specific taxon is simply the average of the overall quartets that contain it. Hence, if there are n taxa, the score for an individual taxon is an average of n(n − 1)(n − 2)/6 quartets.

The delta score was introduced by Holland et al. (2002) primarily as a tool for data exploration. As such, there is little indication of how the statistical significance of various delta scores might be determined. We have implemented and tested a number of schemes for assessing the significance of delta score and Q-residual values, including non-parametric and parametric bootstrapping. Unfortunately, and curiously, none have proven to be sufficiently powerful and robust. Until such tests are available, we will continue to use delta scores and Q-residuals as indicators of the extent of tree-likeness.

Let us see how the combination of NeighborNets, delta scores and Q-residual scores might be put into practice in analysing the shape of linguistic evolution. We will start with a simple example, where the history is known to be more complex than a single tree. Sranan is a creole language developed by African slaves in Surinam on the northern coast of South America. The English established Surinam in 1651 as a slave colony but Dutch has been the official language since 1667 (McWhorter 2001). Sranan thus has words derived from both English and Dutch. Figure 3 shows a NeighborNet based on cognate-coded basic vocabulary for 12 Indo-European languages including Sranan, English and Dutch. The data consisting of 2355 cognate sets were derived from Dyen et al. (1992, 1997). Borrowings identified and removed by Dyen and co-workers were included in the analysis (see Bryant et al. 2005). Gene content distances were used in the NeighborNet analysis. This is an appropriate distance transformation for lexical data as it is equivalent to the stochastic Dollo model developed by Nicholls & Gray (2006, 2008) in which cognates can evolve only once but be lost multiple times. As NeighborNet can overfit the data, splits with small weights (less than 0.005) were filtered from the split graph. As might be expected given the hybrid history of Sranan, the split graph shows strong conflicting signal for the positioning of Sranan. One split labelled (a) groups Sranan most closely with English, while another one labelled (b) groups Sranan with Dutch and other closely related Germanic languages. The average delta score for this dataset = 0.23 and the average Q-residual = 0.03. Overall, this suggests that the data is moderately tree-like. This is not surprising given that basic vocabulary is known to be much less likely to be borrowed than a sampling of the total lexicon (Embleton 1986). However, Sranan stands out as having the highest taxon-specific scores reflecting its hybrid history (delta score = 0.29, Q-residual = 0.05).

Figure 3.

A split graph showing the results of a NeighborNet analysis of 12 Indo-European languages. The graph shows strong conflicting signal for the positioning of Sranan. The split labelled (a) with the short-dashed line groups Sranan most closely with English, while the other one labelled (b) with the long-dashed line groups Sranan with Dutch and other closely related Germanic languages. Scale bar, 0.01.

What can these methods reveal about the shape of lexical evolution on a much broader scale? It might be expected that factors such as geographical isolation and recent population expansions would promote relatively tree-like evolution, while ancient connections and geographical proximity would lead to more network-like patterns. If that was the case then the lexical evolution in the Polynesian language family should be way more tree-like than that of Indo-European. The far-flung Polynesian islands have only been settled in the last 3000 years (Spriggs 2010), whereas the Indo-European languages started to disperse across continental Europe approximately 8500 years ago, with the major radiation of the language families occurring around 6000 years BP (Gray & Atkinson 2003; Atkinson et al. 2005; Nicholls & Gray 2008). Figures 4 and 5 show the results of NeighborNet analyses of comparable basic vocabulary datasets for Polynesian and Indo-European languages. The Polynesian cognate set data were extracted from our Austronesian Basic Vocabulary Database (Greenhill et al. 2008; The Indo-European data came from Dyen et al. (1997). Known borrowings were included in the analyses. Gene content distances were used in the NeighborNet analysis and splits with small weights (less than 0.005) were filtered from the split graph. The split graphs and the associated delta scores and Q-residual scores reveal that the expectation that Polynesian languages would be more tree-like is completely wrong. For Polynesian, the average delta score was 0.41 and the average Q-residual value was 0.02. The respective figures for Indo-European were 0.22 and 0.002. It would be difficult to ascribe this difference to statistical sampling error.

Figure 4.

A split graph showing the results of NeighborNet analyses of the Polynesian lexical data. The network has three main regions: Fijian dialects plus Rotuman, western Polynesian and Eastern Polynesian. There is substantial conflicting signal within each region consistent with the break-up of a dialect chain. Scale bar, 0.1.

Figure 5.

A split graph showing the results of NeighborNet analyses of the Indo-European lexical data. Scale bar, 0.1.

Why is the evolution of even basic vocabulary in Polynesian so strikingly non-tree-like? There are a number of factors that may have jointly contributed to this pattern. There is increasing evidence that, far from being the consequence of chance voyages, the settlement of the Pacific required relatively complex sailing technology and considerable navigational skill. This is especially the case for the rapid settlement of the eastern and southern margins of Polynesia (Irwin 2008). Thus, the voyaging skills of the Polynesians meant that the substantial ocean distances were not necessarily a barrier to ongoing contact. In fact, both archaeological and linguistic evidence attest to substantial ongoing contact (Walter & Sheppard 1996; Weisler & Kirch 1996; Weisler 1998; Geraghty 2004). The lack of social and ecological resources on small islands may have also contributed to this (Irwin 1998).

On the basis of linguistic evidence, Pawley (1996) has argued that the settlement of Polynesia involved the establishment and break-up of a series of dialect chains. Figure 6 shows how the break-up of dialect chains can produce conflicting character distributions. According to Pawley, an initial Proto Central Pacific dialect chain broke-up into a dialect chain consisting of Rotuman and Western and Central Fijian in the west and a Tokelau–Fijian and Polynesian dialect chain further to the east. This later dialect chain subsequently split into northern and southern clusters with the southern cluster ultimately becoming the Tongic subgroup and the northern cluster giving rise to Proto Nuclear Polynesian. Finally, Proto Nuclear Polynesian split into Proto Eastern Polynesian and a non-monophyletic western group of languages. After this split there, Eastern Polynesian split into a Marquesic and a Tahitic subgroup and there was substantial borrowing between parts of western and eastern Polynesia. For example, the western Polynesian language Pukapuka is known to have borrowed extensively from eastern Polynesia (Clark 1980).

Figure 6.

A diagram showing the problem dialect chains cause for the construction of bifurcating trees. The dialects A, B and C are initially all mutually intelligible (note the permeable boundaries between the dialects). Innovations evolve in these dialects (filled circles; filled triangles) and diffuse through the network. However, if a dialect splits off from the network (e.g. the split between C and the other two languages), and this diffusion is only partially complete, then conflicting character histories can result. The filled circle characters support topology 1, whereas the filled triangle characters support topology 2. So, under the Dialect Chain/Network-Breaking model, areas where dialect chains were present should be poorly resolved in a phylogenetic analysis, and are better represented by a network diagram rather than a tree.

The sequential break-up of Proto-Central Pacific dialect chains described by Pawley is consistent with the network-like evolution seen in figure 4. One region of the network separates off Fijian dialects and Rotuman. The lower right side of the network shows considerable conflicting signal within the western Polynesian languages including the Tongic subgroup. The upper left side of the figure shows strong support for the Eastern Polynesian subgroup, within which there is again substantial conflicting signal. The network also shows some conflicting signal between eastern and western Polynesia, with Pukapuka placed in an intermediate position. Within the eastern Polynesian part of the network, the Marquesic and the Tahitic groups do not form clean clusters. The hybrid history of Hawaiian is the likely cause of this local conflicting signal. Archaeological evidence suggests that Hawaii was initially settled from the Marquesas around AD 800–900, but its language and culture were subsequently influenced by contact with Tahiti (Spriggs 2010). The taxon-specific delta and Q-residual scores support the idea that the main source of conflicting signal in the Polynesian data has been the process of dialect chain formation and break-up. Dialect chain break-up should smear that conflicting signal across the whole dialect, e.g. within Eastern Polynesia. In contrast, if just a few taxa are involved in some relatively discrete borrowing, then those taxa should be picked out by the taxon-specific delta and Q-residual scores. This is not the case (table 1).

View this table:
Table 1.

The taxon-specific delta and Q-residual scores for the Polynesian lexical data, ranked from the lowest Q-residual score to the highest.

Why is the evolution of Indo-European basic vocabulary relatively tree-like? One possibility is that the socio-linguistic situation in Europe was markedly different. Instead of the far-flung islands linked by kin connections in the Pacific, the relatively high population densities and thus intense competition in continental Europe and Asia may have meant that small linguistic differences became markers of cultural group identity and hence barriers to lexical diffusion. Alternatively, it might be the case that dialect chain formation and break-up are actually the dominant mode of lexical evolution around the globe. Holden & Gray (2006) argue that this has been the case for Bantu languages and Garrett (2006) advances a similar argument for Indo-European. The other obvious difference between Polynesian and Indo-European is time depth. According to the recent phylogenetic estimates (Gray & Atkinson 2003; Nicholls & Gray 2008), the initial divergence of Indo-European languages dates back to approximately 8500 years, whereas Polynesian languages date back to only 3000 years (Gray et al. 2009; Spriggs 2010). One possibility, discussed by Garrett (2006), is that over time networks get pruned by language extinction to appear more tree-like. If this was true, then older language families around the globe should be more tree-like. This is a possibility that deserves broader comparative testing.

4. The fabric of cultural evolution

It is often claimed that language must function as an inter-related system with strong dependencies between components: ‘un système où tout se tient’ (attributed variously to Antoine Meillet, and Ferdinand de Saussure; see Peeters 1990). If these dependencies are very strong, then different aspects of language should all have similar histories and thus be similar in the extent to which their evolution is tree-like. To test this, we compared the evolution of basic vocabulary with that of typological linguistic features (Greenhill et al. 2010). We selected 20 Austronesian and 20 Indo-European languages for which there were both good lexical and typological information available. The Austronesian lexical data were sourced from Austronesian Basic Vocabulary Database (Greenhill et al. 2008), and the Indo-European lexical data from Dyen et al. (1997). Typological information about these languages (e.g. information about word order, number of consonants, syllable structures, conjunctions, possessives, tenses, etc.) was obtained from the Word Atlas of Language Structures (Haspelmath et al. 2005). The networks built from these datasets using the NeighborNet algorithm in SplitsTree v. 4.10 are shown in figure 7. The networks clearly show that the typological evolution is far less tree-like than that of the basic vocabulary. This difference is also reflected in the delta scores and Q-residuals (figure 7), where the delta scores for the structural information are much larger (twice as large in the Indo-European case), and the Q-residuals are at least two orders of magnitude larger. This supports the view that typological features diffuse relatively easily between neighbouring languages (Matras et al. 2006), while basic vocabulary is less prone to diffusion. For example, although over 50 per cent of the total English lexicon comes from Romance languages post the Norman conquest, this figure falls to around 6 per cent for basic vocabulary, such as the Swadesh 200 word list (Embleton 1986). So far from language being ‘un système où tout se tient’, different aspects of language can have quite different histories, some of which are relatively tree-like and others that are not.

Figure 7.

Split graphs showing the results of NeighborNet analyses of the lexical and typological data. The analyses used Hamming distances and splits were filtered to a threshold of 0.001. For Austronesian basic vocabulary, the average delta score was 0.33 and the average Q-residual = 0.0020. The average delta score for Austronesian typological data was 0.44 and the average Q-residual = 0.05. The respective figures for Indo-European were 0.21 and 0.001 (basic vocabulary) and 0.40 and 0.04 (typology). Known subgroups within each language family are colour-coded. Scale bar, 0.01.

It could be argued that linguistic evolution is a rather special case of cultural evolution. Despite the typological results discussed above, it could be claimed that the transmission mechanisms and social role of language mean that its evolution is likely to be much more coherent and tree-like than other aspects of culture. First, children mainly learn language from their parents, and this enforced vertical transmission tends to maintain intergenerational consistency (Labov 2007). Second, language change is strongly constrained by the need to communicate with others. So, while languages do change rapidly, they cannot change completely overnight. In contrast, many aspects of culture do not share these intergeneration and communicative stabilizing constraints. As Gould (1987) argued, all it takes is 5 min with a bobbin or a bow and arrow for cultural transmission to occur. So, there can be cultural, but not linguistic, revolutions. While we think that these arguments are plausible, we maintain that the extent to which linguistic evolution is unique is an issue that is best addressed empirically, rather than through armchair speculation.

Phylogenetic research on material culture is not common but includes studies of weaving motifs in Turkmen carpets (Collard & Tehrani 2005), basketry traditions in northern California (Jordan & Shennan 2003) and Palaeoindian projectile points (Darwent & O'Brien 2006). However, these studies rarely include an independent estimate of the population history with which to compare the material culture history. A recent study of the cultural evolution of canoe design in the Pacific (Rogers & Ehrlich 2008; Rogers et al. 2009) affords us the opportunity to assess the extent to which the evolution of this aspect of material cultural mirrors the settlement history. Rogers et al. (2009) analysed 134 canoe design traits. Of these traits, 94 were classified as ‘functional’ and 38 ‘symbolic’. Functional traits were those aspects of canoe design that affected canoe sailing performance and hence the prospect of surviving long Oceanic voyages. Symbolic traits were, ‘esthetic, social, and spiritual decorations that presumably have no differential effect on survival from group to group’ (Rogers & Ehrlich 2008, p. 3417). They claimed that population histories could be inferred from the canoe design data and that functional aspects of canoe design provided a stronger reflection of population history. Boldly they suggest that this history may have included Maori sailing the 7000 km from Hawaii to Aotearoa/New Zealand.

To assess these claims, we calculated site-specific likelihoods for each canoe trait. We estimated the relative fit of functional and symbolic traits on a language tree for the 11 societies analysed by Rogers et al. (2009). The tree was constructed from lexical data in the Austronesian Basic Vocabulary Database (Greenhill et al. 2008). Following Gray et al. (2009), cognate sets were binary-coded. Obvious borrowings were eliminated from the analysis. A single substitution rate model of cognates gains and losses, gamma-distributed rate heterogeneity and a strict clock was implemented in the phylogenetic programme Beast v. 1.5.4 (Drummond & Rambaut 2007). To ensure that the language trees matched the population history as closely as possible, and to minimize the impact of undetected borrowing, we constrained the topologies in accordance with independent phonological and morphological evidence (Pawley 1966, 1996). From the posterior probability sample, we constructed a maximum clade credibility tree (figure 8), and then mapped the canoe data onto this tree using Mesquite v. 2.72 (Maddison & Maddison 2010). We calculated the site-specific likelihoods of each character under a 1-rate parameter Markov model. If the claims of Rogers et al. are correct, then it would be expected that both datasets should fit the language trees in figure 8 well, with the functional data fitting the best. Neither prediction is supported by our analyses. Both datasets fit poorly (close to a random distribution), and if anything the functional traits fit the worst (figure 9).

Figure 8.

Maximum clade credibility language tree for the 11 societies analysed by Rogers et al. The tree is constructed from basic vocabulary data with the analyses constrained on the basis of phonological and morphological innovations. To match languages to cultures, we assumed that Societies = Tahitian, Australs = Rurutuan, Cooks = Rarotongan.

Figure 9.

Histograms showing the distribution of likelihood scores for (a) basic vocabulary, (b) functional aspects of canoe design, (c) symbolic aspects of canoe design and (d) randomization of the canoe data on the language tree. Likelihood scores close to zero indicate a good fit. The basic vocabulary data fit the tree the best (mean = −2.89, median = −2.89, s.d. = 2.31). Both the functional and symbolic aspects of canoe design are close to the random distribution (functional: mean = −6.64, median = −7.36, s.d. = 1.28; symbolic: mean = −6.13, median = −6.34, s.d. = 1.37; random: mean = −6.30, median = −6.92, s.d. = 1.45).

Why might this be the case? The trajectory of technological evolution does not need to be tightly tied to population history, especially for functional traits (Dunnell 1978). The global distribution of mobile phones across all kinds of cultural boundaries shows just how quickly useful technology can spread. This is likely to have been the case with functional aspects of canoe design. The large double-hulled drua canoes constructed in Fiji in the late eighteenth century derived their design and handling methods from Tonga and Uvea, while their fore-and-aft rig was Micronesian in origin (D'Arcy 2006). NeighborNet analyses reveal that the evolution of functional aspects of canoe design is indeed strikingly non-tree-like (figure 10). Not only is it clear that Pacific peoples borrowed good aspects of canoe design, they also borrowed, traded and exchanged both canoes (Rolett 2002) and canoe builders (D'Arcy 2006). For example, the drua canoes built in the Lau Group of Fiji were constructed by the Lemaki. The Lemaki were a Tongan and Samoan clan of specialist canoe builders renowned for their extremely watertight method of joining wooden planks without numerous holes and lashings (D'Arcy 2006). While Polynesians readily borrowed functional aspects of canoe design, the symbolic aspects of canoe design might be more closely tied to cultural identity and history. The prows of Maori waka were typically carved in a regional style (Hiroa 1949). This would explain why the symbolic traits fit the languages trees slightly better than the functional traits.

Figure 10.

Split graphs showing the results of NeighborNet analyses of the (a) functional and (b) the symbolic aspects of canoe design. For functional traits, the average delta score was 0.46 and the average Q-residual = 0.03. For symbolic traits, the average delta score was 0.37 and the average Q-residual = 0.05. Scale bar, 0.01.

The canoe data reveal that, at least when it comes to highly functional aspects of material culture, the fabric of cultural evolution is rather different from the evolution of genes in vertebrate species. Different aspects of culture can have quite different evolutionary histories. One challenge for future research is to characterize the processes that promote the tight coupling of cultural lineages and those that lead the different threads to follow separate paths.

5. Conclusion

In this paper we have argued that we need to move beyond dichotomous disputes about the validity of cultural phylogenetics. Instead, we have suggested that the debate is better conceptualized as involving positions along continuous dimensions. The challenge for empirical research is to determine how tree-like and how tightly coupled the evolution of particular aspects of culture are. Both critics and proponents of cultural phylogenetics need to become ‘evidence-based’ in their claims about cultural evolution. Using new network methods derived from evolutionary biology, we have outlined how such investigations can reveal some surprising results—the far-flung Polynesian islands in the Pacific are a hotbed of horizontal lexical and cultural evolution. Properly characterizing the shape and fabric of human cultural history will no doubt require further methodological innovations. For example, it would be very useful to be able to test for significant differences in the degree of tree-likeness. However, the most fundamental requirement for further progress is the collection of more high-quality comparative cultural data. The days when all a study of cultural evolution required was a quick trawl through the Ethnographic Atlas (Murdock 1967) are rapidly drawing to an end. It is time for anthropologists to roll their sleeves up and get serious about gathering comparative data again. We can only echo the sentiments expressed by Shennan (2008, p. 3176) when he noted, ‘the creation of comparable sets of data across time and space has not been the tradition in either anthropology or archaeology, especially in these postmodern times…If cultural evolutionary studies are to progress, this situation needs to change’.


We thank Roger Green for his advice and enthusiastic support of phylogenetic studies of cultural evolution. He is sadly missed. We would like to thank Deborah Rogers for providing the canoe data, Barbara Holland for the original delta-score code and James Steele and Fiona Jordan for their useful comments on the manuscript.



View Abstract