Abstract
How do we quantify patterns (such as responses to local selection) sampled across multiple populations within a single species? Key to this question is the extent to which populations within species represent statistically independent data points in our analysis. Comparative analyses across species and higher taxa have long recognized the need to control for the nonindependence of species data that arises through patterns of shared common ancestry among them (phylogenetic nonindependence), as have quantitative genetic studies of individuals linked by a pedigree. Analyses across populations lacking pedigree information fall in the middle, and not only have to deal with shared common ancestry, but also the impact of exchange of migrants between populations (gene flow). As a result, phenotypes measured in one population are influenced by processes acting on others, and may not be a good guide to either the strength or direction of local selection. Although many studies examine patterns across populations within species, few consider such nonindependence. Here, we discuss the sources of nonindependence in comparative analysis, and show why the phylogenybased approaches widely used in crossspecies analyses are unlikely to be useful in analyses across populations within species. We outline the approaches (intraspecific contrasts, generalized least squares, generalized linear mixed models and autoregression) that have been used in this context, and explain their specific assumptions. We highlight the power of ‘mixed models’ in many contexts where problems of nonindependence arise, and show that these allow incorporation of both shared common ancestry and gene flow. We suggest what can be done when ideal solutions are inaccessible, highlight the need for incorporation of a wider range of population models in intraspecific comparative methods and call for simulation studies of the error rates associated with alternative approaches.
1. Introduction
Comparative analysis across taxa ranging from individuals through populations to species has been central to the study of evolution. Key to such analyses is the incorporation of the nonindependence (autocorrelation) of taxon values resulting from genetic relationships between them. The need to deal with multiple causes of nonindependence in biological data has long been recognized [1], and in evolutionary genetics, methodologies are well developed for analyses at both extremes of the taxonomic scale—among individuals linked by a pedigree, and among species (or higher taxa) linked by a phylogeny. Far less consensus exists for analyses addressing patterns across populations within species, many of which assume populations to be statistically independent entities. Here we review sources of nonindependence in comparative analysis in general, and highlight the specific challenges associated with examining patterns across populations within species. Given the scope of this special issue, the scenarios we discuss are drawn from the field of community genetics; however, the issues raised apply more broadly to use of population data in studies of local adaptation [2].
Community genetics examines the impact of genetic diversity in one species on associated community structure and ecosystem processes [3–5]. The community genetics paradigm has developed primarily (but not exclusively, e.g. [6]) from studies on plants and their associated animal communities, encompassing issues ranging in scale from patterns among individuals within populations to those among populations within species [5,7–10]). One common question concerns the extent to which individual or population genetic diversity in keystone species predicts associated community species richness and abundance [11–13]. Alternatively, one might wish to know how key phenotypic traits of one trophic level (such as levels of defensive plant compounds, or budburst phenology) influence associated communities [14,15]. While some aspects of a community extended phenotype (sensu [7])—such as sessile herbivores—can be sampled for individual host plants, other community components are only meaningful at the population level (for example, measures of betweenplant synchrony in budburst phenology), or can only be sampled using methods that sum over plant stands or populations. Examples of the latter include estimates of insect diversity sampled using malaise traps or moth traps among discrete stands of a single tree species. How, then, should patterns across populations be analysed?
Many community genetic studies use statistical approaches developed in work on local adaptation and quantitative genetics [2,16–18], such as analysis of phenotypic patterns (for example, in herbivore communities) across sets of full or halfsib plants grown in a common garden [12,15,19–21]. Pursuing the insect–plant example, an alternative (often included alongside the first approach) is to examine patterns in communities sampled from natural populations whose pedigree relationships are unknown [11,22]. Multiple populations are sampled (in situ, or under common garden conditions) to provide degrees of freedom for hypothesis testing.
The key question in this second approach is how we regard the information from each population. Many studies treat populations as statistically independent replicates (if using population means) or incorporate population as a fixed effect in analyses using individual/clone data, with each population contributing equal weight [2,10–13,18,22–25]. However, two processes operating at the population level (figures 1 and 2)—exchange of migrants (gene flow) and shared common ancestry (phylogenetic nonindependence)—mean that sampled population phenotypes (and hence associated communityextended phenotypes) are often not statistically independent. Instead, the phenotypes measured in any one population are related to, and influenced by, those in others [26,27]. Revealing the true relationship between variables across a set of related taxa requires that such nonindependence is accounted for statistically.
Generally, variance in any population phenotypic trait that we measure comprises components owing to (i) populationspecific effects (including local selection and genetic drift), (ii) patterns of shared common ancestry among populations, (iii) gene flow between populations, and (iv) sampling error. The question is how to estimate (i) while controlling for (ii), (iii) and (iv). This requires more than adding population as a factor in our analysis because population history and gene flow influence expected patterns of covariance among traits. Put simply, we expect populations that are more closely related or exchange higher numbers of migrants to have more similar trait values before we even consider any populationspecific effects [16,26,28]. Analyses that fail to incorporate such links between populations face two problems: first, where effects of these processes are severe, measured phenotypes can be a poor guide to the processes at work in each population, increasing type 1 (false positive) and type 2 (false negative) error rates. Second, treating sets of linked populations as if they are independent overestimates the number of degrees of freedom available for hypothesis testing, inflating false positive error rates. To estimate the magnitude of populationspecific processes (such as responses to local selective forces), and so examine correlations between these and any other variable, we must transform observed phenotypes to control for contributions associated with shared common ancestry, gene flow and (where appropriate) sampling error. The magnitude of these effects must be expected to vary widely across studies and taxa, reflecting variation in the age of populations, rates of migration between them and the strength of local selection. It will not always be safe to assume that these impacts are negligible.
The issues involved are best understood at two ends of a sampling spectrum. At the individual level, wellestablished quantitative genetic methods control for shared common ancestry by incorporating pedigree information for the individuals involved [29–33]. At the other extreme, analyses across species and higher taxa assume no gene flow but routinely control for phylogenetic nonindependence using a range of comparative methods incorporating phylogenetic or taxonomic level information [28,33,34]. Caught in the middle, analyses across populations within species vary widely in the extent to which they consider phylogenetic nonindependence and gene flow, and most ignore both. Withinspecies analyses are perhaps more challenging than either pedigreebased analysis of individuals or phylogenybased analysis across higher taxa, primarily because (as we will explain below), population histories can be hard to estimate, and we need to consider gene flow—a problem that can usually be ignored in analyses of higher taxa.
In the following sections, we first survey methods that have been developed to incorporate phylogenetic nonindependence in crossspecies analyses (§2), and then (§3) discuss the extent to which these can be extended to crosspopulation analyses. We summarize a method developed by one of us [26] to incorporate effects of gene flow without population history, and (with thanks to Jarrod Hadfield) outline how both effects can be incorporated in the mixed model framework developed in quantitative genetics [33].
2. Coping with phylogenetic nonindependence in crossspecies comparative analysis
Comparative analysis across species and higher taxa requires explicit consideration of evolutionary history, because for any group of phylogenetically linked taxa, phenotypic similarity may reflect convergent evolutionary responses to a similar underlying cause, or limited divergence from a shared common ancestor, or both [28,34,35]. Four general approaches (discussed in detail below) have been developed to address phylogenetic nonindependence: analysis of phylogenetically independent contrasts (PICs, [34]), generalized least squares (GLS, [35–37]), phylogenetic autoregression [38,39] and phylogenetic mixed models [33,40,41]. The approaches differ in attempting either to incorporate phylogenetic and other effects simultaneously (PIC, GLS, phylogenetic mixed models), or to remove phylogenetic effects and examine patterns in residual trait variation (phylogenetic autoregression). Though the four methods have to some extent evolved in isolation, Lynch [40] and Hadfield & Nakagawa [33] have emphasized the similarity between phylogenetic mixed models (of which GLS can be seen as a restricted case) and the ‘animal model’ developed in quantitative genetics, with phylogeny playing the role of pedigree (§2c,d). Hadfield & Nakagawa [33] have recently shown that existing theoretical and computational algorithmics in quantitative genetics can be usefully brought to bear on comparative questions using existing software.
(a) Analysis of phylogenetically independent contrasts
This approach is based on the ‘radiation principle’—that evolutionary correlations between traits are, in principle, free to evolve anew each time daughter taxa diversify from a shared common ancestor (an internal node in a phylogeny) [34,35,42]. The impact of common ancestry is removed by considering only the variation across the daughter lineages at each internal node in a phylogeny (figure 2), summarized for each trait of interest as a form of weighted mean called a linear contrast ([34,43]; for worked examples, see also [28,44]). To reveal the evolutionary relationship between traits, independent contrasts in one trait are regressed against the corresponding contrasts in another trait for the same nodes. Initially developed for analyses structured using bifurcating phylogenies [34], the linear contrasts approach has been extended to incorporate less resolved phylogenies in a GLS framework [28,33,35,45]. While the number of independent contrasts for n species in a fully bifurcating phylogeny is (n − 1), in GLS approaches using incompletely resolved phylogenies, it is the number of internal nodes in the phylogeny.
Estimation of PICs and their statistical analysis both require assumptions about how evolution happens, (i) to allow the estimation of trait values for shared common ancestors, so that contrasts can be calculated for nodes deeper in the phylogeny, and (ii) to scale contrasts made at different nodes in the phylogeny such that the data meet the equal variance assumption of parametric statistical tests. The first [34] and most widely applied model in analytical software (e.g. Phylip [46], Caic [45], Ape [47]) is the Brownian, which assumes continuous characters to evolve by a random walk, with rates of evolutionary change per unit branch length constant in all branches of the phylogeny, and unaffected by the value of a trait or by other species. Though this assumption might appear restrictive, Brownian motion models can incorporate a wide range of evolutionary scenarios [34,48,49]. Other approaches scale by branch length but do so in a different way (phylogenetic regression [35]), or use weighted regression (GLS) to scale the contributions of individual contrasts [36,37]. Despite its convenience for parametric hypothesis testing, the Brownian motion model of character evolution is not supported for some datasets, and is inappropriate for some evolutionary scenarios [50–52]. Alternative plausible models exist that predict rates of trait change exceeding (e.g. resourcepartitioning models) or falling below (e.g. nichefilling models) Brownian predictions [51,52]. Though particularly appropriate to adaptive radiations of species, these alternatives may also apply to trait evolution within species—for example, where populations face a geographical mosaic of contrasting selective pressures [16,53–55]. The extent to which datasets conform to the predictions of a Brownian motion model can be tested [51], and a major strength of generalized linear mixed model approaches discussed below is that they can be applied to nonnormally distributed (more generally, nonGaussian) response variables [33].
PIC methods will only control for the effects of shared common ancestry if the phylogeny used (the ‘working phylogeny’; [35]) actually reflects the true relationships among taxa. This will depend on whether the relationships between sampled taxa are actually treelike, and, if they are, on whether the phylogeny used has the correct topology (branching pattern; so that the right contrasts are made), and relative branch lengths (so that, in methods using this information, contrasts are scaled appropriately). Computer simulations show that if the working phylogeny is inaccurate, the performance of PIC methods deteriorates significantly [48,56]. This is the major argument against applying this approach to withinspecies analyses, because populationsplitting events can be very hard to reconstruct with any confidence (§3). Although some early PIC analyses used existing taxonomy as a proxy for phylogenetic relationships (e.g. [57–59]), working phylogenies are now usually estimated using sequence data, often from a small number of loci. Reconstructing the working phylogeny from a small sample of gene trees makes the assumption that these accurately represent the species tree. When the periods between the branching events in the true species tree are long in terms of numbers of generations (relative to the effective population size; §3a)—usually true in comparative analyses across species—this assumption is reasonable [60]. Methods have been developed that allow uncertainty in the topology of the working phylogeny to be incorporated into comparative analyses, either by repeating the analysis on a set of trees generated by bootstrapping [61] or by using Bayesian methods [62]. Assuming that the working phylogeny is accurate, calculation of appropriate contrasts will then depend on the fit between real evolutionary processes and the assumptions of the evolutionary model [37,44,51,52,63,64]. Again assuming that the working phylogeny is accurate, it is possible to quantify the extent to which trait values are phylogenetically correlated, and so assess whether phylogenetic nonindependence needs to be controlled for [37,64–67]. However, this inference is still only as good as the underlying phylogeny.
(b) Phylogenetic autoregression
Originally developed for analysis of spatial patterns [68,69], autoregression approaches have been widely used in comparative analyses to partition variance in trait values into components attributable to (i) the phylogenetic or spatial relationships among taxa, and (ii) specific, independent evolution in each taxon [38,39,48]. The central tenet of autoregressionbased approaches is that only patterns in the specific phenotypic components are free of phylogenetic nonindependence (box 1). Though now rarely used in crossspecies analyses, spatial and genetic autocorrelation approaches have been used more recently in analyses of withinspecies character divergence [71] and community genetics [13,23,73].
Cheverud's phylogenetic autocorrelation method.
The specific, independent trait value for each species is estimated by subtracting from its total trait value a phylogenetic component owing to shared common ancestry, which is a weighted mean of the values for the same trait in all of the other species sampled. The weight of the contribution of one species i to trait values in another j, w_{ij}, is a function of their spatial or genetic similarity, summarized in a relationship matrix W of pairwise values. The pairwise values are modelled as w_{ij} = 1/d_{ij}^{α}, where d_{ij} is the pairwise distance between species i and j, and the exponent α allows scaling of the relative impact of one species on another's phenotype with phylogenetic distance. Where alpha is high, more distant populations have little impact on trait values [39,70]. Pairwise distances can be derived from a userdefined hierarchy (e.g. of spatial groupings or taxonomy) or from empirical, geographical or phylogenetic distances [13,23,70,71]. Specific trait values for each of n species are estimated by fitting an autoregression model of the form: where (with vectors in bold, lower case, and matrices in bold upper case): x is an n × 1 vector of standardized trait values for n species; ρ is the phylogenetic autocorrelation coefficient; W is an n × n matrix of pairwise phylogenetic similarities between sampled taxa; and e is an n × 1 vector of residuals.
ρWx is taken to represent the phylogenetic component of trait values, while the residuals represent the component that is free of phylogenetic effects, and which may be used to test evolutionary hypotheses. Fitting of the autoregression model requires estimation of ρ and α by maximum likelihood (for example, using the software Compare [72]). Evolutionary correlations between traits can be estimated by fitting separate autoregression models for each trait, and testing the correlation between the specific components of each trait using standard statistical approaches (see [44] for a worked example). See main text for potential drawbacks of this approach.
An attractive aspect of autocorrelation approaches is that the need to control for phylogenetic effects, and the extent to which this is achieved, can be assessed graphically using phylogenetic correlograms [39,48,70,71]. Correlograms show the relationship between the strength of any autocorrelation (usually expressed in terms of Moran's I coefficient) and the pairwise distance between taxa. Moran's I ranges from −1 (maximal negative autocorrelation) to 1 (maximal positive autocorrelation), with an expected value of zero where no association exists. Where there is strong phylogenetic autocorrelation in the original total phenotypic data, Moran's I values at low pairwise distances will be significantly positive, while values at larger distances tend to 0 or negative values. Such a pattern is taken to support fitting of an autoregression model [39]. If phylogenetic and specific effects are effectively partitioned in the model, a correlogram for the specific effects should show no trend in Moran's I with increasing genetic distance.
Autoregression approaches share with PIC methods the assumption that the way in which any two taxa evolve is independent of the relationship between them (i.e. the covariance between specific and phylogenetic components is assumed to be zero). This will not be the case where closely related taxa are exposed to similar habitats and adapt independently to them (parallel evolution). Under such circumstances, resulting similarity in trait values among taxa will be attributed to the phylogenetic component and, so, discarded. Harvey & Pagel [28, p. 134] observed that autoregression approaches ‘assign both parallel evolution and variance due to the interaction of the phylogenetic and specific components solely to the phylogenetic effect’, and for this reason, autoregression approaches have been strongly criticized and are generally avoided in comparative analysis [64,74].
An important assumption of autocorrelation methods is that the relationship matrix W (box 1) provides an appropriate estimate of pairwise taxon relationships; if not, estimates of Wx will be incorrect. Autoregression approaches are, however, less sensitive than PIC methods to errors in identifying the relationships between taxa [44,48,56]. This is important because although contrast/GLS methods can perform better than autocorrelation approaches when the working phylogeny is estimated accurately [48], the performance of contrastbased methods falls substantially when the working phylogeny is significantly wrong [44,48,56]. This is a potential advantage of autocorrelation approaches when the precise relationships linking taxa being compared are either hard to estimate or not treelike at all [39,70] (see below).
(c) Generalized least squares
What we may call the Cheverud approach [38] to dealing with autocorrelation is not the only way. In the Cheverud approach, the factor causing the autocorrelation is supposedly removed, leaving a set of uncorrelated residuals for analysis. GLS is another method. In familiar regressions and ANOVAs, for example, statistical analyses are based on the premise that our observations are independent observations from the same probability distribution—typically the normal. More technically, we suppose that the variance–covariance matrix of our data is diagonal, with all the covariances between the data equal to zero, as they are independent. But as explained above, taxon data may not be independent. Technically, there may be nonzero covariances. If we have a model to generate these covariances, we may construct a suitable variance–covariance matrix and use it to transform our data. GLS then effectively applies standard regression to these transformed data. This approach has been applied to analyses across species and higher taxa by Pagel [36,37] and Grafen [35], using phylogenetic information and a Brownian motion model of evolution to construct the variance–covariance matrix. Freckleton & Jetz [75] use GLS to study a crossspecies model in which both phylogeny and degree of range overlap together determine the variance–covariance structure. GLS has been criticized in a quantitative genetics context because it does not take into account uncertainty in fixed effects [33], and has been largely replaced by more sophisticated generalized linear mixed models, of which GLS can be seen as a restricted case.
(d) Generalized linear mixed models and the phylogenetic mixed model
Mixed models were first developed in quantitative genetics, and applied extensively to analysis of data for individuals linked through a pedigree—an analysis now known as the animal model [76]. This approach was first applied to phylogenetic comparative analysis by Lynch [40], and has been extended within a more inclusive generalized linear mixed model framework by Hadfield & Nakagawa [33]. The essence of what is now called the phylogenetic mixed model is the partitioning of variation in the data into various components. Suppose we denote the average value of a character of interest in taxon i as z_{i}. We model our data as follows: where m is the expected phenotype at the root of the phylogeny, a is the phylogenetic contribution and e is a residual. Such models are referred to as mixed models because, statistically, they mix both fixed effects (m) and random effects (a, e). They are widely used in many sciences where measurements are made on clusters of related (i.e. nonindependent) statistical units (another application is in analyses involving multiple observations on the same individual). For the purposes of intuition, we may suppose that m was the state of the character at the origin of the radiation of the phylogeny we are studying. Our data, the z_{i}, will be nonindependent and this will be revealed in the phylogenetic component, a, which will result in more closely related species having more similar values of z. As in other comparative approaches, the a_{i} are assumed to evolve along the phylogeny via Brownian motion, resulting in dependencies in the phylogenetic effects of species that have shared a common ancestor. The rate at which the phylogenetic effects evolve is proportional to the standard deviation of the phylogenetic effects and is estimated from the data. As the standard deviation of the phylogenetic effects increases relative to the standard deviation of the residuals, the dependency between the phenotypes of taxa increases. When the standard deviation of the residuals, is zero, this method then becomes equivalent to PICs. Substitute ‘pedigree’ for ‘phylogeny’ and ‘breeding value’ for ‘phylogenetic effect’ in the above and you have the quantitative genetics animal model [76]. This simplified view captures the essentials and makes clear the essential similarity between phylogenetic and quantitative genetic models first noted by Lynch [40] and recently emphasized by Hadfield & Nakagawa [33]. Phylogenetic or pedigreebased mixed models can both be fitted using existing software, including ASReml [77], BUGS [78] and the MCMCglmm R library [33,79], allowing estimation of the separate contributions of the components of the model to our data. Phylogenetic, taxonomic or pedigree information are incorporated in a covariance matrix, representing the amount of evolutionary history two taxa have shared. As with other comparative approaches, phylogenetic mixed models have usually assumed trait values to be normally distributed (but see [80]). A major strength of the mixed model approach, noted by Hadfield & Nakagawa [33], is the use of link functions to allow analysis of nonnormally distributed (more generally, nonGaussian) data, and hence application to nonBrownian motion models of evolution (§2a).
3. Analysis of patterns across populations within species
Populations within species are to some extent equivalent to species within higher taxonomic levels, in that their mean trait values include a component inherited from ancestral populations, and components owing to local population effects of selection and genetic drift. However, because populations within species can exchange migrants, their phenotypes may also contain additional components resulting from the combination of history, drift, selection and gene flow acting on neighbouring populations [26,81]. As with phylogenetic nonindependence, gene flow means that phenotypes sampled in a given population will often not be a good guide to the direction and strength of selection acting on that population; instead, migration smoothes phenotypic variation among sets of connected populations [26] (figure 1). Further, migration means that (as with phylogenetic nonindependence) trait values for linked populations cannot be considered statistically independent. Our problem is that the trait values from linked populations are ‘intertwined’, making them totally unsuitable in their current form for any kind of principled inference. For both consequences of migration just mentioned, the more migration there is, the worse it gets.
Analysis of patterns across populations thus requires incorporation of both phylogenetic and gene flow effects. Population trait values will otherwise only be statistically independent where either (i) selection is so rapid and strong that sampled phenotypes are unaffected by either ancestry or gene flow, or (ii) all populations are equally descended from the same common ancestor (i.e. they are branch tips in a starshaped phylogeny) and there is no migration between them [48,70,82] (figure 1). Analyses that do not incorporate these potential causes of nonindependence (and many do not) assume the above. Studies incorporating these issues in analyses across populations have found significant genetic and spatial nonindependence [16,27,70,71]. In the following sections, we consider first phylogenetic nonindependence (§3a) and then gene flow (§3b), before considering specific approaches to dealing with these issues (§3c–f). In contrast to the wellestablished methodologies available for crossspecies analysis, it is early days for withinspecies analysis, and from promising beginnings much remains to be done.
(a) Incorporating phylogenetic nonindependence in withinspecies analyses
It is tempting to use analysis of PICs for withinspecies comparisons, based on a withinspecies phylogeny constructed from sequence data for an appropriately polymorphic marker (such as the mitochondrial cytochrome oxidase c and cytochrome b genes often used in DNA barcoding in animals). This temptation should be resisted [26]. Analysis of PICs is only appropriate where the relationship between populations is both treelike (often it will not be! See below.) and can be inferred with reasonable accuracy. If there is extensive gene flow between populations, the genetic relationship between them will not be treelike, but reticulate, and no appropriate working phylogeny will exist [26,70]. We might be tempted to use the same data to generate a relationship matrix among populations for use in a phylogenetic mixed model, matrix correlation analysis (for a crossspecies example, see [67]) or autocorrelation analysis. Here again the key issue is whether such markers accurately capture the expected covariance in trait values among populations. Mitochondrial sequence data, though easy to generate and often useful in resolving population relationships, in arthropods and nematodes have coalescent histories that can be strongly influenced by selective sweeps imposed by maternally inherited symbionts, such as Wolbachia [83,84]. As we explain below, sequence data will often overestimate the age of splits between populations, leading to an inappropriate expectation of covariance in the data.
The relationships between populations inferred from molecular data are sensitive to two processes: the way in which genetic variation present in a parental population separates out between its daughter populations (sorting of ancestral polymorphism), and the independent accumulation of new mutations in each daughter population. As a general rule, the genetic diversity in very recently separated populations will be dominated by sorting of ancestral polymorphism [60,85]. Since ancestral polymorphism by definition evolves before the separation of daughter taxa, the variation it shows provides no information on their relationships. Instead, the ancestral sequences we sample are determined by the dynamics of the coalescent process at each locus (for diagrammatic illustrations of conflicts between gene trees and species trees, see [60,85,86]). As a result, relationships between populations based entirely on ancestral polymorphism are very likely to differ widely among loci, and to differ substantially in both topology and branch lengths from the true underlying population tree. The use of gene trees in comparative analyses within species risks applying a false working phylogeny if using PICs, and applying a false covariance/weighting matrix in any other method. Bad news all ’round.
As the timescale over which populations diverge increases, so patterns owing to sorting of ancestral polymorphism are augmented by the accumulation of lineagespecific mutation, increasing the concordance between gene trees and the underlying species tree [60,87,88]. As long as some lineagespecific signal exists (given also the randomness of mutation and the finiteness of the regions of the genome examined), the underlying population tree can be estimated in principle using coalescentbased approaches that combine information from many loci [89–96]. As genomics data accumulate for model systems in local adaptation and community genetics [97], this approach will become increasingly accessible. However, the effort required to get a useful answer will be beyond most comparative analyses. Confident inference of the topology and branch lengths linking a recent series of branching events in species with large effective population sizes can require data for many loci even for a small number of populations [96,98–100]. Accurate resolution of relationships among many local (and so recently diverged) populations using genetic data may be impossible.
Within species, phylogenybased methods are only likely to be appropriate where populations are resolvable as independently evolving lineages, i.e. (i) population splits are wellenough separated in time to allow adequate resolution of the population tree, and (ii) migration between populations is minimal. These conditions will only apply when the units we are calling populations are separate enough to effectively constitute incipient species. For multicellular organisms, this is demonstrably the case for populations in longseparated glacial refugia or other habitat islands [101–105]. For very rapidly evolving organisms with low dispersal ability occupying very specific habitat patches, the same may apply over much smaller spatial and temporal scales. In most intraspecific analyses, however, phylogenybased methods will be inappropriate.
(b) Analysis of patterns across populations linked by migration
As described above, migration influences the component of variance in a trait that we can ascribe to populationspecific effects, and hence the statistical independence of population data. Under specific models of population structure, migration can also result in spurious crosspopulation patterns in phenotypes, suggesting an underlying cause such as clinal selection where none beyond migration and drift exists [26]. This means that even though phylogenetic relationships between populations can be estimated in the face of some gene flow [95,106], applying phylogenybased methods (such as PICs) is inappropriate if gene flow significantly influences population phenotypes. We might ask what evidence there is for significant gene flow between a set of phylogenetically linked populations, before deciding whether to incorporate its effects in an analysis. The topology of the population tree and migration rates between populations can be estimated simultaneously using multilocus coalescent approaches [91,107,108], recently extended to multiple populations ([95]; http://genfaculty.rutgers.edu/hey/software). However, many loci are required to estimate splitting times and migration rates with enough accuracy to structure comparative analyses.
One way forward, explored by Felsenstein [26] and discussed in §3c and appendix A, is to consider an explicit population genetic model in which population phenotypes are only influenced by gene flow and local population effects, ignoring population history. An alternative is to use pairwise measures of genetic similarity between populations to sum the contributions of shared common ancestry and gene flow, generating a population relationship matrix for use in phylogenetic autocorrelation or (as the variance–covariance matrix) in a generalized linear mixed model. All of these approaches will be undermined if the population relationship or gene flow matrix is an inaccurate representation of the influences of populations on each others' phenotypes. As explained above, generating appropriate data to construct useful relationship matrices remains a challenge in any system that is not the specific focus of population genetic study. All methods may also fail to discriminate between effects of gene flow and selection where the predicted effects of these processes are confounded, as might be the case where populations facing similar selective pressures are also nearest neighbours experiencing the highest rates of gene flow [26,70,71,109]. The potential for such confounding can be examined post hoc by using matrix correlation analysis to assess the independence of spatial and genetic patterns in phenotypic data across populations (for examples of this approach, see [70,71]). Where population relationships are known in advance, this problem may be avoidable by sampling an appropriate set of populations for which predicted patterns based on relatedness and selection are not the same [26,110].
(c) An intraspecific contrasts method
Felsenstein [26] developed an intraspecific comparative method that generates amongpopulation contrasts controlling for the effects of gene flow between populations. We discuss this approach at some length in appendix A as it makes crystal clear how migration entangles population phenotypes, rendering naive crosspopulation statistical analyses invalid, and how (under specific conditions) this problem may be fixed.
The intraspecific contrasts approach assumes a network of populations at mutationdrift equilibrium, meaning that in each the measured phenotype represents the local selective optimum distorted only by immigration and genetic drift. The latter effects are incorporated using an explicit underlying model of population structure and character evolution (appendix A). In the example discussed by Felsenstein [26], populations have a constant population size, with equal numbers of migrant individuals exchanged between any pair of populations (requiring smaller populations to contribute a larger proportion of their number as migrants). Each population is selected towards a constant local optimum phenotype and trait heritability is assumed to be constant. The method requires the existence of genetic data that allow the generation of a matrix M containing the pairwise migration rates (m_{ij} between populations i and j) scaled by effective population size (i.e. the product n_{i}m_{ij}). This is possible for multilocus data in a coalescent framework using Migrate [111–113] for a range of population structures assuming constant population size, or for sequence data using IMa2 [95]. Given these assumptions, this intraspecific contrasts approach allows explicit transformation of population phenotypes to values that are independent of migration and reflect the outcome of selection acting on that population (i.e. under the assumptions of the model used, they are locally optimal phenotypes). These corrected phenotypes can be used to reveal correlations between population phenotypes and populationspecific explanatory variables. Transformed phenotypic values can be used to fit a model of the form p_{i} = const. + bz_{i}, where p_{i} is the transformed optimal phenotype in population i and z_{i} is a potentially explanatory variable, such as temperature, recorded for that population. Numerous hypotheses can be explored now that the disentangled, transformed data are in place and amenable to statistical analysis. For example, we could test the hypothesis that b = 0, and hence that populationtransformed phenotypes are independent of temperature.
As with any comparative method, applying this method requires (i) an appreciation of the extent to which violation of the underlying assumptions compromise the conclusions (and it is very early days as far as addressing this question is concerned) and (ii) generation of an appropriate pairwise migration matrix, M. Departures from the assumption that population phenotypes are in selection/migration/drift equilibrium can influence the strength of selection inferred in this model. For example, if population means differed historically and are now gradually converging owing to gene flow (as we might expect if there is a strong phylogeographic component to population relationships), then assumption of migration/drift/selection equilibrium will overestimate the role of current selection in maintaining the observed phenotypic difference. Similarly, if population phenotypes were historically similar but are now diverging, then the impact of current selection will be underestimated.
Recent work shows that incorporation of information on withintaxon trait variation (rather than simply using a taxon mean) can have important consequences for the scaling of PICs in PIC analyses [43,114,115]. Withinpopulation variation could be incorporated in a similar way as an extension of the withinspecies contrast method. Other extensions of this method are being developed to take into account sampling variation when there are finite samples from populations, rather than exact knowledge of the population means. The method can also be extended to multiple characters, with additive genetic covariance of characters inferred. In this context, quantitative genetic experiments can connect directly to the method. One particularly troublesome effect for all methods will be direct environmental effects on phenotypes (environmental plasticity), which can mimic genetic differentiation. However, even in that case, one can in principle separate the effects of natural selection from direct effects on the phenotype.
(d) Spatial and genetic autoregression approaches to withinspecies analysis
Early autoregression studies recognized the approach to be as applicable at the population level as it is to analyses across species [38,39]. Autocorrelationbased analyses of withinspecies data thus have the advantage of readily available software (e.g. Compare [72]). As with crossspecies analyses (and with the same caveats over the limitations of autoregression as before), specific components can be generated for traits individually, and then the relationship between traits examined. Intraspecific analyses have used relationship matrices based on mitochondrial sequence data [70] or a combination of mitochondrial sequence and nuclear allele frequencies [70], and used phylogenetic autocorrelograms to show first that there is a signature of genetic autocorrelation in the data considered, and that fitting of an autoregression model removes this signature in the specific (independent) components of trait values. An alternative is to incorporate a genetic similarity matrix in partial Mantel tests [116] of relationships with other variables (e.g. [21], using multilocus AFLP data for cottonwood poplars). One response to the challenge of obtaining pairwise genetic distances is to control instead for spatial autocorrelation in the data [13,23,73,110]. This will control for genetic nonindependence if nearby populations are generally more closely related. However, this model would not be appropriate if populations arise by the sorting of existing intraspecific genetic diversity among microhabitats, for example. If this alternative is the case, neighbouring populations in different habitats may be genetically very divergent.
(e) Generalized leastsquares approaches to withinspecies analysis
Hansen et al. [16] used a GLS model to examine local adaptation in the flower morphology of two Dalechampia species competing for the services of overlapping sets of pollinators. The approach they use is also an early example of a special case of the intraspecific contrasts method [26]. Their hypothesis is that where these species occur together, competition will lead to adaptive divergence in floral morphology. It thus matters whether shifts in phenotype in locations where the two species are sympatric are independently evolved, or derived from widespread dispersal of particular phenotypes in each species (so representing a single divergence event rather than multiple independent events). Their model assumes covariance in trait values between two populations to decay exponentially with their spatial separation, and under this assumption repeated character displacement in the two Dalechampia species was supported. The algebraic machinery employed by Felsenstein [26] (appendix A) also provides the covariance matrix needed for GLS and generalized linear mixed model approaches (below).
(f) Generalized linear mixed models
Notwithstanding the difficulty of accurately estimating both rates of gene flow and branching patterns among populations, the generalized linear mixed models recently discussed in detail by Hadfield & Nakagawa [33] can incorporate both migration and population history. Using the terminology already introduced above and in box 1 and §2d, where m is the expected phenotype at the root of the phylogeny, a is the phylogenetic contribution, b is the contribution owing to gene flow and e is a residual. This form of model can be fitted using available software, in ASReml [77] and the MCMCglmm R library [78]. The R script for fitting this combined history and gene flow model in MCMCglmm is provided in appendix A. As discussed in §2a,d, a major strength of the mixed model approach is its ability to incorporate nonnormally distributed data [33]. However, the reticulate nature of betweenpopulation migration may limit development of efficient computational methods, as has been done for pedigree and phylogenetic matrices (see discussion by Hadfield & Nakagawa [33]).
4. Conclusions, and the future of withinspecies comparative analysis
The need to explore the consequences of nonindependence in comparative data, and to control for it if we find it, is clear. The challenges of coping with such nonindependence underline the value of using established pedigreebased approaches wherever possible. But what should those working on nonpedigreed populations do? Population genomic data provide the best hope of estimating the parameters required to control for nonindependence statistically. Given that generation of population genomic data is in its infancy in almost all systems, an alternative is to use multilocus datasets for whatever neutral genetic markers are available. Use as many informative loci as possible and select those in which likely impacts of recombination and selection are minimized. For reasons described above, under some plausible scenarios, genetic data will be a more reliable guide to population relationships than spatial surrogates. If population trees and migration rates can be estimated with reasonable accuracy, incorporate both into the analysis using generalized linear mixed models to minimize impacts of nonindependence.
More work is needed on the impacts of alternative population models (and hence variance–covariance structures) in intraspecific comparative analysis. Examples include the metapopulation models that have received considerable attention in a coalescent framework [117,118], and models incorporating the alternating periods of reproductive isolation and betweenpopulation dispersal experienced by many taxa as a result of global climate cycles [119]. Where methods vary in their performance, we should choose the method whose assumptions best match the properties of our data [51,64,120]. Simulation studies have proved very helpful in revealing relative type 1 and type 2 error rates of alternative interspecific comparative methods under different scenarios (levels of phylogenetic information, evolutionary models and levels of withinspecies sampling) [43,45,48,56,63,65,114,115,121,122]. One outcome of such studies is the realization that even where alternative approaches have similar error rates, the estimates they return for the correlations between variables can be rather different [122]. The field of withinspecies comparative analysis would benefit enormously from the development of a similar body of critical simulation analysis.
Acknowledgements
G.N.S. is supported by NERC grants NE/H000038/1 and NE/E014453/1 and J.F. was supported by NIH grant GM071639, by NIH grant DEB0814322 (PI: Mary Kuhner) and, for the period 2009–2010, by Genome Sciences Department ‘life support’ funds. We thank Jarrod Hadfield, David Shuker, Richard Preziosi and Jenny Rowntree for their detailed and helpful feedback, Joe Bailey, Gina Wimp, Tom Whitham, James Nicholls and Konrad Lohse for their useful comments and critical reading of earlier versions and James Cook and Karsten Schönrogge for discussions leading to the writing of this paper.
Appendix A. An intraspecific contrasts method
We elaborate on the withinspecies contrast method [26] for several reasons. First, it illustrates how betweenpopulation genetic exchange entangles the features in which we are interested and, so, emphasizes the nature of the problem of withinspecies nonindependence. Second, the matrix algebra used to disentangle, statistically, the populations will be unfamiliar to many readers, but is widely applied and useful in many contexts. Finally, issues that arise in the context of this method will arise for all withinspecies comparative methods.
The basic recursion for the evolution of a phenotype, x, over time, t, in a network of n populations is with (as in box 1), bold face denoting vectors (lower case) and matrices (upper case). x contains the observed phenotype in each population; p contains the optimum phenotype p_{i} for each population; σ is a scalar describing the incremental per generation movement of a population phenotype, x_{i}, towards its local optimum p_{i}. It is the result of both the heritability and the strength of selection. M contains the pairwise migration rates between populations; e is a vector containing terms describing the impact of genetic drift on each population.
Each single population evolves as: where m_{ij} is the fraction of newborns in population i that have arrived from population j, and e_{i} is a random variable approximating the effect of genetic drift, drawn from a normal distribution with a mean of 0 and a variance proportional to the inverse of the population size.
The migration matrix M is entangling the evolution of all the data x; the essence of the method is to exploit M and matrix algebra to construct transformations of the data such that the transformed data are disentangled. To do this, we use a result (eigen decomposition, or spectral decomposition) from matrix algebra that is remarkably useful in all sciences, mathematics and statistics and allows us to use the migration matrix M, to obtain uncorrelated variables. The result is
M = C^{−}^{1}ΛC, where C and Λ are matrices generated from M, such that a diagonal matrix of the eigenvalues of M. C is a matrix of the eigenvectors of M.
We use the matrix C to generate a vector of transformed phenotype values, y, such that
Let us return to our original model in a strippeddown form to let us see clearly the crucial elements: the ‘stripping down’ involves removing the superscripts involving time, t, and looking at the expectation of x, E(x), although we will leave out the ‘E's’ to reduce clutter in this explanatory exercise. This somewhat unusual approach is analogous to the use of pseudocode when presenting computer algorithms where the emphasis is on exposition, not implementation. Here, matrix algebra is behaving like normal algebra.
Extracting one row of this, and using a as a symbol substitution to simplify visuals:
The crucial thing to note is that by using the migration matrix to provide us with a transformation of our original data, we now have transformed data that are not entangled, but scaled by the single value λ_{i}. Obviously, we have swept a lot under the rug to reveal this important core. For example, ‘a’ is a function of the migration matrix and the optimal phenotypes, p.
This is a convenient place to briefly discuss the meaning of the word ‘contrast’ in comparative analysis. ‘Contrasts’ are simply data that have been transformed in such a way that the effects of entangling factors have been removed. The transformation here differs from that used in PICs, but the end result is the same: transformed, disentangled data. In the first case, the entanglement is owing to shared phylogeny, here it is a result of migration.
This model is fitted to some exemplar data in Felsenstein [26], and appropriate software will be made available from http://evolution.gs.washington.edu/programs.html. The explicit underlying model of population structure and character evolution assumed in the application of this approach by Felsenstein [26] is described in the main text. This model can also be fitted in the quantitative genetics package ASReml [77] and using Bayesian approaches, in BUGS [78] and MCMCglmm [33]. With thanks to Jarrod Hadfield, the R scripts for fitting several alternative models in MCMCglmm are as follows. In each, M is the betweenpopulation covariance matrix owing to gene flow, equivalent to M above and Msvd is the single value decomposition of M. A generalized linear mixed model (glmm, §3f) incorporating gene flow is fitted using:
Msvd<svd(M)
Msvd<Msvd$v%*%(t(Msvd$u)*sqrt(Msvd$d))
testdata$Msvd<Msvd
M1<MCMCglmm(x∼1, random=∼idv(Msvd) data=testdata)
To fit a glmm model (M2) that also includes a phylogenetic term, we use:
M2<MCMCglmm(x∼1, random=∼idv(Msvd)+animal,
pedigree=phylogeny, data=testdata),
where animal is a column in test data that indexes taxa, and phylogeny is a phylogeny stored as a phylo object from the package APE [47].
Felsenstein's model (§3c) is a special case of a simultaneous autoregressive model [123,124], which has been extended to deal with random effects efficiently [125]. With the assumption that the population optimum phenotypes are independent and identically distributed, the Felsenstein model can be fitted using
M3<MCMCglmm(x∼1+sir(∼M, ∼units), data=testdata)
To fit a phylogenetic component also, we have
M4<MCMCglmm(x∼1+sir(∼M, ∼units), random=∼animal,
pedigree=phylogeny, data=testdata).
As discussed in Hadfield & Nakagawa [33], Markov Chain Monte Carlo and Gibbs sampling approaches incorporated in BUGS [78] and the MCMCglmm R library [33,79] allow Bayesian approaches to incorporate data for nonGaussian (including nonnormal) trait distributions. See main text, §2a for examples of relevant scenarios involving nonGaussian trait distributions. The models given above can also be fitted (and perhaps more flexibly) in the R package spdep [126].
Footnotes

One contribution of 13 to a Theme Issue ‘Community genetics: at the crossroads of ecology and evolutionary genetics’.
 This journal is © 2011 The Royal Society