Arabidopsis and relatives as models for the study of genetic and genomic incompatibilities

Kirsten Bomblies, Detlef Weigel


The past few years have seen considerable advances in speciation research, but whether drift or adaptation is more likely to lead to genetic incompatibilities remains unknown. Some of the answers will probably come from not only studying incompatibilities between well-established species, but also from investigating incipient speciation events, to learn more about speciation as an evolutionary process. The genus Arabidopsis, which includes the widely used Arabidopsis thaliana, provides a useful set of model species for studying many aspects of population divergence. The genus contains both self-incompatible and incompatible species, providing a platform for studying the impact of mating system changes on genetic differentiation. Another important path to plant speciation is via formation of polyploids, and this can be investigated in the young allotetraploid species A. arenosa. Finally, there are many cases of intraspecific incompatibilities in A. thaliana, and recent progress has been made in discovering the genes underlying both F1 and F2 breakdown. In the near future, all these studies will be greatly empowered by complete genome sequences not only for all members of this relatively small genus, but also for many different individuals within each species.

1. Introduction

It is widely accepted that a common path to speciation involves pre-mating barriers to gene flow, including geographic separation, occupation of distinct ecological or phenological (seasonal) niches or differences in mating system. Such barriers allow evolution to proceed along separate trajectories in different populations, which may ultimately lead to reproductive isolation, caused either by genetic drift or as a by-product of divergent selection for adaptive traits (Coyne & Orr 2004). Despite great advances in our understanding of the genetics of speciation over the last decade, much remains to be learned about how evolutionary forces such as drift or adaptation can lead to genetic incompatibilities. Furthermore, while speciation research has often focused on finding genes responsible for incompatibilities between well-established species, it is as important to investigate systems at different stages of divergence, to understand not only the outcome of speciation events, but also speciation as an evolutionary process.

Members of the genus Arabidopsis, which includes the workhorse of plant molecular genetics, Arabidopsis thaliana, provide a useful set of model plants for studying several aspects of speciation processes. Somewhat unfortunately for questions of interspecific barriers, A. thaliana is well separated from the rest of the genus, which comprises A. croatica, A. halleri, A. lyrata and two very closely related species pairs, A. cebennensis and A. pedemontana, as well A. arenosa and A. neglecta. Finally, an allotetraploid (amphidiploid) species, A. suecica, which can be readily resynthesized in the laboratory, has formed by the hybridization of A. arenosa and A. thaliana (Hylander 1957; O'Kane et al. 1996; Chen et al. 1998). Ignoring the rarer species, the major lineages in the genus are A. thaliana, A. arenosa, A. halleri and A. lyrata (Al-Shehbaz & O'Kane 2002; Clauss & Koch 2006; Koch & Matschinger 2007). The haploid chromosome number is five for A. thaliana, eight for A. arenosa, A. halleri and A. lyrata and thirteen for A. suecica. It is unknown for the remaining four, less well-studied species, but based on the phylogenetic relationships, probably also eight. For several species, both diploid and tetraploid forms are known (Al-Shehbaz & O'Kane 2002). Importantly, viable offspring can be generated from almost any cross within the genus, although the chromosomal differences between A. thaliana and the other species impair fertility of diploid F1 hybrids.

Arabidopsis thaliana is a predominantly selfing species estimated to have made the transition from obligatory outcrossing to preferential self-fertilization about a million years ago (Bechsgaard et al. 2006; Tang et al. 2007). One can therefore think of A. thaliana as a collection of independently evolving natural lineages with only modest gene flow among them, making it a useful model for understanding the early consequences of genetic divergence on hybrid fitness. Studying incompatibilities within species faces, however, the inherent difficulty that the cases identified will not necessarily result in the formation of independent species. As a matter of fact, since speciation events are rare, only a minuscule fraction of incompatibilities found within species is likely to lead to irreversible barriers to gene flow between populations. However, we argue that within-species studies can provide convenient models for understanding the genetic basis of incompatibilities. Moreover, in those instances where such incompatibilities are genetically and mechanistically similar to those observed in other taxa or to ones that separate bona fide species, or where they show repeatable patterns, intraspecific variation opens the door to understanding large-scale patterns in plant evolution and the generation of stable gene flow barriers. Here, we focus on areas in which recent progress has been made in understanding several different types of potential barriers to gene flow in the Arabidopsis genus, and end by pointing out opportunities for future investigations in what remains one of the most fascinating topics in biology.

2. Interspecific incompatibility

(a) Dissecting the polyploid gene flow barrier

Polyploidy, the doubling of the entire genome, is common in plants and may provide an important route to speciation (Coyne & Orr 2004). It is often discussed as providing a mechanism for ‘immediate’ speciation because of meiotic pairing problems that strongly reduce the fertility of interploid hybrids between the nascent polyploid and its diploid progenitor (Coyne & Orr 2004). Apart from chromosomal segregation, mis-regulation of imprinted genes might affect the viability of interploid hybrids. Such imprinting differences, which are thought to arise from parental conflict, can lead to aberrant development of the endosperm and high levels of seed lethality (Haig & Westoby 1991).

In A. thaliana, naturally tetraploid strains are known and interploidy crosses within A. thaliana, as well as with A. lyrata and A. arenosa, have been very informative in bolstering our understanding of the molecular mechanisms of ploidy barriers, and how they can be overcome to allow gene flow across ploidy levels. A. thaliana triploids retain some fertility and give rise to aneuploid swarms, resolving eventually into stable diploids and tetraploids. In one such swarm, a single locus was shown to be selected in tetraploid and aneuploid derivatives, but not in diploids, suggesting that it contributes to aneuploid survivorship or fertility (Henry et al. 2007, 2009). Seed viability in A. thaliana interploidy crosses also varies in different genetic backgrounds. One of the genes associated with such variation has been found to encode TRANSPARENT TESTA GLABRA2 (TTG2), a maternally expressed WRKY family transcription factor controlling seed size (Garcia et al. 2005; Dilkes et al. 2008). That one locus can have a major effect on the viability of seeds with intermediate ploidy has important implications for the evolution of gene flow barriers associated with polyploidy.

Important opportunities for studying the relationship between changes in ploidy level and hybrid speciation, another important route for the generation of new taxa in plants (Rieseberg & Willis 2007), exist for a natural allopolyploid species, A. suecica, which apparently formed only once, and very recently, by hybridization of A. arenosa and A. thaliana (Säll et al. 2003; Jakobsson et al. 2006). A. suecica like plants can be recreated by crosses between A. arenosa and A. thaliana in the laboratory. However, these crosses commonly lead to aborted seeds that are similar to those observed in interploidy crosses within A. thaliana, with endosperm hyperproliferation and delayed development. The hybrid seeds suffer from defects in epigenetic gene silencing that cause activation of paternal transposable elements, as well as inappropriate expression of the imprinted MEDEA (MEA) gene, its target PHERES1 (PHE1), as well as several AGL genes that like PHE1 encode MADS domain proteins. Performing crosses of A. arenosa with A. thaliana plants that have single mutations in several of these genes significantly improves hybrid seed viability (Josefsson et al. 2006; Walia et al. 2009). These findings not only implicate imprinting and epigenetic regulation in interspecific hybrid failure in Arabidopsis, but also demonstrate that this can be caused by a comparatively small number of loci. As in A. thaliana, only the maternal copy of MEA is expressed in the endosperm of A. lyrata. Consistent with the parental conflict hypothesis, there is high allelic diversity at the MEA promoter in A. lyrata, and the distinct haplotypes are suggestive of balancing selection (Kawabe et al. 2007).

Together, the studies discussed above draw intriguing parallels between interspecies and intraspecies hybrid failures owing to mis-regulation of genes that are normally expressed preferentially from either the maternal or paternal allele, an indication of parental conflict. It supports the intriguing possibility that independent evolution of imprinting mechanisms in diverging populations may be a path to reproductive incompatibility.

(b) Species specificity of pollen recognition

An important aspect of successful reproduction in plants is the recognition of conspecific pollen and rejection of foreign pollen. Such species-specific discrimination can occur at different steps in the complex fertilization process. The first level of recognition specificity takes place on the stigma—the surface of the carpel where pollen initially lands (Swanson et al. 2004). In A. thaliana, pollen recognition is primarily mediated by the outer cell wall of the pollen grain, the exine (Zinkl & Preuss 2000; Swanson et al. 2004). The initial binding of the pollen grains to the waxy surfaces of the stigma shows a graded response: conspecific pollen grains adhere more effectively to A. thaliana stigmas than do those of other species, with the extent of interaction commensurate with phylogenetic distance (Zinkl et al. 1999).

Pollen binding is followed by hydration and germination, which is controlled by an extracellular matrix called the pollen coat (Zinkl & Preuss 2000). Primary components of the A. thaliana pollen coat proteome include oleosins and lipases (Mayfield et al. 2001). The genes encoding these proteins are located in two unlinked clusters, one comprising six oleosin genes, another six putative lipase genes (Mayfield et al. 2001). The oleosins contain repetitive motifs that probably play a role in recognition specificity. Similar to components of other types of self/non-self recognition systems, the pollen coat genes are characterized by higher-than-average levels of polymorphism, as well as repetitiveness and organization in clusters. Both the internal repeats and the arrangement in tandem arrays allow unequal meiotic crossovers, which might support the generation of unique ‘recognition cassettes’, whose diversification in turn may play a role in speciation (Mayfield et al. 2001). Indeed, the pollen-specific oleosins are one of the most rapidly evolving protein families in Arabidopsis (Fiebig et al. 2004; Schein et al. 2004).

A next level of recognition occurs when the pollen tubes extend into the style tissue of the carpel, as they navigate towards the ovules, where fertilization occurs (Swanson et al. 2004). Path finding requires close interaction between the paternal and maternal partners, and there is apparent species specificity in pollen tube guidance as well. Similar to pollen recognition by the stigma, A. thaliana ovules attract conspecific pollen tubes better than do ovules from other species, and the attraction quickly decreases with increasing phylogenetic distance (Palanivelu & Preuss 2006).

Another example of a possible mechanism conferring species specificity during pollen tube growth involves the FERONIA (FER) receptor-like kinase, which is required to stop pollen tube growth upon its arrival at the ovule (Escobar-Restrepo et al. 2007). Arabidopsis thaliana fer mutants suffer from pollen tube overgrowth that is nearly identical to the phenotype observed when pollen from other species is placed on A. thaliana wild-type plants. The fer-like phenotype in interspecific crosses correlates with sequence divergence in the FER extracellular domain, suggesting that this protein is involved in species-specific interactions during fertilization (Escobar-Restrepo et al. 2007). The FER locus is also strongly differentiated among populations of A. lyrata, a hallmark of local adaptation (Gos & Wright 2008), raising the possibility that FER could cause pollination events within populations to be more successful than ones that involve partners from different populations. We note that while none of these barriers completely block fertilization by close relatives on their own, the accumulation of many such biases probably results in strong species fidelity in fertilization.

(c) The evolution of selfing

The transition to self-fertilization, though it can come with a risk of inbreeding depression, may be favoured, for example, when populations are small and fertilization by conspecifics therefore difficult or when pollinators are scarce (Levin 2000). Selfing may also play a role in speciation by reducing gene flow with relatives (Rieseberg & Willis 2007), or by altering the selective constraints on the regulation of imprinting in diverging populations (Brandvain & Haig 2005).

An investigation of the evolutionary transition from self-incompatibility (SI) to self-compatibility (SC) in Capsella, a genus closely related to Arabidopsis, has revealed that the entire C. rubella species appears to be attributable to propagation from a single self-fertile individual that originated in a population of the SI species, C. grandiflora (Foxe et al. 2009; Guo et al. 2009). In contrast, in A. thaliana, the transition to SC is estimated to have occurred between 400 000 to at least 1 million years ago (Bechsgaard et al. 2006; Tang et al. 2007). At present it cannot be ruled out that SC played a causal role in the speciation of A. thaliana, but estimates of the antiquity of SC in A. thaliana are considerably lower than estimates of the timing of the divergence of A. thaliana from its closest extant relatives, at about 5 Myr (Koch et al. 2001). This suggests that SC considerably post-dates the species-level divergence of A. thaliana from the rest of the extant species in the genus. Nevertheless, the capacity to self-fertilize probably contributed to strengthening the gene flow barriers between A. thaliana and its relatives, as well as between diverging lineages within A. thaliana itself. Given the important role that the acquisition of SC may play in divergence and speciation in some cases, and in population history and structure, there is considerable interest in elucidating the genetic basis of the origin of SC from SI in the A. thaliana lineage, as well as in other related species.

In outcrossing Arabidopsis species, SI is mediated by the highly polymorphic S-locus, which consists of two tightly linked genes, SCR and SRK. SCR encodes a ligand that is expressed in pollen and SRK encodes its receptor in the stigma. Recognition of the SCR ligand by a matching SRK receptor, which triggers a block to fertilization, occurs when both originate from the same haplotype (Kusaba et al. 2001). Recombination within the S-locus is probably strongly selected against in SI species as it generates non-functional, SC alleles (Uyenoyama & Newbigin 2000). The S-locus itself shows high levels of polymorphism, with alleles that appear to be under balancing selection and that persist across speciation events. Loci linked to the S-locus, but not involved in SI, also show extreme levels of diversity and low recombination (Kamau & Charlesworth 2005; Hagenblad et al. 2006). The recombination-suppressed region around the S-locus is, however, not very large, and elevated nucleotide diversity suggestive of reduced recombination is observed only for the closest flanking loci (Kamau et al. 2007). Though extensive amino acid diversity is observed in SRK, clustered primarily in three hypervariable regions, recent evidence shows that variants at only a relatively small subset of these positions mediate recognition specificity (Boggs et al. 2009a). The functional role of variability elsewhere in the S-locus remains unknown, but presumably is related to the lack of recombination within the locus (Kamau & Charlesworth 2005; Hagenblad et al. 2006).

In A. thaliana, remnants of multiple, old alleles of the S-locus are present, inconsistent with a selective sweep involving a single S-locus mutation causing the transition to SC in this species (Sherman-Broyles et al. 2007; Tang et al. 2007; Boggs et al. 2009b). Arabidopsis thaliana has sustained additional mutations affecting SC outside the S-locus, which could be deduced from the observation that only some accessions can be converted back to SI by transgenic expression of an S-locus allele from A. lyrata (Nasrallah et al. 2004). At least two mutations at the S-locus itself are implicated in the transition to SC in A. thaliana, and modifiers elsewhere in the genome are also beginning to be mapped (Boggs et al. 2009b). Perhaps providing clues as to how SC may have originated in an A. thaliana ancestor, varying levels of SC have been observed in several A. lyrata populations in North America (Mable et al. 2005; Mable & Adam 2007). A. lyrata thus offers opportunities to re-examine proposals that selfing may be selectively favoured under conditions of population fragmentation or pollinator limitation (Lloyd 1992). Similar forces that promoted the switch to self-fertilization might have reduced the gene flow of A. thaliana with the ancestral, SI species (Grant 1971; Jain 1976).

3. Intraspecific incompatibility

An alternative to investigating fixed differences between species is the study of nascent incompatibilities within species (Via 2009). Here, the availability of thousands of wild A. thaliana strains from throughout the range of the species provide an exceptionally rich source of material with which to identify genes that cause problems both in first-generation hybrids and in later generations.

(a) Duplicate gene resolution and genetic incompatibility

An early major result of genome-sequencing projects was the realization that gene duplication, which often involves wholesale duplication of entire genomes, or of large chromosomal segments, is a very frequent phenomenon in fungi, plants and animals (Skrabanek & Wolfe 1998). A close analysis of the first A. thaliana genome that had been completely sequenced confirmed the presence of extensive segmental duplications, but also revealed that paralogous genes were frequently lost from duplicate segments (Blanc et al. 2000; Vision et al. 2000). Lynch & Force (2000) were the first to propose that reciprocal loss of paralogous genes in different populations could cause gene flow barriers. Importantly, this process, dubbed duplicate gene resolution, would allow incompatibilities to arise without a change in function, in contrast to the requirements of Dobzhansky–Muller type incompatibilities (Coyne & Orr 2004). If a single locus is affected, incompatibility would manifest itself only in about 6 per cent (one-sixteenth) of second-generation hybrids, which would be left with no active copy of the gene in question (Lynch & Force 2000). However, if this occurred at several unlinked loci, there would be fewer and fewer second-generation segregants that still contained the full complement of active genes. Moreover, simultaneous copy number changes at many genes would be likely to affect even the first-generation individuals as well as the majority of second-generation offspring. Duplicate gene resolution was thus proposed as a potentially important mechanism for generating gene flow barriers among diverging populations or species (Lynch & Force 2000; Taylor et al. 2001).

Even though one might imagine duplicate gene resolution to appear quite frequently, it is only very recently that the first case with a demonstrated effect on genetic incompatibility has been molecularly characterized (Bikard et al. 2009). It had been noted several times before that certain combinations of genotypes were absent in collections of recombinant inbred lines produced from crosses between different natural strains of A. thaliana (Alonso-Blanco et al. 1998; Loudet et al. 2002; Werner et al. 2005; Törjék et al. 2006; Balasubramanian et al. 2009). This is also the case in populations that derive from a cross between the A. thaliana reference strain Col-0 and a different strain, Cvi (Bikard et al. 2009). Fine-mapping revealed that the causal regions, which are found on two separate chromosomes, are represented in Col-0 by recently duplicated copies of a gene encoding a histidinol-phosphate amino-transferase. One of the copies is, however, transcriptionally inactive. In Cvi, this copy is active, but the other one is deleted. In F2 populations, one out of sixteen plants inherit the two defective copies in the homozygous state and die as embryos. Interestingly, heterozygosity at the locus that is deleted in Cvi together with the homozygous Col-0 combination at the other locus already leads to a weak root phenotype, supporting the assertion made above, that pervasive copy number changes are likely to have negative fitness consequences. Strikingly, multiple loss-of-function mutations are segregating at these two loci in the worldwide A. thaliana population, and most strains have only one fully functional copy, suggesting that the inactivation of one of these loci, leaving only a single copy, might be advantageous over retaining both. However, which copy is inactivated appears to be random (Bikard et al. 2009). In conclusion, while it is unlikely that duplicate gene resolution at a single pair of paralogues significantly reduces gene flow among lineages, multiple cases might act in concert to form a barrier that is difficult to overcome. To what degree such recessive incompatibilities affect gene flow patterns during or after speciation in nature is certainly an important question that merits further study.

If divergent gene resolution has played a role in particular instances of speciation or population divergence, one would expect to find that related species have retained (or neo-functionalized) different members of ancestrally duplicated genes. Independent retention of duplicate gene copies in related species has indeed been observed (e.g. Scannell et al. 2006; Sémon & Wolfe 2007). In two related yeasts, for example, speciation seems to have occurred shortly after an autopolyploidization event, followed closely by independent resolution of thousands of duplicate gene copies (Scannell et al. 2007). However, that reciprocal gene losses actually cause hybrid incompatibilities in crosses between these species has not yet been experimentally demonstrated.

Lineages with a history of frequent and recurrent genome duplication, as is the case in many plant taxa, may be particularly prone to incompatibility by divergent gene resolution because of the large pool of duplicated genes present after polyploidy (Taylor et al. 2001). Another possible outcome of independent evolution after gene duplication is functional divergence, or ‘neo-functionalization’, rather than gene loss (Lynch & Conery 2000). Independent neo-functionalization of duplicate genes might have a similar effect on compatibility to gene loss, leading, for example, to outcomes such as developmental system drift, which can also provide barriers to gene flow (True & Haag 2001).

(b) Genetic incompatibility and the plant immune system

Breeders and naturalists have long noticed that many intraspecific and interspecific crosses in plants are associated with a common hybrid weakness condition. This syndrome, termed hybrid necrosis, is observed throughout the plant kingdom and consistently involves widespread cell death and tissue necrosis, weakness, and frequently results in death or sterility of first-generation hybrids (reviewed in Bomblies & Weigel 2007). After having serendipitously discovered the first case of hybrid necrosis in A. thaliana, we performed a systematic screen for intraspecies incompatibilities in this species, with the goal of investigating the earliest genetic consequences of lineage divergence. Some 300 A. thaliana strains were chosen more or less arbitrarily from the worldwide collection of strains present in the stock centres, and combined in nearly 1500 crosses. The survey of first-generation offspring from these crosses identified almost two dozen abnormal hybrids (Bomblies et al. 2007).

Genome-wide analyses revealed a strong and consistent change in gene-expression patterns that pointed to spontaneous pathogen response as a cause of hybrid necrosis. The gene-expression profile was coupled with widespread cell death and strong broad-spectrum disease resistance, further supporting the connection to autoimmunity. As is common with cases of hybrid necrosis in other species, two major genes that act in a dominant manner were found to be responsible in many cases. The first causal gene, DANGEROUS MIX1 (DM1), to be cloned turned out to be a TIR-NB-LRR class pathogen-resistance gene homologue, while the interacting locus (DM2) also mapped to a locus containing a cluster of TIR-NB-LRR genes (Bomblies et al. 2007).

Not all hybrid necrosis cases are dominant, and this dosage-sensitive effect can shed light on the evolution of incompatibilities (Turelli & Orr 2000; Demuth & Wade 2005). An example of recessive hybrid necrosis was discovered after crosses between A. thaliana parental strains that differ from the ones in which DM1 and DM2 were identified. Similar to the F1 cases, these plants were strongly resistant to pathogen attack, and this depended on the defence hormone salicylic acid (SA). Remarkably, one of the responsible genes appears to be a distinct allelic variant at the same TIR-NB-LRR cluster as DM2 (Alcázar et al. 2009). Since there are over 100 NB-LRR loci in the A. thaliana genome, this might indicate that not all of these are equally likely to cause hybrid problems. In this context it is interesting to note that if an allele at one locus can interact with multiple other loci, incompatibilities accumulate much faster than if incompatibilities are always due to independent pairs of loci (Kondrashov 2003). It would be worthwhile to determine how the third case found for DM2, with several alleles at one locus interacting with multiple other loci, affects the increase in incompatibilities over time.

NB-LRR proteins are key molecules in signalling cascades that often culminate in the activation of programmed cell death (see for review Dangl & Jones 2001; Jones & Dangl 2006), explaining that their spontaneous activation in hybrid necrosis situations is associated with widespread tissue damage and severely curtailed growth. NB-LRR genes, many of which encode disease-resistance (R) proteins, are the most diverse family of genes in plant genomes, which also holds true for A. thaliana (Bakker et al. 2006; Clark et al. 2007). Since the products of these genes mediate the activation of cell death, which is highly detrimental when not adequately controlled, it is perhaps not surprising that divergence in these genes might commonly be involved in early and phenotypically severe genetic incompatibilities in many different species. Different forms of selection appear to be responsible for their diversity, including both rapid adaptation to shifting pathogen populations and frequency-dependent or balancing selection. Nevertheless, we propose that hybrid necrosis is a repeated by-product of presumably adaptive evolution of disease-resistance genes.

The work on hybrid necrosis has important implications for theoretical treatments of species differences. Several authors have pointed out the discontinuity between explicit quantitative genetic theories of speciation and Bateson–Dobzhansky–Muller type models that explain qualitative hybrid phenotypes such as lethality or sterility (e.g. Demuth & Wade 2005). The expression of hybrid necrosis is sensitive to the environment, and can be suppressed by temperatures that are above the normal range for a species (Bomblies & Weigel 2007; Bomblies et al. 2007). Moreover, the discovery of related F1 and F2 cases suggests that even weaker interactions might become fixed in inbred strains. Hybrid necrosis thus provides a powerful arena for empirical tests of some long-held notions regarding the role of epistasis in within-species divergence (Orr 2001; Orr & Turelli 2001; Demuth & Wade 2005). A final advantage of the hybrid necrosis system is that the lethal effects do not require additional, unknown genes from the hybrid background, as in other cases (Brideau et al. 2006), greatly facilitating their genetic and mechanistic dissection.

A recent mathematical modelling study has demonstrated that the evolution of interacting components of the plant immune system, coupled with hybrid necrosis, can under some scenarios lead to reproductive isolation and speciation (Ispolatov & Doebeli 2009), further supporting the notion that the study of hybrid necrosis is relevant to speciation research. In this context, it will be interesting to compare both the diversity and patterns of R gene polymorphisms between selfing and outcrossing species, and the effect of differences, if any, on the frequency of hybrid necrotic interactions among F1 and F2 offspring, as these will address several predictions that have been made for patterns of inter- and intraspecific differences (e.g. Orr 2001).

4. Outlook

Above, we have discussed various genetic changes that cause reduced gene flow. However, one of the first events in this process is often adaption to different niches, which leads to physical separation of populations, which in turn can then accumulate genetic incompatibilities (Coyne & Orr 2004). A recent study of A. halleri has advanced our understanding of how plants can adapt to very different habitats. This species is a metal hyperaccumulator, which distinguishes it from A. lyrata and A. thaliana. Consistent with this trait having evolved over a quite short time frame, its genetic architecture appears to be relatively simple. An important component is the increased capacity of A. halleri roots to export metals such as cadmium or zinc to the leaves, which in A. halleri are more tolerant to high-level metal concentrations than in A. lyrata or A. thaliana. The more efficient xylem loading of metals in the root is to a large extent caused by increased expression of a heavy metal ATPase gene, HMA4, because of the combination of cis-regulatory differences and triplication of the gene (Hanikenne et al. 2008). HMA4 may also have been involved in the independent evolution of heavy metal tolerance and hyperaccumulation in another Brassicaceae, Thlaspi caerulescens (Papoyan & Kochian 2004). Furthermore, an increase in copy number appears to be responsible for elevated expression of additional genes associated with metal tolerance in A. halleri (Dräger et al. 2004).

Intriguingly, there are also several connections between metal tolerance and disease resistance, which we have previously discussed as a potential isolating mechanism. Arabidopsis halleri defensin genes, originally identified based on their involvement in pathogen defence, confer heavy metal tolerance when introduced into A. thaliana (Mirouze et al. 2006). Similarly, high levels of the defence hormone SA are associated with nickel hyperaccumulation in six Thlaspi species, and high SA levels can increase nickel tolerance in A. thaliana (Freeman et al. 2005). Although little is known about whether these particular local adaptations generate gene flow barriers, in at least one case increased genetic differentiation was observed at candidate metal tolerance loci in Thlaspi caerulescens (Besnard et al. 2008). Moreover, a locus causing genetic incompatibility between copper-tolerant and -intolerant varieties of Mimulus guttatus is very closely linked, if not identical, to a locus conferring copper tolerance (Macnair & Christie 1983), indicating that adaptations to the abiotic environment and genetic compatibility can be co-selected.

It will be important to pursue questions of habitat adaptation and gene flow barriers in Arabidopsis as well, and to determine whether such local adaptations alter gene flow among populations. Such studies will be greatly facilitated by whole-genome scans as powerful tools for examining genomic differentiation associated with local adaptation and its effects on gene flow (Turner et al. 2005, 2008), or for asking whether structural rearrangements can affect gene flow and how big their ‘footprints’ of linkage drag are. Genomic approaches can also be used, for example, to compare recombination within populations versus between populations. It is thus with great enthusiasm that we greet the current sequencing revolution, which promises to deliver an until recently unimaginable number of genome sequences. Having genome sequences for many strains from each of the Arabidopsis species (Ossowski et al. 2008; Weigel & Mott 2009) will provide the basis for analysing and comparing patterns of within-species sequence polymorphism, which in turn will inform us about gene flow between species in the past. In addition, we will be able to address questions such as how natural variation within species gives rise to species-level differentiation, and how this in turn leads to reproductive isolation and speciation.

Additional investigations in future may also focus on the genetics of isolation among extant sympatric species such as A. lyrata and A. arenosa. Some A. arenosa populations are diploid and have the same chromosome number as A. lyrata. As improved technologies make mapping more accessible in these species, perhaps using A. thaliana as a bridge, since A. thaliana hybrids can be produced with both A. arenosa and A. lyrata (Nasrallah et al. 2000; Beaulieu et al. 2009), it will become feasible to assess what factors are important in maintaining these species as distinct entities. Further investigation of largely unstudied species in the genus, such as the aptly named A. neglecta, as well as A. cebennensis, A. croatica and A. pedemontana, which often have very narrow distribution ranges, may provide additional models for speciation and local adaptation.


Our work on natural genetic variation has been supported by an NIH Ruth Kirschstein NRSA postdoctoral fellowship (K.B.), by DFG grant ERA-PG ARelatives, a Gottfried Wilhelm Leibniz Award of the DFG and the Max Planck Society (D.W.).


  • Present address: Department of Organismic and Evolutionary Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA.

  • One contribution of 11 to a Theme Issue ‘Genomics of speciation’.


View Abstract